• @[email protected]
    link
    fedilink
    English
    15
    edit-2
    5 months ago

    There is a huge corporate insensitive that everyone is not realizing here. By screen recording + OCR, there is a possibility to start using this data to replace some labor intensive, but simple tasks of operating a business. If you can create RPA+ML+LLM that can rerun repetitive tasks, you have holy grail on your hands. I think this is one of the big reason why M$ is pushing this.

    I assume to be down voted to oblivion, but I do business automation and integration for living, and at the same time I am scared and excited.

    • @[email protected]
      link
      fedilink
      English
      95 months ago

      Lmao do you have any idea how quickly that’s going to go off the rails? They’re going to get into a hallucination feedback loop, which will destroy the integrity of their systems and processes, and they’ll richly deserve it.

      At any rate, most highly-effective technical teams have already automated the shit out of all their rote operations without using ML.

    • @[email protected]
      link
      fedilink
      English
      85 months ago

      Absolutely. Corporations - at least, shitty ones (most of them) - are absolutely salivating at using this. They want to be able to see and easily summarize eeeeeeverything you’re doing.

      Some are absolutely already using a form of this. It’s not a hypothetical - this is currently happening and many want way way more.

    • @[email protected]
      link
      fedilink
      English
      65 months ago

      Automation suites exist and they are very much tuned to the individual apps. It seems giving ML an OCR readout of a page is not enough for it to know what it should do (accurately). We have had a training set for “booking flights on a browser” for about 6 years now and no one has figured out how to have it disrupt automated testing: https://miniwob.farama.org/

    • @[email protected]
      link
      fedilink
      English
      45 months ago

      I was thinking about this, but I don’t know what the plan us for annotating new flows with descriptions of the actions. There’s no point in learning how to send an email or open a webpage, that’s already easy. The value is in a database of uncommon interactions, but it’s only valuable if there is a description to train on.