- OpenAI has officially launched it’s first AI Agent: Operator
- It’s works within a web browser to complete tasks for you, and is out now as a limited research preview
- Operator can make a dinner reservation, fill out a form, and complete other web tasks
OpenAI is always looking for the next big thing to add to ChatGPT, and after months of rumors, including a report from earlier this week that teased a launch, the technology giant’s first AI Agent is here. Operator is designed to complete web tasks for you, all with a touch of a button.
Essentially, Operator is a Computer Using Agent (CUA) that uses GPT-4o’s visual skills to browse and search the web. This means that it can understand the context of what to search for, and thanks to its multi-modality, it understands what it sees as it searches. It’s available now as a research preview for ChatGPT Pro subscribers in the United States.
Operator is described as “an agent that can use its own browser to perform tasks for you.” OpenAI released a demo showing Operator browsing the web as we (that is, we humans) do. You might ask Operator to book a dinner reservation for you, fill out an arduously long form, order groceries from a service, or even book a flight. It can use OpenTable to find and book a reservation at a restaurant, as shown in the demo. Operator will even walk you through its steps.
Operator is a ‘research preview,’ so know that it’s in its early days. OpenAI does impose some limitations. We haven’t had the chance to go hands-on yet, but it certainly looks impressive. This is OpenAI’s first entry into the world of AI agents, which will likely be the theme of the year in the realm of artificial intelligence.
OpenAI writes in a blog post announcing Operator that it “is one of our first agents, which are AIs capable of doing work for you independently—you give it a task and it will execute it.” This hints that not only are there other agents in the pipeline – Altman confirmed this during the live demo – but that they’re all based around the notion of doing things for you – a big step in the quest to make AI even more helpful, giving us some time back.
Operator is powered by the new Computer Using Agent (CUA) model, which pairs GPT4o’s vision skills with advanced reasoning. This all comes together to let Operator understand and use elements within a browser – the search bar, various buttons, and on-screen content.