Paul Krill
Editor at Large

Google unveils Gemini 2.0 AI model for agentic era

news
Dec 11, 20242 mins

Gemini 2.0 Flash cites multimodality advances and is expected to enable the development of AI agents as universal assistants.

Brain, innovation, futuristic, centre, AI, high tech
Credit: Lumiere Collective

Emphasizing a new AI model for the agentic era, Google has introduced Gemini 2.0, which the company calls its most capable model yet.

Announced December 11, the Gemini 2.0 Flash experimental model will be available to all Gemini users. Gemini 2.0 is billed as having advances in multimodality, such as native image and audio output, and native tool use. Google anticipates Gemini 2.0 enabling the development of new AI agents closer to the vision of a universal assistant. Agentic models can understand more, think multiple steps ahead, and take action on a userโ€™s behalf, with supervision, Google CEO Sundar Pichai said

Gemini 2.0โ€™s advances are underpinned by decade-long investments in a differentiated full-stack approach to AI innovation, Pichai said. The technology was built on custom hardware such as Trillium, which features sixth-generation TPUs (tensor processing units), which powered Gemini 2.0 training and inference. Trillium is also generally available to customers who want to build with it.

With this announcement, Google also introduced a new feature, Deep Research, which leverages advanced reasoning and long-context capabilities to act as a research assistant, exploring complex topics and compiling reports. Deep Research is available in Gemini Advanced.

While Gemini 1.0, introduced in December 2023, was about organizing and understanding information, Gemini 2.0 is about making the information more useful, Pichai said. In touting Gemini 2.0, Google cited Project Mariner, an early research prototype built with Gemini 2.0 that explores the future of human-agent interaction, starting with a browser. As a research prototype, it can understand and reason across information in a browser screen, including pixels and web elements like text, code, images, and forms, and then use that information via an experimental Chrome extension to complete tasks.

Paul Krill

Paul Krill is editor at large at InfoWorld. Paul has been covering computer technology as a news and feature reporter for more than 35 years, including 30 years at InfoWorld. He has specialized in coverage of software development tools and technologies since the 1990s, and he continues to lead InfoWorldโ€™s news coverage of software development platforms including Java and .NET and programming languages including JavaScript, TypeScript, PHP, Python, Ruby, Rust, and Go. Long trusted as a reporter who prioritizes accuracy, integrity, and the best interests of readers, Paul is sought out by technology companies and industry organizations who want to reach InfoWorldโ€™s audience of software developers and other information technology professionals. Paul has won a โ€œBest Technology News Coverageโ€ award from IDG.

More from this author