Paul Krill
Editor at Large

Google previews Gemini 2.5 Flash-Lite

news
Jun 18, 20252 mins

Reasoning model optimized for cost and speed shines for high throughput tasks like classification or summarization at scale, Google said.

A red line on a curvy road: speed trail, go fast.
Credit: Taiga / Shutterstock

Google has unveiled a preview of Gemini 2.5 Flash-Lite, a reasoning model optimized for cost and speed, and announced that two other Gemini models, Gemini 2.5 Pro and Gemini 2.5 Flash, are now generally available.

Google made the announcements June 17. Gemini 2.5 models are thinking models, capable of reasoning through thoughts before responding, resulting in enhanced performance and improved accuracy, Google said.

Gemini 2.5 Flash-Lite has the lowest cost and lowest latency in the Gemini 2.5 model family, Google said. Flash-Lite is a reasoning model that enables dynamic control of the thinking budget via an API parameter, but because Flash-Lite is optimized for low latency and low cost, thinking is turned off by default. This model is โ€œgreatโ€ for high throughput tasks such as classification or summarization at scale, Google said. Built as an upgrade to Gemini 1.5 Flash and 2.0 Flash models, Gemini 2.5 Flash-Lite offers better performance across most evals and lower time to the first token, while also achieving higher tokens per second decode, according to Google. Each Gemini 2.5 model has control over the thinking budget, giving developers the ability to choose when and how much the model thinks before generating a response.

The Gemini 2.5 Pro and Gemini 2.5 Flash models are now available and stable, with no changes from the previews. However, pricing for Gemini 2.5 Flash has changed. The price per 1M input tokens has been raised to $0.30 from $0.15, and the price per 1M output tokens has been lowered to $2.50 from $3.50. The price difference for thinking vs. non-thinking has been removed.

Google said Gemini 2.5 Flash-Lite is best for high-volume, cost-efficient tasks, while Gemini 2.5 Flash is best for fast performance on everyday tasks and Gemini 2.5 Pro is best for coding and highly complex tasks. Gemini 2.5 was introduced March 25.

Paul Krill

Paul Krill is editor at large at InfoWorld. Paul has been covering computer technology as a news and feature reporter for more than 35 years, including 30 years at InfoWorld. He has specialized in coverage of software development tools and technologies since the 1990s, and he continues to lead InfoWorldโ€™s news coverage of software development platforms including Java and .NET and programming languages including JavaScript, TypeScript, PHP, Python, Ruby, Rust, and Go. Long trusted as a reporter who prioritizes accuracy, integrity, and the best interests of readers, Paul is sought out by technology companies and industry organizations who want to reach InfoWorldโ€™s audience of software developers and other information technology professionals. Paul has won a โ€œBest Technology News Coverageโ€ award from IDG.

More from this author