Paul Krill
Editor at Large

Google ships Gemini 1.5 Flash-8B AI model

news
Oct 4, 20242 mins

Smaller and faster variant of 1.5 Flash features half the price, twice the rate limits, and lower latency on small prompts compared to its forerunner.

Google Googleplex
Credit: Shutterstock

Googleโ€™s Gemini 1.5 Flash-8B AI model is now production-ready. The company said the stable release of Gemini 1.5 Flash-8B has the lowest cost per intelligence of any Gemini model.

Availability was announced October 3. Developers can access gemini-1.5-flash-8B for free via Google AI Studio and the Gemini API. Gemini 1.5 Flash-8B offers a 50% lower price compared to 1.5 Flash and twice the rate limits. Lower latency on small prompts also is featured.

An experimental version of Gemini 1.5 Flash-8B had been released in September as a smaller, faster variant of 1.5 Flash. Flash-8B nearly matches the performance of the 1.5 Flash model launched in May across multiple benchmarks and performs well on tasks such as chat, transcription, and long context language translation, Google said.

The stable release of Gemini 1.5 Flash-8B is priced at the following rates:

  • $0.0375 per 1 million input tokens on prompts < 128K
  • $0.15 per 1 million output tokens on prompts < 128K
  • $0.01 per 1 million tokens on cached prompts < 128K

Developers on the paid tier will be billed beginning October 14. The new price, along with work Google has done to drive down developer costs with the 1.5 Flash and 1.5 Pro models, show the companyโ€™s commitment to ensuring that developers have the freedom to build products and services that push the world forward, Google said.

Paul Krill

Paul Krill is editor at large at InfoWorld. Paul has been covering computer technology as a news and feature reporter for more than 35 years, including 30 years at InfoWorld. He has specialized in coverage of software development tools and technologies since the 1990s, and he continues to lead InfoWorldโ€™s news coverage of software development platforms including Java and .NET and programming languages including JavaScript, TypeScript, PHP, Python, Ruby, Rust, and Go. Long trusted as a reporter who prioritizes accuracy, integrity, and the best interests of readers, Paul is sought out by technology companies and industry organizations who want to reach InfoWorldโ€™s audience of software developers and other information technology professionals. Paul has won a โ€œBest Technology News Coverageโ€ award from IDG.

More from this author