Paul Krill
Editor at Large

Qdrant Cloud adds service for generating text and image embeddings

Qdrant Cloud Inference simplifies building applications with multimodal search, retrieval-augmented generation, and hybrid search, Qdrant said.

Finger touching cloud design
Credit: Shutterstock

Qdrant has launched Qdrant Cloud Inference, a managed service that allows developers to generate, store, and index text and image embeddings in the Qdrant Cloud. The service, which uses integrated models within a managed vector search engine, is designed to simplify building applications with multimodal search, retrieval-augmented generation, and hybrid search, according to the company.

Announced July 15, Qdrant Cloud Inference is a managed vector database offering multimodal inference and using separate image and text embedding models, natively integrated in Qdrant Cloud. The service combines dense, sparse, and image embeddings with vector search in one managed environment. Users can generate, store, and index embeddingsย in a single API call, turning unstructured text and images into search-ready vectors in a single environment, Qdrant said.ย 

Directly integrating model inference into Qdrant Cloud removes the need for separate inference infrastructure, manual pipelines, and redundant data transfers, simplifying workflows, accelerating development cycles, and eliminating unnecessary network hops for developers, according to Qdrant. โ€œTraditionally, embedding generation and vector search have been handled separately in developer workflows,โ€ said Andrรฉ Zayarni, CEO and co-founder of Qdrant. โ€œWith Qdrant Cloud Inference, it feels like a single tool: one API call with optimal resources for each component.โ€

Supported models in Qdrant Cloud Inference include MiniLM, SPLADE, BM25, Mixedbread Embed-Large, and CLIP for both image and text. Additional models will become available over time. The new offering includes as much as five million free tokens per model each month, with unlimited tokens for BM25. Qdrant Cloud Inference is currently only available in US regions for paid clusters. Support for inference in other regions is coming soon, Qdrant said.

Paul Krill

Paul Krill is editor at large at InfoWorld. Paul has been covering computer technology as a news and feature reporter for more than 35 years, including 30 years at InfoWorld. He has specialized in coverage of software development tools and technologies since the 1990s, and he continues to lead InfoWorldโ€™s news coverage of software development platforms including Java and .NET and programming languages including JavaScript, TypeScript, PHP, Python, Ruby, Rust, and Go. Long trusted as a reporter who prioritizes accuracy, integrity, and the best interests of readers, Paul is sought out by technology companies and industry organizations who want to reach InfoWorldโ€™s audience of software developers and other information technology professionals. Paul has won a โ€œBest Technology News Coverageโ€ award from IDG.

More from this author