Granite Embedding Multilingual R2: Open Apache 2.0 Multilingual Embeddings with 32K Context — Best Sub-100M Retrieval Quality

TL;DR

Granite has released two new open-source multilingual embedding models, one full-size and one compact, with enhanced language support, longer context handling, and state-of-the-art retrieval scores. These models aim to bridge the gap between speed and multilingual accuracy for enterprise applications.

Granite has released two open-source multilingual embedding models, the granite-embedding-97m-multilingual-r2 and the granite-embedding-311m-multilingual-r2, under the Apache 2.0 license, designed to support over 200 languages with improved retrieval performance and long context handling.

The 97M-parameter model achieves a retrieval score of 60.3 on the Multilingual MTEB benchmark, outperforming previous models of similar size by 9.4 points. The full-size 311M model scores 65.2, ranking second among open models under 500M parameters, with support for 200+ languages and programming code retrieval. Both models are compatible with popular frameworks like sentence-transformers, LangChain, and Haystack, requiring no task-specific tuning.

Built on ModernBERT, these models feature a 32,768-token context length, a significant increase over previous versions, enabling better handling of long documents and complex queries. They are trained on curated datasets, emphasizing responsible data governance, and include optimized weights for CPU inference via ONNX and OpenVINO. The models support 52 languages with explicit cross-lingual training, including major languages like Chinese, Arabic, and Spanish, as well as code in nine programming languages.

Why It Matters

This release addresses a persistent challenge in multilingual AI: balancing broad language coverage with model size and performance. By offering high-quality, open-source models that support extensive language and code retrieval, Granite enables enterprises to deploy multilingual AI solutions more efficiently and cost-effectively, potentially transforming cross-lingual search, retrieval-augmented generation, and code understanding in global teams.

Amazon

multilingual AI embedding models

As an affiliate, we earn on qualifying purchases.

Background

Prior to this release, open multilingual embedding models faced limitations in either size, language coverage, or retrieval quality. The earlier R1 models were built on XLM-RoBERTa with a 512-token window, which constrained long-document processing. Granite’s new models leverage ModernBERT, which revisits transformer architectures for better efficiency and longer context support. The release follows ongoing industry efforts to improve open multilingual models amid increasing enterprise demand for scalable, responsible AI solutions.

“Our new models set a new standard for open multilingual embeddings, combining high retrieval quality with enterprise-ready features like long context support and code retrieval.”

— Granite AI spokesperson

“The focus on responsible data governance and open licensing makes these models suitable for wide adoption in commercial applications.”

— IBM Data Scientist

Amazon

long context document retrieval tools

As an affiliate, we earn on qualifying purchases.

What Remains Unclear

It is not yet clear how these models will perform in real-world, large-scale enterprise environments under diverse operational conditions, or how they compare to proprietary solutions in deployment scenarios.

Hands-On Large Language Models: Language Understanding and Generation

As an affiliate, we earn on qualifying purchases.

What’s Next

Granite plans to provide further benchmarking, user feedback, and integration guides. Future updates may include fine-tuning tools, additional language support, and deployment optimizations tailored for specific industries or use cases.

Generative AI with Python: The Developer’s Guide to Pretrained LLMs, Vector Databases, Retrieval Augmented Generation, and Agentic Systems (Rheinwerk Computing)

As an affiliate, we earn on qualifying purchases.

Key Questions

What are the main advantages of these models over previous versions?

The new models offer significantly improved retrieval scores, support for longer contexts (up to 32,768 tokens), and cover over 200 languages, including programming languages, all under an open license.

Can these models be integrated into existing AI frameworks?

Yes, both models are compatible with sentence-transformers, transformers, LangChain, Haystack, and Milvus, requiring only a one-line change to model names.

Are these models suitable for production use?

They are designed with enterprise deployment in mind, featuring optimized weights for CPU inference and responsible data governance, making them suitable for commercial applications, pending further testing.

What languages are explicitly supported?

The models support 52 languages with explicit training, including major languages like Chinese, Arabic, Spanish, and many European and Asian languages, alongside programming languages.

Granite Embedding Multilingual R2: Open Apache 2.0 Multilingual Embeddings with 32K Context — Best Sub-100M Retrieval Quality

Up next

The best argument I’ve heard for why AI won’t take your job

Author

The Idea Magazine Team

Share article