You can now use smaller, faster embedding models for multilingual search and retrieval without sacrificing quality—F2LLM-v2 offers efficient options for resource-constrained deployments while the largest variant ranks first on major benchmarks.
F2LLM-v2 is a family of multilingual embedding models (80M to 14B parameters) trained on 60 million high-quality samples that support 200+ languages, including underserved low-resource ones. Using matryoshka learning and knowledge distillation, these models achieve top performance on benchmarks while being more efficient than previous LLM-based embeddings.