BAAI/
BGE-M3 is a multilingual text embedding model developed by BAAI, distinguished by its Multi-Linguality (supporting 100+ languages), Multi-Functionality (unified dense, multi-vector, and sparse retrieval), and Multi-Granularity (handling inputs from short queries to 8192-token documents). It achieves state-of-the-art retrieval performance across diverse benchmarks while maintaining a single model for multiple retrieval modes.
You need to login to use this model
LoginSettings
Dense is a set of low-dimensional vectors where every token in the input is represented by a fully populated embedding derived from a neural model. 2
Sparse is a collection of high-dimensional vectors where each word in the input is assigned a lexical weight, with most values being zero. 2
Colbert is a system of contextualized vectors where every token in the input is represented by its own BERT-derived embedding. 2
whether to normalize the computed embeddings 2
{
"input_tokens": 42,
"embeddings": [
[
0,
0.5,
1
],
[
1,
0.5,
0
]
],
"sparse": [
[
0,
0,
1
],
[
0,
0.6,
0
]
],
"colbert": [
[
[
0.5,
0.1,
1
],
[
0.3,
0.6,
0.5
]
],
[
[
0.3,
0.6,
0.5
]
]
],
"embedding_jsons": [
"[0.0, 0.5, 1.0]",
"[1.0, 0.5, 0.0]"
]
}
Run models at scale with our fully managed GPU infrastructure, delivering enterprise-grade uptime at the industry's best rates.