intfloat/multilingual-e5-large cover image

intfloat/multilingual-e5-large

The Multilingual-E5-large model is a 24-layer text embedding model with an embedding size of 1024, trained on a mixture of multilingual datasets and supporting 100 languages. The model achieves state-of-the-art results on the Mr. TyDi benchmark, outperforming other models such as BM25 and mDPR. The model is intended for use in text retrieval and semantic similarity tasks, and should be used with the "query: " and "passage: " prefixes for input texts to achieve optimal performance.

The Multilingual-E5-large model is a 24-layer text embedding model with an embedding size of 1024, trained on a mixture of multilingual datasets and supporting 100 languages. The model achieves state-of-the-art results on the Mr. TyDi benchmark, outperforming other models such as BM25 and mDPR. The model is intended for use in text retrieval and semantic similarity tasks, and should be used with the "query: " and "passage: " prefixes for input texts to achieve optimal performance.

Public
$0.010 / Mtoken
fp32
512
PaperLicense

OpenAI-compatible HTTP API

DeepInfra supports the OpenAI embeddings API. The following creates an embedding vector representing the input text

curl "https://api.deepinfra.com/v1/openai/embeddings" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $DEEPINFRA_TOKEN" \
  -d '{
    "input": "The food was delicious and the waiter...",
    "model": "intfloat/multilingual-e5-large",
    "encoding_format": "float"
  }'

which will return something similar to

{
  "object":"list",
  "data":[
    {
      "object": "embedding",
      "index":0,
      "embedding":[
        -0.010480394586920738,
        -0.0026091758627444506
        ...
        0.031979579478502274,
        0.02021978422999382
      ]
    }
  ],
  "model": "intfloat/multilingual-e5-large",
  "usage": {
    "prompt_tokens":12,
    "total_tokens":12
  }
}

Input fields

modelstring

model name


inputarray

sequences to embed


encoding_formatstring

format used when encoding

Default value: "float"

Allowed values: float

Input Schema

Output Schema