A LLM-based embedding model with in-context learning capabilities that achieves SOTA performance on BEIR and AIR-Bench. It leverages few-shot examples to enhance task performance.
A LLM-based embedding model with in-context learning capabilities that achieves SOTA performance on BEIR and AIR-Bench. It leverages few-shot examples to enhance task performance.
DeepInfra supports the OpenAI embeddings API. The following creates an embedding vector representing the input text
curl "https://api.deepinfra.com/v1/openai/embeddings" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $DEEPINFRA_TOKEN" \
-d '{
"input": "The food was delicious and the waiter...",
"model": "BAAI/bge-en-icl",
"encoding_format": "float"
}'
which will return something similar to
{
"object":"list",
"data":[
{
"object": "embedding",
"index":0,
"embedding":[
-0.010480394586920738,
-0.0026091758627444506
...
0.031979579478502274,
0.02021978422999382
]
}
],
"model": "BAAI/bge-en-icl",
"usage": {
"prompt_tokens":12,
"total_tokens":12
}
}
service_tier
stringThe service tier used for processing the request. When set to 'priority', the request will be processed with higher priority.
Allowed values: default
priority
dimensions
integerThe number of dimensions in the embedding. If not provided, the model's default will be used.If provided bigger than model's default, the embedding will be padded with zeros.
Range: 32 ≤ dimensions
Run models at scale with our fully managed GPU infrastructure, delivering enterprise-grade uptime at the industry's best rates.