NVIDIA Nemotron 3 Super - blazing-fast agentic AI, ready to deploy today!
nvidia/
$0.010
/ 1M tokens
The llama-nemotron-embed-vl-1b-v2 is a high-performance multimodal embedding model designed to transform text queries and document images into dense vector representations for advanced retrieval systems. It excels at understanding complex visual content like charts, tables, and infographics.

DeepInfra supports the OpenAI embeddings API. The following creates an embedding vector representing the input text
curl "https://api.deepinfra.com/v1/openai/embeddings" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $DEEPINFRA_TOKEN" \
-d '{
"input": "The food was delicious and the waiter...",
"model": "nvidia/llama-nemotron-embed-vl-1b-v2",
"encoding_format": "float"
}'
which will return something similar to
{
"object":"list",
"data":[
{
"object": "embedding",
"index":0,
"embedding":[
-0.010480394586920738,
-0.0026091758627444506
...
0.031979579478502274,
0.02021978422999382
]
}
],
"model": "nvidia/llama-nemotron-embed-vl-1b-v2",
"usage": {
"prompt_tokens":12,
"total_tokens":12
}
}
© 2026 Deep Infra. All rights reserved.