We use essential cookies to make our site work. With your consent, we may also use non-essential cookies to improve user experience and analyze website traffic…

🚀 New models by Bria.ai, generate and edit images at scale 🚀

nvidia logo

nvidia/

NVIDIA-Nemotron-Nano-12B-v2-VL

$0.20

in

$0.60

out

The model is an auto-regressive vision language model that uses an optimized transformer architecture. The model enables multi-image reasoning and video understanding, along with strong document intelligence, visual Q&A and summarization capabilities.

Deploy Private Endpoint
Public
fp8
131,072
Multimodal
ProjectNemotron
nvidia/NVIDIA-Nemotron-Nano-12B-v2-VL cover image

OpenAI-compatible HTTP API

You can POST to our OpenAI Chat Completions compatible endpoint.

Passing a url to an image is the easiest way to perform OCR.

curl "https://api.deepinfra.com/v1/openai/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $DEEPINFRA_TOKEN" \
  -d '{
      "model": "nvidia/NVIDIA-Nemotron-Nano-12B-v2-VL",
      "max_tokens": 4092,
      "messages": [
        {
          "role": "user",
          "content": [
            {
              "type": "image_url",
              "image_url": {
                "url": "https://url.com/to/shakespeare.png"
              }
            }
          ]
        }
      ]
    }'
copy

Another options is to read the image from a file


BASE64_IMAGE=$(base64 -w 0 shakespeare.png)

curl "https://api.deepinfra.com/v1/openai/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $DEEPINFRA_TOKEN" \
  -d @- <<EOF
{
  "model": "nvidia/NVIDIA-Nemotron-Nano-12B-v2-VL",
  "max_tokens": 4092,
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "type": "image_url",
          "image_url": {
            "url": "data:image/png;base64,$BASE64_IMAGE"
          }
        }
      ]
    }
  ]
}
EOF

copy

Input fields

Input Schema

Output Schema

Streaming Schema