We use essential cookies to make our site work. With your consent, we may also use non-essential cookies to improve user experience and analyze website traffic…

NVIDIA Nemotron 3 Super - blazing-fast agentic AI, ready to deploy today!

nvidia logo

nvidia/

llama-nemotron-rerank-vl-1b-v2

$0.010

/ 1M tokens

The llama-nemotron-rerank-vl-1b-v2 is a 1.7B parameter multimodal reranking model designed to evaluate and order the relevance of document images and text against specific user queries. It excels at understanding complex visual content like charts, tables, and infographics.

Public
bf16
10,240
PaperLicense
nvidia/llama-nemotron-rerank-vl-1b-v2 cover image

HTTP/cURL API

You can use cURL or any other http client to run inferences:

curl -X POST \
    -d '{"queries": ["What is the capital of United States of America?"], "documents": ["The capital of USA is Washington DC."]}'  \
    -H "Authorization: bearer $DEEPINFRA_TOKEN"  \
    -H 'Content-Type: application/json'  \
    'https://api.deepinfra.com/v1/inference/nvidia/llama-nemotron-rerank-vl-1b-v2'
copy

which will give you back something similar to:

{
  "scores": [
    0.1,
    0.2,
    0.3
  ],
  "input_tokens": 42,
  "request_id": null,
  "inference_status": {
    "status": "unknown",
    "runtime_ms": 0,
    "cost": 0.0,
    "tokens_generated": 0,
    "tokens_input": 0,
    "output_length": 0
  }
}

copy

Input fields

Input Schema

Output Schema