We use essential cookies to make our site work. With your consent, we may also use non-essential cookies to improve user experience and analyze website traffic…

Qwen/

Qwen3-Reranker-8B

The Qwen3 Embedding model series is the latest proprietary model of the Qwen family, specifically designed for text embedding and ranking tasks. Building upon the dense foundational models of the Qwen3 series, it provides a comprehensive range of text embeddings and reranking models in various sizes (0.6B, 4B, and 8B)

Public
$0.050 / Mtoken
32,768
ProjectPaperLicense
Qwen/Qwen3-Reranker-8B cover image

HTTP/cURL API

You can use cURL or any other http client to run inferences:

curl -X POST \
    -d '{"queries": ["What is the capital of United States of America?"], "documents": ["The capital of USA is Washington DC."]}'  \
    -H "Authorization: bearer $DEEPINFRA_TOKEN"  \
    -H 'Content-Type: application/json'  \
    'https://api.deepinfra.com/v1/inference/Qwen/Qwen3-Reranker-8B'
copy

which will give you back something similar to:

{
  "scores": [
    0.1,
    0.2,
    0.3
  ],
  "input_tokens": 42,
  "request_id": null,
  "inference_status": {
    "status": "unknown",
    "runtime_ms": 0,
    "cost": 0.0,
    "tokens_generated": 0,
    "tokens_input": 0
  }
}

copy

Input fields

Input Schema

Output Schema

Unlock the most affordable AI hosting

Run models at scale with our fully managed GPU infrastructure, delivering enterprise-grade uptime at the industry's best rates.