We use essential cookies to make our site work. With your consent, we may also use non-essential cookies to improve user experience and analyze website traffic…
deepseek-ai/Janus-Pro-7B cover image

deepseek-ai/Janus-Pro-7B

Janus-Pro is a novel autoregressive framework that unifies multimodal understanding and generation. It addresses the limitations of previous approaches by decoupling visual encoding into separate pathways, while still utilizing a single, unified transformer architecture for processing. The decoupling not only alleviates the conflict between the visual encoder’s roles in understanding and generation, but also enhances the framework’s flexibility. Janus-Pro surpasses previous unified model and matches or exceeds the performance of task-specific models. The simplicity, high flexibility, and effectiveness of Janus-Pro make it a strong candidate for next-generation unified multimodal models.

Janus-Pro is a novel autoregressive framework that unifies multimodal understanding and generation. It addresses the limitations of previous approaches by decoupling visual encoding into separate pathways, while still utilizing a single, unified transformer architecture for processing. The decoupling not only alleviates the conflict between the visual encoder’s roles in understanding and generation, but also enhances the framework’s flexibility. Janus-Pro surpasses previous unified model and matches or exceeds the performance of task-specific models. The simplicity, high flexibility, and effectiveness of Janus-Pro make it a strong candidate for next-generation unified multimodal models.

Public
$0.002 / img
ProjectPaperLicense

HTTP/cURL API

You can use cURL or any other http client to run inferences:

curl -X POST \
    -H "Authorization: bearer $DEEPINFRA_TOKEN"  \
    -F image=@my_image.jpg  \
    -F 'question=Explain this image.'  \
    'https://api.deepinfra.com/v1/inference/deepseek-ai/Janus-Pro-7B'
copy

which will give you back something similar to:

{
  "response": "A photo of an astronaut riding a horse on Mars.",
  "request_id": null,
  "inference_status": {
    "status": "unknown",
    "runtime_ms": 0,
    "cost": 0.0,
    "tokens_generated": 0,
    "tokens_input": 0
  }
}

copy

Input fields

imagestring

Input image bytes for visual question answering task


questionstring

Question about the provided image


seedinteger

Random seed for reproducibility, default is random

Range: 0 ≤ seed < 18446744073709552000


top_pnumber

Top-p sampling parameter, higher values increase diversity

Default value: 0.95

Range: 0 ≤ top_p ≤ 1


temperaturenumber

Temperature parameter, higher values increase randomness

Default value: 0.1

Range: 0 ≤ temperature ≤ 1


webhookfile

The webhook to call when inference is done, by default you will get the output in the response of your inference request

Input Schema

Output Schema

Unlock the most affordable AI hosting

Run models at scale with our fully managed GPU infrastructure, delivering enterprise-grade uptime at the industry's best rates.