Janus-Pro is a novel autoregressive framework that unifies multimodal understanding and generation. It addresses the limitations of previous approaches by decoupling visual encoding into separate pathways, while still utilizing a single, unified transformer architecture for processing. The decoupling not only alleviates the conflict between the visual encoder’s roles in understanding and generation, but also enhances the framework’s flexibility. Janus-Pro surpasses previous unified model and matches or exceeds the performance of task-specific models. The simplicity, high flexibility, and effectiveness of Janus-Pro make it a strong candidate for next-generation unified multimodal models.
Janus-Pro is a novel autoregressive framework that unifies multimodal understanding and generation. It addresses the limitations of previous approaches by decoupling visual encoding into separate pathways, while still utilizing a single, unified transformer architecture for processing. The decoupling not only alleviates the conflict between the visual encoder’s roles in understanding and generation, but also enhances the framework’s flexibility. Janus-Pro surpasses previous unified model and matches or exceeds the performance of task-specific models. The simplicity, high flexibility, and effectiveness of Janus-Pro make it a strong candidate for next-generation unified multimodal models.
You can use cURL or any other http client to run inferences:
curl -X POST \
-H "Authorization: bearer $DEEPINFRA_TOKEN" \
-F image=@my_image.jpg \
-F 'question=Explain this image.' \
'https://api.deepinfra.com/v1/inference/deepseek-ai/Janus-Pro-7B'
which will give you back something similar to:
{
"response": "A photo of an astronaut riding a horse on Mars.",
"request_id": null,
"inference_status": {
"status": "unknown",
"runtime_ms": 0,
"cost": 0.0,
"tokens_generated": 0,
"tokens_input": 0
}
}
top_p
numberTop-p sampling parameter, higher values increase diversity
Default value: 0.95
Range: 0 ≤ top_p ≤ 1
temperature
numberTemperature parameter, higher values increase randomness
Default value: 0.1
Range: 0 ≤ temperature ≤ 1
webhook
fileThe webhook to call when inference is done, by default you will get the output in the response of your inference request
Run models at scale with our fully managed GPU infrastructure, delivering enterprise-grade uptime at the industry's best rates.