deepseek-ai/
Janus-Pro is a novel autoregressive framework that unifies multimodal understanding and generation. It addresses the limitations of previous approaches by decoupling visual encoding into separate pathways, while still utilizing a single, unified transformer architecture for processing. The decoupling not only alleviates the conflict between the visual encoder’s roles in understanding and generation, but also enhances the framework’s flexibility. Janus-Pro surpasses previous unified model and matches or exceeds the performance of task-specific models. The simplicity, high flexibility, and effectiveness of Janus-Pro make it a strong candidate for next-generation unified multimodal models.
You can use cURL or any other http client to run inferences:
curl -X POST \
-H "Authorization: bearer $DEEPINFRA_TOKEN" \
-F image=@my_image.jpg \
-F 'question=Explain this image.' \
'https://api.deepinfra.com/v1/inference/deepseek-ai/Janus-Pro-7B'
which will give you back something similar to:
{
"response": "A photo of an astronaut riding a horse on Mars.",
"request_id": null,
"inference_status": {
"status": "unknown",
"runtime_ms": 0,
"cost": 0.0,
"tokens_generated": 0,
"tokens_input": 0
}
}
Run models at scale with our fully managed GPU infrastructure, delivering enterprise-grade uptime at the industry's best rates.