DeepInfra raises $107M Series B to scale the inference cloud — read the announcement
You can use cURL or any other http client to run inferences:
curl -X POST \
-d '{"prompt": "A happy golden retriever bounds along the wet sand at the edge of the ocean, running toward the camera at a slight angle, tongue out and ears flapping, kicking up small splashes of water with each stride. Gentle waves roll in behind it under warm golden-hour sunlight, the wet sand and sea spray catching the light. Medium tracking shot, vivid natural colors, dynamic but smooth motion, shallow depth of field, cinematic, sharp focus, shot on a DSLR.", "seconds": 5, "resolution": "480p", "orientation": "landscape"}' \
-H "Authorization: bearer $DEEPINFRA_TOKEN" \
-H 'Content-Type: application/json' \
'https://api.deepinfra.com/v1/inference/FastVideo/FastWan-QAD-FP8-1.3B'
which will give you back something similar to:
{
"video_url": "/model/inference/pyramid_sample.mp4",
"seed": "12345",
"request_id": null,
"inference_status": {
"status": "unknown",
"runtime_ms": 0,
"cost": 0.0,
"tokens_generated": 0,
"tokens_input": 0,
"output_length": 0
}
}
© 2026 DeepInfra. All rights reserved.