DeepInfra raises $107M Series B to scale the inference cloud — read the announcement
ACE-Step/
$0.001 / second of audio
*ACE-Step v1.5 is a powerful open-source music foundation model that turns a text prompt into a complete song — vocals, lyrics, and instrumentation — at quality that rivals commercial tools. We run the high-quality XL checkpoint with its planning step ("thinking") on by default, so generations favor musical structure and coherence over raw speed.

You can use cURL or any other http client to run inferences:
curl -X POST \
-d '{"prompt": "a gentle, melodic acoustic ballad \u2014 soft fingerpicked guitar and warm piano, tender female vocals, calm and hopeful, welcoming the sunrise, around 70 bpm, C major", "lyrics": "[verse]\nThe moon clocks out, the stars go to bed\n[chorus]\nUp pops the sun, all gold and red", "response_format": "mp3"}' \
-H "Authorization: bearer $DEEPINFRA_TOKEN" \
-H 'Content-Type: application/json' \
'https://api.deepinfra.com/v1/inference/ACE-Step/acestep-v15-xl-sft'
which will give you back something similar to:
{
"audio": null,
"output_format": "mp3",
"duration_seconds": 0,
"seed": 0,
"generated_lyrics": null,
"request_id": null,
"inference_status": {
"status": "unknown",
"runtime_ms": 0,
"cost": 0.0,
"tokens_generated": 0,
"tokens_input": 0,
"output_length": 0
}
}
© 2026 DeepInfra. All rights reserved.