We use essential cookies to make our site work. With your consent, we may also use non-essential cookies to improve user experience and analyze website traffic…

DeepInfra raises $107M Series B to scale the inference cloud — read the announcement

ACE-Step/

acestep-v15-xl-sft

$0.001 / second of audio

*

ACE-Step v1.5 is a powerful open-source music foundation model that turns a text prompt into a complete song — vocals, lyrics, and instrumentation — at quality that rivals commercial tools. We run the high-quality XL checkpoint with its planning step ("thinking") on by default, so generations favor musical structure and coherence over raw speed.

ACE-Step/acestep-v15-xl-sft cover image

HTTP/cURL API

You can use cURL or any other http client to run inferences:

curl -X POST \
    -d '{"prompt": "a gentle, melodic acoustic ballad \u2014 soft fingerpicked guitar and warm piano, tender female vocals, calm and hopeful, welcoming the sunrise, around 70 bpm, C major", "lyrics": "[verse]\nThe moon clocks out, the stars go to bed\n[chorus]\nUp pops the sun, all gold and red", "response_format": "mp3"}'  \
    -H "Authorization: bearer $DEEPINFRA_TOKEN"  \
    -H 'Content-Type: application/json'  \
    'https://api.deepinfra.com/v1/inference/ACE-Step/acestep-v15-xl-sft'
copy

which will give you back something similar to:

{
  "audio": null,
  "output_format": "mp3",
  "duration_seconds": 0,
  "seed": 0,
  "generated_lyrics": null,
  "request_id": null,
  "inference_status": {
    "status": "unknown",
    "runtime_ms": 0,
    "cost": 0.0,
    "tokens_generated": 0,
    "tokens_input": 0,
    "output_length": 0
  }
}

copy

Input fields

Input Schema

Output Schema