hexgrad/
Kokoro is an open-weight TTS model with 82 million parameters. Despite its lightweight architecture, it delivers comparable quality to larger models while being significantly faster and more cost-efficient. With Apache-licensed weights, Kokoro can be deployed anywhere from production environments to personal projects.
Text to convert to speech
Settings
Select the desired format for the speech output. Supported formats include mp3, opus, flac, wav, and pcm. 5
Select the desired voice for the speech output. You can select multiple to combine and mix voices.
Speed of the speech (Default: empty, 0.25 ≤ speed ≤ 4)
Whether to stream the output 2
Whether to return timestamps 2
Sample rate for the output audio. (Default: empty)
Minimum number of tokens for the output. (Default: empty)
Maximum number of tokens for the output. (Default: empty)
Absolute maximum number of tokens for the output. (Default: empty)
Waiting for audio data... Submit request to start streaming.
Run models at scale with our fully managed GPU infrastructure, delivering enterprise-grade uptime at the industry's best rates.