Qwen3-Max-Thinking state-of-the-art reasoning model at your fingertips!
Qwen/
$20.00
/ 1M characters
● Qwen3-TTS-VoiceDesign is a voice design variant of Qwen3-TTS by Alibaba's Qwen team. Instead of selecting from preset voices, you describe the voice you want in natural language — and the model generates speech in that voice. Key capabilities: - Natural language voice control — describe any voice with free text (e.g. "a deep male voice with a calm, authoritative presence", "a young cheerful female with a warm and friendly tone") - 10 languages — English, Chinese, Japanese, Korean, German, French, Russian, Spanish, Italian, Portuguese - Streaming support — real-time PCM streaming - Multiple output formats — WAV, MP3, FLAC, PCM Built on the same 1.7B parameter architecture as Qwen3-TTS, using discrete multi-codebook language modeling and a custom 12Hz acoustic tokenizer for high-quality end-to-end speech synthesis.

LszXnel2
2026-03-06T23:07:14+00:00
© 2026 Deep Infra. All rights reserved.