DeepInfra raises $107M Series B to scale the inference cloud — read the announcement

inworld-ai/
$35.00
/ 1M characters
Realtime TTS 2.0 is a low-latency text-to-speech model with natural language steering, allowing you to control tone and emotion directly in the prompt (e.g., “[be happy and upbeat] Hello!”). It supports cross-lingual voices and multiple languages, enabling the same voice to speak consistently across different languages. This is an early access preview ahead of full launch, with ongoing improvements to voice quality and steering.

Voices
© 2026 DeepInfra. All rights reserved.