We use essential cookies to make our site work. With your consent, we may also use non-essential cookies to improve user experience and analyze website traffic…

DeepInfra raises $107M Series B to scale the inference cloud — read the announcement

inworld-ai logo

inworld-ai/

realtime-tts-2

Partner

$35.00

/ 1M characters

Realtime TTS 2.0 is a low-latency text-to-speech model with natural language steering, allowing you to control tone and emotion directly in the prompt (e.g., “[be happy and upbeat] Hello!”). It supports cross-lingual voices and multiple languages, enabling the same voice to speak consistently across different languages. This is an early access preview ahead of full launch, with ongoing improvements to voice quality and steering.

Public
inworld-ai/realtime-tts-2 cover image
demoapivoice

1YIJAoTc

2026-05-04T22:17:44+00:00