DeepInfra raises $107M Series B to scale the inference cloud — read the announcement
nvidia/
$0.00020
/ minute
Nemotron 3.5 ASR Streaming Multilingual is an open 0.6B-parameter prompt-conditioned cache-aware FastConformer-RNNT model, engineered for low-latency streaming transcription across 40+ languages. It powers real-time captioning, voice agents, and multilingual transcription pipelines—replacing separate per-language Whisper deployments with a single inference pass.

© 2026 DeepInfra. All rights reserved.