DeepInfra raises $107M Series B to scale the inference cloud — read the announcement

To use DeepInfra's services, you'll need an API key. You can get one by signing up on our platform.
Your API key will be used to authenticate all your requests to the DeepInfra API.
Now lets actually deploy some models to production and use them for inference. It is really easy.
You can deploy models through the web dashboard or by using our API. Models are automatically deployed when you first make an inference request.
Once a model is deployed on DeepInfra, you can use it with our REST API. Here's how to use it with curl:
curl -X POST \
-F "audio=@/path/to/audio.mp3" \
-H "Authorization: Bearer YOUR_API_KEY" \
'https://api.deepinfra.com/v1/inference/openai/whisper-small'
What Is Google TurboQuant and What Does It Mean for Open Source Inference? - Deep Infra<p>In late March 2026, Google Research published a paper that got more attention outside of academic circles than most AI research does. TurboQuant, a new compression algorithm for the key-value cache in large language models, landed with enough noise that Cloudflare CEO Matthew Prince called it Google’s DeepSeek moment. The Silicon Valley Pied Piper comparisons […]</p>
Kimi K2.5 API Benchmarks: Latency, Throughput & Cost<p>About Kimi K2.5 Kimi K2.5 is Moonshot AI’s flagship open-source reasoning model, released in January 2026. It is a native multimodal agentic model built through continual pretraining on approximately 15 trillion mixed visual and text tokens. The model features a Mixture-of-Experts (MoE) architecture with 1 trillion total parameters and 32 billion activated parameters. Kimi K2.5 […]</p>
Kimi K2.6 Pricing Guide 2026: Compare Costs & Deployment Strategies<p>Kimi K2.6 matters because it sits in a rare spot: open weights, broad provider availability, and a real spread in pricing and runtime performance depending on where you buy it. Artificial Analysis tracks the model across nine API providers, with blended pricing ranging from $1.15 to $2.15 per 1M tokens and major differences in throughput […]</p>
© 2026 DeepInfra. All rights reserved.