We use essential cookies to make our site work. With your consent, we may also use non-essential cookies to improve user experience and analyze website traffic…

DeepInfra raises $107M Series B to scale the inference cloud — read the announcement

Getting Started
Published on 2023.03.02 by Nikola Borisov
Getting Started

Getting an API Key

To use DeepInfra's services, you'll need an API key. You can get one by signing up on our platform.

  1. Sign up or log in to your DeepInfra account at deepinfra.com
  2. Navigate to the Dashboard and select API Keys
  3. Create a new API key and save it securely

Your API key will be used to authenticate all your requests to the DeepInfra API.

Deployment

Now lets actually deploy some models to production and use them for inference. It is really easy.

You can deploy models through the web dashboard or by using our API. Models are automatically deployed when you first make an inference request.

Inference

Once a model is deployed on DeepInfra, you can use it with our REST API. Here's how to use it with curl:

curl -X POST \
  -F "audio=@/path/to/audio.mp3" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  'https://api.deepinfra.com/v1/inference/openai/whisper-small'
copy
Related articles
Chat with books using DeepInfra and LlamaIndexChat with books using DeepInfra and LlamaIndexAs DeepInfra, we are excited to announce our integration with LlamaIndex. LlamaIndex is a powerful library that allows you to index and search documents using various language models and embeddings. In this blog post, we will show you how to chat with books using DeepInfra and LlamaIndex. We will ...
Kimi K2.6 API Benchmarks: Latency, TPS & Cost Analysis (2026)Kimi K2.6 API Benchmarks: Latency, TPS & Cost Analysis (2026)<p>About Kimi K2.6 Kimi K2.6 is an open-source frontier model from Moonshot AI, released on April 20, 2026. It is a native multimodal agentic model built for long-horizon coding, autonomous execution, and swarm-based task orchestration. The model uses a Mixture-of-Experts (MoE) architecture with 1 trillion total parameters and 32 billion activated parameters per token, using [&hellip;]</p>
DeepSeek V4 Pro (Max) API Benchmarks: Latency, Throughput & Cost AnalysisDeepSeek V4 Pro (Max) API Benchmarks: Latency, Throughput & Cost Analysis<p>About DeepSeek V4 Pro DeepSeek V4 Pro is a Mixture-of-Experts (MoE) language model with 1.6 trillion total parameters and 49 billion activated parameters, supporting a 1 million token context window. Designed for advanced reasoning, coding, and long-horizon agent workflows, it represents the fourth generation of DeepSeek&#8217;s flagship open-weight models. The model introduces a hybrid attention [&hellip;]</p>