We use essential cookies to make our site work. With your consent, we may also use non-essential cookies to improve user experience and analyze website traffic…

DeepInfra raises $107M Series B to scale the inference cloud — read the announcement

Getting Started
Published on 2023.03.02 by Nikola Borisov
Getting Started

Getting an API Key

To use DeepInfra's services, you'll need an API key. You can get one by signing up on our platform.

  1. Sign up or log in to your DeepInfra account at deepinfra.com
  2. Navigate to the Dashboard and select API Keys
  3. Create a new API key and save it securely

Your API key will be used to authenticate all your requests to the DeepInfra API.

Deployment

Now lets actually deploy some models to production and use them for inference. It is really easy.

You can deploy models through the web dashboard or by using our API. Models are automatically deployed when you first make an inference request.

Inference

Once a model is deployed on DeepInfra, you can use it with our REST API. Here's how to use it with curl:

curl -X POST \
  -F "audio=@/path/to/audio.mp3" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  'https://api.deepinfra.com/v1/inference/openai/whisper-small'
copy
Related articles
Gemma 4 Pricing, Benchmarks & Real-World Cost AnalysisGemma 4 Pricing, Benchmarks & Real-World Cost Analysis<p>Gemma 4 puts a serious open-weight reasoning model into a genuinely competitive provider market. The same Gemma 4 26B A4B model is available across seven API providers, with blended pricing ranging from $0.10 to $0.70 per 1M tokens — real variation that changes production economics. Released April 3, 2026 by Google DeepMind under Apache 2.0, [&hellip;]</p>
Unleashing the Potential of AI for Exceptional Gaming ExperiencesUnleashing the Potential of AI for Exceptional Gaming ExperiencesGaming companies are constantly in search of ways to enhance player experiences and achieve extraordinary outcomes. Recent research indicates that investments in player experience (PX) can result in substantial returns on investment (ROI). By prioritizing PX and harnessing the capabilities of AI...
Llama 3.1 70B Instruct API from DeepInfra: Snappy Starts, Fair Pricing, Production Fit - Deep InfraLlama 3.1 70B Instruct API from DeepInfra: Snappy Starts, Fair Pricing, Production Fit - Deep Infra<p>Llama 3.1 70B Instruct is Meta’s widely-used, instruction-tuned model for high-quality dialogue and tool use. With a ~131K-token context window, it can read long prompts and multi-file inputs—great for agents, RAG, and IDE assistants. But how “good” it feels in practice depends just as much on the inference provider as on the model: infra, batching, [&hellip;]</p>