We use essential cookies to make our site work. With your consent, we may also use non-essential cookies to improve user experience and analyze website traffic…

GLM-5.1 - state-of-the-art agentic engineering, now available on DeepInfra!

Getting Started
Published on 2023.03.02 by Nikola Borisov
Getting Started

Getting an API Key

To use DeepInfra's services, you'll need an API key. You can get one by signing up on our platform.

  1. Sign up or log in to your DeepInfra account at deepinfra.com
  2. Navigate to the Dashboard and select API Keys
  3. Create a new API key and save it securely

Your API key will be used to authenticate all your requests to the DeepInfra API.

Deployment

Now lets actually deploy some models to production and use them for inference. It is really easy.

You can deploy models through the web dashboard or by using our API. Models are automatically deployed when you first make an inference request.

Inference

Once a model is deployed on DeepInfra, you can use it with our REST API. Here's how to use it with curl:

curl -X POST \
  -F "audio=@/path/to/audio.mp3" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  'https://api.deepinfra.com/v1/inference/openai/whisper-small'
copy
Related articles
Qwen3.5 35B A3B API Benchmarks: Latency, Throughput & CostQwen3.5 35B A3B API Benchmarks: Latency, Throughput & Cost<p>About Qwen3.5 35B A3B Qwen3.5 35B A3B is a native vision-language model released by Alibaba Cloud in February 2026. It uses a hybrid architecture that integrates Gated Delta Networks with a sparse Mixture-of-Experts model, achieving higher inference efficiency. With 35 billion total parameters and only 3 billion activated per token through 256 experts (8 routed [&hellip;]</p>
Enhancing Open-Source LLMs with Function Calling FeatureEnhancing Open-Source LLMs with Function Calling FeatureWe're excited to announce that the Function Calling feature is now available on DeepInfra. We're offering Mistral-7B and Mixtral-8x7B models with this feature. Other models will be available soon. LLM models are powerful tools for various tasks. However, they're limited in their ability to per...
Deep Infra Launches Access to NVIDIA Nemotron Models for Vision, Retrieval, and AI SafetyDeep Infra Launches Access to NVIDIA Nemotron Models for Vision, Retrieval, and AI SafetyDeep Infra is serving the new, open NVIDIA Nemotron vision language and OCR AI models from day zero of their release. As a leading inference provider committed to performance and cost-efficiency, we're making these cutting-edge models available at the industry's best prices, empowering developers to build specialized AI agents without compromising on budget or performance.