Documentation

Introduction

DeepInfra allows you to run the latest machine learning models with ease. We take care of all the heavy lifting related to running, scaling and monitoring the models. You can focus on your application and integrate the models with simple REST API calls. Running the models on DeepInfra is so much simpler than running them on your own infrastructure. You can also save a lot of money because you only pay for the time your request is running. Our service is like lambda functions for machine learning inference. We have the top 100+ models available for you to use. You can also deploy your custom models on DeepInfra.

Checkout the Getting Started for a quick dive.

Try out the Dashboard to deploy a model in seconds.

You can find your authorization tokens, monitor your deployments, usage and logs in the Dashboard

We offer LangChain integration for supported LLMs.

For announcements and tutorials please check our Blog

Getting Started

Latest Models

Gryphe/

MythoMax-L2-13b

Phind/

Phind-CodeLlama-34B-v2

openchat/

openchat_3.5

openai/

whisper-tiny

bigcode/

starcoder2-15b

Featured Models

lizpreciatior/

lzlv_70b_fp16_hf

openai/

whisper-large

llava-hf/

llava-1.5-7b-hf

mistralai/

Mixtral-8x22B-Instruct-v0.1

microsoft/

WizardLM-2-8x22B

stability-ai/

sdxl

Company

Pricing

Docs

DeepStart

About

Privacy

Terms