We use essential cookies to make our site work. With your consent, we may also use non-essential cookies to improve user experience and analyze website traffic…

Documentation

Deploying LoRA adapter model

How to deploy LoRA adapter model

Navigate to the dashboard https://deepinfra.com/dash
Click on the 'New Deployment' button
Click on the 'LoRA Model' tab
Fill the form:
- LoRA model name: model name used to reference the deployment
- Hugging Face Model Name: Hugging Face model name
- Hugging Face Token: (optional) Hugging Face token if the LoRA adapter model is private

To use LoRA adapter model, you need

LoRA adapter model hosted on Hugging Face
Base model that supports LoRA adapter at DeepInfra (you can see the list of supported base models in upload lora form)
Hugging Face token if the LoRA adapter model is private
DeepInfra account, and DeepInfra API key

Example flow:

Prerequisites:

askardeepinfra/llama-3.1-8B-rank-32-example-lora
The base model is meta-llama/Meta-Llama-3.1-8B-Instruct which is supported at DeepInfra
The LoRA adapter model is public, so no need for Hugging Face token
DeepInfra API key is generated from https://deepinfra.com/dash/api_keys page

Then I'm gonna deploy the model:

Navigate to the dashboard https://deepinfra.com/dash
Click on the 'New Deployment' button
Click on the 'LoRA Model' tab
Fill the form:
- LoRA model name: asdf/lora-example
- Hugging Face Model Name: askardeepinfra/llama-3.1-8B-rank-32-example-lora
Click on the 'Upload' button

Now the deployment should appear in https://deepinfra.com/dash/deployments page, with a name asdf/lora-example. Initially the state is "Initializing", after a while it should become "Deploying" and then "Running". Once the state is "Running", you can use the model.

Navigate to https://deepinfra.com/asdf/lora-example where you can find all the information about the the model including:

Pricing
Precision
Demo page, where you can test the model
API reference, where you can find information how to inference the model using REST API

I'll leave example of inference with curl below:

curl "https://api.deepinfra.com/v1/openai/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $DEEPINFRA_API_KEY" \
  -d '{
      "model": "asdf/lora-example",
      "messages": [
        {
          "role": "user",
          "content": "Hello!"
        }
      ]
    }'
copy

Custom LLMs LoRA Image Adapters

Unlock the most affordable AI hosting

Run models at scale with our fully managed GPU infrastructure, delivering enterprise-grade uptime at the industry's best rates.

Contact Sales Get Started

Latest Models

openai/

whisper-tiny

Gryphe/

MythoMax-L2-13b

Phind/

Phind-CodeLlama-34B-v2

openchat/

openchat_3.5

bigcode/

starcoder2-15b

Featured Models

Qwen/

Qwen3-235B-A22B

anthropic/

claude-4-sonnet

openai/

whisper-large-v3-turbo

Qwen/

Qwen3-30B-A3B

mistralai/

Voxtral-Small-24B-2507

meta-llama/

Llama-4-Maverick-17B-128E-Instruct-Turbo

Company

Pricing

Docs

Compare

DeepStart

About

Careers

Trust Center

Privacy

Terms

Have questions or need a custom solution?

Contact Sales