How to deploy LoRA adapter model
- Navigate to the dashboard https://deepinfra.com/dash
- Click on the 'New Deployment' button
- Click on the 'LoRA Model' tab
- Fill the form:
- LoRA model name: model name used to reference the deployment
- Hugging Face Model Name: Hugging Face model name
- Hugging Face Token: (optional) Hugging Face token if the LoRA adapter model is private
To use LoRA adapter model, you need
- LoRA adapter model hosted on Hugging Face
- Base model that supports LoRA adapter at DeepInfra (you can see the list of supported base models in upload lora form)
- Hugging Face token if the LoRA adapter model is private
- DeepInfra account, and DeepInfra API key
Example flow:
Prerequisites:
- askardeepinfra/llama-3.1-8B-rank-32-example-lora
- The base model is meta-llama/Meta-Llama-3.1-8B-Instruct which is supported at DeepInfra
- The LoRA adapter model is public, so no need for Hugging Face token
- DeepInfra API key is generated from https://deepinfra.com/dash/api_keys page
Then I'm gonna deploy the model:
- Navigate to the dashboard https://deepinfra.com/dash
- Click on the 'New Deployment' button
- Click on the 'LoRA Model' tab
- Fill the form:
- LoRA model name: asdf/lora-example
- Hugging Face Model Name: askardeepinfra/llama-3.1-8B-rank-32-example-lora
- Click on the 'Upload' button
Now the deployment should appear in https://deepinfra.com/dash/deployments page, with a name asdf/lora-example.
Initially the state is "Initializing", after a while it should become "Deploying" and then "Running". Once the state is "Running", you can use the model.
Navigate to https://deepinfra.com/asdf/lora-example where you can find all the information about the the model including:
- Pricing
- Precision
- Demo page, where you can test the model
- API reference, where you can find information how to inference the model using REST API
I'll leave example of inference with curl below:
curl "https://api.deepinfra.com/v1/openai/chat/completions" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $DEEPINFRA_API_KEY" \
-d '{
"model": "asdf/lora-example",
"messages": [
{
"role": "user",
"content": "Hello!"
}
]
}'