🚀 New models by Bria.ai, generate and edit images at scale 🚀

Databricks Dolly is instruction tuned 12 billion parameter casual language model based on EleutherAI's pythia-12b. It was pretrained on The Pile, GPT-J's pretraining corpus. databricks-dolly-15k open source instruction following dataset was used to tune the model.
To get started, you'll need an API key from DeepInfra.
You can deploy the databricks/dolly-v2-12b model easily through the web dashboard or by using our API. The model will be automatically deployed when you first run an inference request.
You can use it with our REST API. Here's how to call the model using curl:
curl -X POST \
-d '{"prompt": "Who is Elvis Presley?"}' \
-H 'Content-Type: application/json' \
-H "Authorization: Bearer YOUR_API_KEY" \
'https://api.deepinfra.com/v1/inference/databricks/dolly-v2-12b'
We charge per inference request execution time, $0.0005 per second. Inference runs on Nvidia A100 cards. To see the full documentation of how to call this model, check out the model page on our website.
You can browse all available models on our models page.
If you have any question, just reach out to us on our Discord server.
Introducing Tool Calling with LangChain, Search the Web with Tavily and Tool Calling AgentsIn this blog post, we will query for the details of a recently released expansion pack for Elden Ring, a critically acclaimed game released in 2022, using the Tavily tool with the ChatDeepInfra model.
Using this boilerplate, one can automate the process of searching for information with well-writt...
Introducing GPU Instances: On-Demand GPU Compute for AI WorkloadsLaunch dedicated GPU containers in minutes with our new GPU Instances feature, designed for machine learning training, inference, and compute-intensive workloads.© 2025 Deep Infra. All rights reserved.