Documentation
Contents
We offer OpenAI compatible API for a few models, including:
The APIs we support are:
The api_base
is https://api.deepinfra.com/v1/openai
.
Python example
import openai
stream = True # or False
# Point OpenAI client to our endpoint
openai.api_key = "<YOUR DEEPINFRA TOKEN: deepctl auth token>"
openai.api_base = "https://api.deepinfra.com/v1/openai"
MODEL_DI = "meta-llama/Llama-2-70b-chat-hf"
chat_completion = openai.ChatCompletion.create(
model=MODEL_DI,
messages=[{"role": "user", "content": "Hello world"}],
stream=stream,
max_tokens=100,
# top_p=0.5,
)
if stream:
# print the chat completion
for event in chat_completion:
print(event.choices)
else:
print(chat_completion.choices[0].message.content)
You can of course use regular HTTP:
export TOKEN="$(deepctl auth token)"
export URL_DI="https://api.deepinfra.com/v1/openai/chat/completions"
export MODEL_DI="meta-llama/Llama-2-70b-chat-hf"
curl "$URL_DI" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $TOKEN" \
-d '{
"stream": true,
"model": "'$MODEL_DI'",
"messages": [
{
"role": "user",
"content": "Hello!"
}
],
"max_tokens": 100
}'
If you're already using OpenAI's chat completion endpoint you can just set the
base_url
, the api token and change the model name, and you're good to go.
Our pricing is USD 1.00 per 1 million tokens, so you'll save some money too.
Please note that we're not yet 100% compatible, drop us a line in discord if you'd like us to prioritize something missing. Supported request attributes:
model
(one of the supported ones listed above)messages
(including system
)temperature
top_p
stream
max_tokens