We use essential cookies to make our site work. With your consent, we may also use non-essential cookies to improve user experience and analyze website traffic…

Documentation

Function Calling

Function Calling allows models to call external functions provided by the user, and use the results to provide a comprehensive response to the user query. To learn more, read our blog.

We provide OpenAI compatible API.

Currently supported for:

Example

Let's go through some simple example of requesting a weather.

This is how you set up our endpoint

import openai
import json

client = openai.OpenAI(
    base_url="https://api.deepinfra.com/v1/openai",
    api_key="<Your-DeepInfra-API-Key>",
)
copy
import OpenAI from "openai";

const client = new OpenAI({
    baseURL: 'https://api.deepinfra.com/v1/openai',
    apiKey: "<Your-DeepInfra-API-Key>",
});
copy

This is the function that we will execute whenever the model asks us to do so

# Example dummy function hard coded to return the same weather
# In production, this could be your backend API or an external API
def get_current_weather(location):
    """Get the current weather in a given location"""
    print("Calling get_current_weather client side.")
    if "tokyo" in location.lower():
        return json.dumps({
            "location": "Tokyo",
            "temperature": "75"
        })
    elif "san francisco" in location.lower():
        return json.dumps({
            "location": "San Francisco",
            "temperature": "60"
        })
    elif "paris" in location.lower():
        return json.dumps({
            "location": "Paris",
            "temperature": "70"
        })
    else:
        return json.dumps({"location": location, "temperature": "unknown"})
copy
// Example dummy function hard coded to return the same weather
// In production, this could be your backend API or an external API

// Get the current weather in a given location
async function get_current_weather(location) {
  console.log("Calling get_current_weather client side.")
  
  if (location.toLowerCase().includes("tokyo")) {
    return JSON.stringify({
      "location": "Tokyo",
      "temperature": "75"
    });
  } else if (location.toLowerCase().includes("san francisco")) {
    return JSON.stringify({
      "location": "San Francisco",
      "temperature": "60"
    });
  } else if (location.toLowerCase().includes("paris")) {
    return JSON.stringify({
      "location": "Paris",
      "temperature": "70"
    });
  } else {
      return JSON.stringify({"location": location, "temperature": "unknown"});
  }
}
copy

Let's now call our DeepInfra endpoint with tools and a user request

# here is the definition of our function
tools = [{
    "type": "function",
    "function": {
        "name": "get_current_weather",
        "description": "Get the current weather in a given location",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": 
                        "The city and state, e.g. San Francisco, CA"
                }
            },
            "required": ["location"]
        },
    }
}]

# here is the user request
messages = [
    {
        "role": "user",
        "content": "What is the weather in San Francisco?"
    }
]

# let's send the request and print the response
response = client.chat.completions.create(
    model="meta-llama/Meta-Llama-3.1-70B-Instruct",
    messages=messages,
    tools=tools,
    tool_choice="auto",
)
tool_calls = response.choices[0].message.tool_calls
for tool_call in tool_calls:
    print(tool_call.model_dump())
copy
// here is the definition of our function
const tools = [{
  "type": "function",
  "function": {
    "name": "get_current_weather",
    "description": "Get the current weather in a given location",
    "parameters": {
      "type": "object",
      "properties": {
        "location": {
          "type": "string",
          "description": "The city and state, e.g. San Francisco, CA"
        }
      },
      "required": ["location"]
    },
  }
}]
    

async function main() {
  // here is the user request
  const messages = [
    {
      "role": "user",
      "content": "What is the weather in San Francisco?"
    }
  ];

  // let's send the request and print the response
  const response = await client.chat.completions.create({
    model: "meta-llama/Meta-Llama-3.1-70B-Instruct",
    messages: messages,
    tools: tools,
    tool_choice: "auto",
  });

  const tool_calls = response.choices[0].message.tool_calls

  for (const tool_call of tool_calls) {
    console.log(tool_call);
  }
}

main();
copy

Output:

{'id': 'call_X0xYqdnoUonPJpQ6HEadxLHE', 'function': {'arguments': '{"location": "San Francisco"}', 'name': 'get_current_weather'}, 'type': 'function'}

Now let's respond back with a function call response and see the results

# extend conversation with assistant's reply
messages.append(response.choices[0].message)

for tool_call in tool_calls:
  function_name = tool_call.function.name
  if function_name == "get_current_weather":
      function_args = json.loads(tool_call.function.arguments)
      function_response = get_current_weather(
          location=function_args.get("location")
      )

  # extend conversation with function response
  messages.append({
      "tool_call_id": tool_call.id,
      "role": "tool",
      "content": function_response,
  })  


# get a new response from the model where it can see the function responses
second_response = client.chat.completions.create(
  model="meta-llama/Meta-Llama-3.1-70B-Instruct",
  messages=messages,
  tools=tools,
  tool_choice="auto",
)  

print(second_response.choices[0].message.content)
copy
// extend conversation with assistant's reply
messages.push(response.choices[0].message)

for (const tool_call of tool_calls) {
  const function_name = tool_call.function.name

  if (function_name == "get_current_weather") {
    const function_args = JSON.parse(tool_call.function.arguments);
    const function_response = await get_current_weather(function_args.location);

    // extend conversation with function response
    messages.push({
      "tool_call_id": tool_call.id,
      "role": "tool",
      "content": function_response,
    })
  }
}

// get a new response from the model where it can see the function responses
const second_response = await client.chat.completions.create({
  model: "meta-llama/Meta-Llama-3.1-70B-Instruct",
  messages: messages,
  tools: tools,
  tool_choice: "auto",
})

console.log(second_response.choices[0].message.content)
copy

Output:

The current temperature in San Francisco, CA is 60 degrees.
copy

Tips on using function calling

Here are some tips to get the most out of function calling:

  • Make sure the descriptions of the functions are well written, it will make models perform better.
  • Make sure to use lower temperatures < 1.0, this ensures the model won't plug in some random stuff to the parameters
  • Try not to use system messages
  • Models function calling quality degrades with the number of functions supplied.
  • Try to keep top_p and top_k values on the default

Notes

There is additional usage for prompting when using function calling + your function definitions will also be counted toward usage.

Supported:

  • single calls
  • parallel calls (though quality might be lower, it's under active development)
  • tool_choice with only auto or none
  • streaming mode

Not supported:

  • nested calls (not supported)