Guaranteed JSON output on Open-Source LLMs.

Published on 2024.03.08 by Patrick Reiter HornGuaranteed JSON output on Open-Source LLMs. post image

DeepInfra is proud to announce that we have released "JSON mode" across all of our text language models. It is available through the "response_format" object, which currently supports only {"type": "json_object"} Our JSON mode will guarantee that all tokens returned in the output of a langua...Read More


Deploy Custom LLMs on DeepInfra

Published on 2024.03.01 by Iskren ChernevDeploy Custom LLMs on DeepInfra post image

Did you just finetune your favorite model and are wondering where to run it? Well, we have you covered. Simple API and predictable pricing. Put your model on huggingface Use a private repo, if you wish, we don't mind. Create a hf access token just for the repo for better security. Create c...Read More


Enhancing Open-Source LLMs with Function Calling Feature

Published on 2024.01.26 by Pernekhan UtemuratovEnhancing Open-Source LLMs with Function Calling Feature post image

We're excited to announce that the Function Calling feature is now available on DeepInfra. We're offering Mistral-7B and Mixtral-8x7B models with this feature. Other models will be available soon. LLM models are powerful tools for various tasks. However, they're limited in their ability to per...Read More


Long Context models incoming

Published on 2023.11.21 by Iskren ChernevLong Context models incoming post image

Many users requested longer context models to help them summarize bigger chunks of text or write novels with ease. We're proud to announce our long context model selection that will grow bigger in the comming weeks. Models Mistral-based models have a context size of 32k, and amazon recently r...Read More


Unleashing the Potential of AI for Exceptional Gaming Experiences

Published on 2023.11.10 by Tsveta GavanozovaUnleashing the Potential of AI for Exceptional Gaming Experiences post image

Gaming companies are constantly in search of ways to enhance player experiences and achieve extraordinary outcomes. Recent research indicates that investments in player experience (PX) can result in substantial returns on investment (ROI). By prioritizing PX and harnessing the capabilities of AI...Read More


Lzlv model for roleplaying and creative work

Published on 2023.11.02 by Nikola BorisovLzlv model for roleplaying and creative work post image

Recently an interesting new model got released. It is called Lzlv, and it is basically a merge of few existing models. This model is using the Vicuna prompt format, so keep this in mind if you are using our raw [API](/lizpreciatior/lzlv_70b...Read More


Langchain improvements: async and streaming

Published on 2023.10.25 by Iskren ChernevLangchain improvements: async and streaming post image

Starting from langchain v0.0.322 you can make efficient async generation and streaming tokens with deepinfra. Async generation The deepinfra wrapper now supports native async calls, so you can expect more performance (no more t...Read More


Compare Llama2 vs OpenAI models for FREE.

Published on 2023.09.28 by Nikola BorisovCompare Llama2 vs OpenAI models for FREE. post image

At DeepInfra we host the best open source LLM models. We are always working hard to make our APIs simple and easy to use. Today we are excited to announce a very easy way to quickly try our models like Llama2 70b and [Mistral 7b](/mistralai/Mistral-7B-Instruc...Read More


Use OpenAI API clients with LLaMas

Published on 2023.08.28 by Iskren ChernevUse OpenAI API clients with LLaMas post image

Getting started # create a virtual environment python3 -m venv .venv # activate environment in current shell . .venv/bin/activate # install openai python client pip install openai Choose a model meta-llama/Llama-2-70b-chat-hf [meta-llama/L...Read More


Fork of Text Generation Inference.

Published on 2023.08.09 by Nikola BorisovFork of Text Generation Inference. post image

The text generation inference open source project by huggingface looked like a promising framework for serving large language models (LLM). However, huggingface announced that they will change the license of code with version v1.0.0. While the previous license Apache 2.0 was permissive, the new on...Read More