Long Context models incoming

Published on 2023.11.21 by Iskren ChernevLong Context models incoming post image

Many users requested longer context models to help them summarize bigger chunks of text or write novels with ease. We're proud to announce our long context model selection that will grow bigger in the comming weeks. Models Mistral-based models have a context size of 32k, and amazon recently r...Read More


Unleashing the Potential of AI for Exceptional Gaming Experiences

Published on 2023.11.10 by Tsveta GavanozovaUnleashing the Potential of AI for Exceptional Gaming Experiences post image

Gaming companies are constantly in search of ways to enhance player experiences and achieve extraordinary outcomes. Recent research indicates that investments in player experience (PX) can result in substantial returns on investment (ROI). By prioritizing PX and harnessing the capabilities of AI...Read More


Lzlv model for roleplaying and creative work

Published on 2023.11.02 by Nikola BorisovLzlv model for roleplaying and creative work post image

Recently an interesting new model got released. It is called Lzlv, and it is basically a merge of few existing models. This model is using the Vicuna prompt format, so keep this in mind if you are using our raw [API](/lizpreciatior/lzlv_70b...Read More


Langchain improvements: async and streaming

Published on 2023.10.25 by Iskren ChernevLangchain improvements: async and streaming post image

Starting from langchain v0.0.322 you can make efficient async generation and streaming tokens with deepinfra. Async generation The deepinfra wrapper now supports native async calls, so you can expect more performance (no more t...Read More


Compare Llama2 vs OpenAI models for FREE.

Published on 2023.09.28 by Nikola BorisovCompare Llama2 vs OpenAI models for FREE. post image

At DeepInfra we host the best open source LLM models. We are always working hard to make our APIs simple and easy to use. Today we are excited to announce a very easy way to quickly try our models like Llama2 70b and [Mistral 7b](/mistralai/Mistral-7B-Instruc...Read More


Use OpenAI API clients with LLaMas

Published on 2023.08.28 by Iskren ChernevUse OpenAI API clients with LLaMas post image

Getting started # create a virtual environment python3 -m venv .venv # activate environment in current shell . .venv/bin/activate # install openai python client pip install openai Choose a model meta-llama/Llama-2-70b-chat-hf [meta-llama/L...Read More


Fork of Text Generation Inference.

Published on 2023.08.09 by Nikola BorisovFork of Text Generation Inference. post image

The text generation inference open source project by huggingface looked like a promising framework for serving large language models (LLM). However, huggingface announced that they will change the license of code with version v1.0.0. While the previous license Apache 2.0 was permissive, the new on...Read More


The easiest way to build AI applications with Llama 2 LLMs.

Published on 2023.08.02 by Nikola BorisovThe easiest way to build AI applications with Llama 2 LLMs. post image

The long awaited Llama 2 models are finally here! We are excited to show you how to use them with DeepInfra. These collection of models represent the state of the art in open source language models. They are made available by Meta AI and the l...Read More


How to deploy Databricks Dolly v2 12b, instruction tuned casual language model.

Published on 2023.04.12 by Yessen KanapinHow to deploy Databricks Dolly v2 12b, instruction tuned casual language model. post image

Databricks Dolly is instruction tuned 12 billion parameter casual language model based on EleutherAI's pythia-12b. It was pretrained on The Pile, GPT-J's pretraining corpus. [databricks-dolly-15k](http...Read More


How to OpenAI Whisper with per-sentence and per-word timestamp segmentation using DeepInfra

Published on 2023.04.05 by Yessen KanapinHow to OpenAI Whisper with per-sentence and per-word timestamp segmentation using DeepInfra post image

Whisper is a Speech-To-Text model from OpenAI.Read More