Long Context models incoming

Published on 2023.11.21 by Iskren Chernev

Long Context models incoming header picture

Many users requested longer context models to help them summarize bigger chunks of text or write novels with ease.

We're proud to announce our long context model selection that will grow bigger in the comming weeks.


Mistral-based models have a context size of 32k, and amazon recently released a model fine-tuned specifically on longer contexts.

We also recently released the highly praised Yi models. Keep in mind they don't support chat, just the old-school text completion (new models are in the works):

Context FAQ

  • What does context mean? - Context is the number of tokens the model can look at at the same time. In practise this is the limit of the sent tokens + max new tokens (2k default) that can be generated. There is no 1:1 correspondence between tokens and words, general rule-of-thumb is 100 tokens is 75 words.
  • Can I go above the context? - Technically -- no. Any API to a 4k model that lets you pass more will one way or another truncate the tokens given to the LLM to it's context size. You (as users) can also try reducing a long input (by removing some from the middle, for example) and re-submit.
  • My input fits the context size, but the model doesn't take it into account - The context size of a model is a hard creation-specified limit. Just because a model is listed as having a certain context size doesn't mean you can cram it with information and expect excellent results. Models differ in many parameters, and one such parameter is how far back they can recall information.Check MistalLite HF for some metrics shared by Amazon.
  • What can I do to make the model take into account more of the data I sent - There is no one answer to this question, and it varies by model, and each model capacity varies. The best you can do is test a particular model with a single task (like summarization, question answering), with different context lengths and find what works for you. Also try placing system prompt at the start and/or end. For most control you can utilize the text-completion endpoint (not the chat one).
  • Is a longer context model guaranteed to understand more context than a short-context model - Unfortunately -- no. For example a very bad model with huge context size may fail to act on a single sentence, whereas a good model with shorter context length could comprehend a few paragraphs or even pages.
  • I can't find a good model for my large context needs! - Don't loose hope! This is a rapidly evolving ecosystem with models being released every day. We try our best to provide the best open-source models to you, as quickly as possible. Let us know which models you like by emailing feedback@deepinfra.com or drop us a message in discord