Browse deepinfra models:

All categories and models you can try out and directly use in deepinfra:
​Search

Category/all

Phind/Phind-CodeLlama-34B-v2 cover image
fp16
4k
Replaced
  • text-generation

Phind-CodeLlama-34B-v2 is an open-source language model that has been fine-tuned on 1.5B tokens of high-quality programming-related data and achieved a pass@1 rate of 73.8% on HumanEval. It is multi-lingual and proficient in Python, C/C++, TypeScript, Java, and more. It has been trained on a proprietary dataset of instruction-answer pairs instead of code completion examples. The model is instruction-tuned on the Alpaca/Vicuna format to be steerable and easy-to-use. It accepts the Alpaca/Vicuna instruction format and can generate one completion for each prompt.

Qwen/Qwen2-72B-Instruct cover image
bfloat16
32k
Deprecated
  • text-generation

The 72 billion parameter Qwen2 excels in language understanding, multilingual capabilities, coding, mathematics, and reasoning.

Qwen/Qwen2-7B-Instruct cover image
bfloat16
32k
Deprecated
  • text-generation

The 7 billion parameter Qwen2 excels in language understanding, multilingual capabilities, coding, mathematics, and reasoning.

Sao10K/L3.1-70B-Euryale-v2.2 cover image
fp8
128k
$0.35/$0.40 in/out Mtoken
  • text-generation

Euryale 3.1 - 70B v2.2 is a model focused on creative roleplay from Sao10k

XpucT/Deliberate cover image
Replaced
  • text-to-image

The Deliberate Model allows for the creation of anything desired, with the potential for better results as the user's knowledge and detail in the prompt increase. The model is ideal for meticulous anatomy artists, creative prompt writers, art designers, and those seeking explicit content.

bigcode/starcoder2-15b cover image
fp16
16k
Replaced
  • text-generation

StarCoder2-15B model is a 15B parameter model trained on 600+ programming languages. It specializes in code completion.

bigcode/starcoder2-15b-instruct-v0.1 cover image
fp16
Replaced
  • text-generation

We introduce StarCoder2-15B-Instruct-v0.1, the very first entirely self-aligned code Large Language Model (LLM) trained with a fully permissive and transparent pipeline. Our open-source pipeline uses StarCoder2-15B to generate thousands of instruction-response pairs, which are then used to fine-tune StarCoder-15B itself without any human annotations or distilled data from huge and proprietary LLMs.

codellama/CodeLlama-34b-Instruct-hf cover image
fp16
4k
Replaced
  • text-generation

Code Llama is a state-of-the-art LLM capable of generating code, and natural language about code, from both code and natural language prompts. This particular instance is the 34b instruct variant

codellama/CodeLlama-70b-Instruct-hf cover image
fp16
4k
Replaced
  • text-generation

CodeLlama-70b is the largest and latest code generation from the Code Llama collection.

cognitivecomputations/dolphin-2.6-mixtral-8x7b cover image
bfloat16
32k
Replaced
  • text-generation

The Dolphin 2.6 Mixtral 8x7b model is a finetuned version of the Mixtral-8x7b model, trained on a variety of data including coding data, for 3 days on 4 A100 GPUs. It is uncensored and requires trust_remote_code. The model is very obedient and good at coding, but not DPO tuned. The dataset has been filtered for alignment and bias. The model is compliant with user requests and can be used for various purposes such as generating code or engaging in general chat.

cognitivecomputations/dolphin-2.9.1-llama-3-70b cover image
bfloat16
8k
Replaced
  • text-generation

Dolphin 2.9.1, a fine-tuned Llama-3-70b model. The new model, trained on filtered data, is more compliant but uncensored. It demonstrates improvements in instruction, conversation, coding, and function calling abilities.

databricks/dbrx-instruct cover image
bfloat16
32k
Replaced
  • text-generation

DBRX is an open source LLM created by Databricks. It uses mixture-of-experts (MoE) architecture with 132B total parameters of which 36B parameters are active on any input. It outperforms existing open source LLMs like Llama 2 70B and Mixtral-8x7B on standard industry benchmarks for language understanding, programming, math, and logic.

deepinfra/airoboros-70b cover image
fp16
4k
Replaced
  • text-generation

Latest version of the Airoboros model fine-tunned version of llama-2-70b using the Airoboros dataset. This model is currently running jondurbin/airoboros-l2-70b-2.2.1

google/codegemma-7b-it cover image
fp16
8k
Replaced
  • text-generation

CodeGemma is a collection of lightweight open code models built on top of Gemma. CodeGemma models are text-to-text and text-to-code decoder-only models and are available as a 7 billion pretrained variant that specializes in code completion and code generation tasks, a 7 billion parameter instruction-tuned variant for code chat and instruction following and a 2 billion parameter pretrained variant for fast code completion.

google/gemma-1.1-7b-it cover image
bfloat16
8k
Replaced
  • text-generation

Gemma is an open-source model designed by Google. This is Gemma 1.1 7B (IT), an update over the original instruction-tuned Gemma release. Gemma 1.1 was trained using a novel RLHF method, leading to substantial gains on quality, coding capabilities, factuality, instruction following and multi-turn conversation quality.

intfloat/e5-base-v2 cover image
512
$0.005 / Mtoken
  • embeddings

Text Embeddings by Weakly-Supervised Contrastive Pre-training. Model has 24 layers and 1024 out dim.

intfloat/e5-large-v2 cover image
512
$0.010 / Mtoken
  • embeddings

Text Embeddings by Weakly-Supervised Contrastive Pre-training. Model has 24 layers and 1024 out dim.

intfloat/multilingual-e5-large cover image
fp32
512
$0.010 / Mtoken
  • embeddings

The Multilingual-E5-large model is a 24-layer text embedding model with an embedding size of 1024, trained on a mixture of multilingual datasets and supporting 100 languages. The model achieves state-of-the-art results on the Mr. TyDi benchmark, outperforming other models such as BM25 and mDPR. The model is intended for use in text retrieval and semantic similarity tasks, and should be used with the "query: " and "passage: " prefixes for input texts to achieve optimal performance.