Browse deepinfra models:

All categories and models you can try out and directly use in deepinfra:
Search

Category/featured

Our most popular AI models used by thousands of users in their apps and research. What will you create today?

Qwen/Qwen2-72B-Instruct cover image
$0.59/$0.79 in/out Mtoken
  • text-generation

The 72 billion parameter Qwen2 excels in language understanding, multilingual capabilities, coding, mathematics, and reasoning.

microsoft/Phi-3-medium-4k-instruct cover image
$0.14 / Mtoken
  • text-generation

The Phi-3-Medium-4K-Instruct is a powerful and lightweight language model with 14 billion parameters, trained on high-quality data to excel in instruction following and safety measures. It demonstrates exceptional performance across benchmarks, including common sense, language understanding, and logical reasoning, outperforming models of similar size.

openchat/openchat-3.6-8b cover image
$0.08 / Mtoken
  • text-generation

Openchat 3.6 is a LLama-3-8b fine tune that outperforms it on multiple benchmarks.

mistralai/Mistral-7B-Instruct-v0.3 cover image
$0.07 / Mtoken
  • text-generation

Mistral-7B-Instruct-v0.3 is an instruction-tuned model, next iteration of of Mistral 7B that has larger vocabulary, newer tokenizer and supports function calling.

meta-llama/Meta-Llama-3-70B-Instruct cover image
$0.59/$0.79 in/out Mtoken
  • text-generation

Model Details Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and instruction tuned generative text models in 8 and 70B sizes.

meta-llama/Meta-Llama-3-8B-Instruct cover image
$0.08 / Mtoken
  • text-generation

Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and instruction tuned generative text models in 8 and 70B sizes.

mistralai/Mixtral-8x22B-Instruct-v0.1 cover image
$0.65 / Mtoken
  • text-generation

This is the instruction fine-tuned version of Mixtral-8x22B - the latest and largest mixture of experts large language model (LLM) from Mistral AI. This state of the art machine learning model uses a mixture 8 of experts (MoE) 22b models. During inference 2 experts are selected. This architecture allows large models to be fast and cheap at inference.

microsoft/WizardLM-2-8x22B cover image
$0.65 / Mtoken
  • text-generation

WizardLM-2 8x22B is Microsoft AI's most advanced Wizard model. It demonstrates highly competitive performance compared to those leading proprietary models.

microsoft/WizardLM-2-7B cover image
$0.07 / Mtoken
  • text-generation

WizardLM-2 7B is the smaller variant of Microsoft AI's latest Wizard model. It is the fastest and achieves comparable performance with existing 10x larger open-source leading models

google/gemma-1.1-7b-it cover image
$0.07 / Mtoken
  • text-generation

Gemma is an open-source model designed by Google. This is Gemma 1.1 7B (IT), an update over the original instruction-tuned Gemma release. Gemma 1.1 was trained using a novel RLHF method, leading to substantial gains on quality, coding capabilities, factuality, instruction following and multi-turn conversation quality.

mistralai/Mixtral-8x7B-Instruct-v0.1 cover image
$0.24 / Mtoken
  • text-generation

Mixtral is mixture of expert large language model (LLM) from Mistral AI. This is state of the art machine learning model using a mixture 8 of experts (MoE) 7b models. During inference 2 expers are selected. This architecture allows large models to be fast and cheap at inference. The Mixtral-8x7B outperforms Llama 2 70B on most benchmarks.

mistralai/Mistral-7B-Instruct-v0.2 cover image
$0.07 / Mtoken
  • text-generation

The Mistral-7B-Instruct-v0.2 Large Language Model (LLM) is a instruct fine-tuned version of the Mistral-7B-v0.2 generative text model using a variety of publicly available conversation datasets.

cognitivecomputations/dolphin-2.6-mixtral-8x7b cover image
$0.24 / Mtoken
  • text-generation

The Dolphin 2.6 Mixtral 8x7b model is a finetuned version of the Mixtral-8x7b model, trained on a variety of data including coding data, for 3 days on 4 A100 GPUs. It is uncensored and requires trust_remote_code. The model is very obedient and good at coding, but not DPO tuned. The dataset has been filtered for alignment and bias. The model is compliant with user requests and can be used for various purposes such as generating code or engaging in general chat.

lizpreciatior/lzlv_70b_fp16_hf cover image
$0.59/$0.79 in/out Mtoken
  • text-generation

A Mythomax/MLewd_13B-style merge of selected 70B models A multi-model merge of several LLaMA2 70B finetunes for roleplaying and creative work. The goal was to create a model that combines creativity with intelligence for an enhanced experience.

openchat/openchat_3.5 cover image
$0.07 / Mtoken
  • text-generation

OpenChat is a library of open-source language models that have been fine-tuned with C-RLFT, a strategy inspired by offline reinforcement learning. These models can learn from mixed-quality data without preference labels and have achieved exceptional performance comparable to ChatGPT. The developers of OpenChat are dedicated to creating a high-performance, commercially viable, open-source large language model and are continuously making progress towards this goal.

llava-hf/llava-1.5-7b-hf cover image
$0.34 / Mtoken
  • text-generation

LLaVa is a multimodal model that supports vision and language models combined.

stability-ai/sdxl cover image
$0.0005 / sec
  • text-to-image

SDXL consists of an ensemble of experts pipeline for latent diffusion: In a first step, the base model is used to generate (noisy) latents, which are then further processed with a refinement model (available here: https://huggingface.co/stabilityai/stable-diffusion-xl-refiner-1.0/) specialized for the final denoising steps. Note that the base model can be used as a standalone module.

openai/whisper-large cover image
$0.0005 / sec
  • automatic-speech-recognition

Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multi-task model that can perform multilingual speech recognition as well as speech translation and language identification.