Browse deepinfra models:

All categories and models you can try out and directly use in deepinfra:
Search

Category/all

databricks/dbrx-instruct cover image
bfloat16
32k
Replaced
  • text-generation

DBRX is an open source LLM created by Databricks. It uses mixture-of-experts (MoE) architecture with 132B total parameters of which 36B parameters are active on any input. It outperforms existing open source LLMs like Llama 2 70B and Mixtral-8x7B on standard industry benchmarks for language understanding, programming, math, and logic.

deepinfra/airoboros-70b cover image
fp16
4k
Replaced
  • text-generation

Latest version of the Airoboros model fine-tunned version of llama-2-70b using the Airoboros dataset. This model is currently running jondurbin/airoboros-l2-70b-2.2.1

google/codegemma-7b-it cover image
fp16
8k
Replaced
  • text-generation

CodeGemma is a collection of lightweight open code models built on top of Gemma. CodeGemma models are text-to-text and text-to-code decoder-only models and are available as a 7 billion pretrained variant that specializes in code completion and code generation tasks, a 7 billion parameter instruction-tuned variant for code chat and instruction following and a 2 billion parameter pretrained variant for fast code completion.

intfloat/e5-base-v2 cover image
512
$0.005 / Mtoken
  • embeddings

Text Embeddings by Weakly-Supervised Contrastive Pre-training. Model has 24 layers and 1024 out dim.

intfloat/e5-large-v2 cover image
512
$0.010 / Mtoken
  • embeddings

Text Embeddings by Weakly-Supervised Contrastive Pre-training. Model has 24 layers and 1024 out dim.

intfloat/multilingual-e5-large cover image
fp32
512
$0.010 / Mtoken
  • embeddings

The Multilingual-E5-large model is a 24-layer text embedding model with an embedding size of 1024, trained on a mixture of multilingual datasets and supporting 100 languages. The model achieves state-of-the-art results on the Mr. TyDi benchmark, outperforming other models such as BM25 and mDPR. The model is intended for use in text retrieval and semantic similarity tasks, and should be used with the "query: " and "passage: " prefixes for input texts to achieve optimal performance.

meta-llama/Llama-2-13b-chat-hf cover image
fp16
4k
Replaced
  • text-generation

Llama 2 is a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 70 billion parameters. This is the repository for the 7B fine-tuned model, optimized for dialogue use cases and converted for the Hugging Face Transformers format.

meta-llama/Llama-2-70b-chat-hf cover image
fp16
4k
Replaced
  • text-generation

LLaMa 2 is a collections of LLMs trained by Meta. This is the 70B chat optimized version. This endpoint has per token pricing.

meta-llama/Llama-2-7b-chat-hf cover image
fp16
4k
Replaced
  • text-generation

Llama 2 is a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 70 billion parameters. This is the repository for the 7B fine-tuned model, optimized for dialogue use cases and converted for the Hugging Face Transformers format.

mistralai/Mistral-7B-Instruct-v0.1 cover image
fp16
32k
$0.06 / Mtoken
  • text-generation

The Mistral-7B-Instruct-v0.1 Large Language Model (LLM) is a instruct fine-tuned version of the Mistral-7B-v0.1 generative text model using a variety of publicly available conversation datasets.

mistralai/Mistral-7B-Instruct-v0.2 cover image
fp16
32k
$0.06 / Mtoken
  • text-generation

The Mistral-7B-Instruct-v0.2 Large Language Model (LLM) is a instruct fine-tuned version of the Mistral-7B-v0.2 generative text model using a variety of publicly available conversation datasets.

mistralai/Mixtral-8x22B-v0.1 cover image
fp16
64k
Replaced
  • text-generation

Mixtral-8x22B is the latest and largest mixture of expert large language model (LLM) from Mistral AI. This is state of the art machine learning model using a mixture 8 of experts (MoE) 22b models. During inference 2 expers are selected. This architecture allows large models to be fast and cheap at inference. This model is not instruction tuned.

openai/clip-features cover image
$0.0005 / sec
  • custom

Return CLIP features for the clip-vit-large-patch14 model

openai/clip-vit-base-patch32 cover image
$0.0005 / sec
  • zero-shot-image-classification

The CLIP model was developed by OpenAI to investigate the robustness of computer vision models. It uses a Vision Transformer architecture and was trained on a large dataset of image-caption pairs. The model shows promise in various computer vision tasks but also has limitations, including difficulties with fine-grained classification and potential biases in certain applications.

openai/clip-vit-large-patch14-336 cover image
$0.0005 / sec
  • zero-shot-image-classification

A zero-shot-image-classification model released by OpenAI. The clip-vit-large-patch14-336 model was trained from scratch on an unknown dataset and achieves unspecified results on the evaluation set. The model's intended uses and limitations, as well as its training and evaluation data, are not provided. The training procedure used an unknown optimizer and precision, and the framework versions included Transformers 4.21.3, TensorFlow 2.8.2, and Tokenizers 0.12.1.

openai/whisper-base cover image
$0.0005 / sec
  • automatic-speech-recognition

Whisper is a pre-trained model for automatic speech recognition (ASR) and speech translation. It was trained on 680k hours of labelled data and demonstrates a strong ability to generalize to many datasets and domains without fine-tuning. The model is based on a Transformer encoder-decoder architecture. Whisper models are available for various languages including English, Spanish, French, German, Italian, Portuguese, Russian, Chinese, Japanese, Korean, and many more.

openai/whisper-base.en cover image
$0.0005 / sec
  • automatic-speech-recognition

Whisper is a pre-trained model for automatic speech recognition (ASR) and speech translation. It was trained on 680k hours of labelled data and demonstrated a strong ability to generalise to many datasets and domains without fine-tuning. Whisper checks pens are available in five configurations of varying model sizes, including a smallest configuration trained on English-only data and a largest configuration trained on multilingual data. This one is English-only.