We use essential cookies to make our site work. With your consent, we may also use non-essential cookies to improve user experience and analyze website traffic…

Browse deepinfra models:

All categories and models you can try out and directly use in deepinfra:

Viewing all

featured

text-generation

automatic-speech-recognition

text-to-speech

embeddings

text-to-video

text-to-image

reranker

zero-shot-image-classification

multimodal

Category/embeddings

Embeddings are a crucial concept in AI and NLP, used to represent categorical data as continuous vectors in a high-dimensional space. They are particularly useful for working with text data, capturing the semantic meanings, syntax, and context of words and phrases. By converting sparse one-hot encoded vectors into dense vectors, embeddings help machine learning models learn meaningful relationships between categories.

Embeddings are not limited to text data; they can also be used for other types of categorical data such as images, audio, and video. Using embeddings for these data types captures relationships between different objects or sounds, improving machine learning models' performance.

Incorporating embeddings into AI models can significantly enhance their accuracy, making them an essential tool for data scientists and AI practitioners. By understanding embeddings, you can unlock new possibilities in your AI projects and create more sophisticated and accurate models.

512

$0.005 / Mtoken

BAAI/

bge-base-en-v1.5

embeddings

BGE embedding is a general Embedding Model. It is pre-trained using retromae and trained on large-scale pair data using contrastive learning. Note that the goal of pre-training is to reconstruct the text, and the pre-trained model cannot be used for similarity calculation directly, it needs to be fine-tuned

$0.010 / Mtoken

BAAI/

bge-en-icl

embeddings

A LLM-based embedding model with in-context learning capabilities that achieves SOTA performance on BEIR and AIR-Bench. It leverages few-shot examples to enhance task performance.

512

$0.010 / Mtoken

BAAI/

bge-large-en-v1.5

embeddings

fp32

$0.010 / Mtoken

BAAI/

bge-m3

embeddings

BGE-M3 is a versatile text embedding model that supports multi-functionality, multi-linguality, and multi-granularity, allowing it to perform dense retrieval, multi-vector retrieval, and sparse retrieval in over 100 languages and with input sizes up to 8192 tokens. The model can be used in a retrieval pipeline with hybrid retrieval and re-ranking to achieve higher accuracy and stronger generalization capabilities. BGE-M3 has shown state-of-the-art performance on several benchmarks, including MKQA, MLDR, and NarritiveQA, and can be used as a drop-in replacement for other embedding models like DPR and BGE-v1.5.

$0.010 / Mtoken

BAAI/

bge-m3-multi

embeddings

BGE-M3 is a multilingual text embedding model developed by BAAI, distinguished by its Multi-Linguality (supporting 100+ languages), Multi-Functionality (unified dense, multi-vector, and sparse retrieval), and Multi-Granularity (handling inputs from short queries to 8192-token documents). It achieves state-of-the-art retrieval performance across diverse benchmarks while maintaining a single model for multiple retrieval modes.

32k

$0.002 / Mtoken

Qwen/

Qwen3-Embedding-0.6B

embeddings

The Qwen3 Embedding model series is the latest proprietary model of the Qwen family, specifically designed for text embedding and ranking tasks. Building upon the dense foundational models of the Qwen3 series, it provides a comprehensive range of text embeddings and reranking models in various sizes (0.6B, 4B, and 8B).

32k

$0.005 / Mtoken

Qwen/

Qwen3-Embedding-4B

embeddings

32k

$0.010 / Mtoken

Qwen/

Qwen3-Embedding-8B

embeddings

512

$0.005 / Mtoken

intfloat/

e5-base-v2

embeddings

Text Embeddings by Weakly-Supervised Contrastive Pre-training. Model has 24 layers and 1024 out dim.

512

$0.010 / Mtoken

intfloat/

e5-large-v2

embeddings

Text Embeddings by Weakly-Supervised Contrastive Pre-training. Model has 24 layers and 1024 out dim.

fp32

512

$0.010 / Mtoken

intfloat/

multilingual-e5-large

embeddings

The Multilingual-E5-large model is a 24-layer text embedding model with an embedding size of 1024, trained on a mixture of multilingual datasets and supporting 100 languages.

512

$0.0005 / sec

intfloat/

multilingual-e5-large-instruct

embeddings

The Multilingual-E5 models, initialized from XLM-RoBERTa, support up to 512 tokens per input — any longer text will be silently truncated. To ensure optimal performance, always prefix inputs with “query:” or “passage:”, as the model was explicitly trained with this format.

512

$0.005 / Mtoken

sentence-transformers/

all-MiniLM-L12-v2

embeddings

We present a sentence transformation model that generates semantically similar sentences. Our model is based on the Sentence-Transformers architecture and was trained on a large dataset of sentence pairs. We evaluate the effectiveness of our model by measuring its ability to generate similar sentences that are close to the original sentence in meaning.

512

$0.005 / Mtoken

sentence-transformers/

all-MiniLM-L6-v2

embeddings

We present a sentence transformation model that achieves state-of-the-art results on various NLP tasks without requiring task-specific architectures or fine-tuning. Our approach leverages contrastive learning and utilizes a variety of datasets to learn robust sentence representations. We evaluate our model on several benchmarks and demonstrate its effectiveness in various applications such as text classification, sentiment analysis, named entity recognition, and question answering.

512

$0.005 / Mtoken

sentence-transformers/

all-mpnet-base-v2

embeddings

A sentence transformation model that has been trained on a wide range of datasets, including but not limited to S2ORC, WikiAnwers, PAQ, Stack Exchange, and Yahoo! Answers. Our model can be used for various NLP tasks such as clustering, sentiment analysis, and question answering.

$0.005 / Mtoken

sentence-transformers/

clip-ViT-B-32

embeddings

The CLIP model maps text and images to a shared vector space, enabling various applications such as image search, zero-shot image classification, and image clustering. The model can be used easily after installation, and its performance is demonstrated through zero-shot ImageNet validation set accuracy scores. Multilingual versions of the model are also available for 50+ languages.

512

$0.005 / Mtoken

sentence-transformers/

clip-ViT-B-32-multilingual-v1

embeddings

This model is a multilingual version of the OpenAI CLIP-ViT-B32 model, which maps text and images to a common dense vector space. It includes a text embedding model that works for 50+ languages and an image encoder from CLIP. The model was trained using Multilingual Knowledge Distillation, where a multilingual DistilBERT model was trained as a student model to align the vector space of the original CLIP image encoder across many languages.

512

$0.005 / Mtoken

sentence-transformers/

multi-qa-mpnet-base-dot-v1

embeddings

We present a sentence transformation model that maps sentences and paragraphs to a 768-dimensional dense vector space, suitable for semantic search tasks. The model is trained on 215 million question-answer pairs from various sources, including WikiAnswers, PAQ, Stack Exchange, MS MARCO, GOOAQ, Amazon QA, Yahoo Answers, Search QA, ELI5, and Natural Questions. Our model uses a contrastive learning objective.

Unlock the most affordable AI hosting

Run models at scale with our fully managed GPU infrastructure, delivering enterprise-grade uptime at the industry's best rates.

Contact Sales Get Started

Latest Models

Gryphe/

MythoMax-L2-13b

Phind/

Phind-CodeLlama-34B-v2

bigcode/

starcoder2-15b

openchat/

openchat_3.5

openai/

whisper-tiny

Featured Models

zai-org/

GLM-4.5-Air

meta-llama/

Llama-4-Maverick-17B-128E-Instruct-Turbo

mistralai/

Mistral-Small-3.2-24B-Instruct-2506

moonshotai/

Kimi-K2-Instruct

mistralai/

Voxtral-Small-24B-2507

deepseek-ai/

DeepSeek-V3-0324-Turbo

Company

Pricing

Docs

Compare

DeepStart

About

Careers

Trust Center

Privacy

Terms

Have questions or need a custom solution?

Contact Sales