Browse deepinfra models:

All categories and models you can try out and directly use in deepinfra:
Search

Category/all

sentence-transformers/all-mpnet-base-v2 cover image
512
$0.005 / Mtoken
  • embeddings

A sentence transformation model that has been trained on a wide range of datasets, including but not limited to S2ORC, WikiAnwers, PAQ, Stack Exchange, and Yahoo! Answers. Our model can be used for various NLP tasks such as clustering, sentiment analysis, and question answering.

sentence-transformers/clip-ViT-B-32 cover image
512
$0.005 / Mtoken
  • embeddings

The CLIP model maps text and images to a shared vector space, enabling various applications such as image search, zero-shot image classification, and image clustering. The model can be used easily after installation, and its performance is demonstrated through zero-shot ImageNet validation set accuracy scores. Multilingual versions of the model are also available for 50+ languages.

sentence-transformers/clip-ViT-B-32-multilingual-v1 cover image
512
$0.005 / Mtoken
  • embeddings

This model is a multilingual version of the OpenAI CLIP-ViT-B32 model, which maps text and images to a common dense vector space. It includes a text embedding model that works for 50+ languages and an image encoder from CLIP. The model was trained using Multilingual Knowledge Distillation, where a multilingual DistilBERT model was trained as a student model to align the vector space of the original CLIP image encoder across many languages.

sentence-transformers/multi-qa-mpnet-base-dot-v1 cover image
512
$0.005 / Mtoken
  • embeddings

We present a sentence transformation model that maps sentences and paragraphs to a 768-dimensional dense vector space, suitable for semantic search tasks. The model is trained on 215 million question-answer pairs from various sources, including WikiAnswers, PAQ, Stack Exchange, MS MARCO, GOOAQ, Amazon QA, Yahoo Answers, Search QA, ELI5, and Natural Questions. Our model uses a contrastive learning objective.

sentence-transformers/paraphrase-MiniLM-L6-v2 cover image
512
$0.005 / Mtoken
  • embeddings

We present a sentence similarity model based on the Sentence Transformers architecture, which maps sentences to a 384-dimensional dense vector space. The model uses a pre-trained BERT encoder and applies mean pooling on top of the contextualized word embeddings to obtain sentence embeddings. We evaluate the model on the Sentence Embeddings Benchmark.

shibing624/text2vec-base-chinese cover image
512
$0.005 / Mtoken
  • embeddings

A sentence similarity model that can be used for various NLP tasks such as text classification, sentiment analysis, named entity recognition, question answering, and more. It utilizes the CoSENT architecture, which consists of a transformer encoder and a pooling module, to encode input texts into vectors that capture their semantic meaning. The model was trained on the nli_zh dataset and achieved high performance on various benchmark datasets.

stabilityai/stable-diffusion-2-1 cover image
$0.0005 / sec
  • text-to-image

Stable Diffusion is a latent text-to-image diffusion model. Generate realistic images given text description

thenlper/gte-base cover image
512
$0.005 / Mtoken
  • embeddings

The GTE models are trained by Alibaba DAMO Academy. They are mainly based on the BERT framework and currently offer three different sizes of models, including GTE-large, GTE-base, and GTE-small. The GTE models are trained on a large-scale corpus of relevance text pairs, covering a wide range of domains and scenarios. This enables the GTE models to be applied to various downstream tasks of text embeddings, including information retrieval, semantic textual similarity, text reranking, etc.

thenlper/gte-large cover image
512
$0.010 / Mtoken
  • embeddings

The GTE models are trained by Alibaba DAMO Academy. They are mainly based on the BERT framework and currently offer three different sizes of models, including GTE-large, GTE-base, and GTE-small. The GTE models are trained on a large-scale corpus of relevance text pairs, covering a wide range of domains and scenarios. This enables the GTE models to be applied to various downstream tasks of text embeddings, including information retrieval, semantic textual similarity, text reranking, etc.

uwulewd/custom-diffusion cover image
$0.0005 / sec
  • text-to-image

Stable diffusion with the ability to change checkpoint, still wip.