Browse deepinfra models:

All categories and models you can try out and directly use in deepinfra:
Search

Category/all

sentence-transformers/paraphrase-MiniLM-L6-v2 cover image
512
$0.005 / Mtoken
  • embeddings

We present a sentence similarity model based on the Sentence Transformers architecture, which maps sentences to a 384-dimensional dense vector space. The model uses a pre-trained BERT encoder and applies mean pooling on top of the contextualized word embeddings to obtain sentence embeddings. We evaluate the model on the Sentence Embeddings Benchmark.

shibing624/text2vec-base-chinese cover image
512
$0.005 / Mtoken
  • embeddings

A sentence similarity model that can be used for various NLP tasks such as text classification, sentiment analysis, named entity recognition, question answering, and more. It utilizes the CoSENT architecture, which consists of a transformer encoder and a pooling module, to encode input texts into vectors that capture their semantic meaning. The model was trained on the nli_zh dataset and achieved high performance on various benchmark datasets.

smanjil/German-MedBERT cover image
$0.0005 / sec
  • fill-mask

This paper presents a fine-tuned German Medical BERT model for the medical domain, achieving improved performance on the NTS-ICD-10 text classification task. The model was trained using PyTorch and Hugging Face library on Colab GPU, with standard parameter settings and up to 25 epochs for classification. Evaluation results show significant improvement in micro precision, recall, and F1 score compared to the base German BERT model.

stabilityai/stable-diffusion-2-1 cover image
$0.0005 / sec
  • text-to-image

Stable Diffusion is a latent text-to-image diffusion model. Generate realistic images given text description

sultan/BioM-ELECTRA-Large-SQuAD2 cover image
$0.0005 / sec
  • question-answering

We fine-tuned BioM-ELECTRA-Large, which was pre-trained on PubMed Abstracts, on the SQuAD2.0 dataset. Fine-tuning the biomedical language model on the SQuAD dataset helps improve the score on the BioASQ challenge. If you plan to work with BioASQ or biomedical QA tasks, it's better to use this model over BioM-ELECTRA-Large. This model (TensorFlow version) took the lead in the BioASQ9b-Factoid challenge (Batch 5) under the name of (UDEL-LAB2).

thenlper/gte-base cover image
512
$0.005 / Mtoken
  • embeddings

The GTE models are trained by Alibaba DAMO Academy. They are mainly based on the BERT framework and currently offer three different sizes of models, including GTE-large, GTE-base, and GTE-small. The GTE models are trained on a large-scale corpus of relevance text pairs, covering a wide range of domains and scenarios. This enables the GTE models to be applied to various downstream tasks of text embeddings, including information retrieval, semantic textual similarity, text reranking, etc.

thenlper/gte-large cover image
512
$0.010 / Mtoken
  • embeddings

The GTE models are trained by Alibaba DAMO Academy. They are mainly based on the BERT framework and currently offer three different sizes of models, including GTE-large, GTE-base, and GTE-small. The GTE models are trained on a large-scale corpus of relevance text pairs, covering a wide range of domains and scenarios. This enables the GTE models to be applied to various downstream tasks of text embeddings, including information retrieval, semantic textual similarity, text reranking, etc.

uer/albert-base-chinese-cluecorpussmall cover image
$0.0005 / sec
  • fill-mask

We present a Chinese version of the popular BERT-based language model, called Albert, which is trained on the ClueCorpusSmall dataset using the UER-py toolkit. Our model achieves state-of-the-art results on various NLP tasks. The model is trained in two stages, first with a sequence length of 128 and then with a sequence length of 512.

uwulewd/custom-diffusion cover image
$0.0005 / sec
  • text-to-image

Stable diffusion with the ability to change checkpoint, still wip.

xlm-roberta-base cover image
$0.0005 / sec
  • fill-mask

The XLM-RoBERTa model is a multilingual version of RoBERTa, pre-trained on 2.5TB of filtered CommonCrawl data containing 100 languages. It was introduced in the paper "Unsupervised Cross-lingual Representation Learning at Scale" by Conneau et al. and first released in this repository. The model learns an inner representation of 100 languages that can be used to extract features useful for downstream tasks.