Browse deepinfra models:

All categories and models you can try out and directly use in deepinfra:
Search

Category/all

Jean-Baptiste/roberta-large-ner-english cover image
$0.0005 / sec
  • token-classification

We present a fine-tuned RoBERTa model for English named entity recognition, achieving high performance on both formal and informal datasets. Our approach uses a simplified version of the CONLL2003 dataset and removes unnecessary prefixes for improved efficiency. The resulting model shows superiority over other models, especially on entities that do not begin with uppercase letters, and can be used for various applications such as email signature detection.

KB/bert-base-swedish-cased cover image
$0.0005 / sec
  • fill-mask

The National Library of Sweden has released three pre-trained language models based on BERT and ALBERT for Swedish text. The models include a BERT base model, a BERT fine-tuned for named entity recognition, and an experimental ALBERT model. They were trained on approximately 15-20 GB of text data from various sources such as books, news, government publications, Swedish Wikipedia, and internet forums.

Lykon/DreamShaper cover image
$0.0005 / sec
  • text-to-image

DreamShaper started as a model to have an alternative to MidJourney in the open source world. I didn't like how MJ was handled back when I started and how closed it was and still is, as well as the lack of freedom it gives to users compared to SD. Look at all the tools we have now from TIs to LoRA, from ControlNet to Latent Couple. We can do anything. The purpose of DreamShaper has always been to make "a better Stable Diffusion", a model capable of doing everything on its own, to weave dreams.

Phind/Phind-CodeLlama-34B-v2 cover image
4k
$0.60 / Mtoken
  • text-generation

Phind-CodeLlama-34B-v2 is an open-source language model that has been fine-tuned on 1.5B tokens of high-quality programming-related data and achieved a pass@1 rate of 73.8% on HumanEval. It is multi-lingual and proficient in Python, C/C++, TypeScript, Java, and more. It has been trained on a proprietary dataset of instruction-answer pairs instead of code completion examples. The model is instruction-tuned on the Alpaca/Vicuna format to be steerable and easy-to-use. It accepts the Alpaca/Vicuna instruction format and can generate one completion for each prompt.

ProsusAI/finbert cover image
$0.0005 / sec
  • text-classification

FinBERT is a pre-trained NLP model for financial sentiment analysis, built by fine-tuning the BERT language model on a large financial corpus. The model provides softmax outputs for three labels: positive, negative, or neutral.

Rostlab/prot_bert cover image
$0.0005 / sec
  • fill-mask

A pre-trained language model developed specifically for protein sequences using a masked language modeling (MLM) objective. It achieved impressive results when fine-tuned on downstream tasks such as secondary structure prediction and sub-cellular localization. The model was trained on uppercase amino acids only and used a vocabulary size of 21, with inputs of the form "[CLS] Protein Sequence A [SEP] Protein Sequence B [SEP]"

Rostlab/prot_bert_bfd cover image
$0.0005 / sec
  • fill-mask

A pretrained language model on protein sequences using a masked language modeling objective. It achieved high scores on various downstream tasks such as secondary structure prediction and localization. The model was trained on a large corpus of protein sequences in a self-supervised fashion, without human labeling, using a combination of a Bert model and a vocabulary size of 21.

Salesforce/codegen-16B-mono cover image
2k
$0.0005 / sec
  • text-generation

CodeGen is a family of autoregressive language models for program synthesis, trained on a Python programming language dataset. The models are capable of extracting features from given natural language and programming language texts, and calculating the likelihood of them. They are intended for and best at program synthesis, that is, generating executable code given English prompts. The evaluation results show that CodeGen achieves state-of-the-art performance on two code generation benchmarks, HumanEval and MTPB.

XpucT/Deliberate cover image
$0.0005 / sec
  • text-to-image

The Deliberate Model allows for the creation of anything desired, with the potential for better results as the user's knowledge and detail in the prompt increase. The model is ideal for meticulous anatomy artists, creative prompt writers, art designers, and those seeking explicit content.

albert-base-v1 cover image
$0.0005 / sec
  • fill-mask

The ALBERT model is a transformer-based language model developed by Google researchers, designed for self-supervised learning of language representations. The model uses a combination of masked language modeling and sentence order prediction objectives, trained on a large corpus of English text data. Fine-tuning the model on specific downstream tasks can lead to improved performance, and various pre-trained versions are available for different NLP tasks.

albert-base-v2 cover image
$0.0005 / sec
  • fill-mask

The ALBERT model is a variant of BERT that uses a larger model size and more training data to improve performance on downstream NLP tasks. It was trained on a combination of BookCorpus and English Wikipedia, and achieved state-of-the-art results on several benchmark datasets. Fine-tuning ALBERT on specific tasks can further improve its performance.

aubmindlab/bert-base-arabertv02 cover image
$0.0005 / sec
  • fill-mask

An Arabic pretrained language model based on Google's BERT architecture, with two versions: AraBERTv1 and AraBERTv2. It uses the same BERT-Base configuration and is trained on a large dataset of 200 million words, including OSCAR-unshuffled, Arabic Wikipedia, and Assafir news articles. The model is available in TensorFlow 1.x and Hugging Face models repository.

bert-base-cased cover image
$0.0005 / sec
  • fill-mask

A transformer-based language model developed by Google Research that achieved state-of-the-art results on a wide range of NLP tasks. The model was pre-trained on a large corpus of English text, including BookCorpus and English Wikipedia, using a masked language modeling objective. Fine-tuned versions of the model are available for various downstream tasks, and the model has been shown to achieve excellent results on tasks such as question answering, sentiment analysis, and named entity recognition.

bert-base-chinese cover image
$0.0005 / sec
  • fill-mask

A pre-trained language model developed by the HuggingFace team for the Chinese language. It uses a fill-mask approach and has been trained on a large corpus of Chinese text data. The model can be used for various natural language processing tasks such as masked language modeling and has been shown to achieve state-of-the-art results in certain benchmarks. However, like other language models, it also comes with risks, limitations, and biases, including perpetuating harmful stereotypes and biases present in the data it was trained on. Users are advised to carefully evaluate and mitigate these risks when using the model.

bert-base-german-cased cover image
$0.0005 / sec
  • fill-mask

A pre-trained language model developed using Google's TensorFlow code and trained on a single cloud TPU v2. The model was trained for 810k steps with a batch size of 1024 and sequence length of 128, and then fine-tuned for 30k steps with sequence length of 512. The authors used a variety of data sources, including German Wikipedia, OpenLegalData, and news articles, and employed spacy v2.1 for data cleaning and segmentation. The model achieved good performance on various downstream tasks, such as germEval18Fine, germEval18coarse, germEval14, CONLL03, and 10kGNAD, without extensive hyperparameter tuning. Additionally, the authors found that even a randomly initialized BERT can achieve good performance when trained exclusively on labeled downstream datasets.

bert-base-multilingual-cased cover image
$0.0005 / sec
  • fill-mask

A pre-trained multilingual model that uses a masked language modeling objective to learn a bidirectional representation of languages. It was trained on 104 languages with the largest Wikipedias, and its inputs are in the form of [CLS] Sentence A [SEP] Sentence B [SEP]. The model is primarily aimed at being fine-tuned on tasks that use the whole sentence, potentially masked, to make decisions.

bert-base-multilingual-uncased cover image
$0.0005 / sec
  • fill-mask

A transformer-based language model trained on 102 languages with the largest Wikipedia. It was introduced in a research paper by Google Research and has been widely used for various natural language processing tasks. The model is trained using a masked language modeling objective, where 15% of the tokens are masked, and the model predicts the missing tokens.

bert-base-uncased cover image
$0.0005 / sec
  • fill-mask

A transformers model pretrained on a large corpus of English data in a self-supervised fashion. It was trained on BookCorpus, a dataset consisting of 11,038 unpublished books, and English Wikipedia, excluding lists, tables, and headers. The model learns an inner representation of the English language that can then be used to extract features useful for downstream tasks.