Browse deepinfra models:

All categories and models you can try out and directly use in deepinfra:
Search

Category/question-answering

bert-large-uncased-whole-word-masking-finetuned-squad cover image
$0.0005 / sec
  • question-answering

A whole word masking model finetuned on SQuAD is a transformer-based language model pretrained on a large corpus of English data. The model was trained using a masked language modeling objective, where 15% of the tokens in a sentence were randomly masked, and the model had to predict the missing tokens. The model was also fine-tuned on the SQuAD dataset for question answering tasks, achieving high scores on both F1 and exact match metrics.

csarron/bert-base-uncased-squad-v1 cover image
$0.0005 / sec
  • question-answering

We present a fine-tuned BERT-base uncased model for question answering on the SQuAD v1 dataset. Our model achieves an exact match score of 80.9104 and an F1 score of 88.2302 without any hyperparameter search.

deepset/bert-large-uncased-whole-word-masking-squad2 cover image
$0.0005 / sec
  • question-answering

We present a BERT-based language model called bert-large-uncased-whole-word-masking-squad2, trained on the SQuAD2.0 dataset for extractive question answering. The model achieves high scores on exact match and F1 metrics.

deepset/minilm-uncased-squad2 cover image
$0.0005 / sec
  • question-answering

Microsoft's MiniLM-L12-H384-uncased language model achieved state-of-the-art results on the SQuAD 2.0 question-answering benchmark, with exact match and F1 scores of 76.13% and 79.54%, respectively. The model was trained on the SQuAD 2.0 dataset using a batch size of 12, learning rate of 4e-5, and 4 epochs. The authors suggest using their model as a starting point for building large language models for downstream NLP tasks.

deepset/roberta-base-squad2 cover image
$0.0005 / sec
  • question-answering

A pre-trained language model based on RoBERTa, fine-tuned on the SQuAD2.0 dataset for extractive question answering. It achieved scores of 79.87% exact match and 82.91% F1 score on the SQuAD2.0 dev set. Deepset is the company behind the open-source NLP framework Haystack, and offers other resources such as Distilled roberta-base-squad2, German BERT, and GermanQuAD datasets and models.

deepset/roberta-base-squad2-covid cover image
$0.0005 / sec
  • question-answering

We present a RoBERTa-based question answering model called roberta-base-squad2 for extractive QA on COVID-19 related texts. The model was trained on the SQuAD-style CORD-19 annotations and achieved promising results on 5-fold cross-validation.

deepset/roberta-large-squad2 cover image
$0.0005 / sec
  • question-answering

This is the roberta-large model, fine-tuned using the SQuAD2.0 dataset.

deepset/tinyroberta-squad2 cover image
$0.0005 / sec
  • question-answering

Deepset presents tinyroberta-squad2, a distilled version of their roberta-base-squad2 model that achieves similar performance while being faster. The model is trained on SQuAD 2.0 and uses Haystack's infrastructure with 4x Tesla V100 GPUs. It achieved 78.69% exact match and 81.92% F1 score on the SQuAD 2.0 dev set.

distilbert-base-cased-distilled-squad cover image
$0.0005 / sec
  • question-answering

The DistilBERT model is a small, fast, cheap, and lightweight Transformer model trained by distilling BERT base. It has 40% fewer parameters than the original BERT model and runs 60% faster, preserving over 95% of BERT's performance. The model was fine-tuned using knowledge distillation on the SQuAD v1.1 dataset and achieved a F1 score of 87.1 on the dev set.

distilbert-base-uncased-distilled-squad cover image
$0.0005 / sec
  • question-answering

DistilBERT is a small, fast, cheap and light Transformer model trained by distilling BERT base. It has 40% less parameters than bert-base-uncased, runs 60% faster while preserving over 95% of BERT's performances as measured on the GLUE language understanding benchmark. This model is a fine-tune checkpoint of DistilBERT-base-uncased, fine-tuned using (a second step of) knowledge distillation on SQuAD v1.1.

sultan/BioM-ELECTRA-Large-SQuAD2 cover image
$0.0005 / sec
  • question-answering

We fine-tuned BioM-ELECTRA-Large, which was pre-trained on PubMed Abstracts, on the SQuAD2.0 dataset. Fine-tuning the biomedical language model on the SQuAD dataset helps improve the score on the BioASQ challenge. If you plan to work with BioASQ or biomedical QA tasks, it's better to use this model over BioM-ELECTRA-Large. This model (TensorFlow version) took the lead in the BioASQ9b-Factoid challenge (Batch 5) under the name of (UDEL-LAB2).