Browse deepinfra models:

All categories and models you can try out and directly use in deepinfra:
Search

Category/all

hfl/chinese-bert-wwm-ext cover image
$0.0005 / sec
  • fill-mask

Chinese pre-trained BERT with Whole Word Masking, which can be used for various NLP tasks such as question answering, sentiment analysis, named entity recognition, etc. This work is based on the original BERT model but with additional whole word masking techniques to improve its performance on out-of-vocabulary words.

hfl/chinese-roberta-wwm-ext cover image
$0.0005 / sec
  • fill-mask

We present Chinese pre-trained BERT with Whole Word Masking, which is an extension of the original BERT model tailored for Chinese natural language processing tasks. This variant uses whole word masking instead of subword tokenization to improve performance on out-of-vocabulary words and enhance language understanding capabilities.

huggingface/CodeBERTa-small-v1 cover image
$0.0005 / sec
  • fill-mask

CodeBERTa is a RoBERTa-like model trained on the CodeSearchNet dataset from GitHub. Supported languages: go, java, javascript, php, python, ruby.

hustvl/yolos-base cover image
$0.0005 / sec
  • object-detection

Vision Transformer (ViT) trained for object detection

hustvl/yolos-small cover image
$0.0005 / sec
  • object-detection

The YOLOS model, a Vision Transformer (ViT) trained using the DETR loss, achieves 42 AP on COCO validation 2017, similar to DETR and more complex frameworks like Faster R-CNN, despite its small size. The model uses a bipartite matching loss to compare predicted classes and bounding boxes to ground truth annotations. Fine-tuned on COCO 2017 object detection dataset consisting of 118k/5k annotated images for training/validation, it achieved 36.1 AP on validation set.

hustvl/yolos-tiny cover image
$0.0005 / sec
  • object-detection

The YOLOS model, a Vision Transformer (ViT) trained using the DETR loss, achieved an AP of 28.7 on COCO 2017 validation, outperforming other state-of-the-art object detection models while being much simpler. It was pre-trained on ImageNet-1k and fine-tuned on COCO 2017 object detection, a dataset consisting of 118k/5k annotated images for training/validation respectively. The model uses a bipartite matching loss and is trained using standard cross-entropy and a linear combination of the L1 and generalized IoU loss.

intfloat/e5-base-v2 cover image
512
$0.005 / Mtoken
  • embeddings

Text Embeddings by Weakly-Supervised Contrastive Pre-training. Model has 24 layers and 1024 out dim.

intfloat/e5-large-v2 cover image
512
$0.010 / Mtoken
  • embeddings

Text Embeddings by Weakly-Supervised Contrastive Pre-training. Model has 24 layers and 1024 out dim.

jackaduma/SecBERT cover image
$0.0005 / sec
  • fill-mask

SecBERT is a pretrained language model for cyber security text, trained on a dataset of papers from various sources, including APTnotes, Stucco-Data, and CASIE. The model has its own wordpiece vocabulary, secvocab, and is available in two versions, SecBERT and SecRoBERTa. The model can improve downstream tasks such as NER, text classification, semantic understanding, and Q&A in the cyber security domain.

klue/bert-base cover image
$0.0005 / sec
  • fill-mask

The KLUE BERT base is a pre-trained BERT model on Korean Language. It was developed by the Facebook AI Research Lab and is licensed under cc-by-sa-4.0. The model can be used for various tasks like topic classification, semantic textual similarity, natural language inference, named entity recognition, and others.

meta-llama/Llama-2-13b-chat-hf cover image
4k
$0.22 / Mtoken
  • text-generation

Llama 2 is a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 70 billion parameters. This is the repository for the 7B fine-tuned model, optimized for dialogue use cases and converted for the Hugging Face Transformers format.

microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext cover image
$0.0005 / sec
  • fill-mask

PubMedBERT is a pretrained language model specifically designed for biomedical natural language processing tasks. It was trained from scratch using abstracts and full-text articles from PubMed and PubMedCentral, and achieved state-of-the-art performance on various biomedical NLP tasks.

microsoft/beit-base-patch16-224-pt22k-ft22k cover image
$0.0005 / sec
  • image-classification

The BEiT model is a Vision Transformer (ViT) pre-trained on ImageNet-21k, a dataset of 14 million images and 21,841 classes, using a self-supervised approach. The model was fine-tuned on the same dataset and achieved state-of-the-art performance on various image classification benchmarks. The BEiT model uses relative position embeddings and mean-pools the final hidden states of the patch embeddings for classification.

microsoft/codebert-base-mlm cover image
$0.0005 / sec
  • fill-mask

A pre-trained language model designed to handle both programming languages and natural languages. With a multi-task learning framework that includes masked language modeling, next sentence prediction, and replaced token detection, CodeBERT achieves state-of-the-art results on various code understanding tasks while also performing well on natural language processing benchmarks. We analyze the effects of different design choices and provide insights into the behavior of CodeBERT, demonstrating its potential as a versatile tool for a wide range of applications involving both coding and natural language understanding.

microsoft/deberta-base cover image
$0.0005 / sec
  • fill-mask

DeBERTa is a variant of BERT that uses disentangled attention and an enhanced mask decoder to improve performance on natural language understanding (NLU) tasks. In a study, DeBERTa outperformed BERT and RoBERTa on most NLU tasks with only 80GB of training data. The model showed particularly strong results on the SQuAD 1.1/2.0 and MNLI tasks.

microsoft/deberta-v2-xlarge cover image
$0.0005 / sec
  • fill-mask

DeBERTa (Decoding-Enhanced BERT with Disentangled Attention) is a novel language model that improves upon BERT and RoBERTa using disentangled attention and enhanced mask decoding. It achieves state-of-the-art results on various NLU tasks while requiring less computational resources than its predecessors.

microsoft/deberta-v3-base cover image
$0.0005 / sec
  • fill-mask

DeBERTaV3 is an improved version of the DeBERTa model that uses ELECTRA-style pre-training with gradient-disentangled embedding sharing. The new model significantly improves performance on downstream tasks compared to DeBERTa, and achieves state-of-the-art results on SQuAD 2.0 and MNLI tasks. DeBERTaV3 has a hidden size of 768 and 86 million backbone parameters, and was trained using a vocabulary of 128K tokens.

microsoft/resnet-50 cover image
$0.0005 / sec
  • image-classification

Resnet model pre-trained on ImageNet-1k at resolution 224x224 for image classification