fill-mask
PubMedBERT is a pretrained language model specifically designed for biomedical natural language processing tasks. It was trained from scratch using abstracts and full-text articles from PubMed and PubMedCentral, and achieved state-of-the-art performance on various biomedical NLP tasks.
image-classification
The BEiT model is a Vision Transformer (ViT) pre-trained on ImageNet-21k, a dataset of 14 million images and 21,841 classes, using a self-supervised approach. The model was fine-tuned on the same dataset and achieved state-of-the-art performance on various image classification benchmarks. The BEiT model uses relative position embeddings and mean-pools the final hidden states of the patch embeddings for classification.
fill-mask
A pre-trained language model designed to handle both programming languages and natural languages. With a multi-task learning framework that includes masked language modeling, next sentence prediction, and replaced token detection, CodeBERT achieves state-of-the-art results on various code understanding tasks while also performing well on natural language processing benchmarks. We analyze the effects of different design choices and provide insights into the behavior of CodeBERT, demonstrating its potential as a versatile tool for a wide range of applications involving both coding and natural language understanding.
fill-mask
DeBERTa is a variant of BERT that uses disentangled attention and an enhanced mask decoder to improve performance on natural language understanding (NLU) tasks. In a study, DeBERTa outperformed BERT and RoBERTa on most NLU tasks with only 80GB of training data. The model showed particularly strong results on the SQuAD 1.1/2.0 and MNLI tasks.
fill-mask
DeBERTa (Decoding-Enhanced BERT with Disentangled Attention) is a novel language model that improves upon BERT and RoBERTa using disentangled attention and enhanced mask decoding. It achieves state-of-the-art results on various NLU tasks while requiring less computational resources than its predecessors.
fill-mask
DeBERTaV3 is an improved version of the DeBERTa model that uses ELECTRA-style pre-training with gradient-disentangled embedding sharing. The new model significantly improves performance on downstream tasks compared to DeBERTa, and achieves state-of-the-art results on SQuAD 2.0 and MNLI tasks. DeBERTaV3 has a hidden size of 768 and 86 million backbone parameters, and was trained using a vocabulary of 128K tokens.
image-classification
Resnet model pre-trained on ImageNet-1k at resolution 224x224 for image classification
text-generation
The Mistral-7B-Instruct-v0.1 Large Language Model (LLM) is a instruct fine-tuned version of the Mistral-7B-v0.1 generative text model using a variety of publicly available conversation datasets.
text-generation
Mixtral-8x22B is the latest and largest mixture of expert large language model (LLM) from Mistral AI. This is state of the art machine learning model using a mixture 8 of experts (MoE) 22b models. During inference 2 expers are selected. This architecture allows large models to be fast and cheap at inference. This model is not instruction tuned.
token-classification
The German BERT model was fine-tuned on the Legal-Entity-Recognition dataset for the named entity recognition (NER) task, achieving an F1 score of 85.67% on the evaluation set. The model uses a pre-trained BERT base model and is trained with a provided script from Hugging Face. The labels covered include various types of legal entities, such as companies, organizations, and individuals.
token-classification
This paper presents a fine-tuned Spanish BERT model (BETO) for the Named Entity Recognition (NER) task. The model was trained on the CONLL Corpora ES dataset and achieved an F1 score of 90.17%. The authors also compared their model with other state-of-the-art models, including a multilingual BERT and a TinyBERT model, and demonstrated its effectiveness in identifying entities in Spanish text.
fill-mask
The SPLADE CoCondenser EnsembleDistil model is a passage retrieval system based on sparse neural IR models, which achieves state-of-the-art performance on MS MARCO dev dataset with MRR@10 of 38.3 and R@1000 of 98.3. The model uses a combination of distillation and hard negative sampling techniques to improve its effectiveness.
fill-mask
A pretrained BERT model for Brazilian Portuguese that achieves state-of-the-art performances on three downstream NLP tasks: Named Entity Recognition, Sentence Textual Similarity and Recognizing Textual Entailment. The model is available in two sizes: Base and Large, and can be used for various NLP tasks such as masked language modeling and embedding generation.
fill-mask
BERTimbau Large is a pretrained BERT model for Brazilian Portuguese that achieves state-of-the-art performances on three downstream NLP tasks. It is available in two sizes: Base and Large. The model can be used for various NLP tasks such as masked language modeling prediction, and BERT embeddings.
object-detection
We present a fine-tuned YOLOs model for license plate detection, which achieved an AP of 47.9 on the test set. Our model was trained for 200 epochs on a single GPU using Google Colab, utilizing the DETR loss and a base-sized YOLOs model. We evaluated our model on various IoU thresholds and obtained an average recall of 0.676 at IoU=0.50:0.95, with location being medium.
fill-mask
LEGAL-BERT is a family of BERT models for the legal domain, designed to assist legal NLP research, computational law, and legal technology applications. It includes five variants, including LEGAL-BERT-BASE, which achieved better performance than other models on several downstream tasks. The authors suggest possible applications, such as developing question answering systems for databases, ontologies, document collections, and the web; natural language generation from databases and ontologies; text classification; information extraction and opinion mining; and machine learning in natural language processing.