microsoft/deberta-v3-base cover image

microsoft/deberta-v3-base

DeBERTaV3 is an improved version of the DeBERTa model that uses ELECTRA-style pre-training with gradient-disentangled embedding sharing. The new model significantly improves performance on downstream tasks compared to DeBERTa, and achieves state-of-the-art results on SQuAD 2.0 and MNLI tasks. DeBERTaV3 has a hidden size of 768 and 86 million backbone parameters, and was trained using a vocabulary of 128K tokens.

DeBERTaV3 is an improved version of the DeBERTa model that uses ELECTRA-style pre-training with gradient-disentangled embedding sharing. The new model significantly improves performance on downstream tasks compared to DeBERTa, and achieves state-of-the-art results on SQuAD 2.0 and MNLI tasks. DeBERTaV3 has a hidden size of 768 and 86 million backbone parameters, and was trained using a vocabulary of 128K tokens.

Public
$0.0005/sec

Input

text prompt, should include exactly one [MASK] token

You need to login to use this model

Output

where is my father? (0.09)

where is my mother? (0.08)

 


© 2023 Deep Infra. All rights reserved.

Discord Logo