The RoBERTa model was pretrained on a dataset created by combining several sources including BookCorpus, English Wikipedia, CC-News, OpenWebText, and Stories. It uses a tokenization scheme with a vocabulary size of 50,000 and replaces 15% of the tokens with either a special masking token or a random token. The model achieved impressive results when fine-tuned on various downstream NLP tasks, outperforming its predecessor BERT in many areas.
The RoBERTa model was pretrained on a dataset created by combining several sources including BookCorpus, English Wikipedia, CC-News, OpenWebText, and Stories. It uses a tokenization scheme with a vocabulary size of 50,000 and replaces 15% of the tokens with either a special masking token or a random token. The model achieved impressive results when fine-tuned on various downstream NLP tasks, outperforming its predecessor BERT in many areas.
text prompt, should include exactly one <mask> token
You need to login to use this model
where is my father? (0.09)
where is my mother? (0.08)