A pre-trained language model developed specifically for protein sequences using a masked language modeling (MLM) objective. It achieved impressive results when fine-tuned on downstream tasks such as secondary structure prediction and sub-cellular localization. The model was trained on uppercase amino acids only and used a vocabulary size of 21, with inputs of the form "[CLS] Protein Sequence A [SEP] Protein Sequence B [SEP]"
A pre-trained language model developed specifically for protein sequences using a masked language modeling (MLM) objective. It achieved impressive results when fine-tuned on downstream tasks such as secondary structure prediction and sub-cellular localization. The model was trained on uppercase amino acids only and used a vocabulary size of 21, with inputs of the form "[CLS] Protein Sequence A [SEP] Protein Sequence B [SEP]"
3d05bf06e79014892defacad82e0efd06e977ff6
2023-03-03T22:24:23+00:00