We present a Chinese version of the popular BERT-based language model, called Albert, which is trained on the ClueCorpusSmall dataset using the UER-py toolkit. Our model achieves state-of-the-art results on various NLP tasks. The model is trained in two stages, first with a sequence length of 128 and then with a sequence length of 512.
We present a Chinese version of the popular BERT-based language model, called Albert, which is trained on the ClueCorpusSmall dataset using the UER-py toolkit. Our model achieves state-of-the-art results on various NLP tasks. The model is trained in two stages, first with a sequence length of 128 and then with a sequence length of 512.
text prompt, should include exactly one [MASK] token
You need to login to use this model
where is my father? (0.09)
where is my mother? (0.08)
This is the set of Chinese ALBERT models pre-trained by UER-py, which is introduced in this paper. Besides, the models could also be pre-trained by TencentPretrain introduced in this paper, which inherits UER-py to support models with parameters above one billion, and extends it to a multimodal pre-training framework.
You can download the model either from the UER-py Modelzoo page, or via HuggingFace from the links below:
Link | |
---|---|
ALBERT-Base | L=12/H=768 (Base) |
ALBERT-Large | L=24/H=1024 (Large) |
CLUECorpusSmall is used as training data.
@article{lan2019albert,
title={Albert: A lite bert for self-supervised learning of language representations},
author={Lan, Zhenzhong and Chen, Mingda and Goodman, Sebastian and Gimpel, Kevin and Sharma, Piyush and Soricut, Radu},
journal={arXiv preprint arXiv:1909.11942},
year={2019}
}
@article{zhao2019uer,
title={UER: An Open-Source Toolkit for Pre-training Models},
author={Zhao, Zhe and Chen, Hui and Zhang, Jinbin and Zhao, Xin and Liu, Tao and Lu, Wei and Chen, Xi and Deng, Haotang and Ju, Qi and Du, Xiaoyong},
journal={EMNLP-IJCNLP 2019},
pages={241},
year={2019}
}
@article{zhao2023tencentpretrain,
title={TencentPretrain: A Scalable and Flexible Toolkit for Pre-training Models of Different Modalities},
author={Zhao, Zhe and Li, Yudong and Hou, Cheng and Zhao, Jing and others},
journal={ACL 2023},
pages={217},
year={2023}