uer/albert-base-chinese-cluecorpussmall cover image

uer/albert-base-chinese-cluecorpussmall

We present a Chinese version of the popular BERT-based language model, called Albert, which is trained on the ClueCorpusSmall dataset using the UER-py toolkit. Our model achieves state-of-the-art results on various NLP tasks. The model is trained in two stages, first with a sequence length of 128 and then with a sequence length of 512.

We present a Chinese version of the popular BERT-based language model, called Albert, which is trained on the ClueCorpusSmall dataset using the UER-py toolkit. Our model achieves state-of-the-art results on various NLP tasks. The model is trained in two stages, first with a sequence length of 128 and then with a sequence length of 512.

Public
$0.0005 / sec

Input

text prompt, should include exactly one [MASK] token

You need to login to use this model

Output

where is my father? (0.09)

where is my mother? (0.08)

Chinese ALBERT

Model description

This is the set of Chinese ALBERT models pre-trained by UER-py, which is introduced in this paper. Besides, the models could also be pre-trained by TencentPretrain introduced in this paper, which inherits UER-py to support models with parameters above one billion, and extends it to a multimodal pre-training framework.

You can download the model either from the UER-py Modelzoo page, or via HuggingFace from the links below:

Link
ALBERT-BaseL=12/H=768 (Base)
ALBERT-LargeL=24/H=1024 (Large)

Training data

CLUECorpusSmall is used as training data.

BibTeX entry and citation info

@article{lan2019albert,
  title={Albert: A lite bert for self-supervised learning of language representations},
  author={Lan, Zhenzhong and Chen, Mingda and Goodman, Sebastian and Gimpel, Kevin and Sharma, Piyush and Soricut, Radu},
  journal={arXiv preprint arXiv:1909.11942},
  year={2019}
}

@article{zhao2019uer,
  title={UER: An Open-Source Toolkit for Pre-training Models},
  author={Zhao, Zhe and Chen, Hui and Zhang, Jinbin and Zhao, Xin and Liu, Tao and Lu, Wei and Chen, Xi and Deng, Haotang and Ju, Qi and Du, Xiaoyong},
  journal={EMNLP-IJCNLP 2019},
  pages={241},
  year={2019}
}

@article{zhao2023tencentpretrain,
  title={TencentPretrain: A Scalable and Flexible Toolkit for Pre-training Models of Different Modalities},
  author={Zhao, Zhe and Li, Yudong and Hou, Cheng and Zhao, Jing and others},
  journal={ACL 2023},
  pages={217},
  year={2023}