bert-base-chinese cover image


A pre-trained language model developed by the HuggingFace team for the Chinese language. It uses a fill-mask approach and has been trained on a large corpus of Chinese text data. The model can be used for various natural language processing tasks such as masked language modeling and has been shown to achieve state-of-the-art results in certain benchmarks. However, like other language models, it also comes with risks, limitations, and biases, including perpetuating harmful stereotypes and biases present in the data it was trained on. Users are advised to carefully evaluate and mitigate these risks when using the model.

A pre-trained language model developed by the HuggingFace team for the Chinese language. It uses a fill-mask approach and has been trained on a large corpus of Chinese text data. The model can be used for various natural language processing tasks such as masked language modeling and has been shown to achieve state-of-the-art results in certain benchmarks. However, like other language models, it also comes with risks, limitations, and biases, including perpetuating harmful stereotypes and biases present in the data it was trained on. Users are advised to carefully evaluate and mitigate these risks when using the model.

$0.0005 / sec


text prompt, should include exactly one [MASK] token

You need to login to use this model


where is my father? (0.09)

where is my mother? (0.08)


Table of Contents

Model Details

Model Description

This model has been pre-trained for Chinese, training and random input masking has been applied independently to word pieces (as in the original BERT paper).

  • Developed by: HuggingFace team
  • Model Type: Fill-Mask
  • Language(s): Chinese
  • License: [More Information needed]
  • Parent Model: See the BERT base uncased model for more information about the BERT base model.

Model Sources


Direct Use

This model can be used for masked language modeling

Risks, Limitations and Biases

CONTENT WARNING: Readers should be aware this section contains content that is disturbing, offensive, and can propagate historical and current stereotypes.

Significant research has explored bias and fairness issues with language models (see, e.g., Sheng et al. (2021) and Bender et al. (2021)).