A transformer model trained on the Pile dataset for masked autoregressive language modeling. With 125 million parameters, our model is capable of generating high-quality text given a prompt. However, we acknowledge potential limitations and biases in the model's responses, particularly regarding profanity and offensiveness, and advise users to exercise caution when deploying the model for real-world applications.
A transformer model trained on the Pile dataset for masked autoregressive language modeling. With 125 million parameters, our model is capable of generating high-quality text given a prompt. However, we acknowledge potential limitations and biases in the model's responses, particularly regarding profanity and offensiveness, and advise users to exercise caution when deploying the model for real-world applications.
b983397156c0991016feccfbcbe1fe2746d47b29
2023-05-03T22:12:53+00:00