SecBERT is a pretrained language model for cyber security text, trained on a dataset of papers from various sources, including APTnotes, Stucco-Data, and CASIE. The model has its own wordpiece vocabulary, secvocab, and is available in two versions, SecBERT and SecRoBERTa. The model can improve downstream tasks such as NER, text classification, semantic understanding, and Q&A in the cyber security domain.
SecBERT is a pretrained language model for cyber security text, trained on a dataset of papers from various sources, including APTnotes, Stucco-Data, and CASIE. The model has its own wordpiece vocabulary, secvocab, and is available in two versions, SecBERT and SecRoBERTa. The model can improve downstream tasks such as NER, text classification, semantic understanding, and Q&A in the cyber security domain.
text prompt, should include exactly one [MASK] token
You need to login to use this model
where is my father? (0.09)
where is my mother? (0.08)
This is the pretrained model presented in SecBERT: A Pretrained Language Model for Cyber Security Text, which is a BERT model trained on cyber security text.
The training corpus was papers taken from
SecBERT has its own wordpiece vocabulary (secvocab) that's built to best match the training corpus.
We trained SecBERT and SecRoBERTa versions.
Available models include:
We proposed to build language model which work on cyber security text, as result, it can improve downstream tasks (NER, Text Classification, Semantic Understand, Q&A) in Cyber Security Domain.
First, as below shows Fill-Mask pipeline in Google Bert, AllenAI SciBert and our SecBERT .
The original repo can be found here.