Jean-Baptiste/camembert-ner cover image

Jean-Baptiste/camembert-ner

A Named Entity Recognition model fine-tuned from CamemBERT on the Wikiner-FR dataset. Our model achieves high performance on various entities, including Persons, Organizations, Locations, and Miscellaneous entities.

A Named Entity Recognition model fine-tuned from CamemBERT on the Wikiner-FR dataset. Our model achieves high performance on various entities, including Persons, Organizations, Locations, and Miscellaneous entities.

Public
$0.0005 / sec
Web inference not supported yet, please check API tab

camembert-ner: model fine-tuned from camemBERT for NER task.

Introduction

[camembert-ner] is a NER model that was fine-tuned from camemBERT on wikiner-fr dataset. Model was trained on wikiner-fr dataset (~170 634 sentences). Model was validated on emails/chat data and overperformed other models on this type of data specifically. In particular the model seems to work better on entity that don't start with an upper case.

Training data

Training data was classified as follow:

AbbreviationDescription
OOutside of a named entity
MISCMiscellaneous entity
PERPerson’s name
ORGOrganization
LOCLocation

Model performances (metric: seqeval)

Overall

precisionrecallf1
0.88590.89710.8914

By entity

entityprecisionrecallf1
PER0.93720.95980.9483
ORG0.80990.82650.8181
LOC0.89050.90050.8955
MISC0.81750.81170.8146

For those who could be interested, here is a short article on how I used the results of this model to train a LSTM model for signature detection in emails: https://medium.com/@jean-baptiste.polle/lstm-model-for-email-signature-detection-8e990384fefa