The German BERT model was fine-tuned on the Legal-Entity-Recognition dataset for the named entity recognition (NER) task, achieving an F1 score of 85.67% on the evaluation set. The model uses a pre-trained BERT base model and is trained with a provided script from Hugging Face. The labels covered include various types of legal entities, such as companies, organizations, and individuals.
The German BERT model was fine-tuned on the Legal-Entity-Recognition dataset for the named entity recognition (NER) task, achieving an F1 score of 85.67% on the evaluation set. The model uses a pre-trained BERT base model and is trained with a provided script from Hugging Face. The labels covered include various types of legal entities, such as companies, organizations, and individuals.
German BERT (BERT-base-german-cased) fine-tuned on Legal-Entity-Recognition dataset for LER (NER) downstream task.
Legal-Entity-Recognition: Fine-grained Named Entity Recognition in Legal Documents.
Court decisions from 2017 and 2018 were selected for the dataset, published online by the Federal Ministry of Justice and Consumer Protection. The documents originate from seven federal courts: Federal Labour Court (BAG), Federal Fiscal Court (BFH), Federal Court of Justice (BGH), Federal Patent Court (BPatG), Federal Social Court (BSG), Federal Constitutional Court (BVerfG) and Federal Administrative Court (BVerwG).
Split | # Samples |
---|---|
Train | 1657048 |
Eval | 500000 |
Training script: Fine-tuning script for NER provided by Huggingface Colab: How to fine-tune a model for NER using HF scripts
Labels covered (and its distribution):
107 B-AN
918 B-EUN
2238 B-GRT
13282 B-GS
1113 B-INN
704 B-LD
151 B-LDS
2490 B-LIT
282 B-MRK
890 B-ORG
1374 B-PER
1480 B-RR
10046 B-RS
401 B-ST
68 B-STR
1011 B-UN
282 B-VO
391 B-VS
2648 B-VT
46 I-AN
6925 I-EUN
1957 I-GRT
70257 I-GS
2931 I-INN
153 I-LD
26 I-LDS
28881 I-LIT
383 I-MRK
1185 I-ORG
330 I-PER
106 I-RR
138938 I-RS
34 I-ST
55 I-STR
1259 I-UN
1572 I-VO
2488 I-VS
11121 I-VT
1348525 O
Metric | # score |
---|---|
F1 | 85.67 |
Precision | 84.35 |
Recall | 87.04 |
Accuracy | 98.46 |
Created by Manuel Romero/@mrm8488
Made with ♥ in Spain