put sources. We could show that it is feasible to ex-
tract metadata and segment court rulings with great
accuracy.
Nonetheless, this research contains some limita-
tions. While we utilized court rulings from differ-
ent sources and instances, the system, particularly the
court-specific rule-based modules, was tuned based
on our inputs. Even though we evaluated the pro-
posed approach on unseen court rulings, even from
small courts, verdicts from courts of different juris-
dictions (financial, social, employment) may worsen
the results as their structure might be different. How-
ever, the whole pipeline is implemented in an exten-
sible manner so that it is easy to enhance the rules to
match other inputs.
Another promising approach may be the incorpo-
ration of a different head on top of BERT. Specifi-
cally, instead of classifying the whole sequence based
on the pooled representation, adding a linear layer on
top of the hidden-states output might be interesting
to compute span start logits and span end logits. The
model would only be responsible for defining the start
and end of each segment instead of classifying each
sentence. Due to the nature of such a token-based
classification task, it may be feasible for our classifi-
cation task.
While most of our rule-based and heuristic ap-
proaches seem to be adequate, it is worth investigat-
ing in the future whether modern language models
can help to classify tokens with respect to some of
the metadata that did not perform well for us, such as
the previous instances of the court ruling. This could
even improve our reported results further.
Last but not least, we implemented our pipeline
in a prototypical web application called Verlyze, al-
lowing the research community to build even more
reliable systems on top of our implementation.
REFERENCES
Akbik, A., Bergmann, T., Blythe, D., Rasul, K., Schweter,
S., and Vollgraf, R. (2019). Flair: An easy-to-
use framework for state-of-the-art nlp. In NAACL
2019, 2019 Annual Conference of the North Amer-
ican Chapter of the Association for Computational
Linguistics (Demonstrations), pages 54–59.
Aumiller, D., Almasian, S., Lackner, S., and Gertz, M.
(2021). Structural text segmentation of legal docu-
ments.
Chalkidis, I. and Kampas, D. (2019). Deep learning in law:
early adaptation and legal word embeddings trained
on large corpora. Artificial Intelligence and Law,
27(2):171–198.
Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K.
(2019). BERT: Pre-training of deep bidirectional
transformers for language understanding. In Pro-
ceedings of the 2019 Conference of the North Amer-
ican Chapter of the Association for Computational
Linguistics: Human Language Technologies, Volume
1 (Long and Short Papers), pages 4171–4186, Min-
neapolis, Minnesota. Association for Computational
Linguistics.
Fleiss, J. L. (1971). Measuring nominal scale agreement
among many raters. In Psychological Bulletin, volume
76(5), pages 378–382.
Glaser, I. and Matthes, F. (2020). Classification of german
court rulings: Detecting the area of law. In ASAIL@
JURIX.
Glaser, I., Moser, S., and Matthes, F. (2021). Sen-
tence boundary detection in german legal documents.
In Proceedings of the 13th International Conference
on Agents and Artificial Intelligence - Volume 2:
ICAART, pages 812–821. INSTICC, SciTePress.
Lastres, S. A. (2015). Rebooting legal research in a digital
age.
Loza Menc
´
ıa, E. (2009). Segmentation of legal documents.
In Proceedings of the 12th International Conference
on Artificial Intelligence and Law, pages 88–97.
Lu, Q., Conrad, J. G., Al-Kofahi, K., and Keenan, W.
(2011). Legal document clustering with built-in topic
segmentation. In Proceedings of the 20th ACM in-
ternational conference on Information and knowledge
management, pages 383–392.
Lyte, A. and Branting, K. (2019). Document segmenta-
tion labeling techniques for court filings. In ASAIL@
ICAIL.
Ostendorff, M., Ash, E., Ruas, T., Gipp, B., Moreno-
Schneider, J., and Rehm, G. (2021). Evaluating docu-
ment representations for content-based legal literature
recommendations. arXiv preprint arXiv:2104.13841.
Palmirani, M. and Vitali, F. (2011). Akoma-Ntoso for legal
documents, pages 75–100. Springer.
Peoples, L. F. (2005). The death of the digest and the pit-
falls of electronic research: what is the modern legal
researcher to do. Law Libr. J., 97:661.
Shelar, A. and Moharir, M. (2018). A comparative study to
determine a suitable legal knowledge representation
format. In 2018 International Conference on Electri-
cal, Electronics, Communication, Computer, and Op-
timization Techniques (ICEECCOT), pages 514–519.
IEEE.
Waltl, B., Bonczek, G., Scepankova, E., and Matthes, F.
(2019). Semantic types of legal norms in german laws:
classification and analysis using local linear explana-
tions. Artificial Intelligence and Law, 27(1):43–71.
Improving Legal Information Retrieval: Metadata Extraction and Segmentation of German Court Rulings
291