As governments worldwide continue to release vast amounts of textual information, the need for efficient and insightful tools to extract, interpret and present this data has become increasingly critical. Towards solving this issue, we present the Bundestags-Mine: an environment that periodically retrieves pertinent data from the German parliament, parses and analyzes it using pipelines for natural language processing, and then displays the results in a web application that is publicly accessible. Bundestags-Mine helps to extract key information from parliamentary documents in a visually appealing matter for many use cases. For instance, the tool can be leveraged by journalists for news detection, lawyers for compliance checking, linguists for discourse analysis, and the broad public to inform themselves about the positions of political party members on a topic.
Bundestags-Mine_GitHub_Explanation.mp4
(German video)
Bundestags-Mine is an environment for evaluating German government documents by means of various Natural Language Processing techniques and visualizing them via a publicly accessable and intuitive web application as well as providing the resulting data for download. Within this environment, we processes the following types of government data, which are then visualized within the web application for a platform-independent and responsive interface:
- Minutes of plenary proceedings
- Agenda Items
- Polls
We gather this data from the offical Bundestag Data Service, apply various NLP/AI techniques onto it and make these results available on the website.
Dive into several features of the web interface to interact with the Bundestag like never before.
Read the newspaper: "Neues vom Schürfer" | Explore all speeches |
---|---|
Analyse specific topics | Browse through topics |
Within the Bundestags-Mine, we use a list of NLP techniques through pipelines to preprocess and analyse the governmental documents:
- Tokenziation
- Lemmatization
- POS-Tagging
- Named-Entity-Recognition
- Sentiment Analysis
- Automatic Text Summarization
- Translation
- Topic Modelling
@inproceedings{Boenisch:et:al:2023,
title = {{Bundestags-Mine}: Natural Language Processing for Extracting
Key Information from Government Documents},
isbn = {9781643684734},
issn = {1879-8314},
url = {http://dx.doi.org/10.3233/FAIA230996},
doi = {10.3233/faia230996},
booktitle = {Legal Knowledge and Information Systems},
publisher = {IOS Press},
author = {B\"{o}nisch, Kevin and Abrami, Giuseppe and Wehnert, Sabine and Mehler, Alexander},
year = {2023}
}
The majority of the NLP pipelines are provided by the TextImager made by the Text Technology Lab (pdf bibtex github)