Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
lukasweymann authored Feb 22, 2024
1 parent 84dfd6c commit db730b7
Showing 1 changed file with 3 additions and 2 deletions.
5 changes: 3 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ Support for language-dependent components has been added for dozens of languages

Automated reports generated out of the tool that is actioned from a web application to which a corpus can be uploaded. Once processed, the viewer will plot the analysis and automatically generate a PDF report containing the same information.

<img alt="Data Analytics Viewer" src="https://github.com/hplt-project/data-analytics-tool/blob/main/img/data-viewer.png" width=600 />
<img alt="Data Analytics Viewer" src="https://github.com/lukasweymann/data-analytics-tool/blob/main/img/bilingual.png" width=600 />

Icon: https://thenounproject.com/icon/fingerprint-3530285/

Expand All @@ -17,6 +17,7 @@ Running the docker:
* sudo docker-compose build
* sudo docker-compose up


URLS to upload and view a dataset:
* Uploader: localhost:8000/uploader.html
* Viewer: localhost:8000/viewer.html
Expand Down Expand Up @@ -44,7 +45,7 @@ Code and data are located in `/work`

- Parallel English-Norwegian HPLT corpus from initial data release: it shows that deduplication needs to be addressed as one of the most important issues.

<img alt="Data Analytics Viewer" src="https://github.com/hplt-project/data-analytics-tool/blob/main/img/HPLT-en-nn.png" width=600 />
<img alt="Data Analytics Viewer" src="https://github.com/lukasweymann/data-analytics-tool/blob/main/img/monolingual.png" width=600 />


- Monolingual Turkish corpus from Bianet: it shows that at least a 12% of the corpus is not in Turkish.
Expand Down

0 comments on commit db730b7

Please sign in to comment.