Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mention detection with Bert #151

Open
wants to merge 62 commits into
base: main
Choose a base branch
from
Open

Mention detection with Bert #151

wants to merge 62 commits into from

Conversation

eriktks
Copy link
Contributor

@eriktks eriktks commented Jan 5, 2023

New version of REL with the opportunity to use Bert instead of Flair for mention detection (named entity recognition)

Installation

  1. python3 -m venv --prompt REL venv3
  2. source venv3/bin/activate
  3. git clone https://github.com/informagi/REL.git
  4. cd REL
  5. git checkout md-with-bert-2
  6. pip install -e '.[develop]'
  7. mkdir src/data
  8. cd src/data
  9. curl -O http://gem.cs.ru.nl/generic.tar.gz
  10. curl -O http://gem.cs.ru.nl/ed-wiki-2019.tar.gz
  11. curl -O http://gem.cs.ru.nl/wiki_2019.tar.gz
  12. tar zxf wiki_2019.tar.gz
  13. tar zxf ed-wiki-2019.tar.gz
  14. tar zxf generic.tar.gz
  15. cd ../..

Testing

Run 13 tests

  • pytest tests

Some warnings might be reported but all tests should succeed

List test options:

  • python3 src/scripts/efficiency_test.py -h

Process one test document with Flair, sentence-by-sentence, and report the performance:

  • python3 src/scripts/efficiency_test.py --tagger_ner_name flair --max_docs 1 --process_sentences
    ...
    Results: PMD RMD FMD PEL REL FEL: 94.9% 63.8% 76.3% | 69.2% 46.6% 55.7%

Process all 50 test documents with uncased Bert base, document-by-document, maximum of 500 tokens, and report the performance:

  • python3 src/scripts/efficiency_test.py --tagger_ner_name bert_base_uncased --split_docs_value 500
    ...
    Results: PMD RMD FMD PEL REL FEL: 93.5% 62.4% 74.8% | 62.9% 42.0% 50.3%

Use the server to process one test document with Flair, sentence-by-sentence, and report the performance:

  • python3 src/REL/server.py
  • python3 src/scripts/efficiency_test.py --max_docs 1 --use_server
    ...
    Results: PMD RMD FMD PEL REL FEL: 95.0% 63.3% 76.0% | 67.5% 45.0% 54.0%

Copy link
Contributor

@stefsmeets stefsmeets left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @eriktks This is a very long PR, so a very long review 😅

I found it very tricky to review, because there are small changes everywhere.

I have two main impressions. First, the way the code is currently structured seems like it is quite tricky to add a new tagger. The code itself is written with flair in mind, and there is no interface to add a new tagger (like Bert). I'm sure the code works well, but I would have liked to see a clean interface between Bert and Flar, so that any new tagger just has to define this interface and it just works. I guess this is a bit of a pipe dream, but I think we should still strive to make small steps toward this.

Second, most of the code is written in a way that it is very difficult to grasp what is going on. This is an issue with the existing code, but also in the code in this PR. Lots of nested dict and list lookups, lots of nested if statements and conditionals. I didn't even attempt to try to understand what most of it does. This also means it is difficult for me to form a picture of what is going on (see point 1).

  • evaluate_predictions.py - This code only seems to be used inside the efficiency_test.py script. I don't think it should be part of the main REL module. I think it belongs inside the scripts directory.
  • mention_detection.py - This module was already quite complex, and I think that the changes you made do not help to make it easier to read. In fact, the Bert and Flair paths are completely entangled inside the code. For me at least, it is impossible to read and get a feel what is going on. I'm afraid this will make the code very difficult to debug and maintain, or, add a new tagger... 🫣
  • Have you thought about how to test this code on github actions? This is something I'm struggling with a bit, because there is no good 'small' dataset to test with at the moment.
  • Related to that, can you make sure that the github actions at least do not fail? Maybe you can use markers to mark tests that will not run in the CI?
  • Other than that, I left a bunch of remarks with my thoughts on how this code could be improved. Have a look and see what you think.

Edit:
Just remembered that you can also use this decorator to skip the tests on the CI (I use this elsewhere):

@pytest.mark.skipif(
    os.getenv("GITHUB_ACTIONS")=='true', reason="No way of testing this on Github actions."
)

scripts/efficiency_test.py Outdated Show resolved Hide resolved
scripts/efficiency_test.py Outdated Show resolved Hide resolved
scripts/efficiency_test.py Outdated Show resolved Hide resolved
scripts/efficiency_test.py Outdated Show resolved Hide resolved
scripts/efficiency_test.py Outdated Show resolved Hide resolved
tests/test_evaluate_predictions.py Outdated Show resolved Hide resolved
tests/test_evaluate_predictions.py Outdated Show resolved Hide resolved
tests/test_evaluate_predictions.py Outdated Show resolved Hide resolved
tests/test_evaluate_predictions.py Outdated Show resolved Hide resolved
tests/test_flair_md.py Outdated Show resolved Hide resolved
@eriktks
Copy link
Contributor Author

eriktks commented Mar 19, 2024

Github action build now succeeds after changing the Python version and the pytest arguments

@stefsmeets stefsmeets self-requested a review April 8, 2024 14:46
stefsmeets
stefsmeets previously approved these changes Apr 8, 2024
Copy link
Contributor

@stefsmeets stefsmeets left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed this offline. The changes look good to me! 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants