-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mention detection with Bert #151
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @eriktks This is a very long PR, so a very long review 😅
I found it very tricky to review, because there are small changes everywhere.
I have two main impressions. First, the way the code is currently structured seems like it is quite tricky to add a new tagger. The code itself is written with flair in mind, and there is no interface to add a new tagger (like Bert). I'm sure the code works well, but I would have liked to see a clean interface between Bert and Flar, so that any new tagger just has to define this interface and it just works. I guess this is a bit of a pipe dream, but I think we should still strive to make small steps toward this.
Second, most of the code is written in a way that it is very difficult to grasp what is going on. This is an issue with the existing code, but also in the code in this PR. Lots of nested dict and list lookups, lots of nested if statements and conditionals. I didn't even attempt to try to understand what most of it does. This also means it is difficult for me to form a picture of what is going on (see point 1).
evaluate_predictions.py
- This code only seems to be used inside theefficiency_test.py
script. I don't think it should be part of the main REL module. I think it belongs inside the scripts directory.mention_detection.py
- This module was already quite complex, and I think that the changes you made do not help to make it easier to read. In fact, the Bert and Flair paths are completely entangled inside the code. For me at least, it is impossible to read and get a feel what is going on. I'm afraid this will make the code very difficult to debug and maintain, or, add a new tagger... 🫣- Have you thought about how to test this code on github actions? This is something I'm struggling with a bit, because there is no good 'small' dataset to test with at the moment.
- Related to that, can you make sure that the github actions at least do not fail? Maybe you can use markers to mark tests that will not run in the CI?
- Other than that, I left a bunch of remarks with my thoughts on how this code could be improved. Have a look and see what you think.
Edit:
Just remembered that you can also use this decorator to skip the tests on the CI (I use this elsewhere):
@pytest.mark.skipif(
os.getenv("GITHUB_ACTIONS")=='true', reason="No way of testing this on Github actions."
)
Co-authored-by: Stef Smeets <[email protected]>
Co-authored-by: Stef Smeets <[email protected]>
Github action build now succeeds after changing the Python version and the pytest arguments |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We discussed this offline. The changes look good to me! 🚀
New version of REL with the opportunity to use Bert instead of Flair for mention detection (named entity recognition)
Installation
python3 -m venv --prompt REL venv3
source venv3/bin/activate
git clone https://github.com/informagi/REL.git
cd REL
git checkout md-with-bert-2
pip install -e '.[develop]'
mkdir src/data
cd src/data
curl -O http://gem.cs.ru.nl/generic.tar.gz
curl -O http://gem.cs.ru.nl/ed-wiki-2019.tar.gz
curl -O http://gem.cs.ru.nl/wiki_2019.tar.gz
tar zxf wiki_2019.tar.gz
tar zxf ed-wiki-2019.tar.gz
tar zxf generic.tar.gz
cd ../..
Testing
Run 13 tests
pytest tests
Some warnings might be reported but all tests should succeed
List test options:
python3 src/scripts/efficiency_test.py -h
Process one test document with Flair, sentence-by-sentence, and report the performance:
python3 src/scripts/efficiency_test.py --tagger_ner_name flair --max_docs 1 --process_sentences
...
Results: PMD RMD FMD PEL REL FEL: 94.9% 63.8% 76.3% | 69.2% 46.6% 55.7%
Process all 50 test documents with uncased Bert base, document-by-document, maximum of 500 tokens, and report the performance:
python3 src/scripts/efficiency_test.py --tagger_ner_name bert_base_uncased --split_docs_value 500
...
Results: PMD RMD FMD PEL REL FEL: 93.5% 62.4% 74.8% | 62.9% 42.0% 50.3%
Use the server to process one test document with Flair, sentence-by-sentence, and report the performance:
python3 src/REL/server.py
python3 src/scripts/efficiency_test.py --max_docs 1 --use_server
...
Results: PMD RMD FMD PEL REL FEL: 95.0% 63.3% 76.0% | 67.5% 45.0% 54.0%