Improve documentation

bltlab · Nov 10, 2021 · 68bce31 · 68bce31
1 parent 1a6273a
commit 68bce31
Show file tree

Hide file tree

Showing 2 changed files with 208 additions and 25 deletions.
diff --git a/README.md b/README.md
@@ -3,43 +3,196 @@
 [![Documentation Status](https://readthedocs.org/projects/seqscore/badge/?version=latest)](https://seqscore.readthedocs.io/en/latest/?badge=latest)
 
 
-SeqScore: Scoring for named entity recognition and other sequence labeling tasks
+SeqScore provides scoring for named entity recognition and other
+chunking tasks evaluated over sequence labels.
 
 
 # Installation
 
-To install the latest release of SeqScore, run:
+To install the latest official release of SeqScore, run:
 `pip install seqscore`
 
-At this point, the released version is relatively out-of-date, but
-will be updated once new documentation is ready.
-
-For the latest version, check out the `main` branch (stable, but
-sometimes newer than the version on PyPI), or the `dev` branch
-(latest, but less tested).
-
-To install from a clone of this repository, use:
-`pip install -e .`
+This will install the package and add the command `seqscore` in your
+Python environment.
 
 
 # Usage
 
 ## Overview
 
-For a list of commands, run `seqscore --help`.
+For a list of commands, run `seqscore --help`:
+```
+$ seqscore --help
+Usage: seqscore [OPTIONS] COMMAND [ARGS]...
+
+Options:
+  --help  Show this message and exit.
+
+Commands:
+  convert
+  count
+  repair
+  score
+  validate
+```
+
+## Scoring
+
+The most common application of SeqScore is scoring CoNLL-format NER
+predictions. Let's assume you have two files, one containing the
+correct labels (annotation) and the other containing the predictions
+(system output).
+
+The correct labels are in the file `reference.bio`:
+```
+This O
+is O
+a O
+sentence O
+. O
+
+University B-ORG
+of I-ORG
+Pennsylvania I-ORG
+is O
+in O
+West B-LOC
+Philadelphia I-LOC
+, O
+Pennsylvania B-LOC
+. O
+```
+
+The predictions are in the file `predicted.bio`:
+```
+This O
+is O
+a O
+sentence O
+. O
+
+University B-ORG
+of I-ORG
+Pennsylvania I-ORG
+is O
+in O
+West B-LOC
+Philadelphia B-LOC
+, O
+Pennsylvania B-LOC
+. O
+```
+
+To score the predictions, run:
+`seqscore score --labels BIO --reference reference.bio predicted.bio`
+```
+| Type   |   Precision |   Recall |     F1 |   Reference |   Predicted |   Correct |
+|--------|-------------|----------|--------|-------------|-------------|-----------|
+| ALL    |       50.00 |    66.67 |  57.14 |           3 |           4 |         2 |
+| LOC    |       33.33 |    50.00 |  40.00 |           2 |           3 |         1 |
+| ORG    |      100.00 |   100.00 | 100.00 |           1 |           1 |         1 |
+```
+
+A few things to note:
+* The reference file must be specifed with the `--reference`
+  flag.
+* The chunk encoding (BIO, BIOES, etc.) must be specified
+  using the `--labels` flag.
+* Both files need to use the same chunk encoding. If you have
+  files that use different chunk encodings, use the `convert` command
+* You can get output in a different format using the `--score-format`
+  flag
+
+The above scoring command will work for files that do not have any
+invalid transitions, that is, those that perfectly follow what the
+encoding allows. However, consider this BIO-encoded file, which we'll
+call `invalid.bio`:
+
+```
+This O
+is O
+a O
+sentence O
+. O
+
+University I-ORG
+of I-ORG
+Pennsylvania I-ORG
+is O
+in O
+West B-LOC
+Philadelphia I-LOC
+, O
+Pennsylvania B-LOC
+. O
+```
+
+Note that the token `University` has the label `I-ORG`, but there is
+no preceding `B-ORG`. If we try to score it as before with `seqscore
+score --labels BIO --reference reference.bio invalid.bio`, scoring
+will fail with an error:
+```
+seqscore.encoding.EncodingError: Stopping due to validation errors in invalid.bio:
+Invalid transition 'O' -> 'I-ORG' for token 'University' on line 7
+```
+
+To score output with invalid transitions, we need to specify a repair
+method which can correct them. We can tell SeqScore to use the same
+approach that conlleval uses (which we refer to as "begin" repair in our
+paper): `seqscore score --labels BIO --repair-method conlleval  --reference reference.bio invalid.bio`
 
-Some examples:
 ```
-# Score like conlleval
-seqscore score --labels BIO --repair-method conlleval --reference <reference_conll_file> <prediction_conll_file>
-# Score discarding invalid chunks, which sometimes produces higher scores
-seqscore score --labels BIO --repair-method discard --reference <reference_conll_file> <prediction_conll_file>
-seqscore validate --labels BIO <input_conll_file>
-seqscore dump --labels BIO <input_conll_file> <output_delim_file>
+Validation errors in sequence at line 7 of invalid.bio:
+Invalid transition 'O' -> 'I-ORG' for token 'University' on line 7
+Used method conlleval to repair:
+Old: ('I-ORG', 'I-ORG', 'I-ORG', 'O', 'O', 'B-LOC', 'I-LOC', 'O', 'B-LOC', 'O')
+New: ('B-ORG', 'I-ORG', 'I-ORG', 'O', 'O', 'B-LOC', 'I-LOC', 'O', 'B-LOC', 'O')
+| Type   |   Precision |   Recall |     F1 |   Reference |   Predicted |   Correct |
+|--------|-------------|----------|--------|-------------|-------------|-----------|
+| ALL    |      100.00 |   100.00 | 100.00 |           3 |           3 |         3 |
+| LOC    |      100.00 |   100.00 | 100.00 |           2 |           2 |         2 |
+| ORG    |      100.00 |   100.00 | 100.00 |           1 |           1 |         1 |
 ```
 
-Scoring only supports BIO chunk encoding. Validation can be done for IO, BIO, and BIOES.
-At the moment, `dump` only supports BIO, but support will be added for IO and BIOES.
+You can use the `-q` flag to suppress the logging of all of the repairs
+applied. You may want to also explore the `discard` repair, which can
+produce higher scores for output from models without a CRF or constrained
+decoding as they are more likely to produce invalid transitions.
+
+## Other commands
+
+Other commands are still being documented, but here is a quick summary:
+* `repair`: Apply a repair method to a file, creating an output file with
+  only valid transitions.
+* `convert`: Convert a file from one encoding to another.
+* `count`: Output counts of chunks in the input file.
+* `validate`: Check whether a file has any invalid transitions.
+
+
+# FAQ
+
+## Why can't I score output files that are in the format conlleval expects?
+
+At this time, SeqScore intentionally does not support the "merged"
+format used by conlleval where each line contains a token, correct
+tag, and predicted tag:
+
+```
+University B-ORG B-ORG
+of I-ORG I-ORG
+Pennsylvania I-ORG I-ORG
+is O O
+in O O
+West B-LOC B-LOC
+Philadelphia I-LOC B-LOC
+, O O
+Pennsylvania B-LOC B-LOC
+. O O
+```
+
+We do not support this format because we have found that creating
+predictions in this format is a common source of errors in scoring
+pipelines.
 
 
 # Features coming soon!
@@ -51,17 +204,43 @@ At the moment, `dump` only supports BIO, but support will be added for IO and BI
 # Citation
 
 If you use SeqScore, please cite
-[Addressing Barriers to Reproducible Named Entity Recognition Evaluation](https://arxiv.org/abs/2107.14154).
+[SeqScore: Addressing Barriers to Reproducible Named Entity Recognition Evaluation](https://aclanthology.org/2021.eval4nlp-1.5/).
 
+BibTeX:
+```
+@inproceedings{palen-michel-etal-2021-seqscore,
+    title = "{S}eq{S}core: Addressing Barriers to Reproducible Named Entity Recognition Evaluation",
+    author = "Palen-Michel, Chester  and
+      Holley, Nolan  and
+      Lignos, Constantine",
+    booktitle = "Proceedings of the 2nd Workshop on Evaluation and Comparison of NLP Systems",
+    month = nov,
+    year = "2021",
+    address = "Punta Cana, Dominican Republic",
+    publisher = "Association for Computational Linguistics",
+    url = "https://aclanthology.org/2021.eval4nlp-1.5",
+    pages = "40--50",
+}
+```
 
 # License
 
 SeqScore is distributed under the MIT License.
 
 
-# Setting up for development
+# Development
+
+For the latest development version, check out the `main` branch
+(stable, but sometimes newer than the version on PyPI), or the `dev`
+branch (latest, but less tested).
+
+To install from a clone of this repository, use:
+`pip install -e .`
+
+## Setting up an environment for development
 
-1. Create environment: `conda create -y -n seqscore python=3.8`
+1. Create an environment: `conda create -y -n seqscore python=3.8`
 2. Activate the environment: `conda activate seqscore`
 3. Install dependencies: `pip install -r requirements.txt`
 4. Install seqscore: `pip install -e .`
+   x
diff --git a/seqscore/scripts/seqscore.py b/seqscore/scripts/seqscore.py
@@ -213,7 +213,11 @@ def count(
     type=click.Choice(SUPPORTED_SCORE_FORMATS),
     show_default=True,
 )
-@click.option("--delim", default="\t", help="[default: tab]")
+@click.option(
+    "--delim",
+    default="\t",
+    help="the delimiter to be used for delimited output (has no effect on input) [default: tab]",
+)
 @click.option("--quiet", "-q", is_flag=True)
 def score(
     file: List[str],  # Name is "file" to make sense on the command line, but it's a list