Skip to content

Commit

Permalink
Improve documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
ConstantineLignos committed Nov 10, 2021
1 parent 1a6273a commit 68bce31
Show file tree
Hide file tree
Showing 2 changed files with 208 additions and 25 deletions.
227 changes: 203 additions & 24 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,43 +3,196 @@
[![Documentation Status](https://readthedocs.org/projects/seqscore/badge/?version=latest)](https://seqscore.readthedocs.io/en/latest/?badge=latest)


SeqScore: Scoring for named entity recognition and other sequence labeling tasks
SeqScore provides scoring for named entity recognition and other
chunking tasks evaluated over sequence labels.


# Installation

To install the latest release of SeqScore, run:
To install the latest official release of SeqScore, run:
`pip install seqscore`

At this point, the released version is relatively out-of-date, but
will be updated once new documentation is ready.

For the latest version, check out the `main` branch (stable, but
sometimes newer than the version on PyPI), or the `dev` branch
(latest, but less tested).

To install from a clone of this repository, use:
`pip install -e .`
This will install the package and add the command `seqscore` in your
Python environment.


# Usage

## Overview

For a list of commands, run `seqscore --help`.
For a list of commands, run `seqscore --help`:
```
$ seqscore --help
Usage: seqscore [OPTIONS] COMMAND [ARGS]...
Options:
--help Show this message and exit.
Commands:
convert
count
repair
score
validate
```

## Scoring

The most common application of SeqScore is scoring CoNLL-format NER
predictions. Let's assume you have two files, one containing the
correct labels (annotation) and the other containing the predictions
(system output).

The correct labels are in the file `reference.bio`:
```
This O
is O
a O
sentence O
. O
University B-ORG
of I-ORG
Pennsylvania I-ORG
is O
in O
West B-LOC
Philadelphia I-LOC
, O
Pennsylvania B-LOC
. O
```

The predictions are in the file `predicted.bio`:
```
This O
is O
a O
sentence O
. O
University B-ORG
of I-ORG
Pennsylvania I-ORG
is O
in O
West B-LOC
Philadelphia B-LOC
, O
Pennsylvania B-LOC
. O
```

To score the predictions, run:
`seqscore score --labels BIO --reference reference.bio predicted.bio`
```
| Type | Precision | Recall | F1 | Reference | Predicted | Correct |
|--------|-------------|----------|--------|-------------|-------------|-----------|
| ALL | 50.00 | 66.67 | 57.14 | 3 | 4 | 2 |
| LOC | 33.33 | 50.00 | 40.00 | 2 | 3 | 1 |
| ORG | 100.00 | 100.00 | 100.00 | 1 | 1 | 1 |
```

A few things to note:
* The reference file must be specifed with the `--reference`
flag.
* The chunk encoding (BIO, BIOES, etc.) must be specified
using the `--labels` flag.
* Both files need to use the same chunk encoding. If you have
files that use different chunk encodings, use the `convert` command
* You can get output in a different format using the `--score-format`
flag

The above scoring command will work for files that do not have any
invalid transitions, that is, those that perfectly follow what the
encoding allows. However, consider this BIO-encoded file, which we'll
call `invalid.bio`:

```
This O
is O
a O
sentence O
. O
University I-ORG
of I-ORG
Pennsylvania I-ORG
is O
in O
West B-LOC
Philadelphia I-LOC
, O
Pennsylvania B-LOC
. O
```

Note that the token `University` has the label `I-ORG`, but there is
no preceding `B-ORG`. If we try to score it as before with `seqscore
score --labels BIO --reference reference.bio invalid.bio`, scoring
will fail with an error:
```
seqscore.encoding.EncodingError: Stopping due to validation errors in invalid.bio:
Invalid transition 'O' -> 'I-ORG' for token 'University' on line 7
```

To score output with invalid transitions, we need to specify a repair
method which can correct them. We can tell SeqScore to use the same
approach that conlleval uses (which we refer to as "begin" repair in our
paper): `seqscore score --labels BIO --repair-method conlleval --reference reference.bio invalid.bio`

Some examples:
```
# Score like conlleval
seqscore score --labels BIO --repair-method conlleval --reference <reference_conll_file> <prediction_conll_file>
# Score discarding invalid chunks, which sometimes produces higher scores
seqscore score --labels BIO --repair-method discard --reference <reference_conll_file> <prediction_conll_file>
seqscore validate --labels BIO <input_conll_file>
seqscore dump --labels BIO <input_conll_file> <output_delim_file>
Validation errors in sequence at line 7 of invalid.bio:
Invalid transition 'O' -> 'I-ORG' for token 'University' on line 7
Used method conlleval to repair:
Old: ('I-ORG', 'I-ORG', 'I-ORG', 'O', 'O', 'B-LOC', 'I-LOC', 'O', 'B-LOC', 'O')
New: ('B-ORG', 'I-ORG', 'I-ORG', 'O', 'O', 'B-LOC', 'I-LOC', 'O', 'B-LOC', 'O')
| Type | Precision | Recall | F1 | Reference | Predicted | Correct |
|--------|-------------|----------|--------|-------------|-------------|-----------|
| ALL | 100.00 | 100.00 | 100.00 | 3 | 3 | 3 |
| LOC | 100.00 | 100.00 | 100.00 | 2 | 2 | 2 |
| ORG | 100.00 | 100.00 | 100.00 | 1 | 1 | 1 |
```

Scoring only supports BIO chunk encoding. Validation can be done for IO, BIO, and BIOES.
At the moment, `dump` only supports BIO, but support will be added for IO and BIOES.
You can use the `-q` flag to suppress the logging of all of the repairs
applied. You may want to also explore the `discard` repair, which can
produce higher scores for output from models without a CRF or constrained
decoding as they are more likely to produce invalid transitions.

## Other commands

Other commands are still being documented, but here is a quick summary:
* `repair`: Apply a repair method to a file, creating an output file with
only valid transitions.
* `convert`: Convert a file from one encoding to another.
* `count`: Output counts of chunks in the input file.
* `validate`: Check whether a file has any invalid transitions.


# FAQ

## Why can't I score output files that are in the format conlleval expects?

At this time, SeqScore intentionally does not support the "merged"
format used by conlleval where each line contains a token, correct
tag, and predicted tag:

```
University B-ORG B-ORG
of I-ORG I-ORG
Pennsylvania I-ORG I-ORG
is O O
in O O
West B-LOC B-LOC
Philadelphia I-LOC B-LOC
, O O
Pennsylvania B-LOC B-LOC
. O O
```

We do not support this format because we have found that creating
predictions in this format is a common source of errors in scoring
pipelines.


# Features coming soon!
Expand All @@ -51,17 +204,43 @@ At the moment, `dump` only supports BIO, but support will be added for IO and BI
# Citation

If you use SeqScore, please cite
[Addressing Barriers to Reproducible Named Entity Recognition Evaluation](https://arxiv.org/abs/2107.14154).
[SeqScore: Addressing Barriers to Reproducible Named Entity Recognition Evaluation](https://aclanthology.org/2021.eval4nlp-1.5/).

BibTeX:
```
@inproceedings{palen-michel-etal-2021-seqscore,
title = "{S}eq{S}core: Addressing Barriers to Reproducible Named Entity Recognition Evaluation",
author = "Palen-Michel, Chester and
Holley, Nolan and
Lignos, Constantine",
booktitle = "Proceedings of the 2nd Workshop on Evaluation and Comparison of NLP Systems",
month = nov,
year = "2021",
address = "Punta Cana, Dominican Republic",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2021.eval4nlp-1.5",
pages = "40--50",
}
```

# License

SeqScore is distributed under the MIT License.


# Setting up for development
# Development

For the latest development version, check out the `main` branch
(stable, but sometimes newer than the version on PyPI), or the `dev`
branch (latest, but less tested).

To install from a clone of this repository, use:
`pip install -e .`

## Setting up an environment for development

1. Create environment: `conda create -y -n seqscore python=3.8`
1. Create an environment: `conda create -y -n seqscore python=3.8`
2. Activate the environment: `conda activate seqscore`
3. Install dependencies: `pip install -r requirements.txt`
4. Install seqscore: `pip install -e .`
x
6 changes: 5 additions & 1 deletion seqscore/scripts/seqscore.py
Original file line number Diff line number Diff line change
Expand Up @@ -213,7 +213,11 @@ def count(
type=click.Choice(SUPPORTED_SCORE_FORMATS),
show_default=True,
)
@click.option("--delim", default="\t", help="[default: tab]")
@click.option(
"--delim",
default="\t",
help="the delimiter to be used for delimited output (has no effect on input) [default: tab]",
)
@click.option("--quiet", "-q", is_flag=True)
def score(
file: List[str], # Name is "file" to make sense on the command line, but it's a list
Expand Down

0 comments on commit 68bce31

Please sign in to comment.