From 1ba32f9fc3f5e7dd8937f5a702b8f96e53978383 Mon Sep 17 00:00:00 2001 From: Christopher Bryant Date: Sun, 12 Aug 2018 15:21:33 +0100 Subject: [PATCH] Added changelog --- changelog.md | 22 ++++++++++++++++++++++ readme.md | 4 ---- 2 files changed, 22 insertions(+), 4 deletions(-) create mode 100644 changelog.md diff --git a/changelog.md b/changelog.md new file mode 100644 index 0000000..5831bcc --- /dev/null +++ b/changelog.md @@ -0,0 +1,22 @@ +# Changelog + +This document contains descriptions of all the significant changes made to ERRANT since its release. + +## 10-08-18 + +Added support for multiple annotators in `parallel_to_m2.py`. +Before: `python3 parallel_to_m2.py -orig -cor -out ` +After: `python3 parallel_to_m2.py -orig -cor [ ...] -out ` +This is helpful if you have multiple annotations for the same orig file. + +## 17-12-17 + +In November, spaCy changed significantly when it became version 2.0.0. Although we have not tested ERRANT with this new version, the main change seemed to be a slight increase in performance (pos tagging and parsing etc.) at a significant cost to speed. Consequently, we still recommend spaCy 1.9.0 for use with ERRANT. + +## 22-11-17 + +ERRANT would sometimes run into memory problems if sentences were long and very different. We hence changed the default alignment from breadth-first to depth-first. This bypassed the memory problems, made ERRANT faster and barely affected results. + +## 10-05-17 + +ERRANT v1.0 released. \ No newline at end of file diff --git a/readme.md b/readme.md index 0f4e3e7..0766224 100644 --- a/readme.md +++ b/readme.md @@ -34,8 +34,6 @@ Currently, we only support Python 3. It is safest to install everything in a cle spaCy is a natural language processing (NLP) toolkit available here: https://spacy.io/. -UPDATE 17/12/17: In early November, spaCy underwent significant changes when it became version 2.0.0. Although we have not tested ERRANT with this new version of spaCy, the main difference seems to be a slight increase in performance at a significant cost to speed. As such, we currently recommend the slightly older spaCy v1.9.0 for use with ERRANT. - It can be installed for Python 3 as follows: ``` pip3 install -U spacy==1.9.0 @@ -89,8 +87,6 @@ All these scripts also have additional advanced command line options which can b In terms of speed, automatic edit extraction is the bottleneck. As a guideline, it takes roughly 10 seconds (including loading times) to extract and classify the edits in 100 sentences on an Intel Core i5-6600 @ 3.30GHz machine. In contrast, it takes just 0.2 seconds to classify the edits in the same 100 sentences if the edit boundaries are already known. Bear in mind that these figures are only a rough estimate and runtime actually depends on how different the original and corrected sentences are and how many edits they contain. -UPDATE 22/11/17: When sentences were long and very different, ERRANT would sometimes run into memory problems. We fixed this by changing the default alignment behaviour from breadth-first to depth-first. Experiments showed this barely affects the results and we even saw improvements. It should also make ERRANT faster. - # Edit Extraction For more information about the edit extraction phase of annotation, we refer the reader to the following paper: