-
Notifications
You must be signed in to change notification settings - Fork 2
Building a literature curation software Objectives
The Curator software is the answer to our (Hill Laboratory) need to dispose of a methodological approach to literature curation (click here for a description of what is intended by literature curation in the present context) that would display some desirable features:
-
The approach should allows for collaborative curation.
- Annotations on a particular document can be made by different curators in a concurrent fashion.
- Produced annotations must therefore be easily mergeable.
- For sake of scientific reproducibility, the history of modifications must be tractable.
-
The result of the curation process should be easily reusable by other. It must therefore not rest on some implicit knowledge of the curator. The important information associated with the annotations must be explicitly specified.
-
The output of the curation process must be easily machine readable. Although any computer file is machine readable, what makes it easily readable is the use of a consistent formatting (e.g., CSV files) with key fields using a highly consistent terminology. This terminology should ideally by linked to identifiers from external entities (e.g., ontological entries) allowing cross-referencing, indexing and searching annotations in relations with specific concepts (e.g., specific brain regions).
-
Annotations must be precisely and reliably localizable in the document of origin.
-
The process of annotating a document must be as light as possible. The curation process is to be performed by domain-experts which are doing it as part of other overarching goals. As such, the production of a curation output that respect the criteria laid out herein must not be perceived as implying a significant increase in workload when compared to an informal review of the literature.
-
The design of the system must be lightweight, simple, and resting on existing tools such that it can be design and implanted in matter of weeks. Its maintenance should not imply a significant amount of work.
-
Annotations should be sharable without involving copyright issues (e.g., they cannot be embedded in full-text documents which are themselves copyrighted).