Basic idea

From the Old Norse word spá or spæ referring to prophesying and which is cognate with the present English word “spy,” continuing Proto-Germanic *spah- and the Proto-Indo-European root *(s)peḱ (to see, to observe) — vǫlva (wikipedia)

Basic idea

Unstructured PDF documents remain the main vehicle for dissemination of scientific findings. Those interested in gathering and assimilating data must therefore manually peruse published articles and extract from these the elements of interest. Evidence-based medicine provides a compelling illustration of this: many person-hours are spent each year extracting summary information from articles that describe clinical trials. Machine learning provides a potential means of mitigating this burden by automating extraction.

But, for automated approaches to be useful to end-users, we need tools that allow domain experts to interact with, and benefit from, model predictions. To this end, we present an web-based tool called Spá that accepts as input an article and provides as output an automatically visually annotated rendering of this article. More generally, Spá provides a framework for visualizing predictions, both at the document and sentence level, for full-text PDFs.

What is Spá concretely

Spá is our client-side library for rendering and editing annotations on PDF documents. It was initially conceived to render predictions of machine learning systems trained on full-text literature from the biomedical domain.

The original design was published as “Spá: A Web-Based Viewer for Text Mining in Evidence Based Medicine” in the Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge (ECML-PKDD 2014) [doi, preprint].

Later Spá was changed to work as a git submodule for the Vortext Annotate and Vortext Demo projects.

How does it work?

The major components of Spá are:

Mozilla PDF.js
React
Backbone.js
RequireJS
Hypothesis dom-anchor-bitap (experimental)

PDF.js is responsible for rendering the document. Normally PDF.js does this by rendering the document to <canvas> and putting a series of <div>’s on top for text selection (the textLayer). We replaced the textLayer with our own custom React component, this way we have full control over what happens in the textLayer without resorting to hacks.

To maintain state we use Backbone models and collections. We coordinate the model layer and the view layer by using contraptions we call dispatchers. Dispatchers are defined by the projects that include Spá, not here. The general idea is that a dispatcher listens for model changes (Backbone events) and updates the React components’ state accordingly using setState or forceUpdate methods. The components receive the Backbone models as props, and are allowed to call their methods to initiate change. It’s not as a pretty as Flux with immutable data structures (or ClojureScript) but it does the job for now.

How to use it?

Spá can be used by including it in other projects and defining a dispatcher. It is not meant to be used directly. Currently the following projects use Spá:

Vortext demo (for running predictions)
Vortext

Contributing

Currently this is a research object. The API and organizational structure are subject to change. Comments and suggestions are much appreciated. For code contributions: fork, branch, and send a pull request.

License

Spa is open source, and licensed under GPLv3. See license for more information.

Name		Name	Last commit message	Last commit date
Latest commit History 305 Commits
css		css
docs		docs
img		img
scripts		scripts
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.org		README.org

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Basic idea

What is Spá concretely

How does it work?

How to use it?

Contributing

License

About

Releases

Packages

Contributors 2

Languages

License

vortext/spa

Folders and files

Latest commit

History

Repository files navigation

Basic idea

What is Spá concretely

How does it work?

How to use it?

Contributing

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages