Skip to content

Text Mining & Analysis

Juan M. Banda edited this page Apr 12, 2020 · 18 revisions

This page is dedicated to the 'Text mining & Analysis' division of the CoVid 2019-Biohackathon 2020.

Rich analyses shall be done, explanatory visualizations & dashboard shall be made, datasets shall be curated & maintained for future scientific research projects.

Collaborative Covid-19 literature annotation @ PubAnnotation

Since the LitCovid (by NCBI) and the CORD-19 (by Allen Institute for AI) datasets were released, many groups are producing and releasing annotations to the data set. We have setup an environment to collect and integrate those annotation datasets at PubAnnotation, a public repository of literature annotation, and are organizing collaborative annotation to the literature datasets of Covid-19. Production and collection of various annotation datasets is ongoing, and we are aiming at releasing a meaning amount of rich annotations in the end of the hackathon. Contribution with annotation datasets is completely open, and all the contributed annotation datasets will become immediately integrated and accessible, in various ways, including search, visualization, and fine-grained access.

Tasks worked during the hackathon:

Identification of symptoms on Twitter users - Quantify how many users are claiming symptoms.

Lead: Juan M. Banda, Github Repo, Mini-Publication

Characterize the information/misinformation around potential COVID-19 treatments using Twitter data

Lead: Ramya Tekumalla, Github Repo , Mini-Publication

Communication

Join the Slack workspace & head on to to 'text-mining-and-analysis' channel. It shall be fun.

Participants

Coordinator Ali Haider Bangash

Coordinator Yagoub A I Adam

Research hypotheses & mini-publications are invited in the realms of:

Transmission, incubation, and environmental stability of SARS-COV-2

Risk factors for CoVid 2019

Genetics, origin, and evolution of SARS-COV-2

Therapeutics & Vaccines against SARS-COV-2

Health systems' capacity to deal with the CoVid 2019 pandemic

Non-pharmaceutical interventions against the CoVid 2019 pandemic

Diagnostics & Surveillance against the CoVid 2019 pandemic

Information sharing and Inter-sectoral collaboration against the CoVid 2019 pandemic

Ethical and social science considerations related to dealing with & combating against the CoVid 2019 pandemic

Resources

Paul Mooney's elaborative schema,

Breaking down the 'tasks' of COVID-19 Open Research Dataset Challenge (CORD-19): An AI challenge with AI2, CZI, MSR, Georgetown, NIH & The White House to the minute significant details will guide you further.

Twitter data analysis using (https://zenodo.org/record/3735274).

Clone this wiki locally