Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

check import of FAERS indication data #122

Open
andrewsu opened this issue Aug 29, 2019 · 1 comment
Open

check import of FAERS indication data #122

andrewsu opened this issue Aug 29, 2019 · 1 comment

Comments

@andrewsu
Copy link
Member

andrewsu commented Aug 29, 2019

Using the code in https://github.com/SuLab/faers, Greg parsed FAERS data for drug indications and produced this file https://zenodo.org/record/1436000#.XWcVWShKguU. In that file (record 836), there are three diseases listed for the drug bupropion: Depression, Anxiety, Bipolar disorder. However, in the bot run to add FAERS indications, the diff for bupropion only added anxiety: https://www.wikidata.org/w/index.php?title=Q834280&diff=756477246&oldid=737266253

Should investigate why... (and while we're at it, look at automating the parsing and updates...)

(tagging @stuppie in case you remember a reason why this might have been by design...)

@gtsueng
Copy link

gtsueng commented Sep 19, 2019

In looking at the example of bupropion, it appears that the corresponding Wikidata entries for the diseases depression and bipolar disorder do not have the MONDO IDs listed in the FAERS data.
No Mondo IDs in Q4340209 or Q131755; and cannot pull any entries up with SPARQL queries for MONDO:0002050 or MONDO:0004985 the way you can for MONDO:0011918. The function 'normalize_to_qids' appears to use MONDO IDs:

mondo_qid = wdi_helpers.id_mapper(PROPS['Mondo ID'])

The FAERs data does have CUIs for the indications, and a quick SPARQL query with the corresponding CUIs for depression and bipolar disorder will successfully pull up the Wikidata entities for the two.

Coverage of the FAERS MONDO IDs in Wikidata compared to the FAERS UMLS CUIs:
number of unique umls cuis in FAERS data: 800
number of unique FAERS data umls cuis found in Wikidata via SPARQL: 739
number of unique mondo ids in FAERS data: 774
number of unique FAERS data mondo ids found in Wikidata: 524

No one-to-many nor many-to-one mapping issues were found in pulling Wikidata items with Mondo IDs, which is probably why the script used mondo ids. In contrast, UMLS cuis had one-to-many AND many-to-one mapping issues when used to pull wd entities via sparql.

75 unique FAERS UMLS cuis pulled 156 unique WD entities via SPARQL (one-to-many)
54 unique FAERS UMLS cuis pulled 26 unique WD entities via SPARQL (many-to-one)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants