Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wikidata database label for Metabolite nodes #42

Closed
DeniseSl22 opened this issue Jun 21, 2017 · 14 comments
Closed

Wikidata database label for Metabolite nodes #42

DeniseSl22 opened this issue Jun 21, 2017 · 14 comments

Comments

@DeniseSl22
Copy link
Collaborator

Could a database label for Wikidata-identifiers be added please :)?

@JonathanMELIUS
Copy link
Collaborator

This is because in the datasources configuration files , Wikidata is registered as a "secondary identifiers".
( columns 8: 0 = secondary identifiers); e.g :

So it means that if we want Wikidata to appear in the metabolite dialog box in PathVisio, we have to change the 0 to 1.

@egonw, @mkutmon : Any reasons why Wikidata is a "secondary identifiers" ?

@egonw
Copy link
Collaborator

egonw commented Jun 22, 2017

I think that may originate from the days when I was not sure about the quality yet... but I'm OK to make it a primary ID now... @mkutmon, also OK?

I think we then need to update BridgeDb in PathVisio, right? Because we still cannot update just the datasources.txt (or .ttl) in PathVisio without a new BridgeDb release (something that annoys me)...

If so, there are one or two further datasources.txt I like to include in that: 1. EPA CompTox IDs, 2. BRENDA compound IDs...

@egonw
Copy link
Collaborator

egonw commented Jun 22, 2017

Related to #43

@DeniseSl22
Copy link
Collaborator Author

DeniseSl22 commented Jun 22, 2017

I think we should look through the whole datasources configuration files -list, since you (Egon) asked me yesterday to manually change Pubchem_substance to pubchem_ compounds. However, when people still have the option to use Pubchem_substance as a database, they can keep adding those as a metabolite identifier.

So I think we should evaluate the list, change the 0 to 1 for databases we do want to appear in Pathvisio, change the 1 to 0 for databases we do not want to appear in Pathvisio, and then update BridgeDB.

@egonw
Copy link
Collaborator

egonw commented Jun 22, 2017

Denise, maar niet alleen de data source veranderen, maar ook de identifier!

Yes, making PubChem substance a secondairy data source sounds like a good idea!

@DeniseSl22
Copy link
Collaborator Author

Yes I got that I need to change the identifier as well, otherwise it is just wrong.

So, are there any other databases that can be changed from primary to secondary and vise versa?

@DeniseSl22
Copy link
Collaborator Author

Apparently InCHI -IDs are not valid anymore in PathVisio (results in broken links, check here. So, this identifier option should also be removed I think.

@egonw
Copy link
Collaborator

egonw commented Aug 14, 2017

The problem with InChIs is that BridgeDb does not support them for many compounds, as the String length exceeds what is allowed BridgeDb's Derby backend...

@DeniseSl22
Copy link
Collaborator Author

Okay that explains. But the InCHIKey is good to use as identifier?
About the secondary data IDs: according to Tina and Jonathan, this should be changed in the metabolites Bridge DB file (so that people cannot use certain databases which are known to give problems, like Pubchem substance, in PathVisio).

@mkutmon
Copy link
Collaborator

mkutmon commented Sep 1, 2017

@egonw did you change Wikidata to primary? because even after your pull request with v2.2.0 it doesn't show up. When I check it doesn't seem to primary. @JonathanMELIUS can you confirm this?

@mkutmon
Copy link
Collaborator

mkutmon commented Sep 1, 2017

hmm seems to be primary in datasources.txt at the moment - need to check to make sure I have got the correct bridgedb jars

@mkutmon
Copy link
Collaborator

mkutmon commented Sep 1, 2017

okay this is weird - in tag 2.2.0 of BridgeDb Wikidata seems to be a primary ds (https://github.com/bridgedb/BridgeDb/blob/release_2.2.0/org.bridgedb.bio/resources/org/bridgedb/bio/datasources.txt) but if I unjar the org.bridgedb.bio.jar and look at the datasources.txt file in there it is a secondary ds. @egonw @JonathanMELIUS any ideas what might be going on?

In any case, as long as it isn't a primary ds in the datasources.txt it doesn't show up in the dropdown box in PV.

@egonw
Copy link
Collaborator

egonw commented Sep 1, 2017

My bad! My PR was was from longer ago, but I was focussed on the bug fix... the commit to make Wikidata primary was after that PR: bridgedb/BridgeDb@7b91a01

I will make a new PR!

@DeniseSl22
Copy link
Collaborator Author

Nice! looking forward to see that one ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants