This project contains the source code for generating the disease association evidence data which is used by the Open-Targets platform.
To generate a new data release, run the following script:
% cd src/bin
% ./OpenTargetsCreator -all
This will create several files:
- uniprot-valid.json - contains a JSON object representing a disease association on each line.
- open-targets-*.log - the log file reporting on the progress of the JSON generation.
- cttv011-DD-MM-YYYY.json.gz - a zipped file containing uniprot-valid.json. This is the file that can be submitted to the Open-Targets CoreDB team. Note: this will only be generated if the JSON generation completed successfully.
Please view the logs generated to see whether many errors are being encountered. If unusual errors are seen, then fix the codebase and rerun the script.
The contents of the log files often contains the following, which is not a problem:
- WARN u.a.e.u.ot.mapper.FFOmim2EfoMapper - No mapping found for OMIM: XXXXXX
After data has been successfully generated, it needs to be deposited to the Open-Targets CoreDB team, in their Google Bucket location.
Should you need to contact the CoreDB team, they can be emailed here: [email protected]
- Please make sure the json version is up-to-date in DefaultBaseFactory.java (CTTV_SCHEMA_VERSION) as required by open target team
- Please make sure the japi version in pom is pointing to latest private repo - https://mvnrepository.com/artifact/uk.ac.ebi.uniprot/japi?repo=ebi-repo
- Rerun the command if it fails for java.net.SocketTimeoutException: Read timed out
Python version 2 required31.10.2022 Python2 is not supported anymore and script will fail because of inline python2 code. Changed to python3 syntax, so no requirement of python2.