-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create simplified notebook that executes PFOCR clustering #538
Comments
That log message refers to the scores that are reported under |
Got it. Thanks Andrew. |
This is not from our BTE-PFOCR notebook, but I'm just putting this here as an example since making it recently for another project. The rows and columns are swapped from the suggestion above and it is for a "normal" enrichment of PFOCR rather than an iterative enrichment-with-exclusion that we are planning. Nevertheless it illustrates what can be learned from a heatmap view and highlights why we are interested in the exclusion strategy. Note the redundant representation by these 16 pathways of gene groupings that could equally represented by just 2 or 3 pathways). |
Here is the repo I created for this issue. The PFOCR figure results using the example of 7 TRAPI results are in the jupyter notebook: https://github.com/wikipathways/BioThings_Explorer_PFOCR_prioritization/blob/main/bte_clustering_AA.ipynb
|
Ok. We're are almost there! Some key feedback and decisions for the completion of this v1 issue:
|
All the features above are complete. Looks pretty good! Should be ready to demo next week. I propose opening new issues for additional bugs and features, or other varieties of PFOCR notebooks. |
And we're trying out a new name :) PET Notebook |
PET Notebook uses PFOCR CSV files as input. The Jupyter notebook that generates the required PFOCR input CSV files has been moved to the PFOCR repo. I have also updated the PFOCR pipeline details to ensure this notebook is run in the next release of PFOCR and the required CSV files are generated. This will keep the inputs required by PET notebook up to date with the latest release of PFOCR. |
(Sorry, forgot to update comments here.) I think the notebook looks great. I don't completely love the examples I suggested in the initial post to demonstrate the utility. Would be great to explore other options (I think one of you had suggested "genes related to Alzheimer's" and I think that would be great). Also noting that Alex presented to the Translator User-Centered Working Group and that was well-received. Would be good to follow up with them in a month or two from now... |
@andrewsu What steps did you follow for getting the below example URLs? Kristina is testing some queries on the notebook and we are facing issues which I think might link back to the JSON URL.
|
When a query is posted to the ARS, you get back a JSON object with a primary key (PK). For example, this is the first few lines of an ARS response:
That PK can then be plugged into the ARAX UI. For the example above: https://arax.ncats.io/?r=8d85bbb4-2085-4ad8-a71a-b4dd5099c4a0. You can also get JSON output here: https://arax.ncats.io/api/arax/v1.3/response/8d85bbb4-2085-4ad8-a71a-b4dd5099c4a0. In either case, you can see that there is a child PK for BTE's response to this query: |
@khanspers Tagging you here so you can see Andrew's response as well for getting the response JSON URL. |
Done. Pursuing joint strategies with other Translator teams for result list-level enrichment. |
This ticket is based on the exploratory work done in #451, but the goal here is to create a notebook that can be easily used and modified by other data analysts within Translator. This notebook should read in a TRAPI result, perform a PFOCR enrichment analysis over the entities in each TRAPI results, and then report groups of related results with the PFOCR figures that join them.
Requirements:
Imatinib - [Gene] - [Gene] - Asthma
: https://arax.ncats.io/api/arax/v1.3/response/49d80ecb-7fd9-4ee6-a642-6d7994903f04 (41 MB, 2862 results); "results" inn1
andn2
Imatinib - [Gene] - Asthma
: https://arax.ncats.io/api/arax/v1.3/response/7b14f961-9066-41f7-9e3b-d76b2b4a7fac (83kB, 7 results); "results" inn1
message.query_graph.nodes
should be used for clustering. The entities mapped to that node ID are the "result entities"The text was updated successfully, but these errors were encountered: