-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update aggregation scripts to use API to submit instead of pymongo #11
Comments
Thanks for summarizing the situation and laying out the acceptance criteria. I took a look at this today. Here are my English translations of all the database queries performed within generate_functional_agg.py, specifically. Query 1"Get all the distinct nmdc-aggregator/generate_functional_agg.py Line 120 in 3abf6ed
Query 2"For each document in the nmdc-aggregator/generate_functional_agg.py Line 121 in 3abf6ed
Query 3"Insert these documents into the nmdc-aggregator/generate_functional_agg.py Line 134 in 3abf6ed
Query 4"Get the document having this
Finally, here the aliases that appear in the list of queries above. nmdc-aggregator/generate_functional_agg.py Lines 52 to 54 in 3abf6ed
|
Similarly, here are my English translations of all the database queries performed within generate_metap_agg.py. They mirror the ones in the other file (i.e. same operations, different operands). Query 1"Get all the distinct nmdc-aggregator/generate_metap_agg.py Line 162 in 3abf6ed
Query 2"For each document in the nmdc-aggregator/generate_metap_agg.py Line 165 in 3abf6ed
Query 3"Insert these documents into the nmdc-aggregator/generate_metap_agg.py Line 186 in 3abf6ed
Query 4"Get the document having this nmdc-aggregator/generate_metap_agg.py Line 92 in 3abf6ed
Finally, here the aliases that appear in the list of queries above. nmdc-aggregator/generate_metap_agg.py Lines 56 to 58 in 3abf6ed
|
At this point, I'm wondering whether the Runtime API already provides the endpoints necessary for performing those operations. If it does, I think this is ready for implementation. |
query 4 inserts into the aggregation tables (functional_annotation_agg and metap_gene_function_aggregation) not data_object_set. the blocked ticket linked in the description, microbiomedata/nmdc-runtime#611 prevents us from using json:submit to add documents via the API. It is possible we could use queries:run, I haven't tested that, but it would be nice to use an endpoint which had more validation. Additionally metap_gene_function_aggregation is not defined in the schema so i believe this disallows using any existing endpoints at this time. |
I'll add a topic to the agenda for tomorrow's infrastructure meeting, about addressing the things (in the Runtime) that are—or may be—blocking this. |
@aclum @eecavanna who is this issue assigned to? Who's working on this? |
@kheal addressed generate_metap_agg.py in #26 recently. @mbthornton-lbl could work on generate_functional_agg.py but this hasn't been prioritized yet. This is not currently a blocker but would have to get addressed before moving our mongo instance off of SPIN. |
Justification: In order to migrate runtime to the cloud for increased stability we need to transition code that interacts with mongo directly to API queries.
blocked by:
microbiomedata/nmdc-runtime#611 - resolved, we can use json:submit now to enter these records.
Acceptance critera:
both generate_functional_agg.py and generate_metap_agg.py generate a request body which is submitted to a runtime API endpoint instead of using pymongo insert statements.
cc @sanjaypjana @eecavanna @shreddd @mbthornton-lbl
Subtasks:
The text was updated successfully, but these errors were encountered: