Graph response formatting fails when no pipeline data is in the database #367
Labels
bug:functional
Functional defects resulting from feature changes.
Epic
A collection of issues that are related by topic and can be addressed together.
Is there an existing issue for this?
Expected Behavior
No response
Current Behavior
When submitting any cohort query to an n-API v0.4.0
the query results in an
Internal server error
.The same issue does not occur if the n-API is run in aggregated mode, or for a graph database containing at least one subject with pipeline metadata.
Error message
Error in the n-API container logs:
This error is misleading, as it gives the impression that the relevant section of code:
api/app/api/crud.py
Lines 222 to 232 in 9913736
is using a non-existent or deprecated argument / has some inherent syntax error.
However, what's actually happening is that the code assumes
reset_index()
is operating on apd.Series
(which DOES have thename
argument). But something is going wrong in the logic forsession_completed_pipeline_data
such that it's producing apd.DataFrame
instead (which DOESN'T have thename
argument forreset_index()
).Source of problem
Earlier in the code, when
pipeline_grouped_data
is constructed:api/app/api/crud.py
Lines 202 to 220 in 9913736
we are dropping
NaNs
during thegroupby
, meaning that when there are no pipeline names in the data, we get an empty dataframe like:as a result, when we then try to run
groupby
again on this object to constructsession_completed_pipeline_data
, that has no effect and still returns apd.DataFrame
, causing the unexpected keyword error when we then try to runreset_index()
on it.If we instead set
dropna=False
in the groupby when constructingpipeline_grouped_data
, there is no longer an error, but the resultingcompleted_pipelines
field for single subject-session looks like this in the response:Environment
How to reproduce
No response
Anything else?
Some considerations
Why this wasn't caught by our tests
api/tests/conftest.py
Lines 111 to 168 in 9913736
a. It assumes a dataset with info containing pipeline metadata
b. It is an aggregated response
To avoid similar issues
i. ONLY phenotypic data
ii. ONLY phenotypic + bids data (no derivatives)
iii. ONLY phenotypic + derivatives data (no BIDS)
iv. Phenotypic + bids + derivatives
The text was updated successfully, but these errors were encountered: