Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with jmespath: unexpected behavior with single items (not arrays) #323

Closed
colleenXu opened this issue Apr 2, 2024 · 6 comments
Closed
Assignees
Labels

Comments

@colleenXu
Copy link

colleenXu commented Apr 2, 2024

I'm trying to query MyChem's drugcentral info and only keep drugcentral.bioactivity objects with the matching action_type value (POSITIVE MODULATOR). So I'm setting the parameter jmespath to drugcentral.bioactivity|[?action_type=='POSITIVE MODULATOR']

But if I try using that jmespath on this document C0017845_doesnt_work.json, I end up with drugcentral.bioactivity: null which is unexpected. (To retrieve this document yourself, use this GET query))

POST-query with chemical C0017845 that returns null

curl --location --globoff 'https://mychem.info/v1/query?size=1000&fields=drugcentral.bioactivity.uniprot.uniprot_id,drugcentral.bioactivity.action_type,drugcentral.bioactivity.act_source,drugcentral.bioactivity.organism&jmespath=drugcentral.bioactivity|[?action_type=='POSITIVE%20MODULATOR']' \
--header 'Content-Type: application/json' \
--data '{
  "q": [
      [ "C0017845", "POSITIVE MODULATOR"]
      ],
  "scopes": ["drugcentral.xrefs.umlscui", "drugcentral.bioactivity.action_type"]
}'


VS when I try using jmespath on a different document C0018549_works.json, it works as-intended. (To retrieve this document yourself, use this GET query))

A POST query with chemical C0018549 that works as-intended

curl --location --globoff 'https://mychem.info/v1/query?size=1000&fields=drugcentral.bioactivity.uniprot.uniprot_id,drugcentral.bioactivity.action_type,drugcentral.bioactivity.act_source,drugcentral.bioactivity.organism&jmespath=drugcentral.bioactivity|[?action_type=='POSITIVE MODULATOR']' \
--header 'Content-Type: application/json' \
--data '{
  "q": [
      [ "C0018549", "POSITIVE MODULATOR"]
      ],
  "scopes": ["drugcentral.xrefs.umlscui", "drugcentral.bioactivity.action_type"]
}'

I do notice a different between these two documents that may account for this:

  • in the first case, drugcentral.bioactivity is an object (since there was only 1 thing)
  • in the second case, drugcentral.bioactivity is an array of objects

If this is the key, I'd like jmespath to be able to gracefully handle both situations...

@colleenXu
Copy link
Author

Another example is described in biothings/biothings_explorer#316 (comment)

@colleenXu
Copy link
Author

colleenXu commented Apr 9, 2024

I've updated the opening post to match the discussions in Slack/add the json documents.

The proposed fix is to adjust the BioThings API's always_list parameter behavior (so it works before jmespath). Then adjust x-bte annotations to add it to queries using jmespath as a just-in-case. The syntax is always_list=drugcentral.bioactivity (aka path to the field that should be an array for jmespath to work on it).

@colleenXu
Copy link
Author

Assigning to myself to review and see if this is fixed.

@colleenXu
Copy link
Author

colleenXu commented Dec 7, 2024

Using always_list fixes this behavior! Going to add to all jmespath operations while doing biothings/biothings_explorer#733... so track updates there.

@colleenXu
Copy link
Author

FYI:

  • I think the key BioThings change is here 7df44a3. I think this allows the always_list transformation to be done first, before jmespath processing is done (see convo in lab Slack).
  • MyChem was updated with that BioThings change May 9 (lab Slack link)
  • I don't know if other APIs have been updated.

@colleenXu
Copy link
Author

Regarding current x-bte operations:

  • I ended up adding always_list to all MyChem drugcentral.bioactivity x-bte operations: they all use jmespath and the field's value could be an object (single) or array (multiple)
  • I ended up NOT adding always_list to any other current use of jmespath. For those cases and fields, I saw single-element arrays. So I assumed the field's value was always an array. See comments in these commits.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants