-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improper merging of PFOCR edge attributes from multiple Records #463
Comments
and maybe only one Basically, I think the "last" Record processed is what is kept (all prior ones are overwritten?)...and instead we want to make the values arrays and add elements to them... Something like this is already done for other apis ingested through x-bte annotation, like semmeddb (think pubmed IDs) and MyVariant |
Maybe semmeddb would have an equivalent test case to see whether the problem happens there too? |
Without going step-by-step through the code, I'm not aware off the top of my head what would be causing this. |
I think I figured out where the issue is being caused in the code, I can probably make a PR soon if I am correct? |
the screenshot looks good, ask @tokebe whether you're working with the correct file. For single values, I think it's fine to leave them as single values (not 1-element arrays). But if you see something different in the code (like elsewhere single values are converted into 1-element arrays), let us know... |
Screenshot looks good, make a PR as soon as you're ready. |
on a related note @rjawesome @tokebe I think "sets" (aka unique values only) is more useful than "lists". I've noticed that problem with stuff from MyVariant; look at the civic stuff on edges in the following query. However, maybe this is something separate enough to be a different issue? POST to MyVariant specifically: http://localhost:3000/v1/smartapi/09c8782d9f4027712e65b95424adba79/query
Example:
|
I can make each attribute into a Set, that should be no problem. |
@colleenXu Hmm, your query seems to create another issue, apparently, sometimes an array is being put as the attribute in the record, for example for your query, these two arrays are in the two records that combine to form this edge [
[
"Sensitivity/Response",
"Sensitivity/Response",
"Sensitivity/Response",
"Resistance",
"Sensitivity/Response",
"Resistance",
"Sensitivity/Response"
],
[
"Sensitivity/Response",
"Sensitivity/Response",
"Sensitivity/Response",
"Sensitivity/Response",
"Resistance",
"Resistance",
"Resistance",
"Sensitivity/Response",
"Sensitivity/Response",
"Sensitivity/Response"
]
] So, right now my code is making a set or array of these two. Would we like to flatten these arrays? (we can only do this if we are sure no values themselves are of the array type). The results you are showing are because the attributes are only taken from the last record |
I think what's happening is that this DBSNP ID actually corresponds to two hits in MyVariant, and the nested nature of the data is what's leading to this... I think it'd be nice to flatten these arrays and then run a set operation / unique-values-only. I was hoping for something in the end like I currently can't think of cases where we wouldn't want to do this for an edge-attribute....but we'd have to test your stuff carefully once it's ready for testing. |
@colleenXu Should be ready for testing, flattening of arrays/usage of set has been implemented. |
@rjawesome @tokebe If I try to run the example query from the first post, I get a status 500 response. console logs
|
Ah I must have made a typo when copying the code over to the other PR... Fix pushed. |
Deployed to prod 🚀 |
When a single edge is derived from multiple Records, the edge attributes appear to be merged improperly, at least in the case of the attribute
figure_url
,figure_title
,pmc_reference
from PFOCR. What appears to be happening is that only onefigure_url
is kept, instead of each being merged into an array.To replicate
In the workspace:
main
branch and updatednpm run git checkout main
&npm run git pull
npm run smartapi_sync
Query:
URL:
http://localhost:3000/v1/smartapi/edeb26858bd27d0322af93e7a9e08761/query
Body:
This query should be returning 9 figures instead of the 1 that appears (
attributes
of the only edge inmessage.knowledge_graph.edges
)What has been confirmed
mappedResponse
object with correct attributes.TODO
mappedResponse
is turned intoedge.attributes
Tagging @colleenXu @ariutta for additional details.
The text was updated successfully, but these errors were encountered: