Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

transcriptomic query functions #105

Merged
merged 6 commits into from
Sep 15, 2023
Merged

transcriptomic query functions #105

merged 6 commits into from
Sep 15, 2023

Conversation

Clare72
Copy link
Contributor

@Clare72 Clare72 commented Sep 5, 2023

fixes #104

@dosumis dosumis requested a review from hkir-dev September 8, 2023 11:57
Copy link
Member

@dosumis dosumis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Main method looks good - just small query is returning false rather than empty dataframe when no results are returned consistent with behaviour of other methods on VFB_connect?

  • Not sure why changes were made to lookup. I think these are likely to break other methods.
    Please revert (or explain the motivation so we can discuss other fixes).

  • Having played with running queries I am concerned that the results can appear confusing when multiple clusters are annotated with the same ontology term in a single paper. We need to look at returning more context in these cases. Example:

cell_type cell_type_id ref gene gene_id function extent level
adult Kenyon cell FBbt_00049825 Li et al., 2022, Science 375(6584): eabk2432 14-3-3ε FBgn0020238 [] 0.678571 1268.15800
adult Kenyon cell FBbt_00049825 Li et al., 2022, Science 375(6584): eabk2432 14-3-3ε FBgn0020238 [] 0.705882 1257.36780
adult Kenyon cell FBbt_00049825 Mokashi et al., 2021, Front. Psychiatry 12: 69... 14-3-3ε FBgn0020238 [] 0.805285 1712.31100
adult Kenyon cell FBbt_00049825 Mokashi et al., 2021, Front. Psychiatry 12: 69... 14-3-3ε FBgn0020238 [] 0.823817 1532.72750
adult Kenyon cell FBbt_00049825 Mokashi et al., 2021, Front. Psychiatry 12: 69... 14-3-3ε FBgn0020238 [] 0.816967 1598.16420

We could just return the cluster name as an additional column, but it would be better if we could return more informative context. In these cases the difference appears to be sex.

@@ -237,7 +237,7 @@ def get_lookup(self, limit_by_prefix=None, include_individuals=False,
q = self.commit_list([property_lookup_query])
out.extend(dict_cursor(q))
lookup = {x['name']: x['id'].replace('_', ':') for x in out}
lookup.update({x['id']: x['id'].replace('_', ':') for x in out})
Copy link
Member

@dosumis dosumis Sep 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why change this? I think the original design makes sure curie inputs are lookups for short_forms. Pretty sure other methods depend on this. The changed code looks like it updates a library with they keys and values it was initialised with.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fixed an issue I was having - but now unable to replicate! changed back.

This reverts commit d0ebf52.
@Clare72
Copy link
Contributor Author

Clare72 commented Sep 14, 2023

just small query is returning false rather than empty dataframe when no results are returned consistent with behaviour of other methods on VFB_connect?

Yeah, happy to change - I think methods on VFB_neo4j usually return False rather than empty datastructures? I didn't want to throw an error, because having no data isn't really an error, but empty dataframe makes sense too.

@dosumis
Copy link
Member

dosumis commented Sep 14, 2023

I think methods on VFB_neo4j usually return False rather than empty datastructures?

I think either option is fine. We should just be consistent across methods.

@Clare72
Copy link
Contributor Author

Clare72 commented Sep 14, 2023

I think neo_query_wrapper doesn't return the headers if there is no data, so would need to manually specify if we want to return a dataframe with column headers and no data

@Clare72
Copy link
Contributor Author

Clare72 commented Sep 14, 2023

copied return from get_connected_neurons_by_type

@Clare72 Clare72 requested a review from dosumis September 14, 2023 13:50
@dosumis
Copy link
Member

dosumis commented Sep 14, 2023

Looks good. Happy to merge. BTW We should think about how to deal with mixed sex => blank.

@Clare72
Copy link
Contributor Author

Clare72 commented Sep 15, 2023

BTW We should think about how to deal with mixed sex => blank.

I don't think there is any annotation in FlyBase to distinguish mixed sex from unknown sex

@Clare72 Clare72 merged commit d0cfe81 into master Sep 15, 2023
1 of 2 checks passed
@Clare72 Clare72 deleted the gene_functions branch September 15, 2023 14:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Queries for transcriptomic data
2 participants