-
Notifications
You must be signed in to change notification settings - Fork 163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Color by genotype refactoring + gene names with hyphens/underscores #681
Conversation
…instead of (0, length). In the UI, positions are one-origin along the gene length, so the last position should be included.
Consolidates test/encoding/decoding logic into three main functions which are used throughout the codebase instead of hardcoded assumptions about the serialization format ("gt-<gene>_<pos>") in many locations. This resolves existing and prevents future divergence in the serialized format, while allowing it to change in the future with minimal or no change outside of getGenotype.js. Multiple positions are supported, but multiple genes are not at this time. I've left that for future work since there is not a UI for multiple genes. This refactoring will make it easier if we do. Terminology (prot vs. gene) was normalized for consistency when trivial. As a result of improved parsing, gene names with embedded underscores and hyphens are now supported. Resolves #669.
…ction Leaves it up to the top-level render(), where the same conditional exists, to decide when genotype controls should be shown. The nested conditional is unnecessary, and gtPositionInput() now matches the gtGeneSelect() render function's behaviour in this regard. Diff best viewed with whitespace ignored.
Default gene (HA1 or nucleotide) and position (1) selections are replaced by informative placeholders and nothing changes until a full selection is made. If an invalid position is entered, the app remains at the last good state instead of resetting back to a default state. This allows a correction of the position entered instead of requiring a re-entering. The previous placeholders rarely or never triggered because default values were set.
…rectly This helps maintain consistency in the handling across different components.
…lor by field It is easier to reason about and keep track of whole state replacement than the combined effects of individual field updates. This is especially true for logically linked state like geneSelected and positionSelected, which easily becomes out of sync with colorByGenotype. For example, geneSelected and positionSelected were not reset when switching from a genotype coloring to any other color by field. A few individual field updates remain where it doesn't make any more sense to think about them in terms of replacement.
External state in redux now has one entrypoint (componentWillReceiveProps) and one exitpoint (componentDidUpdate). This consolidates calls to redux's dispatch() and ties it to changes in local component state, simplifying the logic of the color by, gene, and position input handlers. setColorBy() and setGenotypeColoring() are now one function (componentDidUpdate()) instead of two interacting functions.
The corresponding JSON schema changes in augur to allow hyphens is https://github.com/nextstrain/augur/compare/schema-gene-names. |
Brilliant. Would you have time tomorrow afternoon to talk through this? |
This is great Tom. The handling of |
@jameshadfield Yep, I'm certainly happy to walk through this. Tomorrow might be tight, however, with lab meeting in the morning and then our offsite meeting in the afternoon. I'll ping you on Slack to figure something else out. |
Walked through this with James. Overall we're happy with it. I'm going to make the color by control debouncing more specific to genotype (so non-genotype color bys don't delay 400ms), and then it should be good to merge. |
This avoids triggering a new state on every keypress in a short sequence, which is not only slow but inconsistent with expectations that intermediate colorations are shown while, for example, backspacing a position input to delete it. 400ms seemed about right in my testing (300ms was too quick for most typists, 500ms seemed sluggish), but that value may want to be adjusted.
It's good practice to avoid hardcoding such magic values into a myriad of places and makes maintenance easier. The name also lends more context to its usage. Using a global constant also means there is only a single place to change the value, at least in auspice, if we need to. It's not unlikely we'll need to change it, because there are plenty of legitimate genes named "nuc", e.g. <https://www.ncbi.nlm.nih.gov/gene/?term=nuc>. Note that this doesn't refactor _all_ references to the string "nuc", as the same value is also used for a different concept, the "mutation type" or "mutType", which is valued "nuc" or "aa" (or sometimes false). It's important that magic value constants aren't used merely for their value alone, but for the conceptual thing they represent so as not to conflate changes in the future.
…to make it clear it returns a "mutType". The mutType is closely related to but different from the genotype gene name. Embedding mut type into the function name makes it clear what's returned and easier to find with other mut type code when grepping.
Though the development webpack config includes source maps, they were disabled in production. When debugging a live build (or a local build where you don't want hot reloading), it's very useful to have a source map for the browser to translate between the packed dist/bundle.js and the original code. This matches the recommendations of webpack's documentation: https://webpack.js.org/guides/production/#source-mapping
…instead of bundling the former with the latter. Changing the mutType must happen _before_ updating colors because the entropy bars need to be recomputed for the new mutType before applying the new genotype colorBy. Otherwise, the entropy component tries to apply the new genotype colorBy to bars of the wrong mutType, which in turn causes all sorts of errors ("entropy out of sync" and selected positions not matching the data bars). The state dependencies are a bit tangled here, but de-tangling them is a larger project for another time.
685c68e
to
1c7b778
Compare
@jameshadfield I've re-applied the debouncing to only the genotype updates and repushed the branch. The new commit is 1b46f72, which makes these changes between the old and new version. Give it a look? |
This branch started as work to add support for hyphens and underscores in gene names (#669). It quickly led to refactoring the color by genotype handling so that the encoding/decoding could be handled in a single, well-factored place. That in turn led to cleanups around the color by controls. The story and rationale for all changes is in the commit messages. It will be more digestible if reviewed commit-by-commit in chronological order rather than as one big lump.
@jameshadfield, I'd like your input on these changes if you have time.
@emmahodcroft, I believe you had the original need for hyphens and underscores being supported in gene names in auspice. It should work on this branch!