Enable a narrative to source multiple builds #1164

eharkins · 2020-06-11T02:26:15Z

Description of proposed changes

This builds on an existing contribution #1071 (thanks to the amazing work of @salvatore-fxpig
!) to address issue #1050, i.e. enabling a narrative to source multiple builds (datasets). It mostly works, but needs more testing.

Some issues which might prevent this from getting merged (and I am still working to reproduce and define so that we can address them):

There are a number of existing narratives where there is a typo in one of the slides specifying the dataset or they don't specify the build in exactly the same way. This wasn't an issue before, since having any of the datasets be different from the first slide doesn't have any effect, since that's the only one we used to load the dataset. With these changes, we fetch all the unique urls from all slides in a narrative, which might break such narratives as the ones I mentioned. @jameshadfield said:

All of the narratives we control (e.g. github.com/nextstrain/narratives) we can write a small script to flag where the builds differ (they should have all been the same so far). It was spelled out in the original spec that the dataset should change, so if there are problems then people will have to fix their narrative files*
* if the fetch fails for a dataset change, we should fall-back to the currently-in-use dataset & throw up a notification. This’d be nice, but not essential.

Ensuring traditional narratives are not broken on mobile.
Ensuring that we implement componentWillUnmount() methods which remove listeners set up in any of the react components which might be displayed and then unmounted when changing slides (and datasets) in a narrative. This is to prevent memory leaks or updates to unmounted components (this may or may not be an issue currently with the Entropy component - componentWillUnmount still not implemented, see the diff).
I also am noticing a difference in some deme sizes between my branch and master on a given ncov narrative I am using to test that the existing narrative behavior is unchanged - so will want to fix that as well assuming it's both wrong and introduced by these changes.

Some goals it has inspired:

The way we rerender all the main components when changing datasets in a narrative is with the react key prop. This makes it so we can avoid having to explicitly cover tons of cases of comparing props to guess if we should rerender a given component. This seems quite powerful and sets a cool precedent for other places where we'd like to do this in the app.
See below about a good idea James had for testing narratives.

Some TODO items that can become separate issues (i.e. don't prevent this from being merged if we don't mind these imperfections), which we will still appreciate any help with cc @salvatore-fxpig:

Ensuring multiple dataset narratives work on mobile.
Improve dataset fetching performance strategy (see TODOs in the diff)
Allow frequencies data to be fetched and displayed in narratives

Testing

I have been testing with this example narrative created by @jameshadfield and extending it as necessary. As James pointed out, maintaining a huge narrative (or set of narratives) that runs through many combinations of dataset and queries (and thus app states) could be immensely helpful in testing narratives manually as well as with automated tests. I will create an issue as a stub for this and as a place for discussion of how we want to use that big test narrative.

Thank you for contributing to Nextstrain!

using redux to cache JSONs

this completes a bunch of TODOs (search TODO:1071 for remaining items): 1. clear the json cache when unmounting narrative component 2. rerender mounted react components with new dataset upon redux state update this is achieved using react "key" props (https://reactjs.org/docs/lists-and-keys.html#keys) 3. wait for all datasets to be fetched using promise.all 4. fix loading narratives from not the first page 5. use the appropriate dataset when leaving the narrative page to explore a dataset

jameshadfield

Thanks @eharkins (and @salvatore-fxpig). This is looking great. I'll think a bit about how we're calling changePage over the coming days to make sure this is the right direction.

In addition to the in-line comments, In terms of to-dos before merge / auspice release we'll need:

A more expansive narrative to test with (e.g. using entropy panel, multiple datasets). This doesn't have to be huge for the purposes of this PR.
Write a small script to check all files in github.com/nextstrain/narratives and ensure each uses the same dataset for each block. We can fix these as we find them.
~~Write a test narrative with a misformed / non-existant dataset in one of it's blocks. I imagine there are narratives out there with this.~~ Experiment with one of the incorrectly formatted narratives @eharkins identified above. This would have rendered fine on current master, but on this PR would result in the rejection of the Promise.all we use to fetch datasets. Ideally, we'd use the previous "good" dataset, but at the least we can print a helpful error notification.

jameshadfield · 2020-06-11T04:50:27Z

src/actions/loadData.js

+  // then fetching all the rest in the background so we can use them from the cache upon changing slides.
+  // Doing that presents the risk of a race case (if you change pages faster than a dataset can be fetched) so we are avoiding it for now.
+  // Instead we use Promise.all to ensure all the datasets are fetched before we render.
+  return Promise.all(blocks.map((block, i) => {


This actually makes a request for every block, regardless of whether the blocks use the same dataset (treeName) -- i.e. the test narrative fetches the zika dataset twice and the mumps/na dataset twice. Prior to the Promise.all I think we want to make a set of all treeName and secondTreeName across the narrative blocks, then perform a Promise.all on that (unique) set.

Added a commit that should fix this. Thanks for pointing it out!

jameshadfield · 2020-06-11T04:54:00Z

src/actions/loadData.js

+    secondTreeDataset: false,
+    secondTreeName: false
+  };
+  // TODO:1050 A more performant fetching strategy would be fetching the dataset for the slide you land on,


Agree that this will add complexity and is not strictly necessary, thus it should be addressed as a subsequent PR. The current implementation fetches the datasets in parallel (via Promise.all) which it would be interesting to time against a fetch of a single dataset.

jameshadfield · 2020-06-11T04:59:34Z

src/reducers/jsonCache.js

+  jsons: {} /* object with dataset names as keys and loaded dataset / narrative jsons as values */
+}, action) => {
+  switch (action.type) {
+    case types.CACHE_JSONS:


Since we have the ability to return to the current narrative -- click "explore the dataset" then click "return to narrative" -- we want to preserve the cache until a new dataset is loaded*, i.e. the CLEAN_START action. Currently this series of events causes Auspice to crash. This will also remove the need for a separate action here when the narratives unmount. (I realise I may have suggested to clear out the cache when we click "explore the dataset"!!)

* This should have the effect that changing the dataset to on which happened to be cached will load it from the cache rather than fetching it.

Makes sense! Just pushed a fix that seems to fix the bug with this behavior and only clear the cache on CLEAN_START

jameshadfield · 2020-06-11T05:02:10Z

src/components/narrative/index.js

-      }));
-
+      };
+      if (this.props.blocks[idx].dataset !== this.props.blocks[this.props.currentInFocusBlockIdx].dataset) {


Is this the only section which needs to be transferred to the mobile narrative display for that to work?

salvatore-fxpig · 2020-06-12T00:56:53Z

Uuuh I was waiting for this to come up again!
Will give it a spin and see if I can help, the idea of using keys to forcing remounting is brilliant stuff. 👍

PS: I think adding uniq from lodash just changed the other-vendors chunkhash, if you're already publishing somewhere with long-term caching I would suggest using Set, it should transpile just fine anyway.

salvatore-fxpig · 2020-06-12T02:06:49Z

I think the async stuff can be baked in pretty easily, something along these lines should do the trick
salvatore-fxpig@a5afd50

…ld narratives

salvatore-fxpig · 2020-06-12T16:43:39Z

Dumping some stuff here: #1167, hopefully makes it easier to play with

getting rid of _.uniq sets the hash of the other-vendors bundle right again.
Async loading
Fallback to landing block, and from landing block to block[0], for missing dataset (with updated 404 result returned by server rather than generic 500)

also rework narrative json fetching code, fixes bug with: "explore the data yourself" and "return to narrative" buttons

eharkins · 2020-06-12T20:15:56Z

Thanks @salvatore-fxpig !

PS: I think adding uniq from lodash just changed the other-vendors chunkhash, if you're already publishing somewhere with long-term caching I would suggest using Set, it should transpile just fine anyway.

Makes sense. The reason I used _uniq instead of Sets was only because I wanted to use .map() on the unique list of datasets and not have to convert the Set back to an array to do that. And because when I used the jsons object keys to ensure no duplicate fetches, it didn't have that behavior because of a misunderstanding of promises on my part. I'm sure there is a better way though so I will check out 1167 and give it a test!

The original design of nextstrain narratives was that each "section" (paragraph) should define its own dataset, such that a single narrative could display different datasets (or the same one if each section specified the same dataset). This was never implemented -- i.e. in auspice currently only the dataset specified in the YAML frontmatter is ever used -- which has resulted in the inadvertent creation of narratives which specify different (and perhaps invalid) datasets. Nearly every narrative file in this repo had inadvertently defined multiple datasets! Often this appears to be due to copy&paste of paragraphs from previous narratives etc. Auspice will soon "correctly" interpret these multiple datasets (nextstrain/auspice#1164). This commit changes 206 narratives which previously specified multiple datasets so that each one now only specifies the dataset defined in the YAML frontmatter. This will preserve the current behavior in future versions of Auspice. The only exception is the "test_multiple-datasets.md" narrative. There is also a small script which can be run periodically to identify narratives which define multiple narratives. Closes #8

salvatore-fxpig and others added 17 commits June 10, 2020 18:24

allow to source multiple narratives

667f22e

Update recomputeReduxState.js

55d6ff6

move all saved window state in one place

2628322

lint

2e0b0b0

remove alt script to get data

4e35aee

Update package-lock.json

24a3db4

Update recomputeReduxState.js

0893dce

Update recomputeReduxState.js

8f4943b

WIP allow multiple dataset narratives;

2700cd8

using redux to cache JSONs

WIP: mult-dataset narratives partly working

3f46807

changing narrative datasets: how to best rerender?

0ea217a

clean up + jsonCache redux sensible default

1ac3e66

WIP: remove listeners in components when unmounted

54c8c6a

mostly clarifying comments

616ee53

fix package lock to same as master

2544076

TODOs

c456b7e

jameshadfield temporarily deployed to auspice-1050-mult-narra-sgj7hf June 11, 2020 02:26 Inactive

jameshadfield requested changes Jun 11, 2020

View reviewed changes

fetch unique set of datasets for narratives

c887fb8

eharkins temporarily deployed to auspice-1050-mult-narra-sgj7hf June 11, 2020 20:40 Inactive

This was referenced Jun 12, 2020

Enable narratives to source multiple builds - Add Ons #1167

Closed

Enable a narrative to source multiple builds #1071

Closed

salvatore-fxpig added 4 commits June 12, 2020 18:10

async fetching of additional datasets, remove _.uniq

d1cdf40

throw 404 when dataset is not available, compatibility fallback for o…

6c322d9

…ld narratives

Update loadData.js

e03beba

linting

f5962ec

Only clear jsonCache on CLEAN_START;

70c834c

also rework narrative json fetching code, fixes bug with: "explore the data yourself" and "return to narrative" buttons

eharkins temporarily deployed to auspice-1050-mult-narra-sgj7hf June 12, 2020 20:03 Inactive

remove unused action CLEAR_JSON_CACHE

65ac3c0

eharkins temporarily deployed to auspice-1050-mult-narra-sgj7hf June 12, 2020 20:04 Inactive

jameshadfield mentioned this pull request Jun 15, 2020

Fix narratives which inadvertently specify multiple datasets nextstrain/narratives#8

Closed

Add back in CACHE_JSONS

1d03df2

eharkins temporarily deployed to auspice-1050-mult-narra-sgj7hf June 15, 2020 22:37 Inactive

eharkins added 2 commits June 16, 2020 16:05

Merge branch '1167-addons' into 1050-mult-narratives

26c5e63

narrative fetching error handling

3a49166

eharkins temporarily deployed to auspice-1050-mult-narra-sgj7hf June 17, 2020 01:07 Inactive

jameshadfield mentioned this pull request Jun 19, 2020

Fix unintentional defining of multiple datasets nextstrain/narratives#9

Merged

eharkins mentioned this pull request Jul 21, 2020

core-vendors and other-vendors chunks have different hashes building locally vs GitHub CI #1190

Open

jameshadfield mentioned this pull request Jul 29, 2020

next release #1193

Merged

jameshadfield merged commit eb67db6 into master Aug 3, 2020

eharkins mentioned this pull request Dec 19, 2020

Enable a narrative to source multiple builds #1050

Closed

trvrb deleted the 1050-mult-narratives branch January 11, 2021 05:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable a narrative to source multiple builds #1164

Enable a narrative to source multiple builds #1164

eharkins commented Jun 11, 2020

jameshadfield left a comment •

edited by eharkins

Loading

jameshadfield Jun 11, 2020

eharkins Jun 11, 2020

jameshadfield Jun 11, 2020

jameshadfield Jun 11, 2020

eharkins Jun 12, 2020

jameshadfield Jun 11, 2020

salvatore-fxpig commented Jun 12, 2020

salvatore-fxpig commented Jun 12, 2020

salvatore-fxpig commented Jun 12, 2020

eharkins commented Jun 12, 2020

Enable a narrative to source multiple builds #1164

Enable a narrative to source multiple builds #1164

Conversation

eharkins commented Jun 11, 2020

Description of proposed changes

Testing

Thank you for contributing to Nextstrain!

jameshadfield left a comment • edited by eharkins Loading

Choose a reason for hiding this comment

jameshadfield Jun 11, 2020

Choose a reason for hiding this comment

eharkins Jun 11, 2020

Choose a reason for hiding this comment

jameshadfield Jun 11, 2020

Choose a reason for hiding this comment

jameshadfield Jun 11, 2020

Choose a reason for hiding this comment

eharkins Jun 12, 2020

Choose a reason for hiding this comment

jameshadfield Jun 11, 2020

Choose a reason for hiding this comment

salvatore-fxpig commented Jun 12, 2020

salvatore-fxpig commented Jun 12, 2020

salvatore-fxpig commented Jun 12, 2020

eharkins commented Jun 12, 2020

jameshadfield left a comment •

edited by eharkins

Loading