-
Notifications
You must be signed in to change notification settings - Fork 163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
future server API #687
Comments
In general I think @tsibley's proposal is probably the better direction. I've sketched some further details out here. I've used the v2.0 schema which we should start using ASAP (no more meta & tree). I've also used the underscore separator as we've had a bunch of discussions about this and have always preferred it over hierarchical structures. (FWIW a change here would be better suited as an augur PR, as augur creates flat file structures.) If we let the extension cusomisation define the fetch prefixes
This would alleviate a lot of the complexity currently involved in writing a server for auspice. Serverless buildsMy understanding of serverless builds is that Path completionCurrently path completion is done by the server -- This proposal would result in a fetch to Two treesURLs with multiple trees are currently parsed by the server, which gets the appropriate JSONs and combines them. However auspice could be configured to make the two fetches:
Additional files (frequencies etc)These will be specified in the main dataset JSON, so auspice can make an additional fetch. E.g.
Community builds.This functionality shouldn't be part of the auspice repo. One option is to follow the above logic and require a server to interpret the request and deliver the appropriate JSON. I.e. The URL Alternatively, the extension interface could expose a function such as: function constructFetch(browserURL) {
// interpret URLs
return fetchURL;
} |
Good observations, thank you for spending the time to consider this in detail! My comments are below.
Hmm, I don't think anything fundamental about augur requires flat structures. All of the output files are user-provided and can be any arbitrary path. I expect moving the logically-hierarchical file names (with underscores) to actually-hierarchical names (with slashes) would require approximately no changes to augur.
I'm glad you agree! This is exactly my goal with the proposed API. :-)
I don't agree; I think it is possible to have this type of static build use datasets defined by the URL. It seems like it the On a related note: I think the term "serverless" is not appropriate for this feature. The feature is comparable to a static-site generator like Jekyll or Gatsby, not a utility computing platform like AWS Lambda.
Moving this URL manipulation into Auspice makes the most sense to me. There are other alternate solutions like symlinking flu.json → flu/seasonal/h3n2/ha/3y.json, but I think the consequences get a little weird without any additional benefit.
Yep, moving this into Auspice makes the most sense to me.
I like both approaches, and I don't think they are mutually exclusive: in the absence of an extension-provided browser dataset URL → fetch URL transformation function, the standard request will be made and a custom server can exist to handle it. Note that in the GitHub /community/… case, the server doesn't need to be complicated; it can encode the transformation function server-side as HTTP redirects to raw.githubusercontent.com. |
I really like the proposed API on the auspice fetch side. I think I'm historically the big proponent of flat JSON structures, ie Regardless, at the very least, we should be including in the combined JSON the relevant data fields, so that
(This is an augur issue) I had also assumed that eventually we would have a database for these JSONs and that they would be requested dynamically. Rather than staying fixed to directory hierarchies forever. This seemed like the way to get to proper versioning etc... But maybe we never leave S3-style blob storage... One question, if we went with the nested structure, what does the augur output JSON become in the example of |
Thank you for your comments. I think we are agreed this is the better direction so I've added it to the nextstrain roadmap. |
Nod. I guess I tend to consider the path (from the root of whatever project dir) part of the file's name rather than some disconnected, independent thing. Paths are a ubiquitous method of establishing a naming hierarchy with support for traversal and globbing operations, and it's a little weird to me to re-create all of that on disk and in our code using underscores. I won't push on this more for now; it's not super-important, just persistently strange to me.
Yes, agreed!
Maybe! Though versioning can be done lots of ways "properly", with or without a traditional database, and the manner in which the data is stored doesn't obviate our hierarchical access patterns (i.e. what seasonal flu HA builds do we have?) or subsetting-nature of the dataset generation.
With the v1 schema it'd be |
Previously the dataset selectors were not in the desired order due to the default value appearing first. This commit fixes this for the nextstrain.org server. Note that this functionality will be moved to the client when the new server API is implemented (see nextstrain/auspice#687). This commit closes Auspice issue nextstrain/auspice#696.
This issue is to discuss the design of the server API employed by auspice. To briefly recap, auspice needs to know which datasets / narratives are available as well as obtaining the dataset or narrative to view.
Here are @tsibley's thoughts, taken from #683 (comment)
Current
The Charon API, as described in the white-labelling docs, relies on a dynamic server able to respond to the following endpoints:
Pros:
Cons:
Requires a dynamic server.
Statically-hosted builds (e.g. for GitHub Pages) and S3-served builds (e.g. for Nextstrain) must be special-cased since no dynamic server is possible.
Paths in the user-facing URL require transformation back and forth, which historically has been a source of bugs.
Proposed
My proposed "data API" is to use standard HTTP access for what it's good at by changing Auspice to request these URLs instead:
Pros:
Endpoints can be provided either
Statically-hosted builds (e.g. for GitHub Pages) and S3-served builds (e.g. for Nextstrain) just work with no special-casing or additional server logic in Auspice.
Paths in the user-facing URL match those behind the scenes without any transformation.
Standard HTTP caching strategies work without a custom server complicating it.
Cons:
I don't see any downsides to this approach, but maybe you do? Looking for feedback!
The text was updated successfully, but these errors were encountered: