-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve support for self-hosted runtime #25026
Conversation
Bring self-hosted runtime support in-line with amp-toolbox optimizer: - Do not auto-append /rtv/{rtv} to self-hosted runtime URLs - Support using the default CDN for dynamic components
There is an example included in the PR that can be removed later. It is meant to demonstrate a PWA retrieving self-hosted static components and non-self-hosted dynamic components. The demo is live here:
Notice that in both cases,
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Outstanding questions, can you take a look so we can move forward?
src/service/extension-location.js
Outdated
const rtv = getMode().rtvVersion; | ||
if (opt_extensionVersion == null) { | ||
opt_extensionVersion = '0.1'; | ||
} | ||
const extensionVersion = opt_extensionVersion | ||
? '-' + opt_extensionVersion | ||
: ''; | ||
if (urls.cdn !== urls.defaultCdn && !useDefaultCdn) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If single pass is enabled does this logic still function as expected? For both conditions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I mistakenly assumed single pass mode was related to localdev, not experiments. I will rewrite this to include spPath
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is resolved in commit d76da0c. Let me know how the changes look to you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if i'm understanding it correctly d76da0c could create a path that looks like cdn.ampproject.org/rtv/${rtvVersion}/sp/v0/${extensionId}${extensionVersion}.js
? if so that might break.
We can actually remove that getSinglePassExperimentPath()
logic now since we assign unique versions to the resources we deploy as a single pass experiment. (meaning they also have theyre own rtv
version deployed that looks like the normal rtv url scheme)
feel free to ping me if you need further help on this part! (or ping me on slack too if you want a quicker response time erwinm
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just to make my explanation simpler, the whole sp
infix logic is moot if its served on google amp cache (even with single pass experiment ON). it was needed in an old version of our experiment deployment but no longer needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@erwinmombay Since removal of single pass experiment paths is unrelated to this PR topic, would you mind handling that change in a separate commit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mattwomple will do. I'll take it on today. 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changes merged here #25073
src/service/extension-location.js
Outdated
/** | ||
* Dynamic AMP components | ||
*/ | ||
const AMP_DYNAMIC_COMPONENTS = ['amp-geo']; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This hardcoded list (now in the base runtime) is disconnected from the source of truth that makes a component "dynamic."
Are there many others that are expected to fall into this category? What happens when they do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know the future of how many components will become "dynamic components". I'm hesitant to assume it will always be just amp-geo
. The rationale for identifying these components is discussed in ampproject/amp-toolbox#515 and ampproject/amp-toolbox#518. Or, tldr;
Not everyone who self-hosts the AMP runtime has the infrastructure to
support dynamic components like amp-geo, ref: AMP cache guidelines.
I agree with your 'source of truth' comment. This probably isn't the best place for this. Would it make more sense to introduce src/dynamic-components.js
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is resolved in commit d76da0c. Let me know how the changes look to you.
location, | ||
opt_isLocalDev, | ||
opt_useDefaultCdn | ||
) { | ||
if (opt_isLocalDev) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a future note @erwinmombay -- this is the kind of logic we should remove (local dev) when we support a development build.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just to clarify, the development will essentially be running in its own pipeline process to separate this out correct? (which will of course clean up these logical branchings)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep!
src/config.js
Outdated
defaultCdn: DEFAULT_CDN, | ||
useDefaultCdnForDynamicComponents: Boolean( | ||
env['useDefaultCdnForDynamicComponents'] | ||
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we simplify this API? I think just adding one new field should suffice:
cdnForAmpGeo: env['cdnForAmpGeo'] || 'https://cdn.ampproject.org',
This would allow self-hosters to use caches other than Google's for amp-geo
.
It also avoids introducing "default" and "dynamic component" both of which need explaining. AFAIK there's no plan for other dynamic components so we can cross that bridge when we get there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course I'll be happy to make this change if it really is preferred, but if I were in your shoes and saw someone submit a PR to introduce cdnForAmpGeo
, I'd think "gosh, that's a little short sighted".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or cdnForCacheModifiedExtensions
since this is public. :)
"Dynamic" is a bit overloaded in AMP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fine. And then this documentation should be updated as well to adopt the new name: https://github.com/ampproject/amphtml/blob/master/spec/amp-cache-guidelines.md#guidelines-adding-a-new-cache-to-the-amp-ecosystem
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commit cc4f0ed makes this change, but I didn't want to introduce a 3rd CDN URL, so I opted for cdnSupportsCacheModifiedExtensions
boolean. Let me know what you think.
src/service/extension-location.js
Outdated
*/ | ||
function isDynamicComponent(extensionId) { | ||
return ( | ||
Boolean(extensionId) && AMP_DYNAMIC_COMPONENTS.indexOf(extensionId) >= 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is case sensitivity important? E.g. Element.tagName
is always uppercase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd argue this is a correct implementation. Fuzzy matching on an ID feels sloppy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commit cc4f0ed allows fuzzy matching.
src/service/extension-location.js
Outdated
const rtv = getMode().rtvVersion; | ||
if (opt_extensionVersion == null) { | ||
opt_extensionVersion = '0.1'; | ||
} | ||
const extensionVersion = opt_extensionVersion | ||
? '-' + opt_extensionVersion | ||
: ''; | ||
if (urls.cdn !== urls.defaultCdn && !useDefaultCdn) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this is safe to do. It might break RTV pathing for Bing's cache for example.
Instead of inferring when to disable RTV pathing, we should probably make it an explicit option in config.js.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you mind following-up with @sebastianbenz about this? Here's our discussion: ampproject/amp-toolbox#517 (it's worth viewing the edit history on the original post there, too). If we did not make the right call in dropping '/rtv/{rtv}' for alternate CDNs, I'd like to make sure optimizer in amp-toolbox is updated again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, what's the use case for ampUrlPrefix
? It doesn't seem to have an analog in the context of this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ampUrlPrefix
in the optimizer tool is akin to env['cdnUrl']
being defined before the runtime initializes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah thanks, I misread what it does.
I think the optimizer case it would be confusing to have both ampRuntimeVersion
and ampUrlIncludeRtv
options, which doesn't apply here. Though I may be still missing some context: ampproject/amp-toolbox#517 (comment)
I'd like to propose we do not make the following change: Instead we ask hosting environments to emulate the conditions AMP Caches can deliver these resources from. Adding additional logic to servers is almost always preferable than encouraging the work to be done on every users machine. Edit: As part of this change we should also document the location of resources relative to a root host serving AMP binaries so that its easy for many to create servers to host the resources with tests to ensure validity. |
Please reconsider. This requires anyone hosting the runtime (which should not be restricted to large corporations with significant CDN resources) to store 52 copies of the runtime per year. It should be possible to drop |
I'm not sure I am following. Where does the requirement for 52 copies come from? |
I think he means one release/RTV per week. Though I suppose you could have some custom server-side URL rewriting logic to avoid that. |
Or the optimizer can only output urls that are supported by the serving environment. Having a rtv value in the URL is valuable from a caching perspective, these resources should be |
That's not really a necessity. Most CDNs support distinct internal and external |
I tend to agree with Matt here. What do we gain form enforcing RTVs for self-hosting? I'm sure that every serious publisher will self-host the runtime with RTV for the reasons @kristoferbaxter mentioned. However, it increases hosting complexity and I think it's a valid use case for devs to simply download the latest AMP runtime and serve it from their static site. |
src/dynamic-components.js
Outdated
@@ -0,0 +1,32 @@ | |||
/** | |||
* Copyright 2016 The AMP HTML Authors. All Rights Reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: 2019
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, updated in commit cc4f0ed.
Simplification and reduction of the code in the runtime. It's a tradeoff, less complexity in the build/release/authorship/execution processes in exchange for more requirements in the hosting environment. What I'm suggesting is the hosting environment should support the versions it hosts using the I'd like to flip the question around too though, why should the runtime increase in size and testing complexity for work that can reasonably be done in a hosting system by those who choose to self host? Why not add a test suite that gives those who choose to self host more confidence that their setup works for their AMP documents? This testing suite can then also be modified in the future to coincide with changes in hosting strategy. Overall, this is not a blocker from me, but I'd really like for the participants of this discussion to agree to the additional risks and complexity we're adding here. |
A worthwhile discussion. From another light, this change allows AMP publishers to be more easily decoupled from the dominant AMP cache that happens to be run by Google. Also IMO the cost borne by runtime is small-ish (a few LOC). |
+1 to @choumx making it easy to decouple AMP pages from the cache sends an important message. I also really like the test suite idea. Once self hosting the runtime is a thing, I plan to publish a runtime hosting module which emulates the AMP Cache behavior as part of AMP Toolbox. A test suite would be a natural by-product. |
*/ | ||
export function isCacheModifiedExtension(extensionId) { | ||
return ( | ||
Boolean(extensionId) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Boolean check shouldn't be necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not necessary, but prevents an unnecessary loop if an empty string is passed for extensionId
.
const cdnSupportsCacheModifiedExtensions = | ||
typeof env['cdnUrlSupportsCacheModifiedExtensions'] == 'string' | ||
? env['cdnUrlSupportsCacheModifiedExtensions'] !== 'false' | ||
: env['cdnUrlSupportsCacheModifiedExtensions'] !== false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain why this is necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's just a means of supporting cdnUrlSupportsCacheModifiedExtensions
defined as either string or boolean.
src/service/extension-location.js
Outdated
return `${base}/rtv/${rtv}/${spPath}v0/${extensionId}${extensionVersion}.js`; | ||
const extensionPath = `${spPath}v0/${extensionId}${extensionVersion}.js`; | ||
if (urls.cdn !== urls.defaultCdn && !useDefaultCdn) { | ||
return `${base}/${extensionPath}`; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we allowing this without RTV fingerprint? The fingerprint allows for easy longterm caching (important for the browser, not the cache server), where a shared URL forces frequent rechecks.
I think prioritizing the user's experience means that, if you want to self-host, you need to follow the RTV contract.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note, too, that this can cause an infinite reload loop if the loaded extension does not match the version loaded by v0.js. We automatically reload requesting /rtv/${V0's RTV}/v0/${ext}.js
. If we can't specifically request an RTV, then we'll keep requesting /v0/${ext}.js
, which has the incorrect version (because we started the reload loop), and continue looping.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we allowing this without RTV fingerprint?
Please read conversation thread here and at ampproject/amp-toolbox#517.
Note, too, that this can cause an infinite reload loop if the loaded extension does not match the version loaded by v0.js.
I'll review the reload conditions this evening to see if changes are warranted here or there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So @jridgewell agrees with @kristoferbaxter. :)
Note, though, that the "user benefit" assumes self-hosters want to match AMP cache's release patterns. In practice, they can set their own release cadence, browser cache hints, etc. Also, being same-origin allows use of a service worker which can ensure matching RTVs between v0.js and extensions.
Point taken though that this is "wild west" territory, dragons, etc. So some documentation on best practices and edge cases (e.g. reload loop) is a good idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note, too, that this can cause an infinite reload loop if the loaded extension does not match the version loaded by v0.js. We automatically reload requesting
/rtv/${V0's RTV}/v0/${ext}.js
.
Commit 6bbbd5d disables version locking when self-hosting. It's up to the cache maintainer to ensure version alignment between runtime and components.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Do not auto-append /rtv/{rtv} to self-hosted runtime URLs
I do not want to see this merged. It sacrifices user experience to make it easier to self host. If you want to self host, you must agree to use the RTV directory pattern.
- Support using the default CDN for cache-modified extensions when otherwise using a self-hosted runtime
I'm fine with this part.
Your demand is arbitrary. A cache can still opt to follow the |
I also don't believe there's One Right Way for self-hosting. Also, isn't the spirit of the Bento project to be more flexible with the way people can use AMP? /cc @ampproject/wg-approvers since this is more contentious than I'd anticipated. |
Could we discuss this in a design review? It is definitely a substantial enough change to warrant that. I'm pretty sure we cannot make this change, because we rely on being able to load AMP of a particular version and that all AMP binaries in a page have the exact same version. I think (but am not sure) that this change can break that. Maybe we can find other ways to achieve the DX improvements without breaking that. |
@honeybadgerdontcare and I are currently working on a design for supporting runtime self-hosting in AMP caches and AMP validator. This is closely related to the change proposed in this PR. Ideally we should discuss these changes together in the design review. @mattwomple I'll share a first draft of our design doc for you to comment on as soon as we've got one. |
Update calculateExtensionScriptUrl() and calculateEntryPointScriptUrl() URL generation now that spPath is deprecated.
@erwinmombay Thanks for cleaning up the deprecated experiment paths. This significantly simplifies the logic in @jridgewell I'm sorry I responded to your comment with such an antagonistic tone. I've spent quite a bit of time (well before this PR) considering how to achieve a good balance of optimization and simplicity for small publishers. I'm very excited about the idea of allowing publishers to both optimize their AMP pages and maintain a simple, predictable URL to the AMP runtime. I look forward to continuing the conversation and will keep a more measured tone. @sebastianbenz Thank you for taking the initiative on creating a design doc. I'm glad you and @honeybadgerdontcare are working on it because my first draft of "design" was most certainly a position paper. Let me know how and when I can contribute. |
@mattwomple thank you for your perseverance on this. It's a challenging issue and we want to get it right. As usual cramforce is right in that we should bring this up in a design review. There are a lot of nuances here; preserving user first, AMP ideals, and giving publishers more options in how they serve their content. Would the October 23rd design review work for those interested? |
Could we do Nov 6th instead for the design review? The one on Oct 23th would be quite late for me due to timezones. |
if (urls.cdn === urls.defaultCdn || useDefaultCdn) { | ||
extensionPath = `rtv/${getMode().rtvVersion}/${extensionPath}`; | ||
} | ||
return `${base}/${extensionPath}`; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For #25214 (comment), I was actually hoping to return just ${base}/${extensionPath}
here if opt_isLocalDev
or getMode().test
is true. Is that a change we can make?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/CC @honeybadgerdontcare @sebastianbenz Would you two mind taking @cathyxz's request into consideration in your design doc?
My intention is to drop this PR since the two changes made here are impeding progress on introducing self-hosted runtime support and a better alternative is available for
Are there any objections or comments before this is closed? |
Dropping PR for reasons listed in previous comment, as well as because GitHub account @mattwomple will not be used after Jan 2020 (retiring work account but will continue AMP contributions under personal account @mdmower). |
Bring self-hosted runtime support in-line with amp-toolbox optimizer:
/rtv/{rtv}
to self-hosted runtime URLsAdditional members in
AMP_CONFIG
:defaultCdn
: Set tohttps://cdn.ampproject.org
so that cache-modified components can still optionally be retrieved from a supported sourcecdnSupportsCacheModifiedExtensions
: Set tofalse
to usedefaultCdn
when retrieving cache-modified components, instead of the self-hosted componentRecommended reading:
Note: This partially addresses the original issue reported in #24927 but does not handle the case of components loaded by the runtime in singleDoc mode. For example,
amp-loader
andamp-auto-lightbox
do not need to appear in the<head>
of an AMP document to load. They are added during runtime. Since AMP pages have no means of manipulatingAMP.config
, these extensions use the default CDN. Improved self-hosting support for this scenario is intentionally being handled in a separate pull request since it involves more controversial changes and requires updates to the AMP validator.