Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] SSG with incremental-static-regeneration and revalidate #804

Closed
arelaxend opened this issue Nov 16, 2020 · 18 comments
Closed

[RFC] SSG with incremental-static-regeneration and revalidate #804

arelaxend opened this issue Nov 16, 2020 · 18 comments

Comments

@arelaxend
Copy link

arelaxend commented Nov 16, 2020

Hi,

Is your feature request related to a problem? Please describe.
When you setup a page /[id].ts for SSG with fallback: true and an API route /api/preview.ts to enable preview mode, when you makes "preview" changes this works pretty well and you can see live changes of the page. 👍

Still, changes you make that appears correctly in the preview mode does not update the s3 files static-pages/*.html _next/data/*/*.json accordingly.

So whenever you display such a page "publicly" (with preview mode disabled), it does not display the latest changes. What it displays is the latest version of the page stored in s3.

Since currently, the preview mode does not update s3 files, we cannot see the changes with preview mode disabled.

So currently, for a specific page youdomain/abcde, the file static-page/abcde.html never updates (this is also true for the corresponding json file). When one makes changes in preview mode, the plugin does not update the static-pages abcde.html so the change does not appear publicly.

How to reproduce the issue
To reproduce this problem, please follow the steps:

  1. Render without preview mode (no cookies) the page /abced. It should create _next/data/<id>abced.json and static-pages/abced.html files in s3 and render the page according to the RFC. 👍
  2. Modify the content of /abced with preview mode ON (with cookies) (to do this, use an API route, e.g. /api/preview).
  3. Render again the page /abced, as the result, the content shows the updates make in 2/ 👍
  4. Remove the preview mode cookies
  5. Display again the page /abced, since preview mode is OFF this time, the content are not the latest. It displays the result in 1/ instead of displaying the result with modified content in 3/
  6. If you look, steps 2. and 3. do not update files _next/data/<id>abced.json and static-pages/abced.html in s3. It does not invalidate cloudfront cache neither.

Describe the solution you'd like
A good solution requires that when you make changes made in preview mode, the next time you render without preview mode the page, it shows the changes.

To fix it, we need to make the fallback page never cached max-age: 0, so the lambda can return the newly generated page (and cache it) the next time the same route is hit. But we also need to store the html and json files in preview mode for each preview request (RFC STEP 2)
Capture d’écran 2020-11-16 à 16 14 01

Describe alternatives you've considered
The alternative is to force delete those files in s3 and to invalidate cloudfront cache without the plugin. It is not standard, does not follow the RFC and it is a ugly and unsatisfying solution. Per file cache invalidation is not scalable (limit of 3000 concurrent invalidation on AWS) and it has a cost.

Additional context
@dphang discusses a related issue here :

I did find that issue too, it is because fallback page gets cached after 1st hit. But if you bust the cache after hitting a non-prerendered route (add a random query parameter), with the new change it would pick up the page that was just stored in s3, so all props are populated.

To fix it, we need to make the fallback page never cached (max-age 0), so the lambda at the edges can return the newly generated page (and cache it) the next time the same route is hit. But we also need to version the pages properly as I realized we are not clearing them properly (we only have one set of pages under static-pages, so subsequent deploys may pick up an old version).

Originally posted by @dphang in #798 (comment)

This discussion triggers a related issue, that is users request cached, still up-to-date versions after a new deployment. This analysis apply to preview mode too as described in this thread.

Also, #355

A.

@arelaxend arelaxend changed the title SSG with preview mode and fallback true does not update s3 files [RFC proposal] SSG with preview mode and fallback true does not update s3 files Nov 16, 2020
@arelaxend
Copy link
Author

arelaxend commented Nov 17, 2020

vercel/next.js#11552
#559

https://nextjs.org/blog/next-9-4#incremental-static-regeneration-beta

This would look something like:

  • The page defines what timeout is reasonable, this could be as low as 1 second
  • When a new request comes in the statically generated page is served
  • When a new request comes in and the defined timeout is exceeded:
    • The statically generated page is served
    • Next.js generates a new version of the page in the background and updates the static file for future requests
  • When a new request comes in the updated static file is served

Originally posted by @timneutkens in vercel/next.js#11552

https://github.com/vercel/next.js/blob/4866c47d9d9b39515d6e57118955e86c85630c4c/packages/next/next-server/server/incremental-cache.ts#L89

That one will be tricky to implement. We could discuss it separately after this RFC is completed.

@danielcondemarin @dphang - do you have any update regarding implentation of revalidate for getStaticProps ?

Implementing revalidate is a tricky one I might have underestimated. In order to implement the stale-while-revalidate spec properly you have to generate a fresh copy of the page in the background whilst serving the stale copy in a non-blocking fashion. Doing that in Lambda is not possible due the way the lambda execution environment works. In other words, Lambda doesn't let you return a response to the client and keep on running a task on the background. The closest thing to running something in the background is to set callbackWaitsForEmptyEventLoop to false but that freezes outstanding events so not even that would work.
I need to take a closer look at Next.js implementation as I'm not sure if they conform strictly to the spec. Any ideas / suggestions please let me know 🙂

I wonder what I'm missing here - A separate lambda invocation makes sense, but why would SQS, or anything else on top be necessary? Couldn't the first Lambda just invoke the second lambda with InvocationType Event and not await its result?

You're right @janus-reith. That removes the need for SQS even though I think AWS still uses a queue behind the scenes for async invocations, but if is abstracting it away from us that's best!

My only remaining concern would be latency between regions, say that a user is in us-east-1 and hits one of the edge functions in that region, when invoking the other lambda for background page regeneration what latency are we expected to see? I realise that it won't need to wait for the regeneration lambda to process but the request itself to put the event in the queue what region is that happening and how do we know is not going to a queue somewhere in eu-*.
My initial guess would be the queue AWS uses behind the scenes would be colocated in the same region as the async lambda but I'm not sure of that.

Originally posted by @danielcondemarin in #355 (comment)

To fix it, we need to make the fallback page never cached, so the lambda can return the newly generated page (and cache it) the next time the same route is hit. But we also need to version the pages properly as I realized we are not clearing them properly (we only have one set of pages under static-pages, so subsequent deploys may pick up an old version).

Originally posted by @dphang in #798 (comment)

Also :

Can one adds this feature to the schedule ? This RFC is older than the v10, back to 9.4 👍

@arelaxend arelaxend changed the title [RFC proposal] SSG with preview mode and fallback true does not update s3 files [RFC] SSG with incremental-static-regeneration and revalidate Nov 17, 2020
@krikork
Copy link

krikork commented Dec 21, 2020

Support for incremental static regeneration would be very handy!

@alexthewilde
Copy link

Hey @arelaxend, incremental static regeneration imho is a killer feature of Next.js. Do you have any idea if and by when serverless-next.js will support this?

@vosinsky
Copy link

Hi, if there is an update on this? Or is there some other method to solve this problem?

@evangow
Copy link

evangow commented Feb 20, 2021

Spent some time thinking about this. My pseudocode is below for anyone that might have time to implement this. This covers both the setting { revalidation: Int } on getStaticProps and { fallback: "blocking" } on getStaticPaths (neither or which are currently handled from my understanding).

//  below is the try/catch logic that would be inserted inside the response lambda
try {
  // https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html#headObject-property
  // this will throw an error if the specified key doesn't exist
  const { lastModifiedField } = await s3.headObject({ Bucket, Key: s3Key }).promise() 

  // if you get to the below, then s3 didn't throw an error, which means
  // there is an existing s3 page and you must simply choose whether
  // or not to regenerate it in the background
  // https://nextjs.org/docs/basic-features/data-fetching#incremental-static-regeneration

  if (lastModifiedField - new Date() > getStaticProps.revalidate) {
  // if you get to here, then the existing s3 page is stale
  // call an async lambda to regenerate the page and upload it to s3
  // https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/Lambda.html#invokeAsync-property
    await lambda.invokeAsync({
      FunctionName: "generatePageAndSaveToS3", 
      InvokeArgs: (req, res)
    })
  }

  // regardless of whether revalidate is undefined or the s3 object is stale
  // or not, we simply return the the existing s3 page
  return s3Key // i.e. return the existing s3 page
} catch (err) {
  // below assumes the error is "specified key doesn't exist"
  // https://nextjs.org/docs/basic-features/data-fetching#the-fallback-key-required
  // fallback can be true, false, or blocking
  if(getStaticPaths.fallback === true) {
    // if fallback is true, we generate the S3 page in the background, but return a fallback page
    await lambda.invokeAsync({
      FunctionName: "generatePageAndSaveToS3", 
      InvokeArgs: (req, res)
    })
    return fallbackPage
  } else if(getStaticPaths.fallback === "blocking") {
    // if fallback is blocking, then we wait to generate the page
    // upload to s3 and return it
    const { data, html } = page.renderReqToHTML(req, res)
    uploadToS3(data, html)
    serve(json)
  } else {
    return 404page
  }
}

// async lambda that gets called if 
function generatePageAndSaveToS3(req, res) {
  const { data, html } = page.renderReqToHTML(req, res)
  await uploadToS3(data, html)
}
  • Potential Edge Case: if it's past the revalidation window, the async lambda could be called multiple times before before it has time to complete.

For example, if you get 100 requests for a particular page immediately after the revalidation period, the async lambda could get called all 100 times if the first time it was fired the function wasn't able to upload the new page to S3 before the next 99 requests hit

  • Potential Solution 1: Set a concurrency limit of 1 on the async lambda. Then, inside the async lambda itself check again to see if the page needs to be regenerated. This way, the 1st time the function is called, the page will be regenerated and uploaded to S3. The next 99 calls to the lambda will see that a new file as been uploaded with the revalidation window and simply return null

The concurrency limit on a lambda can't be set based on the function arguments as far as I'm aware though, so this would create a bottleneck for ALL page regeneration requests. If you've got dozens or hundreds of ISR pages with and a "revalidation: 1" second, then, you could be calling this lambda more frequently then it could process requests and upload them to S3, which would create an ever-expanding backlog of request

For this reason, I think it's better to perhaps let the 99 follow on requests (from the example) simply regenerate the page

  • Potential Solution 2: Instead of using an async lambda invocation, the request could be sent to an SQS queue with a deduplication id (never used this before, so my understanding could be wrong)

https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/using-messagededuplicationid-property.html

In this case, the deduplication id would be composed of the lastModifiedField returned by the s3.headObject().promise() along with the path that the request was made on (e.g. /slug/...).

That way, the first request would insert the message into the SQS queue and the other 99 request would automatically be deduplicated.

The SQS queue would have the generatePageAndSaveToS3 lambda attached to process the queue request.

Based on my understanding of the docs, the deduplication id lasts for 5 mins, so as long as the generatePageAndSaveToS3 can complete the initial request within the 5 minute timeframe, then it should only ever get called once.

@avisra
Copy link

avisra commented Mar 23, 2021

Just confirming my understanding here - incremental static regeneration does not currently work with this repo?

@GuillaumeSD
Copy link

Hi @avisra
From what I know and do, SSG does not yet work with this repo. Discovered this when I saw the revalidate option didn't work with getStaticProps.
SSG is a killer feature imho, really looking forward to it.

@avisra
Copy link

avisra commented Mar 24, 2021

well... this is a real bummer. my application relies heavily on incremental static regeneration. excuse me while I scramble to find another hosting option...

@dmsolutionz
Copy link

Spent some time thinking about this. My pseudocode is below for anyone that might have time to implement this. This covers both the setting { revalidation: Int } on getStaticProps and { fallback: "blocking" } on getStaticPaths (neither or which are currently handled from my understanding).

//  below is the try/catch logic that would be inserted inside the response lambda
try {
  // https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html#headObject-property
  // this will throw an error if the specified key doesn't exist
  const { lastModifiedField } = await s3.headObject({ Bucket, Key: s3Key }).promise() 

  // if you get to the below, then s3 didn't throw an error, which means
  // there is an existing s3 page and you must simply choose whether
  // or not to regenerate it in the background
  // https://nextjs.org/docs/basic-features/data-fetching#incremental-static-regeneration

  if (lastModifiedField - new Date() > getStaticProps.revalidate) {
  // if you get to here, then the existing s3 page is stale
  // call an async lambda to regenerate the page and upload it to s3
  // https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/Lambda.html#invokeAsync-property
    await lambda.invokeAsync({
      FunctionName: "generatePageAndSaveToS3", 
      InvokeArgs: (req, res)
    })
  }

  // regardless of whether revalidate is undefined or the s3 object is stale
  // or not, we simply return the the existing s3 page
  return s3Key // i.e. return the existing s3 page
} catch (err) {
  // below assumes the error is "specified key doesn't exist"
  // https://nextjs.org/docs/basic-features/data-fetching#the-fallback-key-required
  // fallback can be true, false, or blocking
  if(getStaticPaths.fallback === true) {
    // if fallback is true, we generate the S3 page in the background, but return a fallback page
    await lambda.invokeAsync({
      FunctionName: "generatePageAndSaveToS3", 
      InvokeArgs: (req, res)
    })
    return fallbackPage
  } else if(getStaticPaths.fallback === "blocking") {
    // if fallback is blocking, then we wait to generate the page
    // upload to s3 and return it
    const { data, html } = page.renderReqToHTML(req, res)
    uploadToS3(data, html)
    serve(json)
  } else {
    return 404page
  }
}

// async lambda that gets called if 
function generatePageAndSaveToS3(req, res) {
  const { data, html } = page.renderReqToHTML(req, res)
  await uploadToS3(data, html)
}
  • Potential Edge Case: if it's past the revalidation window, the async lambda could be called multiple times before before it has time to complete.

For example, if you get 100 requests for a particular page immediately after the revalidation period, the async lambda could get called all 100 times if the first time it was fired the function wasn't able to upload the new page to S3 before the next 99 requests hit

  • Potential Solution 1: Set a concurrency limit of 1 on the async lambda. Then, inside the async lambda itself check again to see if the page needs to be regenerated. This way, the 1st time the function is called, the page will be regenerated and uploaded to S3. The next 99 calls to the lambda will see that a new file as been uploaded with the revalidation window and simply return null

The concurrency limit on a lambda can't be set based on the function arguments as far as I'm aware though, so this would create a bottleneck for ALL page regeneration requests. If you've got dozens or hundreds of ISR pages with and a "revalidation: 1" second, then, you could be calling this lambda more frequently then it could process requests and upload them to S3, which would create an ever-expanding backlog of request

For this reason, I think it's better to perhaps let the 99 follow on requests (from the example) simply regenerate the page

  • Potential Solution 2: Instead of using an async lambda invocation, the request could be sent to an SQS queue with a deduplication id (never used this before, so my understanding could be wrong)

https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/using-messagededuplicationid-property.html

In this case, the deduplication id would be composed of the lastModifiedField returned by the s3.headObject().promise() along with the path that the request was made on (e.g. /slug/...).

That way, the first request would insert the message into the SQS queue and the other 99 request would automatically be deduplicated.

The SQS queue would have the generatePageAndSaveToS3 lambda attached to process the queue request.

Based on my understanding of the docs, the deduplication id lasts for 5 mins, so as long as the generatePageAndSaveToS3 can complete the initial request within the 5 minute timeframe, then it should only ever get called once.

Have you attempted this solution yet?

@evangow
Copy link

evangow commented Mar 26, 2021

Are you asking me? If so, no. My application doesn't rely on ISR. I would use it if this repo supported it, but it's not strictly necessary for my use-case.

@adamdottv
Copy link

I'm looking into implementing this, if nobody else has started the work. I leverage lambda-at-edge in my CLI tool, Ness (https://github.com/nessjs/ness) and want to give users of Ness ISR support.

@dphang
Copy link
Collaborator

dphang commented Apr 3, 2021

@adamelmore sounds good, thanks! Sorry I hadn't a chance to work on this yet. Do let us know if you need any guidance - I think for this feature this would require adding some new SQS / regional Lambda to trigger the incremental static regeneration / revalidate behavior, which can be called from Lambda@Edge using SQS client.

@dmsolutionz
Copy link

@adamelmore sounds good, thanks! Sorry I hadn't a chance to work on this yet. Do let us know if you need any guidance - I think for this feature this would require adding some new SQS / regional Lambda to trigger the incremental static regeneration / revalidate behavior, which can be called from Lambda@Edge using SQS client.

Where on the roadmap is this placed at this point? It is genuinely one of the best features I've seen from Next.

@adamdottv
Copy link

@adamelmore sounds good, thanks! Sorry I hadn't a chance to work on this yet. Do let us know if you need any guidance - I think for this feature this would require adding some new SQS / regional Lambda to trigger the incremental static regeneration / revalidate behavior, which can be called from Lambda@Edge using SQS client.

Where on the roadmap is this placed at this point? It is genuinely one of the best features I've seen from Next.

Well, I don't know if there's a roadmap, per se; but, on my personal Trello board it's currently in the "Next Week" column 😅

I want this for my own side project, so I will be working on it; it's just a matter of when. I hope to have started in the next week or so, and then go from there.

@adamdottv
Copy link

I'm interested in paying a bounty to anyone that wants to take this work on. I've got a lot on my plate but really want to see this completed.

Offering $1200 USD if anyone that stumbles on this wants to take this on. Message me here or on Twitter for details.

https://twitter.com/aeduhm/status/1382093398077796357?s=20

@adamdottv
Copy link

I'm interested in paying a bounty to anyone that wants to take this work on. I've got a lot on my plate but really want to see this completed.

Offering $1200 USD if anyone that stumbles on this wants to take this on. Message me here or on Twitter for details.

https://twitter.com/aeduhm/status/1382093398077796357?s=20

Upping this to $2000 USD.

@kirkness
Copy link
Collaborator

Keen on taking you up on that offer @adamelmore 😁

Just put in a WIP PR here - keen for everyone's feedback.

@dphang
Copy link
Collaborator

dphang commented May 22, 2021

I think this can be closed save for any bugs which I will track here: #1098. Thanks all!

@dphang dphang closed this as completed May 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests