Skip to content
This repository has been archived by the owner on Mar 5, 2020. It is now read-only.

Initial attempt at a lightweight dynamic server #936

Merged
merged 20 commits into from
Jun 9, 2015

Conversation

toolness
Copy link
Contributor

The current state of this WIP is deployed at https://teach-site.herokuapp.com/.

This is intended to resolve #585.

I'm thinking for now that we can just build it and use it alongside the existing static site generator, and simply switch over once we decide it's stable and performant enough to use in production.

Tasks left to do (can be filed as separate issues):

  • Add proper error handling (e.g., "500 Internal Server Error") to the server. (Update: filed as Add proper error handling to lightweight dynamic server #1007.)
  • The sitemap isn't currently being generated. This will take a bit of work, since the existing sitemap is generated by crawling all the index.html files of the static site, which isn't feasible for the dynamic server. (Update: filed as Lightweight dynamic server should generate sitemap.xml #1008.)
  • Appropriate static files (HTML, CSS, JS, etc) should be delivered as compressed/gzipped content, at least in production. (Update: filed as Lightweight dynamic server should serve compressed static files #1009.)
  • Static files should be delivered with reasonable cache headers, at least in production. (Update: filed as Lightweight dynamic server should serve relevant content with cache headers #1010.)
  • Assets should be delivered with relevant security headers (CSP, HSTS, etc.)
  • All packages that aren't required for production deployment should be moved into the devDependencies section of package.json.
  • Include the git-rev meta in generated pages. Turns out this is really hard to do when an app is deployed to Heroku. We should probably just settle for showing the version number from package.json for now.
  • Static files should all be copied into the dist directory via our gulpfile, rather than instantiating multiple express.static() middleware instances for each directory.
  • Add some tests.
  • react-router redirects should actually generate 302s from the server, rather than web pages telling non-JS users to visit the new location.

@toolness toolness changed the title [WIP] Initial attempt at a lightweight dynamic server Initial attempt at a lightweight dynamic server May 21, 2015
@toolness
Copy link
Contributor Author

Hey @mmmavis, this is kind of a big PR with some potentially unfamiliar things in it, so let me know if you want to walk through it on vidyo!

console.log('Rebuilt server-side bundle.');
});

gulp.start('watch-webpack');
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the things I had a hard time discovering is that gulp is actually a subclass of Orchestrator. So it has methods like start() that aren't actually documented in the Gulp API docs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ignore that! I realized that it would be easier to just move all development mode logic into the gulpfile, and require devs to use npm run app to start the lightweight dev server. This gets rid of our weird usage of gulp from within app.js, and also means we don't have to use funky undocumented and/or unfamiliar API methods.

@toolness
Copy link
Contributor Author

Ok, so manually testing out this PR will be a bit similar to #839. Note that you'll also want to manually test npm start too--even though its behavior hasn't changed, it's been refactored, so we want to make sure that it still works the same way it used to.

  • While running npm run app (which starts the lightweight dynamic server in development mode) and npm start, intentionally add syntax and runtime errors in various places in the code--LESS files, JSX files, and so on. This is to simulate real developers making real editing mistakes. Then make sure the console logs useful errors!
  • While running npm run app and npm start, add basic edits in various parts of the code, and make sure the site regenerates itself properly.
  • While running npm run app and npm start, edit the gulpfile and make sure the server stops with a helpful message.
  • Run node app.js and make sure the instructions it gives you are helpful.
  • Run npm install and make sure the postinstall step doesn't do anything.
  • Set NODE_ENV="production" and try the following things:
    • Run npm install and make sure the postinstall step generates static assets (because reasons).
    • Run node app.js and make sure the lightweight dynamic server works in production mode.

@mmmavis
Copy link
Member

mmmavis commented May 26, 2015

sorry for the wait, will get to this PR later today 🐢

@mmmavis mmmavis mentioned this pull request May 27, 2015
5 tasks
@mmmavis
Copy link
Member

mmmavis commented May 27, 2015

@toolness

phew I finally started to review this PR, one quick question: does https://teach-site.herokuapp.com/ act like a PROD environment or STAGING/DEVELOPMENT?

throw err;
}

console.log('Built server-side bundle.');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @toolness, I'm trying to understand the 3 bundled js:

  1. app.bundle.js (what's this? I can't find out how this is generated from)
  2. commons.bundle.js (I guess this is from here)?
  3. index-static.bundle.js

Are the first 2 for browser and index-static.bundle.js for server to watch for code changes & generate /dist? Other than being a Webpack novice I think my brain gets confused easily by the types of "server" we have for the site.

So there's a server that generates /dist and our static site just serves w/e is in /dist?
And there's also a lightweight server that dynamically renders HTML (#585, aka what this PR is for)?
Are the "2" servers 2 different things or 1 server doing two jobs??? 👀

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Er sorry, yeah, this situation is kinda confusing now because this PR doesn't actually replace our static site generator with a dynamic lightweight server--it adds a lightweight dynamic server as an optional way to deploy the site.

To clarify, there's only one server intended for production here--it's the one in this PR, which dynamically renders HTML. There is no other production server, because our current site is just a bunch of pre-generated static files in dist/ which we upload to S3.

However, there are actually 2 different kinds of servers we can use for development:

  1. The development static file server, which is currently run when we use npm start (or gulp watch if you use that directly). This is a fairly "dumb" server, though, in the sense that it just kinda emulates S3 or any other kind of static file server.
  2. The dynamic lightweight server, which is in this PR, and can be run via npm run app (or gulp app). Once this server is truly ready for production, we can potentially get rid of the development static file server, as well as the static site generator, and replace them with this.

That said, though, we do actually generate (and copy) some static files even for the dynamic lightweight server: These include e.g.:

  • LESS/CSS files,
  • Client-side JavaScript files generated by webpack,
  • Images

The above will only ever change when we push a new release, which is why, even with the lightweight dynamic server, we just generate/copy them once in dist/, and serve that directory as static files, just like we do with the purely static site we already use. In fact, the only thing that's not statically served in the dynamic lightweight server is every HTML page on the site.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A number of your questions are related to how we're using webpack's code splitting functionality to efficiently deliver the site to our users.

  1. app.bundle.js (what's this? I can't find out how this is generated from)

This is basically the main entry point for the website. It's generated from this part of our webpack.config.js:

module.exports = {
  entry: {
    app: './lib/main.jsx',
    manualTests: './test/browser/manual-main.jsx',
    tests: './test/browser/main.js'
  },

Basically webpack is generating three different entry points for our code, to be used by three different kinds of pages: the "app", which is loaded by any public page on our site, the automated test suite, and the manual test suite. In the future, we might add more.

(Note also that this file isn't new to this PR, it's been around since the early days of this project.)

  1. commons.bundle.js (I guess this is from here)?

Yup, it's our commons chunk, which contains common modules shared between all our entry points.

(Note also that this file isn't new to this PR, it's been around since the early days of this project.)

  1. index-static.bundle.js

Are the first 2 for browser and index-static.bundle.js for server to watch for code changes & generate /dist?

Yup exactly! Well, mostly exactly... index-static.bundle.js was introduced in #839 and is used to generate the purely static site, but it's also being leveraged by the lightweight dynamic server in this PR, to dynamically generate the HTML pages.

Another way to say it is that the website's HTML pages can either be statically generated in dist or dynamically served, and index-static.bundle.js is required to do both.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Er, another clarification: the dynamic lightweight server does not serve any .html files in the dist/ directory. In fact, when deploying to Heroku, npm install also generates the dist/ directory via a postinstall step, and you'll note that it doesn't actually generate the static HTML files--only the LESS/CSS, client-side JS, and so on.

Even if there happen to be any leftover .html files in the dist/ directory on your development machine when you run the dynamic lightweight server, they're never actually served--the server's router intercepts all requests to URLs like /clubs/ and dynamically generates the responses before the express static middleware ever sees them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An aside: you might also be wondering what the heck the 1.1.bundle.js file in dist is. That's actually mapbox!

In #430 we used Webpack's code splitting functionality to asynchronously load mapbox on-demand only when the user is on the Clubs page. This is because we don't want to unnecessarily send the user all of mapbox if they e.g. only want to look at the homepage of the site. You can imagine how, as our site grows, we don't want to deliver a massive single JS bundle containing all the functionality the user could ever possibly use on our site--it's more efficient to only give the user what they need to view the page they requested, and then asynchronously load new chunks of code as needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To put it another way: the static site uses indexStatic.generate() to generate every unique page for the site once and writes each page out to a HTML file in the dist directory, whereas the dynamic server uses that same function to dynamically generate the page being requested on-the-fly and send it back to the user without writing it to the file system.

coooool this is very clear!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, that won't work because NODE_ENV=production is so optimized for deployment (not development) that it doesn't watch source files or anything--it just compiles the JSX once and then uses it forever, until it's shut down. The only way to have it read the new code is to either reboot the server or use the development mode for the server, which does watch the filesystem for changes and rebuilds the bundle when needed.

I tried rebooting the server but still didn't see the changes 😿 (cmd-c to quit AND NODE_ENV="production" ORIGIN="http://localhost:8008" node app.js to restart)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

heh, sorry I think I phrased my question wrong - I understand the "dynamic" server being "dynamic" part 😛 but I was wondering how template files/JS gets compiled from .jsx. I assume React handles the compilation?

Ohhhh, got it! Yeah, that's actually powered by the infrastructure we added in #839, and it's different for development vs. production mode.

For development, we use the index-static-watcher thing to watch our JSX files (or any of the JS files they depend on) for changes and re-compile the server-side/html-generating JS bundle when needed.

For production, that whole thing is done only once (since we know our JSX/JS won't be tinkered with while the server is running) during the postinstall step.

Either way, the JSX/JS is ultimately bundled, via webpack, into a file in the build directory which the server loads (and potentially _re_loads, if we're in development mode). And that bundle is where indexStatic.generate() lives.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, that won't work because NODE_ENV=production is so optimized for deployment (not development) that it doesn't watch source files or anything--it just compiles the JSX once and then uses it forever, until it's shut down. The only way to have it read the new code is to either reboot the server or use the development mode for the server, which does watch the filesystem for changes and rebuilds the bundle when needed.

I tried rebooting the server but still didn't see the changes 😿 (cmd-c to quit AND NODE_ENV="production" ORIGIN="http://localhost:8008" node app.js to restart)

Oh shoot, I actually mis-wrote what you quoted! 😞

Because in production the all-powerful indexStatic.generate() ultimately comes from the bundle generated by the postinstall step, you not only have to reboot the server, but also have to re-run the postinstall step to have any changes to JSX files take effect. :( It's a huge hassle but I guess that's why we prefer development mode, and leave production mode to Heroku, hehehe.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yay cool, this matches my expectation of the app's behaviours now!

@toolness
Copy link
Contributor Author

one quick question: does https://teach-site.herokuapp.com/ act like a PROD environment or STAGING/DEVELOPMENT?

Oh it acts like a staging environment, in the sense that it's minifying code while also showing the "dev version" ribbon, just like our existing staging site at https://teach.mofostaging.net/.

That said, it also reflects the current state of this PR, which is to say that it doesn't have a lot of the recent improvements on our develop or master branches, like the new homepage.

@mmmavis
Copy link
Member

mmmavis commented Jun 4, 2015

😓 Ahhh, I think I confused myself with the term prod/staging/development site vs what process.env.NODE_ENV can be.
Is it true that we normally set

  • process.env.NODE_ENV to production on S3(production site) & Heroku(staging site)
  • process.env.NODE_ENV to development when we run the app locally (I guess we can still set it to production if we want) ?

@toolness
Copy link
Contributor Author

toolness commented Jun 4, 2015

Yeah, that's correct. For dev/prod parity I wanted staging to be as close to production as possible, so we can spot errors before we push to production.

Er, that said, the only things running on Heroku right now are teach-api, which doesn't use node, and the lightweight dynamic server in this PR... teach.mofostaging.net is S3+CloudFront, just like teach.mozilla.org.

@@ -59,6 +62,8 @@
"s3": "gulp s3",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is npm run s3 (and pres3) what gets run when we deploy the site to staging/production?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mmmavis
Copy link
Member

mmmavis commented Jun 4, 2015

Hey @toolness ,

I ran through the code & comments & tested what you mentioned here - everything worked out well 💃 💃 💃 👍

I get the general idea about this PR but I can't say I understand everything 100% - I'll probably still be asking questions here and there in the future as my brain gets developed day by day... lol 😛 I feel like every time I go through this PR I learn something new. Heh. I guess I just need some time to process the code.

That said, I wonder if we should merge this PR since it works as expected? :neckbeard:

@toolness
Copy link
Contributor Author

toolness commented Jun 5, 2015

No worries mavis! I think what makes this PR particularly challenging to understand is the fact that there's now lots of code that's only used in certain contexts--for instance, npm start is only used when we want to develop using the static site generator, while that postinstall step is only intended for use w/ deploying the dynamic server to Heroku, while index-static.jsx is used by both the static site generator and the dynamic server in completely different contexts. It's a whole lot to take in and I'm sorry it's so confusing. 😟

I think it'd actually easier to understand if I just removed all the code only used for static site generation... but we can't really do that until we're completely ready to move to the dynamic server... urgh.

I guess it's probably OK to merge the PR for now, and once we eventually migrate to using the dynamic server in production, we can just throw out the static site generator code completely, which will make the codebase easier to understand.

Hmm, that said, another alternative is for me to just maintain this PR as a sort of fork of the project, which we only merge in when we're ready to migrate to the dynamic server (and at that point I can also change this PR to remove all the static site generation code). I'm not sure which option is better...

@mmmavis
Copy link
Member

mmmavis commented Jun 5, 2015

No worries @toolness and thanks for patiently answering all my questions!

Heh I did spend quite a bit of time on package.json and gulpfile.js just to understand what scripts should be ran in what context, and what command would trigger what scripts that sort of thing. But thanks to your explanation now it becomes a lot more clear to me.

Now I'm curious how we determine if we are "ready" to completely switch over to the dynamic server. 😶

@toolness
Copy link
Contributor Author

toolness commented Jun 5, 2015

Ah cool! Yeah, please don't hesitate to keep the questions coming.

Now I'm curious how we determine if we are "ready" to completely switch over to the dynamic server. 😶

Haha that's a great question, and one that I think I'll need to file a new bug about. At the very least, it will involve making the server provide caching headers that allow CloudFront (or some other reverse cache proxy) to aggressively cache the server's responses for the site's most static content, so that we don't suffer a performance hit in moving to the dynamic server. Because it would really suck if the server actually regenerated every single HTML page for every single request from scratch--that's a lot less efficient than our current setup where CloudFront serves pre-rendered HTML content from an S3 bucket!

toolness added a commit that referenced this pull request Jun 9, 2015
Initial attempt at a lightweight dynamic server
@toolness toolness merged commit bb8aaa9 into mozilla:develop Jun 9, 2015
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants