Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add gzip compressed _all.html file to build artifacts #746

Closed
mattpen opened this issue Mar 14, 2019 · 37 comments
Closed

Add gzip compressed _all.html file to build artifacts #746

mattpen opened this issue Mar 14, 2019 · 37 comments

Comments

@mattpen
Copy link
Contributor

mattpen commented Mar 14, 2019

Requested to improve performance for the iOS app in phetsims/phet-ios-app#440.

I'm trying to find a node library that will accomplish lzma compression so we can build this artifact in buildRunnable.html. I'm currently evaluating lzma-native. It works fine on linux machines but fails hard (no catchable exceptions) on Windows machines. I've opened addaleax/lzma-native#78 with the developer.

@mattpen mattpen mentioned this issue Mar 14, 2019
@mattpen mattpen self-assigned this Mar 14, 2019
mattpen added a commit that referenced this issue Mar 14, 2019
@mattpen
Copy link
Contributor Author

mattpen commented Mar 14, 2019

This seems to be working nicely now:

$ grunt build-for-server --brands=phet --allHTML
Running "build" task
Building runnable repository (acid-base-solutions, brands: phet)
Building brand: phet
>> require.js optimization for brand: phet complete (4328192 bytes)
(node:18676) [DEP0005] DeprecationWarning: Buffer() is deprecated due to security and usability issues. Please use the Buffer.alloc(), Buffer.allocUnsafe(), or Buffer.from() methods instead.
>> Minification for phet complete
>> Require.js: 1432412 bytes
>> Preloads: 241309 bytes
>> Mipmaps: 132656 bytes

Done.

$ cd build/phet/
$ ls -la
total 10140
drwxr-xr-x 1 matt 197121       0 Mar 14 14:12 ./
drwxr-xr-x 1 matt 197121       0 Mar 14 14:11 ../
-rw-r--r-- 1 matt 197121    1111 Mar 14 14:12 acid-base-solutions_all_iframe_phet.html
-rw-r--r-- 1 matt 197121 2140359 Mar 14 14:12 acid-base-solutions_all_phet.html
-rw-r--r-- 1 matt 197121  245803 Mar 14 14:12 acid-base-solutions_all_phet.html.xz
-rw-r--r-- 1 matt 197121 5833512 Mar 14 14:12 acid-base-solutions_all_phet_debug.html
-rw-r--r-- 1 matt 197121    1110 Mar 14 14:12 acid-base-solutions_en_iframe_phet.html
-rw-r--r-- 1 matt 197121 1877995 Mar 14 14:12 acid-base-solutions_en_phet.html
-rw-r--r-- 1 matt 197121    9899 Mar 14 14:12 acid-base-solutions-128.png
-rw-r--r-- 1 matt 197121   95847 Mar 14 14:12 acid-base-solutions-600.png
-rw-r--r-- 1 matt 197121   40140 Mar 14 14:12 acid-base-solutions-ios.png
-rw-r--r-- 1 matt 197121  105678 Mar 14 14:12 acid-base-solutions-twitter-card.png
-rw-r--r-- 1 matt 197121    2358 Mar 14 14:12 dependencies.json
drwxr-xr-x 1 matt 197121       0 Mar 14 14:11 xhtml/

@samreid
Copy link
Member

samreid commented Mar 15, 2019

Have you considered using node's built-in zlib library? https://nodejs.org/docs/latest-v11.x/api/zlib.html

@mattpen
Copy link
Contributor Author

mattpen commented Mar 18, 2019

@samreid - I don't believe that supports lzma, which was requested specifically in phetsims/phet-ios-app#440. The compression ratio is somewhat better.

$ xz acid-base-solutions_all_phet.html -c > acid-base-solutions_all_phet.html.xz
$ gzip acid-base-solutions_all_phet.html -c > acid-base-solutions_all_phet.html.gz
$ du -sh acid-base-solutions_all_phet.html*
2.1M    acid-base-solutions_all_phet.html
704K    acid-base-solutions_all_phet.html.gz
604K    acid-base-solutions_all_phet.html.xz

There are a couple of other libraries with xz bindings that we could try if this is creating problems for local development, or we could make native calls to xz. The latter option probably won't work for windows devs, so we would probably need to move that step to the build-server.

@mattpen
Copy link
Contributor Author

mattpen commented Mar 18, 2019

I'm going to investigate https://www.npmjs.com/package/xz

@mattpen
Copy link
Contributor Author

mattpen commented Mar 18, 2019

That doesn't seem to be stable either. I'm getting the same error @chrisklus reported in slack when tyring to install it without removing node_modules first.

> [email protected] install C:\Users\matt\phet\git\chipper\node_modules\xz
> node-gyp rebuild


C:\Users\matt\phet\git\chipper\node_modules\xz>if not defined npm_config_node_gyp (node "C:\Users\matt\AppData\Roaming\npm\node_modules\npm\bin\node-gyp-bin\\..\..\node_modules\node-gyp\bin\node-gyp.js" rebuild )  else (node "" rebuild )
gyp ERR! configure error
gyp ERR! stack Error: Can't find Python executable "C:\Program Files\Python 3.5\python.EXE", you can set the PYTHON env variable.
gyp ERR! stack     at PythonFinder.failNoPython (C:\Users\matt\AppData\Roaming\npm\node_modules\npm\node_modules\node-gyp\lib\configure.js:483:19)
gyp ERR! stack     at PythonFinder.<anonymous> (C:\Users\matt\AppData\Roaming\npm\node_modules\npm\node_modules\node-gyp\lib\configure.js:508:16)
gyp ERR! stack     at C:\Users\matt\AppData\Roaming\npm\node_modules\npm\node_modules\graceful-fs\polyfills.js:284:29
gyp ERR! stack     at FSReqWrap.oncomplete (fs.js:153:21)
gyp ERR! System Windows_NT 10.0.17134
gyp ERR! command "C:\\Program Files\\nodejs\\node.exe" "C:\\Users\\matt\\AppData\\Roaming\\npm\\node_modules\\npm\\node_modules\\node-gyp\\bin\\node-gyp.js" "rebuild"
gyp ERR! cwd C:\Users\matt\phet\git\chipper\node_modules\xz
gyp ERR! node -v v10.15.3
gyp ERR! node-gyp -v v3.6.2
gyp ERR! not ok
npm WARN optional SKIPPING OPTIONAL DEPENDENCY: [email protected] (node_modules\fsevents):
npm WARN notsup SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for [email protected]: wanted {"os":"darwin","arch":"any"} (current: {"os":"win32","arch":"x64"})

npm ERR! code ELIFECYCLE
npm ERR! errno 1
npm ERR! [email protected] install: `node-gyp rebuild`
npm ERR! Exit status 1
npm ERR!
npm ERR! Failed at the [email protected] install script.
npm ERR! This is probably not a problem with npm. There is likely additional logging output above.

@mattpen
Copy link
Contributor Author

mattpen commented Mar 18, 2019

Since none of the available node library bindings appear to be compatible with npm prune && npm update, I'm going to remove this step from chipper and add it to the build-server. I'm going to implement it as a native process call.

mattpen added a commit that referenced this issue Mar 18, 2019
@samreid
Copy link
Member

samreid commented Mar 18, 2019

The compression ratio is somewhat better.
704K acid-base-solutions_all_phet.html.gz
604K acid-base-solutions_all_phet.html.xz

Is the argument that it is important to save 100kb per simulation? If so, why is that important? Let's say we have 100 simulations, saving an average of 100kb / simulation. That is a total of 10MB. Does that cross a threshold?

I'm going to remove this step from chipper and add it to the build-server.

Is it not important that this be controlled as an output from the build step?

I want to understand the reasoning to do something "outside the box" when there is a supported cross-platform solution provided directly by the node.js SDK and doesn't require 3rd party libraries or limitation of running native process call which only works on some platforms.

@mattpen
Copy link
Contributor Author

mattpen commented Mar 18, 2019

Is the argument that it is important to save 100kb per simulation? If so, why is that important? Let's say we have 100 simulations, saving an average of 100kb / simulation. That is a total of 10MB. Does that cross a threshold?

This request is from Chris Huxtable, who is working on our app. I'm not confident it crosses a threshold or that gzip will be unsuitable. gzip is already provided by apache so it would require 0 effort in terms of build tools development.

10MB savings isn't crossing a threshold, but it's a 15% reduction which could be significant at scale.

Is it not important that this be controlled as an output from the build step?

The thing that concerns me the most is that it creates a dependency on the file naming conventions, but there are many other instances of this type of dependency in the build-server already.

I'll follow up with Chris and see if he is ok with gzip and bring it up at dev meeting if not.

@mattpen
Copy link
Contributor Author

mattpen commented Mar 19, 2019

Chris Huxtable's reasoning for requesting an xz/lzma asset is here.

Here are our options for proceeding:

  1. Use gzip and accept the 15% bloat in storage size.

    • This affects server bandwidth and storage on all 100k+ ios devices. This is already implemented server side, we would just need to change which compression algorithm is used in iOS (zlib is supported, which should be able to read gzip files).
  2. Use a node library to create an xz asset with the grunt tools.

    • There is some instability with the npm prune && npm update operation, which would need to be sorted out. We would likely need to revert to the rm -rf node_modules && npm install method, which is quite a bit slower and requires a lot of extra bandwidth during the build process. There may be other unforeseen problems in adding an additional third party npm library to the build tools.
  3. Use a system call to create an xz asset with the grunt tools.

    • All developers would need to install xz in order to use the grunt tools.
  4. Use a system call to create an xz asset in the build-server.

    • This somewhat breaks the design pattern of using grunt to create the build artifacts, and puts an additional dependency on grunts file structure and naming conventions to deploy the artifact. (Note: we already have a dependencies like this)
  5. Adding a path in the tomcat application that fetches the sims and compresses them on the fly

    • very suboptimal in terms of cpu usage on the server
  6. Implementing the xz compression algorithm in node ourselves

    • This is basically reinventing the wheel
  7. Implementing an httpd module to add support for xz compression.

    • This is reinventing the tools needed to reinvent the wheel, and then reinventing the wheel
  8. rewrite chipper in c

    • seems legit

@samreid
Copy link
Member

samreid commented Mar 21, 2019

Unlike gzip, lzma seems to benefit from compressing multiple simulations together. For instance, I compressed 10 simulations together with lzma and they averaged 300kb/sim in the compression. I don't recall seeing the same batch savings with gzip.

We could also investigate reducing the size of the simulations themselves. For instance, jquery takes up about 84kb (uncompressed) of each of our simulations. There is an issue to remove it phetsims/joist#537 There may be other ways to attain savings.

@mattpen
Copy link
Contributor Author

mattpen commented Mar 21, 2019

Unlike gzip, lzma seems to benefit from compressing multiple simulations together.

I think this is not going to be beneficial for our situation. One of the requirements is that we download a single compressed sim and write it directly to disk without processing. They are then decompressed on demand. If we bundle them, at download time the app would need to decompress the bundle, then compress the sims individually and then write them to disk. This requires a lot of cpu processing, which severely impacts performance of the app and excessively drains the battery.

@mattpen
Copy link
Contributor Author

mattpen commented Mar 21, 2019

We could also investigate reducing the size of the simulations themselves. For instance, jquery takes up about 84kb (uncompressed) of each of our simulations. There is an issue to remove it phetsims/joist#537 There may be other ways to attain savings.

This would certainly be beneficial, but lzma would still get better compression ratios than gzip.

@samreid
Copy link
Member

samreid commented Mar 21, 2019

Notes from today's meeting:

  • We would like a solution that works well for the android app. If android doesn't have good lzma support, that may lead us to using gzip.
  • We discussed the download time as 100 sims * 500kb/sim at 300kbps internet connection speed, the download time would be 2.5 minutes LZMA vs 3.0 minutes GZ.
  • What about using different compression levels for GZ? Maybe not helpful after level 6.

Some comments from the team:
SR - Does android support lzma? I’m not too worried about 100kb extra per sim. It also sounds like lzma takes more CPU and time, given that I'd prefer gzip
JG - Don't know enough to cast an informed vote
DB - Not enough know how on this matter. Defer to MP
MB - GZip ok
JO: Built-in gzip with Apache serverside and iOS/Android client-side sounds like enough
MP - +1 for gzip
MK - the build in gzip seems really nice.
CK - Would like more info but gzip seems okay≈
CM
ON
JB - Uninformed gut instinct: Gzip is more common and easy to support, and the difference isn’t that much, so go for that.

@mattpen
Copy link
Contributor Author

mattpen commented Mar 21, 2019

@mbarlow12 - Can you please investigate XZ for Java, either the original package or the Apache commons wrapper?

@mattpen
Copy link
Contributor Author

mattpen commented Mar 21, 2019

I'll spend some more time investigating the npm issues for now.

@ariel-phet - Do you have any input on how we should proceed? To summarize, the benefit is a fairly significant savings in terms of bandwidth and storage on mobile devices. The drawback is that we already have a pretty good compression algorithm working (gzip) and this would require a significant amount of developer time.

@mattpen
Copy link
Contributor Author

mattpen commented Mar 22, 2019

I tested this package again this morning. It requires sh and is not compatible with windows and it looks like the developer isn't planning to add it: https://github.com/robey/node-xz/issues/8

@samreid
Copy link
Member

samreid commented Mar 22, 2019

Have you reviewed https://github.com/LZMA-JS/LZMA-JS ?

@mattpen
Copy link
Contributor Author

mattpen commented Mar 22, 2019

On windows, this package will work but requires all build machines to globally install node-pre-gyp and rimraf otherwise npm prune fails.

I tested this package on MacOS and RHEL as well and was able to npm install, prune, and update without error. However, it looks like npm prune is printing warnings to stderr, and this is causing the build server to think there has been an error. The build-server simply checks for any content in stderr after calling npm update and decides to abort the build if content is present. Maybe we should check stderr content for the (case-insensitive) presence of 'err'?

@mattpen
Copy link
Contributor Author

mattpen commented Mar 22, 2019

This is a pure javascript implementation with zero dependencies so npm operations should work beautifully. Install, prune, and update all work fine on Windows/MinGW, MacOS, and RHEL. Bad news, as far as I can tell it is broken. I tried various compression levels and the resulting artifact ends up 200kB LARGER than the original artifact.

$ du -sh build/phet/*_all_phet.html*
2.0M	build/phet/acid-base-solutions_all_phet.html
2.2M	build/phet/acid-base-solutions_all_phet.html.xz

@mattpen
Copy link
Contributor Author

mattpen commented Mar 22, 2019

No readme published on www.npmjs.com. Has not been updated in 3 years. Does not compile on MacOS.

@samreid
Copy link
Member

samreid commented Mar 22, 2019

I tried various compression levels and the resulting artifact ends up 200kB LARGER than the original artifact.

Maybe it was running decompression instead of compression somehow?

zepumph added a commit to phetsims/states-of-matter-basics that referenced this issue Apr 25, 2019
zepumph added a commit to phetsims/trig-tour that referenced this issue Apr 25, 2019
zepumph added a commit to phetsims/under-pressure that referenced this issue Apr 25, 2019
zepumph added a commit to phetsims/unit-rates that referenced this issue Apr 25, 2019
zepumph added a commit to phetsims/wave-interference that referenced this issue Apr 25, 2019
zepumph added a commit to phetsims/wave-on-a-string that referenced this issue Apr 25, 2019
mattpen added a commit that referenced this issue Apr 26, 2019
mattpen added a commit to phetsims/area-model-algebra that referenced this issue Apr 26, 2019
mattpen added a commit that referenced this issue Apr 26, 2019
mattpen added a commit to phetsims/area-model-decimals that referenced this issue Apr 26, 2019
mattpen added a commit to phetsims/area-model-introduction that referenced this issue Apr 26, 2019
mattpen added a commit that referenced this issue Apr 26, 2019
mattpen added a commit to phetsims/area-model-multiplication that referenced this issue Apr 26, 2019
mattpen added a commit that referenced this issue Apr 26, 2019
mattpen added a commit to phetsims/energy-forms-and-changes that referenced this issue Apr 26, 2019
mattpen added a commit that referenced this issue Apr 26, 2019
mattpen added a commit to phetsims/isotopes-and-atomic-mass that referenced this issue Apr 26, 2019
zepumph added a commit to phetsims/sherpa that referenced this issue Apr 29, 2019
@mattpen
Copy link
Contributor Author

mattpen commented May 6, 2019

This change was applied to all current release branches and deployed to production where necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants