Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to minify/compress thousands of JS files - including some large ones - at the same time or sequentially without crashing the console? #2113

Closed
jslegers opened this issue Jun 16, 2017 · 12 comments

Comments

@jslegers
Copy link

jslegers commented Jun 16, 2017

Context

With a demo I'm currently refactoring, I have a src folder that contains 196 MB. About 142 MB consist of two binary files.

About 2000 of the remaining 2137 files (which is about 46 MB) consists of JavaScript files. More specifically, these are official and complete distributions of the LuciadRia framework & Dojo toolkit. The largest JavaScript file is about 23MB. It is unminified code originally written in C++ and compiled - with emscripten - to asm, and this code makes up the heart of the LuciadRia rendering engine.

I wanted to write a Node.js script that copies all of my files from the src path to the dist path and minifies every JS or CSS file it encounters along the way. Unfortunately, the number and/or size of JS files involved seems to break my script.


Let's go through the steps I took...

Step 1

I started with writing a small build script that copied all data from my src folder to my dist folder. I was surprised to learn that this process finishes in a matter of seconds.

Herebelow is my code for this script. Note that you'll need Node 8 to run that code.

const util = require('util');
const fs = require('fs');
const path = require('path');

const mkdir = util.promisify(require('mkdirp'));
const rmdir = util.promisify(require('rimraf'));
const ncp = util.promisify(require('ncp').ncp);
const readdir = util.promisify(fs.readdir);
const readFile = util.promisify(fs.readFile);
const writeFile = util.promisify(fs.writeFile);
const stat = util.promisify(fs.stat);

const moveFrom = path.join(__dirname,"../scr");
const moveTo = path.join(__dirname,"../dist");

var copyFile = function(source, target) {
    return new Promise(function(resolve,reject){
        const rd = fs.createReadStream(source);
        rd.on('error', function(error){
            reject(error);
        });
        const wr = fs.createWriteStream(target);
        wr.on('error', function(error){
            reject(error);
        });
        wr.on('close', function(){
            resolve();
        });
        rd.pipe(wr);
    });
};

var copy = function(source, target) {
    stat(source)
    .then(function(stat){
        if(stat.isFile()) {
            console.log("Copying file %s", source);
            switch (path.extname(target)) {
                default:
                    return copyFile(source, target);
            }
        } else if( stat.isDirectory() ) {
            return build(source, target);
        }
    }).catch(function(error){
        console.error(error);
    });
};

var build = function(source, target) {
    readdir(source)
    .then(function(list) {
        return rmdir(target).then(function(){
            return list;
        });
    })
    .then(function(list) {
        return mkdir(target).then(function(){
            return list;
        });
    }).then(function(list) {
        list.forEach(function(item, index) {
            copy(path.join(source, item), path.join(target, item));
        });
    }).catch(function(error){
        console.error(error);
    })
};

build(moveFrom, moveTo);

Step 2

Tto minify my CSS files whenever I encountered them, I added CSS minification.

For that, I made the following modifications to my code.

First, I added this function :

var uglifyCSS = function(source, target) {
    readFile(source, "utf8")
    .then(function(content){
		return writeFile(target, require('ycssmin').cssmin(content), "utf8");
    }).catch(function(error){
        console.error(error);
    });
}

Then, I modified my copy function, like this :

var copy = function(source, target) {
    stat(source)
    .then(function(stat){
        if(stat.isFile()) {
            console.log("Copying file %s", source);
            switch (path.extname(target)) {
            case ".css":
                return uglifyCSS(source, target);
            default:
                return copyFile(source, target);
            }
        } else if( stat.isDirectory() ) {
            return build(source, target);
        }
    }).catch(function(error){
        console.error(error);
    });
};

So far, so good. Everything still runs smoothly at this stage.

Step 3

Then, I did the same to minify my JS.

So again, I added a new function :

var uglifyJS = function(source, target) {
    readFile(source, "utf8")
    .then(function(content){
        return writeFile(target, require('uglify-js').minify(content).code, "utf8");
    }).catch(function(error){
        console.error(error);
    });
}

Then, I modified my copy function again :

var copy = function(source, target) {
    stat(source)
    .then(function(stat){
        if(stat.isFile()) {
            console.log("Copying file %s", source);
            switch (path.extname(target)) {
            case ".css":
                return uglifyCSS(source, target);
            case ".js":
                return uglifyJS(source, target);
            default:
                return copyFile(source, target);
            }
        } else if( stat.isDirectory() ) {
            return build(source, target);
        }
    }).catch(function(error){
        console.error(error);
    });
};

The problem

Here, things go wrong. As the process keeps encountering more and more JS files, it keeps slowing down until the process seems to stop completely.

It appears that too many parallel processes get started and keep consuming more and more memory until no more memory is left and the process just dies silently. I tried other minifiers besides UglifyJS, and I experienced the same issue for all of them. So the problem doesn't appear to be specific to UglifyJS.

Any ideas how to fix this issue?

This is the complete code :

const util = require('util');
const fs = require('fs');
const path = require('path');

const mkdir = util.promisify(require('mkdirp'));
const rmdir = util.promisify(require('rimraf'));
const ncp = util.promisify(require('ncp').ncp);
const readdir = util.promisify(fs.readdir);
const readFile = util.promisify(fs.readFile);
const writeFile = util.promisify(fs.writeFile);
const stat = util.promisify(fs.stat);

const moveFrom = path.join(__dirname,"../scr");
const moveTo = path.join(__dirname,"../dist");

var copyFile = function(source, target) {
    return new Promise(function(resolve,reject){
        const rd = fs.createReadStream(source);
        rd.on('error', function(error){
            reject(error);
        });
        const wr = fs.createWriteStream(target);
        wr.on('error', function(error){
            reject(error);
        });
        wr.on('close', function(){
            resolve();
        });
        rd.pipe(wr);
    });
};

var uglifyCSS = function(source, target) {
    readFile(source, "utf8")
    .then(function(content){
        return writeFile(target, require('ycssmin').cssmin(content), "utf8");
    }).catch(function(error){
        console.error(error);
    });
}

var uglifyJS = function(source, target) {
    readFile(source, "utf8")
    .then(function(content){
        return writeFile(target, require('uglify-js').minify(content).code, "utf8");
    }).catch(function(error){
        console.error(error);
    });
}

var copy = function(source, target) {
    stat(source)
    .then(function(stat){
        if(stat.isFile()) {
            console.log("Copying file %s", source);
            switch (path.extname(target)) {
		            case ".css":
		                return uglifyCSS(source, target);
						    case ".js":
						        return uglifyJS(source, target);
                default:
                    return copyFile(source, target);
            }
        } else if( stat.isDirectory() ) {
            return build(source, target);
        }
    }).catch(function(error){
        console.error(error);
    });
};

var build = function(source, target) {
    readdir(source)
    .then(function(list) {
        return rmdir(target).then(function(){
            return list;
        });
    })
    .then(function(list) {
        return mkdir(target).then(function(){
            return list;
        });
    }).then(function(list) {
        list.forEach(function(item, index) {
            copy(path.join(source, item), path.join(target, item));
        });
    }).catch(function(error){
        console.error(error);
    })
};

build(moveFrom, moveTo);
@jslegers jslegers changed the title How to minify/compress dozens of JS files at the same time without crashing the console? How to minify/compress thousands of JS files at the same time without crashing the console? Jun 16, 2017
@jslegers jslegers changed the title How to minify/compress thousands of JS files at the same time without crashing the console? How to minify/compress thousands of JS files at the same time - including some large ones - without crashing the console? Jun 16, 2017
@jslegers jslegers changed the title How to minify/compress thousands of JS files at the same time - including some large ones - without crashing the console? How to minify/compress thousands of JS files - including some large ones - at the same time or sequentially without crashing the console? Jun 16, 2017
@kzc
Copy link
Contributor

kzc commented Jun 16, 2017

Since you're dealing with huge volumes of emscripten generated asm.js code that cannot benefit from the uglify compress option, I'd recommend to only mangle and disable compress:

require('uglify-js').minify(content, {
    compress: false,
})

Uglify mangle accounts for 98% of the size reduction in minification anyway. And mangle does work on asm.js code.

Should that option not work due to the size of the inputs you may be forced to disable both mangle and compress so only whitespace eliding is performed.

require('uglify-js').minify(content, {
    compress: false,
    mangle: false,
})

@kzc
Copy link
Contributor

kzc commented Jun 16, 2017

Something else you may try: google how to increase node stack size with: node --stack-size

Also be aware that uglify-js@3 uses less memory and takes less time to minify() than uglify-js@2.

@alexlamsl
Copy link
Collaborator

Unless I misunderstand how those promisify() stuff works, eveything should still be run on a single process/thread.

So looks to me uglify-js is processing the files one at a time, and so one of the many files alone is crashing your process, assuming everything works if you skip uglify-js in your example code above.

Apologies if this is a dumb request, but would you mind sharing the JavaScript input files (so presumbly stuff created by emscripten) being fed into uglify-js? I do have a big enough box here to investigate this issue.

I know from experience that lib/parse.js would crash on some corner case input (JetStream's mandreel.js was problematic at one point), so having some real world examples to optimise/fix against would be very helpful.

@jslegers
Copy link
Author

I've been playing around some more with it, and it looks like the 23MB file is the one causing the issue. If I remove that one from my src path, the process does complete, and it does so in less than five minutes.

I'm afraid I can't share that file, however, as it's not open source. I could get in major trouble with my employer.

@alexlamsl
Copy link
Collaborator

@jslegers no worries. If you can somehow produce a skeleton test case that would also help, otherwise I'm afraid there are too many possibilities for me to figure out where the potential problem is.

Suggestion by @kzc above would help to identify which stage is crashing, e.g. compress:false or mangle:false would skip lib/compress.js or lib/mangle.js respectively.

@kzc
Copy link
Contributor

kzc commented Jun 16, 2017

@alexlamsl It's easy to find large emscripten projects or make one of your own from a C/C++ code base.

Here's a decent sized example: https://github.com/kripken/sql.js/

$ npm install sql.js

$ wc -c node_modules/sql.js/js/*.js
   21169 node_modules/sql.js/js/api.js
     130 node_modules/sql.js/js/shell-post.js
      76 node_modules/sql.js/js/shell-pre.js
 8725110 node_modules/sql.js/js/sql-debug.js
 2200500 node_modules/sql.js/js/sql-memory-growth.js
 2184948 node_modules/sql.js/js/sql.js
    1868 node_modules/sql.js/js/worker.js
 2186774 node_modules/sql.js/js/worker.sql.js
 15320575 total

@alexlamsl
Copy link
Collaborator

@kzc individually, none of them seems to fail with -mc passes=3,unsafe,keep_fargs=0.

I've took that directory, removed shell-*.js as they are incomplete/invalid JavaScript, then run:

$ uglifyjs js/*.js -mc passes=3,unsafe,keep_fargs=0 -o min.js --stats

<--- Last few GCs --->

[644:0000023D5829F920]    87776 ms: Mark-sweep 1403.2 (1463.6) -> 1403.2 (1432.6) MB, 1102.5 / 0.0 ms  last resort
[644:0000023D5829F920]    88881 ms: Mark-sweep 1403.2 (1432.6) -> 1403.2 (1432.6) MB, 1104.5 / 0.0 ms  last resort


<--- JS stacktrace --->

==== JS stack trace =========================================

Security context: 00000282C84A9891 <JS Object>
    1: push(this=0000021472EC4FC9 <JS Array[50855]>)
    2: visit [000002B068302311 <undefined>:~3297] [pc=000000C114B258EB](this=0000025C3CF918F1 <a TreeWalker with map 00000099FF44BBB1>,node=00000293BB6B1909 <an AST_SymbolRef with map 00000099FF4526F1>,descend=000001361D388C11 <JS Function noop (SharedFunctionInfo 000000FB2CC42E59)>)
    3: _visit [000002B068302311 <undefined>:~1213] [pc=00...

FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory

@kzc
Copy link
Contributor

kzc commented Jun 16, 2017

Not surprising. Increase node's stack size or try it without mangle and/or compress.

@jslegers
Copy link
Author

I'll need to do some further testing next week.
It's 9 pm over here, so it's time to start my weekend!

@alexlamsl
Copy link
Collaborator

This works:

$ node --max-old-space-size=4096 bin/uglifyjs js/*.js -mc passes=3,unsafe,keep_fargs=0 -o min.js --timings
- parse: 14.250s
- scope: 15.640s
- compress: 43.830s
- mangle: 62.486s
- properties: 0.000s
- output: 15.297s
- total: 151.503s

So this is a heap rather than stack issue - and without any further specifics, I think the V8 option would be the best solution.

@kzc
Copy link
Contributor

kzc commented Jun 16, 2017

Okay, it's a node issue. Not much we can do on the uglify side given the file sizes and the memory it'd take to hold the AST.

@alexlamsl
Copy link
Collaborator

@kzc agreed.

@jslegers I'll close this out for now - will re-open if there is a different error from #2113 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants