Have you seen the "JavaScript Is Weird (EXTREME EDITION)" video?
This one, in case you're wandering "which one of them?".
It's a ~22-minute video that abuses type coercion issues in JavaScript
to create a prototype of a JavaScript transpiler that outputs valid code
consistinting of only the symbols ( ) = > { } [ + ] ! / -
. As an
example, this "Hello, world!" program in JavaScript:
console.log("Hello world!");
Gets converted to this other valid JavaScript program, that produces the same result but whose source is really weird. Here's the first few bytes of it:
(()=>{})[({}+[])[+!![] + +!![] + +!![] + +!![] + +!![]]+({}+[])[+!![]]+(
(+!![]/+[])+[])[+!![] + +!![] + +!![] + +!![]]+(![]+[])[+!![] + +!![] +
+!![]]+({}+[])[+!![] + +!![] + +!![] + +!![] + +!![] + +!![]]+(!![]+[])[
+!![]]+(!![]+[])[+!![] + +!![]]+({}+[])[+!![] + +!![] + +!![] + +!![] +
If you haven't seen the video yet, go do it - this post will be waiting right here.
So a few months ago I've watch the video and had a weird idea. If the resulting files are basically a different encoding of the same source code, but using a representation that uses less symbols, is there a chance that GZIP'ing those files would result in smaller files than GZIP'ing the original source files? Exchanging GZIP'ed files over the wire is standard practice for web browsers for a while now, so if the hypothesis (a really weak one, yes) worked, this weird idea may become something useful-ish.
And if this worked, how would the compression compare to JavaScript minifiers? And what about runtime performance? Maybe there were some scenarios in which it's OK to trade some runtime performance for smaller file sizes.
First of all, I've downloaded the video's linked repo with the sample code and run it locally. Because knowing some NodeJS code worked a year says very little about that same code working now - and I even didn't knew if it worked at all.
$ git clone https://github.com/lowbyteproductions/JavaScript-Is-Weird
Cloning into 'JavaScript-Is-Weird'...
remote: Enumerating objects: 4, done.
remote: Counting objects: 100% (3/3), done.
remote: Compressing objects: 100% (3/3), done.
remote: Total 4 (delta 0), reused 0 (delta 0), pack-reused 1
Receiving objects: 100% (4/4), done.
$ cd JavaScript-Is-Weird/
$ node index.js
(()=>{})[({}+[])[+!![] + +!![] + +!![] + +!![] + +!![]]+({}+[])[+!![]]+(
... snip ...
$
Inspecting the code, it had a hard-coded call to the compile
function
with the console.log("Hello, world!");
code in it. So I quickly
changed the program to receive the name of the source file as a command
line argument -
by asking ChatGPT for that trivial piece of code.
I've ran the program against itself, and checked that it kinda worked -
it was producing some output, but I couldn't ran itself once again
because the transpiler wraps the source code in an anonymous function
call that doesn't support require
ing. But, hey - we have some output!
$ node weird.js weird.js > weird.weird.js
$ gzip < weird.weird.js > weird.weird.js.gzip
$ ls -l weird.*
-rw-r--r-- 1 mgarcia staff 2065 Oct 16 01:28 weird.js
-rw-r--r-- 1 mgarcia staff 7702811 Oct 16 02:44 weird.weird.js
-rw-r--r-- 1 mgarcia staff 85691 Oct 16 02:45 weird.weird.js.gzip
(Yes, I've also renamed the script as weird.js
and adopted the
.weird
sub-extension since it seemed appropriate and YOLO.)
OK, so we are kind of testing the hypotesis already - that file doesn't feel smaller than the original one. And let's not even talk about minification or gziping the source file.
$ npx minify weird.js > weird.min.js
$ gzip < weird.js > weird.js.gzip
$ ls -l weird.*
-rw-r--r-- 1 mgarcia staff 2065 Oct 16 01:28 weird.js
-rw-r--r-- 1 mgarcia staff 756 Oct 16 02:49 weird.js.gzip
-rw-r--r-- 1 mgarcia staff 1242 Oct 16 02:48 weird.min.js
-rw-r--r-- 1 mgarcia staff 7702811 Oct 16 02:44 weird.weird.js
-rw-r--r-- 1 mgarcia staff 85691 Oct 16 02:45 weird.weird.js.gzip
So the original file is 2065 bytes long, the weird version is ~7MB, and the weird+GZIP version is 85k long (41.5x the size) - while the minified version is 0.6x the size, and GZIP is 0.36x.
I didn't even care to check the runtime performance, since there's no benefits to trade off.
So maybe this one script was weird enough that it didn't compress that
well? I had to try with some other files, so I got a few examples of
JavaScript files that weren't so long - after some file lenght, an
Array.join
in the transpiler fails with a RangeException
, so no
JQuery for our tests.
You can check the files in the sample/
directory. They
all show similar, aweful results:
$ ls -l sample/
total 683064
-rw-r--r-- 1 mgarcia staff 19648 Oct 16 01:39 dommy-2.0.js
-rw-r--r-- 1 mgarcia staff 5379 Oct 16 02:57 dommy-2.0.js.gz
-rw-r--r-- 1 mgarcia staff 8762 Oct 16 01:40 dommy-2.0.min.js
-rw-r--r-- 1 mgarcia staff 51015234 Oct 16 01:40 dommy-2.0.weird.js
-rw-r--r-- 1 mgarcia staff 587229 Oct 16 01:42 dommy-2.0.weird.js.gz
-rw-r--r-- 1 mgarcia staff 115023 Oct 16 01:50 lodash-4.17.15.js
-rw-r--r-- 1 mgarcia staff 24398 Oct 16 02:57 lodash-4.17.15.js.gz
-rw-r--r-- 1 mgarcia staff 13542 Oct 16 01:50 lodash-4.17.15.min.js
-rw-r--r-- 1 mgarcia staff 280028085 Oct 16 01:50 lodash-4.17.15.weird.js
-rw-r--r-- 1 mgarcia staff 3226049 Oct 16 01:51 lodash-4.17.15.weird.js.gz
-rw-r--r--@ 1 mgarcia staff 7114 Oct 16 01:44 modernizr-custom.js
-rw-r--r-- 1 mgarcia staff 2689 Oct 16 02:57 modernizr-custom.js.gz
-rw-r--r-- 1 mgarcia staff 1296 Oct 16 01:45 modernizr-custom.min.js
-rw-r--r-- 1 mgarcia staff 14293251 Oct 16 01:45 modernizr-custom.weird.js
-rw-r--r-- 1 mgarcia staff 168443 Oct 16 01:45 modernizr-custom.weird.js.gz
Here's a table:
File | Source | Weird | Weird.gz | Minified | GZIP |
---|---|---|---|---|---|
Dommy | 19648 | 51015234 (2596x) | 587229 (29x) | 8762 (0.44x) | 5379 (0.27x) |
Lodash | 115023 | 280028085 (2434x) | 3226049 (28x) | 13542 (0.11x) | 24398 (0.21x) |
Modernizr | 7114 | 14293251 (2009x) | 168443 (23x) | 1296 (0.18x) | 2689 (0.37x) |
So, yeah - this isn't a good idea. If the Weird transpiler only changes the encoding of each character with a really weird equivalent, it makes a lot of sense that it doesn't compress better than the source one - the ideal scenario would be to compress the same.
I really didn't expect to get great results out of this, but on the other hand it was a nice opportunity to scratch some curiosity that I got - while pushing a bizarre, extreme Internet semi-joke a little bit more still. And no kittens were hurt during the making of it, so...