Added Go and updates #15

davidaf3 · 2022-07-08T15:58:04Z

Hi!

I have added a Go implementation and updated my old Java and C solutions. I ran (on WSL) the benchmark an these are the results I got:

Proc,Run,Memory(bytes),Time(ms)
===> java -Xms20M -Xmx100M -cp build/java Main
java-Main,0,439
java-Main,0,1189
java-Main,0,1870
java-Main,0,3220
java-Main,0,9520
java-Main,0,13760
===> java -Xms20M -Xmx100M -cp build/java MainFaster
java-MainFaster,0,371
java-MainFaster,0,742
java-MainFaster,0,1775
java-MainFaster,0,3197
java-MainFaster,0,1007
java-MainFaster,0,1541
===> ./lisp-phone-encoder
./lisp-phone-encoder,0,1488
./lisp-phone-encoder,0,2707
./lisp-phone-encoder,0,7547
./lisp-phone-encoder,0,13030
./lisp-phone-encoder,0,27022
./lisp-phone-encoder,0,44292
===> ./rust
./rust,0,104
./rust,0,419
./rust,0,1693
./rust,0,3203
./rust,0,8665
./rust,0,13972
===> ./phone_encoder_c
./phone_encoder_c,0,130
./phone_encoder_c,0,254
./phone_encoder_c,0,1053
./phone_encoder_c,0,1551
./phone_encoder_c,0,490
./phone_encoder_c,0,908
===> ./phone_encoder_go
./phone_encoder_go,0,248
./phone_encoder_go,0,512
./phone_encoder_go,0,1549
./phone_encoder_go,0,2813
./phone_encoder_go,0,970
./phone_encoder_go,0,1755

renatoathaydes · 2022-07-08T16:11:20Z

I was looking at the solution you wrote in Java, and it seems really similar to what I had done... I couldn't figure out why your solution is faster... can you explain the difference?

davidaf3 · 2022-07-08T19:39:25Z

The main difference is that I group words by length when building a solution. For example, suppose that the phone number 123456 can be encoded by both bar foo and baz foo. First I fin that 123 can be encoded by bar and baz. Then, instead of trying to encode the rest of the number from bar and then from baz, I group the two words and encode 456 only once.

This is done by storing the partial solutions in an irregular matrix, where each row contains a list of words with the same length. To print the encodings I have to recursively build each posible solution using the words from the matrix, but to find the solution count I only need to multiply the lengths of the rows. That's why it saves so much time when only counting encodings.

Also, my solution saves the most steps when a lot of words of the same length can be grouped. The english dictionary contains a lot of short words, so the probability of being able to group them is high. That's another reason why my solution saves a lot of time when counting.

renatoathaydes · 2022-07-08T20:39:11Z

Ah! It didn't even come across my mind to optimise for the "count" case!!

The only reason I introduced that case was to remove the IO overhead because the programs run so fast that after around 100,000 phone numbers, they spend nearly the whole time writing to stdout.

I hadn't really comprehended that you had optimised for count specifically, but that makes a lot of sense now that you explain it.

Even though that's really smart, I have to say that, for the purposes of benchmarking the languages' ability to run the actual algorithm to find all solutions, that would probably count as "cheating" (not saying you're cheating, just that we're not comparing solutions to the original problem anymore), no offense! Anyway, it's really cool to see just how fast the programs can get when you apply intelligent optimisations like that, and thank you very much for the effort.

I ported the original Lisp solution to Zig! I have some work to do to get it working for long inputs, and there are lots of ways to optimise Zig programs, so I am hoping to do that before pushing a PR tomorrow with the results. I will then compare that against the Go/C/Java solutions you've given (the "print" variant at least)... I might even implement the Trie-based solution which apparently is always faster than the arithmetic one (not sure about that yet, that's actually one of the things I want to find out! The Rust solution removed arithmetic because it could do the same kind of thing by just storing bytes in the hashmap keys, which I am planning to do also in Zig).

davidaf3 · 2022-07-08T23:04:33Z

I actually wrote the solution before the count option was in the benchmark. When it was added I figured I could just multiply the lengths of the rows instead of counting each solution one by one, haha.

I did a quick change to the Java code (MainFaster) to count each solution individually after generating it, and the runtimes are still a little bit faster in most cases:

===> java -Xms20M -Xmx100M -cp build/java Main
java-Main,0,419
java-Main,0,893
java-Main,0,1927
java-Main,0,3289
java-Main,0,7390
java-Main,0,13549
===> java -Xms20M -Xmx100M -cp build/java MainFaster
java-MainFaster,0,348
java-MainFaster,0,780
java-MainFaster,0,1749
java-MainFaster,0,3051
java-MainFaster,0,7935
java-MainFaster,0,11133

I guess the results really depend on the input generated. Sometimes it will benefit from grouping and sometimes not.

I will be out the rest of the month so I won't be able to update the other languages until August.

davidaf3 · 2022-08-04T14:00:22Z

I have updated all the implementations to count the solutions one by one. I have also optimized the code that generates the solutions and now it runs way faster.

These are the results that I got after running the benchmark again:

Proc,Run,Memory(bytes),Time(ms)
===> java -Xms20M -Xmx100M -cp build/java Main
java-Main,0,429
java-Main,0,1225
java-Main,0,1929
java-Main,0,3254
java-Main,0,8380
java-Main,0,16723
===> java -Xms20M -Xmx100M -cp build/java MainFaster
java-MainFaster,0,380
java-MainFaster,0,858
java-MainFaster,0,1847
java-MainFaster,0,2965
java-MainFaster,0,8491
java-MainFaster,0,16004
===> ./lisp-phone-encoder
./lisp-phone-encoder,0,1632
./lisp-phone-encoder,0,3133
./lisp-phone-encoder,0,8733
./lisp-phone-encoder,0,16182
./lisp-phone-encoder,0,29716
./lisp-phone-encoder,0,53965
===> ./rust
./rust,0,109
./rust,0,422
./rust,0,1669
./rust,0,3230
./rust,0,8530
./rust,0,16006
===> ./phone_encoder_c
./phone_encoder_c,0,140
./phone_encoder_c,0,249
./phone_encoder_c,0,769
./phone_encoder_c,0,1408
./phone_encoder_c,0,3917
./phone_encoder_c,0,8076
===> ./phone_encoder_go
./phone_encoder_go,0,231
./phone_encoder_go,0,492
./phone_encoder_go,0,1558
./phone_encoder_go,0,2795
./phone_encoder_go,0,6881
./phone_encoder_go,0,12425
===> ./encoder_zig
./encoder_zig,0,72
./encoder_zig,0,545
./encoder_zig,0,2484
./encoder_zig,0,4669
./encoder_zig,0,10167
./encoder_zig,0,19825

renatoathaydes · 2022-08-23T19:00:33Z

Oh nice, thanks for posting the new results! Looks interesting.

davidaf3 added 4 commits July 8, 2022 17:20

C optimizations and fixes

fa2088f

Java update

ac7a054

Added go

7b39b7c

Updated benchmark

4d0a076

davidaf3 changed the base branch from master to davidaf3 July 8, 2022 15:59

davidaf3 changed the title ~~Davidaf3~~ Added Go and updates Jul 8, 2022

davidaf3 marked this pull request as ready for review July 8, 2022 16:01

davidaf3 mentioned this pull request Jul 8, 2022

Faster Java and added C #12

Closed

davidaf3 added 5 commits August 2, 2022 16:17

Updated solutions

bc1c361

Solved benchmark conflicts

04639de

Merge remote-tracking branch 'upstream/davidaf3' into davidaf3

22208e4

Fixed benchmark

c10c609

Added another color to the chart

ab384df

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added Go and updates #15

Added Go and updates #15

davidaf3 commented Jul 8, 2022 •

edited

Loading

renatoathaydes commented Jul 8, 2022

davidaf3 commented Jul 8, 2022

renatoathaydes commented Jul 8, 2022

davidaf3 commented Jul 8, 2022 •

edited

Loading

davidaf3 commented Aug 4, 2022

renatoathaydes commented Aug 23, 2022

Added Go and updates #15

Are you sure you want to change the base?

Added Go and updates #15

Conversation

davidaf3 commented Jul 8, 2022 • edited Loading

renatoathaydes commented Jul 8, 2022

davidaf3 commented Jul 8, 2022

renatoathaydes commented Jul 8, 2022

davidaf3 commented Jul 8, 2022 • edited Loading

davidaf3 commented Aug 4, 2022

renatoathaydes commented Aug 23, 2022

davidaf3 commented Jul 8, 2022 •

edited

Loading

davidaf3 commented Jul 8, 2022 •

edited

Loading