Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Go and updates #15

Open
wants to merge 9 commits into
base: davidaf3
Choose a base branch
from
Open

Conversation

davidaf3
Copy link

@davidaf3 davidaf3 commented Jul 8, 2022

Hi!

I have added a Go implementation and updated my old Java and C solutions. I ran (on WSL) the benchmark an these are the results I got:

Proc,Run,Memory(bytes),Time(ms)
===> java -Xms20M -Xmx100M -cp build/java Main
java-Main,0,439
java-Main,0,1189
java-Main,0,1870
java-Main,0,3220
java-Main,0,9520
java-Main,0,13760
===> java -Xms20M -Xmx100M -cp build/java MainFaster
java-MainFaster,0,371
java-MainFaster,0,742
java-MainFaster,0,1775
java-MainFaster,0,3197
java-MainFaster,0,1007
java-MainFaster,0,1541
===> ./lisp-phone-encoder
./lisp-phone-encoder,0,1488
./lisp-phone-encoder,0,2707
./lisp-phone-encoder,0,7547
./lisp-phone-encoder,0,13030
./lisp-phone-encoder,0,27022
./lisp-phone-encoder,0,44292
===> ./rust
./rust,0,104
./rust,0,419
./rust,0,1693
./rust,0,3203
./rust,0,8665
./rust,0,13972
===> ./phone_encoder_c
./phone_encoder_c,0,130
./phone_encoder_c,0,254
./phone_encoder_c,0,1053
./phone_encoder_c,0,1551
./phone_encoder_c,0,490
./phone_encoder_c,0,908
===> ./phone_encoder_go
./phone_encoder_go,0,248
./phone_encoder_go,0,512
./phone_encoder_go,0,1549
./phone_encoder_go,0,2813
./phone_encoder_go,0,970
./phone_encoder_go,0,1755

benchmark-result

@davidaf3 davidaf3 changed the base branch from master to davidaf3 July 8, 2022 15:59
@davidaf3 davidaf3 changed the title Davidaf3 Added Go and updates Jul 8, 2022
@davidaf3 davidaf3 marked this pull request as ready for review July 8, 2022 16:01
@davidaf3 davidaf3 mentioned this pull request Jul 8, 2022
@renatoathaydes
Copy link
Owner

I was looking at the solution you wrote in Java, and it seems really similar to what I had done... I couldn't figure out why your solution is faster... can you explain the difference?

@davidaf3
Copy link
Author

davidaf3 commented Jul 8, 2022

The main difference is that I group words by length when building a solution. For example, suppose that the phone number 123456 can be encoded by both bar foo and baz foo. First I fin that 123 can be encoded by bar and baz. Then, instead of trying to encode the rest of the number from bar and then from baz, I group the two words and encode 456 only once.

This is done by storing the partial solutions in an irregular matrix, where each row contains a list of words with the same length. To print the encodings I have to recursively build each posible solution using the words from the matrix, but to find the solution count I only need to multiply the lengths of the rows. That's why it saves so much time when only counting encodings.

Also, my solution saves the most steps when a lot of words of the same length can be grouped. The english dictionary contains a lot of short words, so the probability of being able to group them is high. That's another reason why my solution saves a lot of time when counting.

@renatoathaydes
Copy link
Owner

Ah! It didn't even come across my mind to optimise for the "count" case!!

The only reason I introduced that case was to remove the IO overhead because the programs run so fast that after around 100,000 phone numbers, they spend nearly the whole time writing to stdout.

I hadn't really comprehended that you had optimised for count specifically, but that makes a lot of sense now that you explain it.

Even though that's really smart, I have to say that, for the purposes of benchmarking the languages' ability to run the actual algorithm to find all solutions, that would probably count as "cheating" (not saying you're cheating, just that we're not comparing solutions to the original problem anymore), no offense! Anyway, it's really cool to see just how fast the programs can get when you apply intelligent optimisations like that, and thank you very much for the effort.

I ported the original Lisp solution to Zig! I have some work to do to get it working for long inputs, and there are lots of ways to optimise Zig programs, so I am hoping to do that before pushing a PR tomorrow with the results. I will then compare that against the Go/C/Java solutions you've given (the "print" variant at least)... I might even implement the Trie-based solution which apparently is always faster than the arithmetic one (not sure about that yet, that's actually one of the things I want to find out! The Rust solution removed arithmetic because it could do the same kind of thing by just storing bytes in the hashmap keys, which I am planning to do also in Zig).

@davidaf3
Copy link
Author

davidaf3 commented Jul 8, 2022

I actually wrote the solution before the count option was in the benchmark. When it was added I figured I could just multiply the lengths of the rows instead of counting each solution one by one, haha.

I did a quick change to the Java code (MainFaster) to count each solution individually after generating it, and the runtimes are still a little bit faster in most cases:

===> java -Xms20M -Xmx100M -cp build/java Main
java-Main,0,419
java-Main,0,893
java-Main,0,1927
java-Main,0,3289
java-Main,0,7390
java-Main,0,13549
===> java -Xms20M -Xmx100M -cp build/java MainFaster
java-MainFaster,0,348
java-MainFaster,0,780
java-MainFaster,0,1749
java-MainFaster,0,3051
java-MainFaster,0,7935
java-MainFaster,0,11133

I guess the results really depend on the input generated. Sometimes it will benefit from grouping and sometimes not.

I will be out the rest of the month so I won't be able to update the other languages until August.

@davidaf3
Copy link
Author

davidaf3 commented Aug 4, 2022

I have updated all the implementations to count the solutions one by one. I have also optimized the code that generates the solutions and now it runs way faster.

These are the results that I got after running the benchmark again:

Proc,Run,Memory(bytes),Time(ms)
===> java -Xms20M -Xmx100M -cp build/java Main
java-Main,0,429
java-Main,0,1225
java-Main,0,1929
java-Main,0,3254
java-Main,0,8380
java-Main,0,16723
===> java -Xms20M -Xmx100M -cp build/java MainFaster
java-MainFaster,0,380
java-MainFaster,0,858
java-MainFaster,0,1847
java-MainFaster,0,2965
java-MainFaster,0,8491
java-MainFaster,0,16004
===> ./lisp-phone-encoder
./lisp-phone-encoder,0,1632
./lisp-phone-encoder,0,3133
./lisp-phone-encoder,0,8733
./lisp-phone-encoder,0,16182
./lisp-phone-encoder,0,29716
./lisp-phone-encoder,0,53965
===> ./rust
./rust,0,109
./rust,0,422
./rust,0,1669
./rust,0,3230
./rust,0,8530
./rust,0,16006
===> ./phone_encoder_c
./phone_encoder_c,0,140
./phone_encoder_c,0,249
./phone_encoder_c,0,769
./phone_encoder_c,0,1408
./phone_encoder_c,0,3917
./phone_encoder_c,0,8076
===> ./phone_encoder_go
./phone_encoder_go,0,231
./phone_encoder_go,0,492
./phone_encoder_go,0,1558
./phone_encoder_go,0,2795
./phone_encoder_go,0,6881
./phone_encoder_go,0,12425
===> ./encoder_zig
./encoder_zig,0,72
./encoder_zig,0,545
./encoder_zig,0,2484
./encoder_zig,0,4669
./encoder_zig,0,10167
./encoder_zig,0,19825

benchmark-result

@renatoathaydes
Copy link
Owner

Oh nice, thanks for posting the new results! Looks interesting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants