Use vector-of-structs of preds/semi for Lengauer-Tarjan #408

samolisov · 2024-12-21T04:44:50Z

Closes #383

samolisov · 2024-12-21T04:49:47Z

I use the following benchmark: dominator_tree_benchmark.cpp

On my machine (32 X 1792.7 MHz CPU s with hyper-threading and almost zero Load Average, Ubuntu 20.4) the report is the following (we may use the state after merging #407 as a base-line:

----------------------------------------------------------------------------------
Benchmark                                        Time             CPU   Iterations
----------------------------------------------------------------------------------
Tarjan's paper (vertex list)                   934 ns          934 ns       748347
Tarjan's paper  (vertex vector)                845 ns          845 ns       830574
Appel. fig. 19.8 (vertex list)                 960 ns          959 ns       731191
Appel. fig. 19.8  (vertex vector)              860 ns          860 ns       813827
Muchnick. fig. 8.18 (vertex list)              561 ns          560 ns      1248586
Muchnick. fig. 8.18  (vertex vector)           538 ns          538 ns      1302725
Cytron's paper, fig. 9 (vertex list)          1145 ns         1145 ns       613263
Cytron's paper, fig. 9  (vertex vector)       1046 ns         1046 ns       674659
From a code, 186 BBs (vertex list)           12938 ns        12937 ns        54742
From a code, 186 BBs (vertex vector)         11528 ns        11527 ns        62319

After implementing a "vector-of-structs" solution, the numbers are the following:

----------------------------------------------------------------------------------
Benchmark                                        Time             CPU   Iterations
----------------------------------------------------------------------------------
Tarjan's paper (vertex list)                   919 ns          919 ns       768302
Tarjan's paper  (vertex vector)                835 ns          835 ns       838532
Appel. fig. 19.8 (vertex list)                 944 ns          944 ns       739354
Appel. fig. 19.8  (vertex vector)              854 ns          854 ns       825316
Muchnick. fig. 8.18 (vertex list)              527 ns          527 ns      1285818
Muchnick. fig. 8.18  (vertex vector)           488 ns          488 ns      1433765
Cytron's paper, fig. 9 (vertex list)          1101 ns         1101 ns       636063
Cytron's paper, fig. 9  (vertex vector)       1024 ns         1024 ns       685137
From a code, 186 BBs (vertex list)           12754 ns        12753 ns        54584
From a code, 186 BBs (vertex vector)         11623 ns        11622 ns        6116

Here we can see about 1% speedup for the "large" cases (for CFGs with 186 basic blocks) and about 10% for small ones (Muchnick. fig. 8.18, 8 vertices).

I'm thinking what to deal with the semedom_ vector: whether should we put samedoms into the struct? The pattern is a little different so that some more experiments are required.

samolisov · 2024-12-21T05:20:48Z

Maybe a check on a larger graph (up to 1000 or 2000-3000) nodes is needed to ensure there is no regression for large inputs.

jeremy-murphy · 2024-12-21T06:09:42Z

Thanks for trying this change, pity it didn't yield anything significant. I still think it's a better logical design, so I'm happy to proceed with it, although I'd like to make a few style changes.
For starters, I think we can just drop the set functions on the struct. More later.

samolisov · 2024-12-29T14:39:29Z

@jeremy-murphy Thank you for the suggestion, I've replaced every set_ method with a direct writing to the corresponding field and remove the methods.

Also, I added a benchmark for a huge (3000+ nodes) graph, on such graph I see the following situation. The baseline (code from the develop branch):

Huge Inlined Function (vertex list)         275707 ns       275683 ns         2531
Huge Inlined Function (vertex vector)       236892 ns       236878 ns         2969

With the "cache-friendly" solution:

Huge Inlined Function (vertex list)         284871 ns       284855 ns         2495
Huge Inlined Function (vertex vector)       251233 ns       251218 ns         2783

So, we can see even some performance degradation, up to 3-6%.

Use vector-of-structs of preds/semi for Lengauer-Tarjan

b66a6cc

Closes boostorg#383

Remove the 'set_' members from 'preds' struct

7c27b39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use vector-of-structs of preds/semi for Lengauer-Tarjan #408

Use vector-of-structs of preds/semi for Lengauer-Tarjan #408

samolisov commented Dec 21, 2024

samolisov commented Dec 21, 2024

samolisov commented Dec 21, 2024

jeremy-murphy commented Dec 21, 2024

samolisov commented Dec 29, 2024

Use vector-of-structs of preds/semi for Lengauer-Tarjan #408

Are you sure you want to change the base?

Use vector-of-structs of preds/semi for Lengauer-Tarjan #408

Conversation

samolisov commented Dec 21, 2024

samolisov commented Dec 21, 2024

samolisov commented Dec 21, 2024

jeremy-murphy commented Dec 21, 2024

samolisov commented Dec 29, 2024