Improve Performance on PD Instances #559

krypt-n · 2021-08-31T14:44:35Z

Issue

(Do I need one for performance improvements?)

Tasks

Cleanup
Update CHANGELOG.md (remove if irrelevant)
review

Description

This is work I did back in February which lead me to discover #462, rebased onto master and formatted. During profiling I noticed that PDShift and LocalSearch both contain code to look for a cheap insertion of a pickup and a delivery into a route.
This PR first de-duplicates the code and then improves the search somewhat by pruning based on known costs, leading to improved performance on PD instances.

Additionally, 0e3aa24 reduces the allocation of Amount objects, again, improving performance somewhat.

As far as I am aware, this PR does not in any way change the solutions computed by vroom, I'd consider it a bug if it does. Initial benchmarking on the li_lim_100 PD instances looks promising:

master at 58f7411

,Gaps,Computing times
Min,-20.04,250
First decile,-0.0,411
Lower quartile,0.0,534
Median,0.0,726
Upper quartile,1.77,849
Ninth decile,4.18,1039
Max,9.86,1278

this PR

,Gaps,Computing times
Min,-20.04,235
First decile,-0.0,368
Lower quartile,0.0,439
Median,0.0,577
Upper quartile,1.77,680
Ninth decile,4.18,797
Max,9.86,1052

krypt-n · 2021-08-31T16:18:29Z

A bit more thorough benchmark on 176 instances (li_lim_100, li_lim_200, li_lim_400):

master:

,Gaps,Computing times
Min,-39.68,245
First decile,-16.1,589
Lower quartile,-9.09,870
Median,0.0,3102
Upper quartile,1.21,12602
Ninth decile,3.49,17916
Max,14.61,31036

this PR:

,Gaps,Computing times
Min,-39.68,234
First decile,-16.1,501
Lower quartile,-9.09,710
Median,0.0,2551
Upper quartile,1.21,10462
Ninth decile,3.49,14437
Max,14.61,26141

seems to be almost 20% faster

jcoupey · 2021-08-31T16:25:21Z

Thanks for polishing this work and submitting a PR! I'll look into it soon. Looks like between this and #558 we have some great ongoing improvements on computing times. \o/

jcoupey · 2021-09-02T09:57:08Z

Did not go through the commits yet, but I can confirm the steady reduction around -21.5% on average for all PDPTW benchmarks (li_lim_*) across all exploration levels. This is great!

jcoupey · 2021-09-02T17:05:07Z

I've been running a few non-regression tests on that one and found something breaking with the following instance.

pd-perf-problem.txt

Running vroom at 58f7411 (parent commit for the PR on master) provides an expected solution but the current tip of the PR breaks an assert:

$ vroom -i pd-perf-problem.txt 
vroom: structures/vroom/tw_route.cpp:177: void vroom::TWRoute::fwd_update_earliest_from(const vroom::Input&, vroom::Index): Assertion `current_earliest <= latest[i]' failed.
Aborted (core dumped)

This kind of problem is not trivial to debug, it's a consistency check that breaks upon updating earliest times after applying some route change. Usually it's the sign that 1. the applied move is not valid or 2. an inconsistency has been previously introduced in earliest/latest dates by some other change. I'd go for 1. here since the changes do not touch the TW logic but only "client" code.

krypt-n · 2021-09-02T17:14:31Z

Cool, the instance is pretty small. I'll look into it

jcoupey · 2021-09-03T08:36:09Z

For what it's worth, here is a patch I find useful to log current state of TWRoute objects. It applies to current master but may require adjustments on this PR.

Also to avoid searching a needle in a haystack you can narrow down the error with a single heuristic parameter applied: vroom -i pd-perf-problem.txt -e "0,FURTHEST,0.3" -x 0.

krypt-n · 2021-09-03T09:03:28Z

I think I fixed the problem, it was clearly a mistake in this PR. compute_best_insert would halve any cost, thus turning an Insertion with numeric_limits::max() cost (used to signal that no insertion is possible) into one with numeric_limits::max()/2 cost. This means that try_job_additions always would have mistakenly found a "valid" Insertion.

I'm a bit scared that this didn't come up in any benchmark instance

jcoupey · 2021-09-06T16:00:07Z

Trying to get the big picture for those changes, @krypt-n just let me know if the following summary fits.

On P&D insertion

Before:

PDShift had a check for early stop based on pickup insertion cost only and a known threshold, the same was not implemented for the similar code in compute_best_insertion_pd, called in try_job_additions.
The version in compute_best_insertion_pd had a check to skip deliveries when their sole insertion is not valid.

Now grouping both implementations and adjusting has the consequence that:

shortcuts from 1. are now available everywhere, including within try_job_additions;
skip described in 2. additionally applies from PDShift and foremost is extended to skip the whole inner loops whenever it is not valid to include any of the deliveries on their own.

On amount-related allocations

Now for 0e3aa24, my understanding is that switching from maintaining modified_delivery all along to recomputing the whole value from the range upon testing does not theoretically reduce the number of amount allocation. In the worst case, this would even be done on each check so would be more costly. But since we only check this for potentially better solutions, the calls to is_valid_addition_for_capacity_inclusion are sparse enough that it's cheaper in the end.

jcoupey

This looks great, both for the speedup related to added early stops and for the fact that this reduces code duplication. For the latter, I see two ways to go even further, by using the new ls::compute_best_insertion_pd:

From cvrp::PDShift::compute_gain too.
From the heuristics.

Item 1 seems pretty straightforward: I think it would work out-of-the-box if cvrp::PDShift::compute_gain were to hold the new vrptw::PDShift::compute_gain implementation (except for the removal check part) and vrptw::PDShift::compute_gain were to call its parent counterpart. Item 2 may be more touchy so we can always schedule that for another PR.

src/problems/vrptw/operators/pd_shift.cpp

jcoupey

Looks good as is, thanks for the last changes. What's your take on the previous comment? If you don't plan to do anything on the cvrp side (item 1), then I could probably handle it before merging.

krypt-n · 2021-09-10T13:17:03Z

Yep, your summary seems correct to me. I replaced the cvrp version in b88b11e

krypt-n · 2021-09-10T13:18:41Z

On a first glance, there are two copies of this pickup/delivery insertion search in heuristics.cpp, with some minor differences to the one in this PR, so I would leave that as is for now

jcoupey · 2021-09-13T12:51:44Z

I completed the usual Li&Lim checks by runs on real-life instances and also noticed a speedup there (in the 10% to 15% ballpark) so I'm looking forward to merge!

this PR does not in any way change the solutions computed by vroom, I'd consider it a bug if it does

I do have a few instances where the solution is actually different with this PR. It's not always better or worse so it really looks like some heuristic choice is simply made differently at some point. I did not dig more into the problem but managed to narrow it down with a small-ish instance.

Results I'm getting with this file using the parent commit 58f7411 and the tip of this PR:

$ vroom_58f7411 -x 4 -i pd-different-solutions.txt -e "1,HIGHER_AMOUNT,2.1" | jq .summary
{
  "cost": 69312,
  "unassigned": 26,
  "service": 33000,
  "duration": 69312,
  "waiting_time": 0,
  "priority": 260,
  "violations": [],
  "computing_times": {
    "loading": 0,
    "solving": 9
  }
}
$ vroom_b88b11e -x 4 -i pd-different-solutions.txt -e "1,HIGHER_AMOUNT,2.1" | jq .summary
{
  "cost": 68332,
  "unassigned": 24,
  "service": 34560,
  "duration": 68332,
  "waiting_time": 0,
  "priority": 260,
  "violations": [],
  "computing_times": {
    "loading": 0,
    "solving": 6
  }
}

krypt-n · 2021-09-13T12:59:36Z

I'll try to figure out the reason for that difference

jcoupey · 2021-10-05T16:04:43Z

I did some debugging and found that this is a case of same-cost-but-different-choice in try_job_additions.

Applying this patch on top of b88b11e

diff --git a/src/algorithms/local_search/local_search.cpp b/src/algorithms/local_search/local_search.cpp
index 61d3a78..8871588 100644
--- a/src/algorithms/local_search/local_search.cpp
+++ b/src/algorithms/local_search/local_search.cpp
@@ -7,6 +7,8 @@ All rights reserved (see LICENSE).
 
 */
 
+#include <iostream>
+
 #include "algorithms/local_search/local_search.h"
 #include "algorithms/local_search/insertion_search.h"
 #include "problems/vrptw/operators/cross_exchange.h"
@@ -247,6 +249,15 @@ void LocalSearch<Route,
     job_added = (best_cost < std::numeric_limits<double>::max());
 
     if (job_added) {
+      bool log =
+        (best_route == 3 and best_job_rank == 30 and
+         best_insertion.cost == 169 and best_insertion.delivery_rank == 4);
+
+      if (log) {
+        std::cout << "best_insertion.pickup_rank = "
+                  << best_insertion.pickup_rank << std::endl;
+      }
+
       _sol_state.unassigned.erase(best_job_rank);
       const auto& best_job = _input.jobs[best_job_rank];

yields:

$ vroom -x 4 -i pd-different-solutions.txt -e "1,HIGHER_AMOUNT,2.1" -o /dev/null
best_insertion.pickup_rank = 3

Now logging the same at 58f7411 results in:

$ vroom -x 4 -i pd-different-solutions.txt -e "1,HIGHER_AMOUNT,2.1" -o /dev/null
best_insertion.pickup_rank = 0

All the rest of the execution path (operators applied, jobs added) looks similar prior to this choice and then diverges since solution differ after this insertion.

Looks like all evaluations are matching but the order in which ranks are evaluated is changed somehow so a different option with same cost is picked.

jcoupey · 2022-01-20T11:53:23Z

Back on this PR, I just noticed that my previous comment was misleading since I reported the same output: best_insertion.pickup_rank = 3 for both commands. The value is actually 3 at b88b11e but 0 at 58f7411 (I edited the above message).

I think I found the reason for this difference by logging the actual costs evaluated within the different versions of compute_best_insertion_pd:

at 58f7411, inserting job 30 in route 3 with respective pickup and delivery ranks 0 and 4 costs 169 and is chosen as best option (see situation from the above log patch);
at b88b11e, when calling compute_best_insertion_pd for job 30 and route 3 at that same point in solving, we actually have two updates of the best insertion option: inserting pickup/delivery at ranks 0/4 costs 339, then inserting pickup/delivery at ranks 3;4 costs 338.

The thing is that normalizing the P&D insertion cost by dividing it by two now happens after the call to compute_best_insertion_pd rather than in its inner loop. This explains the difference in logged costs, but also that the version in this PR is able to pick the 3/4 P&D insertion that has a cheaper cost (less by 1). The code at 58f7411 is not able to pick the 3/4 insertion over the 0/4 one since normalized costs both equal to 169.

Wrapping this up:

behavior changes should only be noticed in cases with insertion options costs that differ by 1 (and provided the ordering triggers the above situation);
this PR is doing it right by first picking the best insertion cost, then normalizing the cost.

I did not want to introduce a change of behavior without fully understanding the reason, but now I think we're good! @krypt-n do you think you could resolve conflicts with current master and add a changelog entry?

krypt-n · 2022-01-26T12:40:46Z

Hi, I'll look into updating this PR in the next couple of days. I believe I already resolved some merge conflicts locally a while ago

krypt-n · 2022-01-30T11:24:25Z

Okay, I rebased the changes, added a changelog entry (pushed this to the master branch accidentally, apologies for that), and confirmed that this is still a 20% improvement compared to the current master branch with a few benchmarks.

Ready to merge from my point of view!

jcoupey added this to the v1.11.0 milestone Sep 2, 2021

jcoupey reviewed Sep 9, 2021

View reviewed changes

jcoupey added enhancement internals PDPTW refactor labels Sep 9, 2021

jcoupey approved these changes Sep 10, 2021

View reviewed changes

jcoupey removed this from the v1.11.0 milestone Sep 30, 2021

krypt-n added 7 commits January 30, 2022 12:03

Remove unneccessary check

2ad43c5

Start sharing code between pd_shift and insert

6349c87

Use known cost_threshold in shared code

498878a

Remove now duplicated code

5b34b51

Add asserts

4fa858c

Add short-circuiting based on delivery costs

e429815

Reduce amount allocations

957f4be

krypt-n added 4 commits January 30, 2022 12:04

Remove commented out code

b7de725

Fix compute_best_insertion if no insert possible

acde791

Adress review comments

840dec1

Replace cvrp::PDShift::compute_gain()

98234b4

krypt-n force-pushed the enhancement/pd-perf branch from b88b11e to 98234b4 Compare January 30, 2022 11:07

Add changelog entry

bc354ed

krypt-n requested a review from jcoupey January 30, 2022 11:22

jcoupey approved these changes Jan 31, 2022

View reviewed changes

jcoupey merged commit 9e0840c into VROOM-Project:master Jan 31, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve Performance on PD Instances #559

Improve Performance on PD Instances #559

krypt-n commented Aug 31, 2021 •

edited by jcoupey

Loading

krypt-n commented Aug 31, 2021

jcoupey commented Aug 31, 2021

jcoupey commented Sep 2, 2021

jcoupey commented Sep 2, 2021

krypt-n commented Sep 2, 2021

jcoupey commented Sep 3, 2021

krypt-n commented Sep 3, 2021

jcoupey commented Sep 6, 2021

jcoupey left a comment

jcoupey left a comment

krypt-n commented Sep 10, 2021

krypt-n commented Sep 10, 2021

jcoupey commented Sep 13, 2021

krypt-n commented Sep 13, 2021

jcoupey commented Oct 5, 2021 •

edited

Loading

jcoupey commented Jan 20, 2022

krypt-n commented Jan 26, 2022

krypt-n commented Jan 30, 2022

Improve Performance on PD Instances #559

Improve Performance on PD Instances #559

Conversation

krypt-n commented Aug 31, 2021 • edited by jcoupey Loading

Issue

Tasks

Description

krypt-n commented Aug 31, 2021

jcoupey commented Aug 31, 2021

jcoupey commented Sep 2, 2021

jcoupey commented Sep 2, 2021

krypt-n commented Sep 2, 2021

jcoupey commented Sep 3, 2021

krypt-n commented Sep 3, 2021

jcoupey commented Sep 6, 2021

On P&D insertion

On amount-related allocations

jcoupey left a comment

Choose a reason for hiding this comment

jcoupey left a comment

Choose a reason for hiding this comment

krypt-n commented Sep 10, 2021

krypt-n commented Sep 10, 2021

jcoupey commented Sep 13, 2021

krypt-n commented Sep 13, 2021

jcoupey commented Oct 5, 2021 • edited Loading

jcoupey commented Jan 20, 2022

krypt-n commented Jan 26, 2022

krypt-n commented Jan 30, 2022

krypt-n commented Aug 31, 2021 •

edited by jcoupey

Loading

jcoupey commented Oct 5, 2021 •

edited

Loading