Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Branch and Bound for vertex separation #17647

Closed
dcoudert opened this issue Jan 17, 2015 · 60 comments
Closed

Branch and Bound for vertex separation #17647

dcoudert opened this issue Jan 17, 2015 · 60 comments

Comments

@dcoudert
Copy link
Contributor

This patch implements a branch and bound algorithm for computing the vertex separation of directed graphs. It can be significantly faster than other methods already implemented. Furthermore, it can handle graphs with more than 31 vertices, but of course the computation time could be long.

sage: from sage.graphs.graph_decompositions import vertex_separation as VS
sage: D = digraphs.RandomDirectedGNP(30,.1)
sage: %time w,l = VS.vertex_separation(D); print w
4
CPU times: user 229 ms, sys: 345 ms, total: 574 ms
Wall time: 575 ms
sage: %time w,l = VS.vertex_separation_BAB(D); print w
4
CPU times: user 1.34 ms, sys: 32 µs, total: 1.37 ms
Wall time: 1.35 ms

Depends on #17665

CC: @nathanncohen

Component: graph theory

Author: David Coudert

Branch/Commit: dd12f60

Reviewer: Nathann Cohen

Issue created by migration from https://trac.sagemath.org/ticket/17647

@dcoudert dcoudert added this to the sage-6.5 milestone Jan 17, 2015
@dcoudert
Copy link
Contributor Author

Branch: u/dcoudert/17647

@dcoudert
Copy link
Contributor Author

Changed branch from u/dcoudert/17647 to public/17647

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Jan 17, 2015

Commit: 9b5f33e

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Jan 17, 2015

Branch pushed to git repo; I updated commit sha1. New commits:

487ce4b17647: add FastDiGraph_bitset data structure
b8a31a517647: implement the branch and bound
9b5f33e17647: add documentation to the module

@dcoudert

This comment has been minimized.

@nathanncohen
Copy link
Mannequin

nathanncohen mannequin commented Jan 18, 2015

comment:5

A first question: you implement a graph data structure with bitsets... but what about using this instead?

http://www.sagemath.org/doc/reference/graphs/sage/graphs/base/static_dense_graph.html

Nathann

@dcoudert
Copy link
Contributor Author

comment:6

The binary_matrix_t data structure encodes the neighborhood of a vertex using a specific bitset, that is without using the type bitset_t. Consequently, it is not easy to use the operations already implemented for bitset_t (union, intersection, difference, etc.) when using binary_matrix_t. I would have to encapsulate the rows into an instance of bitset_t before each operation. This is certainly doable. However, I don't know how safe it is to cast a unsigned long * into a mp_limb_t*. This is the main problem.

@nathanncohen
Copy link
Mannequin

nathanncohen mannequin commented Jan 18, 2015

comment:7

HMmmm... Agreed, but we can't keep on adding graph data structures in this graph_decomposition/ folder when we have a base/ directory where everything else is already.

Let's not think about this now. I'll do a review first.

@dcoudert
Copy link
Contributor Author

comment:8

Actually, static dense graph is used only in independent_sets.pyx, right? and in this code, we can see many cast to mp_limb_t*.
I think we could, in another patch, change the static dense graphs to use my data structure. This way we could clean the independent set code (remove all cast operations) and enable a direct access to bitset operations.

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Jan 22, 2015

Changed commit from 9b5f33e to 9aaf80e

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Jan 22, 2015

Branch pushed to git repo; I updated commit sha1. New commits:

9aaf80etrac #17647: Reviewer's commit

@nathanncohen
Copy link
Mannequin

nathanncohen mannequin commented Jan 22, 2015

comment:10

Hello David,

Here are some superficial changes. It's faster to implement them than to write a long comment. Fix anything that you don't agree with.

I am not done with the review, but I worry a bit about the fact that you allocate memory in a recursive function. They can cost a lot, these things. Do you see a clean way to avoid that ?

Also, I did some profiling on your code and it seems that the most time-consuming operations is the call to popcount. I do not know where it is needed yet, but if you know how to work around that you would probably earn some perf.

Tell me what you think,

Nathann

@dcoudert
Copy link
Contributor Author

comment:11

Here are some superficial changes. It's faster to implement them than to write a long comment. Fix anything that you don't agree with.

Nice. I have just fixed a small typo.

I am not done with the review, but I worry a bit about the fact that you allocate memory in a recursive function. They can cost a lot, these things. Do you see a clean way to avoid that ?

I already though about it and I don't have a "clean" method to propose. I could for instance allocate one for all 5 arrays of n bitsets each and access to the bitsets of level level since the BAB has depth at most n. This way we avoid multiple allocations.
If you think it is OK, I can implement it.

Also, I did some profiling on your code and it seems that the most time-consuming operations is the call to popcount. I do not know where it is needed yet, but if you know how to work around that you would probably earn some perf.

What's the magic commande to profile this code? %prun is not working here.

The speed of the bitset_len method depends on the system. Since patch #13352, we use the mpn methods for popcount which is expected to use __builtin_popcount. I don't know how fast it is on my laptop. I'll check as soon as you give me the magic command ;)

Thanks,
David.

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Jan 22, 2015

Changed commit from 9aaf80e to 0504e8f

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Jan 22, 2015

Branch pushed to git repo; I updated commit sha1. New commits:

0504e8ftrac #17647: small typo

@nathanncohen
Copy link
Mannequin

nathanncohen mannequin commented Jan 22, 2015

comment:13

Nice. I have just fixed a small typo.

Thanks

I am not done with the review, but I worry a bit about the fact that you allocate memory in a recursive function. They can cost a lot, these things. Do you see a clean way to avoid that ?

I already though about it and I don't have a "clean" method to propose. I could for instance allocate one for all 5 arrays of n bitsets each and access to the bitsets of level level since the BAB has depth at most n. This way we avoid multiple allocations.
If you think it is OK, I can implement it.

Using bitsets that would mean dozens of allocations lines again. What is it that you use in those fast digraphs of yours that are not in the 'binary matrix' type ? Possibly the best is to implement it there and use the binary matrices?

What's the magic commande to profile this code? %prun is not working here.

Run a long command, then type 'perf top' in a console. You'll love it.

The speed of the bitset_len method depends on the system. Since patch #13352, we use the mpn methods for popcount which is expected to use __builtin_popcount. I don't know how fast it is on my laptop. I'll check as soon as you give me the magic command ;)

I will probably write a "profiling tutorial" soon. It would be cool if you were willing to review it. That will be helpful, along with the "interface C code with Sage" tutorial.

Nathann

@dcoudert
Copy link
Contributor Author

comment:14

I am not done with the review, but I worry a bit about the fact that you allocate memory in a recursive function. They can cost a lot, these things. Do you see a clean way to avoid that ?

I already though about it and I don't have a "clean" method to propose. I could for instance allocate one for all 5 arrays of n bitsets each and access to the bitsets of level level since the BAB has depth at most n. This way we avoid multiple allocations.
If you think it is OK, I can implement it.

Using bitsets that would mean dozens of allocations lines again.

should be something like that

cdef bitset_t * bitset_pool = <bitset_t *>sage_malloc(5 * n * sizeof(bitset_t))
for i from 0 <= i < 5*n:
    bitset_init(bitset_pool[i], n)
...
...
for i from 0 <= i < 5*n:
    bitset_free(bitset_pool[i])
sage_free(bitset_pool)

What is it that you use in those fast digraphs of yours that are not in the 'binary matrix' type ? Possibly the best is to implement it there and use the binary matrices?

I'm using extensively bitset_union, bitset_difference, bitset_discard, bitset_add, bitset_len, bitset_first.
What you propose is to recode what has already been properly implemented for bitsets into the binary matrix type. For instance, in module static_dense_graph.pyx, we have code like that:

            # The intersection of the common neighbors of i and j is a AND of
            # their respective rows. A popcount then returns its cardinality.                  
            inter = 0
            for l in range(m.width):
                inter += __builtin_popcountl(m.rows[i][l] & m.rows[j][l])

Seriously, it would be much cleaner to use bitsets for rows and to call bitset_len.

As I said previously, my suggestion is more to change the static_dense_graph data_structure to use directly bitset_t for rows.

What's the magic commande to profile this code? %prun is not working here.

Run a long command, then type 'perf top' in a console. You'll love it.

Don't have it on OSX :(

The speed of the bitset_len method depends on the system. Since patch #13352, we use the mpn methods for popcount which is expected to use __builtin_popcount. I don't know how fast it is on my laptop. I'll check as soon as you give me the magic command ;)

I will probably write a "profiling tutorial" soon. It would be cool if you were willing to review it. That will be helpful, along with the "interface C code with Sage" tutorial.

According some discussions on the mailing lists, we have some profiling experts that could be of more help than me. I can have a look anyway to tell if it is dummy's level ;)

David.

@nathanncohen
Copy link
Mannequin

nathanncohen mannequin commented Jan 22, 2015

comment:15

should be something like that

cdef bitset_t * bitset_pool = <bitset_t *>sage_malloc(5 * n * sizeof(bitset_t))
for i from 0 <= i < 5*n:
    bitset_init(bitset_pool[i], n)
...
...
for i from 0 <= i < 5*n:
    bitset_free(bitset_pool[i])
sage_free(bitset_pool)

In a design like that there is a problem if some bitset_init fails. But well. Yeah, then it would probably be cleaner anyway. Good idea.

            # The intersection of the common neighbors of i and j is a AND of
            # their respective rows. A popcount then returns its cardinality.                  
            inter = 0
            for l in range(m.width):
                inter += __builtin_popcountl(m.rows[i][l] & m.rows[j][l])

Seriously, it would be much cleaner to use bitsets for rows and to call bitset_len.

With bitsets you cannot do this very computation without allocating a third bitset.

Okay, let's say that it is not important: could you turn/extend your fast_digraph into a bitset matrix ? You can overwrite the current one if you like, nobody but me uses it anyway. This way it would work with bitsets, and you could allocate this matrix of bitsets in one instruction without having to handle the memory allocation problems in your branch and bound code.

As I said previously, my suggestion is more to change the static_dense_graph data_structure to use directly bitset_t for rows.

We agree if you actually talk of changing the 'binary_matrix' type, which is then used by the static dense graph stuff.

Run a long command, then type 'perf top' in a console. You'll love it.

Don't have it on OSX :(

O_o

Try it on a real machine. That will give you a reason to install some real OS :-P

According some discussions on the mailing lists, we have some profiling experts that could be of more help than me. I can have a look anyway to tell if it is dummy's level ;)

The problem with 'profiling experts' is that they don't review those patches, and if nobody does it everybody loses.

Nathann

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Jan 22, 2015

Changed commit from 0504e8f to fdef674

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Jan 22, 2015

Branch pushed to git repo; I updated commit sha1. New commits:

fdef674trac 17647: use a pool of betsets to avoid mallocs in recursive function

@dcoudert
Copy link
Contributor Author

comment:17

Hello,

I'm now using a pool of bitsets. I add to use type bitset_s *, but it's working and indeed a bit faster (try with G = graphs.MycielskiGraph(5)).

About changing the static_dense_graph structure to use directly bitsets, should I do it in this patch or in another patch?

It's a pitty I don't have this perf top command on mac. In fact it is not installed on my linux desktop :(

Best,
David.

@nathanncohen
Copy link
Mannequin

nathanncohen mannequin commented Jan 23, 2015

comment:18

Hello,

I'm now using a pool of bitsets. I add to use type bitset_s *, but it's working and indeed a bit faster (try with G = graphs.MycielskiGraph(5)).

Good news !

About changing the static_dense_graph structure to use directly bitsets, should I do it in this patch or in another patch?

Could you do it in another patch, then rebase this ticket on top of it ? This way we will not have to discuss code that will be changed afterwards.

It's a pitty I don't have this perf top command on mac. In fact it is not installed on my linux desktop :(

Did you try it on a linux machine ? It really tells you all you want to know.

Nathann

@dcoudert
Copy link
Contributor Author

comment:19

About changing the static_dense_graph structure to use directly bitsets, should I do it in this patch or in another patch?

Could you do it in another patch, then rebase this ticket on top of it ? This way we will not have to discuss code that will be changed afterwards.

OK, I will work on it.
I will certainly need your help to understand the procedure to "rebase this ticket on top of the other one" ;)

It's a pitty I don't have this perf top command on mac. In fact it is not installed on my linux desktop :(

Did you try it on a linux machine ? It really tells you all you want to know.

I don't have it installed on my linux desktop(s), and since I don't know the package name, yum returns me hundreds of possible packaged containing the keyword "perf".

D.

@nathanncohen
Copy link
Mannequin

nathanncohen mannequin commented Jan 23, 2015

comment:20

Yo!

OK, I will work on it.
I will certainly need your help to understand the procedure to "rebase this ticket on top of the other one" ;)

Easy. Let us say that this ticket is a branch A. Once the other branch will be written, name it B.

git checkout A
git rebase -i B

After this your branch A will be above B.

I don't have it installed on my linux desktop(s), and since I don't know the package name, yum returns me hundreds of possible packaged containing the keyword "perf".

In debian it is in the package linux-tools.

Nathann

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Jan 27, 2015

Changed commit from fdef674 to 2ad04bc

@nathanncohen nathanncohen mannequin added the s: needs work label Jan 29, 2015
@dcoudert
Copy link
Contributor Author

comment:37

Replying to @nathanncohen:

Hello,

  • in _my_invert_positions: you cannot in C, but in Cython you can do
    a,b=b,a.

I don't see how to use it here.

source_pos = positions[i]
j = prefix[target_pos]
prefix[target_pos],prefix[source_pos] = prefix[source_pos], prefix[target_pos]
positions[i],positions[j] = positions[j],positions[i]

It could be made simpler if the code of _my_invert_positions was taking two positions (or two vertices) as argument instead of 'one of each type'.

I have change the method and updated calls to it.

But if it is a Python list, method vertex_separation_BAB_C has to return both the value and that list.

e = [1,2,3]
def invert_two_entries(l):
   l[0],l[1] = l[1],l[0]
invert_two_entries(e)
print e # prints [2, 1, 3]

How would that be more efficient than passing a pointer and updating the content of the array when needed ?

It is not about efficiency, it is about not allocating/deallocating something if it can be avoided. This table is updated very rarely, so it does not have to be a C array.

Apparently your care a lot about reducing mallocs.

I changed it.

Replying to @nathanncohen:

Helloooooo again,

  • What is the purpose of lower_bound? The documentation says that the code is
    trying to find an ordering whose cost is lower than lower_bound? On what
    exactly is it a lower bound?

    Do you mean that the code will "return any ordering whose cost is lower than
    lower_bound, even if it has no proof that it is optimal?

Exactly.
With parameter upper_bound, you ask for a solution with width less than upper_bound if such a solution exists.
With parameter lower_bound, you say that you are satisfied with any solution with width less than this bound. This is useful in particular when you split the input digraph into strongly connected components and solve the problem on each of them. If one component has width k, you don't care if another component has width k-1 since you know that the vertex separation of the entire digraph is at least k.

This will be useful when we will start adding pre-processing.

  • Do you think that you actually need loc_b_neighborhood? It feels like all
    you need is what you store in bits_tmp, i.e. the union of the prefix and the
    neighborhood. And you could drop one of your two tmp bitets.

I succeed to remove it. I have changed the name of some variables and updated the size of bm_poolaccordingly.

  • Are those two lines needed?
delta_i = max(current_cost, delta_i)
if delta_i < upper_bound:

It seems that you can already suppose that current_cost<upper_bound, and you
checked several lines above that delta_i < upper_bound.

If we reduce the upper bound during the recursive call of the kth element of the list, we want to prune (cut) branches that are no longer useful.
So yes, these lines are needed.

  • I do not agree with this comment
if upper_bound <= lower_bound:
  # Either we have optimal solution, or we are satisfied with
  # current solution.

it seems that this code only deals with the case when 'we are satisfied with
the current solution'.

I changed the sentence to We are satisfied with current solution

  • Is there any reason why you did not expose your function in the main
    vertex_separation function? If you do, is there any reason why it should not
    be the default one? (if it is faster?..). Finally, note that if you do make it
    the default some doctests of your functions will have to be updated, as you
    would be comparing the BAB function with vertex_separation (which would also
    call the BAB).

Yep, I have many reasons for not exposing the BAB in the main vertex_separation function yet.

  1. this patch is already quite long and technical. Since you often complain about patch bombs, I did my best to decompose the work
  2. I plan to add pre-processing rules. The first one is to decompose the input digraph into strongly connected components, but I have others, more complex. I will work on this part only when the BAB will be finalized (including memoization of prefixes). I believe its better to wok this way.

With those comments and those from my previous message, I would say that my
review is finished. We only have to settle those last remarks.

Thanks for your work!

Nathann

I hope I have answered all your comments ;)

David.

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Jan 29, 2015

Branch pushed to git repo; I updated commit sha1. New commits:

6b4a491trac #17647: simplifies _my_invert_positions
ea87914trac #17647: change best_seq from int array to list
8d243ddtrac #17647: reduce the number of temporary bitsets
2f0bf7atrac #17647: update some comments

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Jan 29, 2015

Changed commit from fdfe120 to 2f0bf7a

@dcoudert
Copy link
Contributor Author

Reviewer: Nathann Cohen

@nathanncohen
Copy link
Mannequin

nathanncohen mannequin commented Jan 31, 2015

comment:40

Hello,

Apparently your care a lot about reducing mallocs.

Having a short code that only does what is needed helps a lot when you try to understand someboy else's code.

Do you mean that the code will "return any ordering whose cost is lower than
lower_bound, even if it has no proof that it is optimal?

Exactly.

Okay I understand. That helps when you split your first graph into subproblems. This being said, the term lower_bound seems to indicate that the user can provide a lower bound on the optimal value of the graph, which confused me when I read code that dealt with returning any "solution below the lower bound". What would you think of renaming it to something like "accept_any_below=4"? That's the best keyword I found, but if you have in mind something which could represent a bit better what it does... I find this name confusing.

  • Are those two lines needed?
delta_i = max(current_cost, delta_i)
if delta_i < upper_bound:

If we reduce the upper bound during the recursive call of the kth element of the list, we want to prune (cut) branches that are no longer useful.
So yes, these lines are needed.

Oh I see. I had not noticed that upper_bound was updated at every loop. Consequently, wouldn't it be better to do something like that instead?

if delta_i >= upper_bound:
    break
<continue exploring>

Yep, I have many reasons for not exposing the BAB in the main vertex_separation function yet.

  1. this patch is already quite long and technical. Since you often complain about patch bombs, I did my best to decompose the work
  2. I plan to add pre-processing rules. The first one is to decompose the input digraph into strongly connected components, but I have others, more complex. I will work on this part only when the BAB will be finalized (including memoization of prefixes). I believe its better to wok this way.

Okay, as you like. As long as you plan to do it in the short future, you can manage the patches as you like.

Nathann_from_Nantes

@dcoudert
Copy link
Contributor Author

comment:41

Do you mean that the code will "return any ordering whose cost is lower than
lower_bound, even if it has no proof that it is optimal?

Exactly.

Okay I understand. That helps when you split your first graph into subproblems. This being said, the term lower_bound seems to indicate that the user can provide a lower bound on the optimal value of the graph, which confused me when I read code that dealt with returning any "solution below the lower bound". What would you think of renaming it to something like "accept_any_below=4"? That's the best keyword I found, but if you have in mind something which could represent a bit better what it does... I find this name confusing.

It is used for both usage: If a lower bound is known, then the user can provide it to the BAB to stop if a solution reaches this bound. It can also be used to say "I'm happy with any solution with cost less than k".

I propose to rename it cut_off as for shortests paths algorithms.
I have updated the doc and tests accordingly

  • Are those two lines needed?
delta_i = max(current_cost, delta_i)
if delta_i < upper_bound:

If we reduce the upper bound during the recursive call of the kth element of the list, we want to prune (cut) branches that are no longer useful.
So yes, these lines are needed.

Oh I see. I had not noticed that upper_bound was updated at every loop. Consequently, wouldn't it be better to do something like that instead?

if delta_i >= upper_bound:
    break
<continue exploring>

Right, done.

Yep, I have many reasons for not exposing the BAB in the main vertex_separation function yet.

  1. this patch is already quite long and technical. Since you often complain about patch bombs, I did my best to decompose the work
  2. I plan to add pre-processing rules. The first one is to decompose the input digraph into strongly connected components, but I have others, more complex. I will work on this part only when the BAB will be finalized (including memoization of prefixes). I believe its better to wok this way.

Okay, as you like. As long as you plan to do it in the short future, you can manage the patches as you like.

Nathann_from_Nantes

Let us know if you meet Lulu ;)

Thank you again for the review.

David.

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Jan 31, 2015

Changed commit from 2f0bf7a to 0bd12bf

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Jan 31, 2015

Branch pushed to git repo; I updated commit sha1. New commits:

0bd12bftrac #17647: review commit

@nathanncohen
Copy link
Mannequin

nathanncohen mannequin commented Jan 31, 2015

comment:43

Yoooooooooooo !

Thank you again for the review.

And thank you for the code ;-)

Nathann

@dcoudert
Copy link
Contributor Author

comment:44

Youhou!

@vbraun
Copy link
Member

vbraun commented Feb 17, 2015

comment:45

PDF docs don't build

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Feb 17, 2015

Changed commit from 0bd12bf to dd12f60

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Feb 17, 2015

Branch pushed to git repo; I updated commit sha1. New commits:

dd12f60trac #17647: resolve latex compilation error

@dcoudert
Copy link
Contributor Author

comment:47

Should be OK now.

@vbraun
Copy link
Member

vbraun commented Feb 17, 2015

Changed branch from public/17647b to dd12f60

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants