Skip to content
This repository has been archived by the owner on Oct 8, 2021. It is now read-only.

[Contributing] Wilson's, loop erased walks, and normalization constants (Kirchoff/Tutte) #1571

Open
gjh6 opened this issue Jun 18, 2021 · 8 comments

Comments

@gjh6
Copy link

gjh6 commented Jun 18, 2021

Hello,

In following the contributor guidelines, I am opening an issue to report that I would like to contribute three algorithms to light graphs

  1. Loop erased random walks
  2. Wilson's algorithm to randomly sample a spanning tree according to a measure proportional to the product of edge-weights
  3. An algorithm to count the number of spanning trees, or (more generally) determine the normalization constant associated with the above step via the use of Kirchoff's/Tutte's theorems (e.g.).

The first algorithm will be implemented similarly to randomwalk.jl's self_avoiding_walk, however I would like it to be able to take in an AbstractRNG (i.e. rng = getRNG(seed) if seed >=0 and otherwise uses a passed rng which defaults to GLOBAL_RNG).

I would like to make two implementations: the first which starts at a specified vertex and continues for n steps; the second which specifies a terminating node set and a starting node and then returns a loop_erased_random walk that terminates at the specified set or fails.

For Wilson's algorithm, I would only contribute an algorithm on a directed graph (similar to what is done in all of the mst algorithms); it could later be extended. Wilson's would also take in both a seed and AbstractRNG and decide which one to use.

For the count/normalization function, I would reuse the Laplacian matrix function and then take the determinant of a minor. There would be two implementations: one for the determinant and another for the log of the determinant. Although this is simple to add, I believe it would add a nice feature to LightGraphs and perhaps introduce these algorithms to people who don't know of them.

If you have any questions or concerns about these contributions (or if they are already in progress) please let me know.

Please also let me know if these contributions would likely be accepted (assuming acceptable implementation), and whether any would not be accepted.

@gjh6 gjh6 added the bug confirmed bug producing incorrect results label Jun 18, 2021
@gjh6 gjh6 changed the title [Contributing] Wilson's, loop erased walks, and normalization constants (Krichoff/Tutte) [Contributing] Wilson's, loop erased walks, and normalization constants (Kirchoff/Tutte) Jun 18, 2021
@sbromberger sbromberger removed the bug confirmed bug producing incorrect results label Jun 18, 2021
@sbromberger
Copy link
Owner

Thank you for your interest in contributing. These algorithms sound interesting. A couple of questions/observations:

  1. Q: Do you know offhand the computational complexity of each of these?
  2. O/Q: Any algorithm that is based on linear algebra / matrix math is probably not a great fit for LightGraphs. We do export the graph datastructure as a matrix, but (with I think one exception) we don't actually operate on graphs as matrices. Would your count/normalization algorithm rely on matrix representation of the graph?

@gjh6
Copy link
Author

gjh6 commented Jun 19, 2021

Wilson’s algorithm expected runtime is on the order of the graph’s mean hitting time, which is bounded by the graph’s cover time (see here). For example, in the case of planar graphs, this is bounded by O(nv(g)^2) (see here). Loop-erased random walks have O(nv(g)) complexity in expectation (if we specify a terminating node-set).

Implementing Kirchoff/Tutte does use linear algebra. The complexity is that of computing a determinant which will be (implementation dependent) O(V^(2.373) (see here). The code for this is nearly trivial and would rely on features that are already part of LightGraphs and native Julia operations. It would essentially be something like

function log_nspanning(g)
return logdet(view(laplacian_matrix(g), 2:nv(g), 2:nv(g)))
end

@sbromberger
Copy link
Owner

I think the first two are definitely good to go. The third still gives me pause: not only is it matrix-based, but it's a one-liner, and it's > O(V^2) which means it won't scale well in any case.

I've been toying around for years with pulling all the linear algebra code out of LG - @jpfairbanks has been on the receiving end of a bunch of my rants. So far he's talked me down. I'll defer to him again on this one, especially since Linear Algebra is one of my acknowledged blind spots.

@jpfairbanks
Copy link
Contributor

I think code that uses these fancy matrix multiply algorithms will require a special skill set for the maintainers and don't think that LG can provide that level of expert maintainership without an ongoing commitment from the original author. I haven't followed the recent developments in this area for the last few years, but the last time I looked into it, the matrix algorithms that were being produced out of the theoretical computer science literature aren't particularly practical even though they have good asymptotic performance.

I would be happy to be proven wrong by a simple and performant implementation. But we would definitely need to see the implementation and the benchmarks before signing off ion it.

@gjh6
Copy link
Author

gjh6 commented Jun 19, 2021

@sbromberger - Sounds great; I will send a pull request in the next few days. For Kirchoff/Tutte, it makes no difference for me to normalize the spanning probabilities in my local projects versus LightGraphs. I also fully agree about the simplicity of it making it nearly pointless.

As an alternative, perhaps I could add a note about Kirchoff/Tutte in the comments of Wilson's algorithm. My concern is that many users might not know that you can explicitly write down the probability of drawing a tree from Wilson's algorithm and this has a number of interesting uses, particularly in sampling algorithms. @jpfairbanks - I don't know if you would consider Kirchoff/Tutte to be performant, but I have found them both simple and extremely useful in my own work.

@sbromberger
Copy link
Owner

Having the description of Kirchoff/Tutte – and even a proposed implementation – in the docstring of Wilson would be perfectly fine.

@sbromberger
Copy link
Owner

Also, just to set expectations - my availability over the next few weeks will be very limited, and then it will fall off a cliff until September.

@gjh6
Copy link
Author

gjh6 commented Jun 22, 2021

Sounds good, thanks for the heads up.

I've written the methods, but wanted to check in about unit testing. For the loop erased random walks, I was thinking about only including a check for no loops. Is there anything else you'd like to see for this?

For Wilson's, I was planning on taking a simple unweighted and weighted graph (say 4 nodes) and ensuring the fraction of times each type of spanning tree is sampled matches with an exactly computable answer (within some tolerance and given some number of samples). Is there anything else you'd like to see in this?

FYI, since Wilson's algorithm calls loop_erase_randomwalk, the walk function would be implicitly tested via Wilson's.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants