Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extremely slow random_matrix in unimodular form #35664

Closed
2 tasks done
user202729 opened this issue May 22, 2023 · 2 comments
Closed
2 tasks done

Extremely slow random_matrix in unimodular form #35664

user202729 opened this issue May 22, 2023 · 2 comments

Comments

@user202729
Copy link
Contributor

user202729 commented May 22, 2023

Is there an existing issue for this?

  • I have searched the existing issues for a bug report that matches the one I want to file, without success.

Did you read the documentation and troubleshoot guide?

  • I have read the documentation and troubleshoot guide

Environment

- **Sage Version**: 9.8

Steps To Reproduce

import time
start=time.time()
F = GF(2)
m=matrix.random(F, 46, algorithm="unimodular")
print(time.time()-start); start=time.time()

# 1-3 seconds

Expected Behavior

Should be near-instant.

Actual Behavior

As above.

Additional Information

The bottleneck is in the generation of pivot locations which uses a Beta(1.6, 4.3) distribution to generate real values in range [0, 1), multiply by number of columns and get distinct values.

pivots = [0] #Force first column to be a pivot. No harm if no pivots at all.
# Probability distribution for the placement of leading one's.
pivot_generator = pd.RealDistribution("beta", [1.6, 4.3])
while len(pivots) < num_pivots:
pivot_column = int(pivot_generator.get_random_element() * num_col)
if pivot_column not in pivots:
pivots.append(pivot_column)
pivots.sort()

Extracting it out, this snippet

import sage.probability.probability_distribution as pd

num_col=46
num_pivots=num_col

import time
start=time.time()

pivots = [0] #Force first column to be a pivot. No harm if no pivots at all.
pivot_generator = pd.RealDistribution("beta", [1.6, 4.3])
while len(pivots) < num_pivots:
	pivot_column = int(pivot_generator.get_random_element() * num_col)
	if pivot_column not in pivots:
		pivots.append(pivot_column)

print(time.time()-start)

which may take up to a few seconds.

I can see a few possible ways to fix it:

  • hard code the case of full-rank matrix (which is frequent and also the slowest one)
  • sample from the distribution without using a loop
  • change to some other probability distribution that is easier to sample from (backwards incompatible!) e.g. using the uniform distribution only takes O(n log n) steps because of the cup coupon collector problem, and there are algorithms that takes only O(n) steps
@videlec
Copy link
Contributor

videlec commented May 31, 2023

I agree that it is a very bad design to generate a subset of {0, 1, ..., n-1}. I suggest to move this subset generation outside of random_rref_matrix(parent, num_pivots). More precisely

  • having somewhere one (or several) function(s) generating pivots (ie a random subset generator)
  • having a function generating a matrix from a given set of pivots

For the second item one could slightly change the signature of random_rref_matrix to random_rref_matrix(parent, pivots) where pivots is expected to be a subset of {0, 1, ..., n-1}. If it is provided as an integer we could interpret it as a number of pivots and generate a random subset according to some default distribution (first item).

I think it is perfectly fine to change the default distribution as long as it remains reasonable.

@fchapoton fchapoton self-assigned this Sep 22, 2023
vbraun pushed a commit to vbraun/sage that referenced this issue Oct 12, 2023
    
trying to do something for sagemath#35664 ; not yet good

### 📝 Checklist

- [x] The title is concise, informative, and self-explanatory.
- [x] I have linked a relevant issue or discussion.
- [x] I have created tests covering the changes.
- [x] I have updated the documentation accordingly.
    
URL: sagemath#36313
Reported by: Frédéric Chapoton
Reviewer(s): David Coudert, Frédéric Chapoton
@fchapoton
Copy link
Contributor

this should be fixed now, please double check and close the issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants