Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new project_to_dag algorithm #206

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open

new project_to_dag algorithm #206

wants to merge 4 commits into from

Conversation

blengerich
Copy link
Collaborator

Prompted by #205 , here is a simplified project_to_dag algorithm. Instead of searching over edge weights to threshold out, it does a topological sort of the nodes and then returns only the forward edges. The topological sort is just a simple ordering by sum of edge weight magnitudes (more and stronger edges => higher priority to keep the edges in this node).

A quick test of performance is included in the Jupyter notebook. The topological sort function could (possibly should?) be improved, but the sort-based projection appears to get similar performance to the search-based algorithm:

image

with much reduced runtime:

image

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@cnellington
Copy link
Owner

cnellington commented Jul 15, 2023

The speedup is definitely worth the cost here, but there's a few edge cases to consider.
Here's one failure case for topological sort.

X -2-> Y 
Y -4-> Z

Where it's already a dag but topological sort by outgoing edge magnitudes will remove the first edge. Both algorithms often ignore root nodes with few/small outgoing edges. Just to emphasize that neither are perfect, both sort and score fail here.

X -2-> Y
Y -4-> Z
Z -3-> Y

I wonder if it's even possible to come up with an edge-removal function $f$ such that for any $G \in R^{n \times n}$, we maximize
$max_f \sum_{i,j}|f(G)_{i,j}|$ s.t. $f(G)$ is a dag?

Copy link
Owner

@cnellington cnellington left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, feel free to merge.

@cnellington
Copy link
Owner

Re: #205 (comment)
Is it possible that the binary search implementation is causing runtime issues? Python is very slow at recursion. Seems like it might not be the is_dag check that's the issue, and I don't think binary search should incur serious runtime problems? It might be worthwhile reimplementing the binary search as a while loop to see if we can get the search algorithm to work quickly.

@cnellington cnellington self-requested a review July 15, 2023 16:51
@blengerich blengerich linked an issue Jul 15, 2023 that may be closed by this pull request
@blengerich blengerich removed a link to an issue Jul 15, 2023
@cnellington cnellington changed the base branch from dev to main December 29, 2023 23:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants