Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add functionality to ignore selfloops #49

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

tlarock
Copy link

@tlarock tlarock commented May 6, 2019

This PR adds the ability to ignore selfloops in path data to the Paths.add_path instance function and Paths.read_file class function. It also removes some redundant code from paths.py by changing read_file to use add_path with the expand_subpaths option.

I implemented this because I have a dataset that includes (apparently) meaningless self-loops that I want to be able to remove/include in my pipeline without preprocessing the data manually. The two main changes are:

  1. Add remove_selfloops option to add_path with functionality that collapses consecutively repeated symbols into a single symbol. For example, the sequence ('a', 'a', 'b', 'b', 'c') will collapse to just ('a', 'b', 'c').
    • Note that since a sanity check in add_path is already looping through the elements of the path to ensure the separator character is safe to use, I only add comparisons and list appends to the loop so the change should not meaningfully impact computational complexity.
  2. Change Paths.read_file to call cls.add_path rather than reproducing the same functionality. Moves expand_subpaths functionality to add_path rather than calling it on the whole object at the end.

In working with this code, I couldn't understand why Paths.read_edges is static and Paths.read_file is class. I think it makes sense for them to be consistent and I think they should both be static, since they both populate and return a new Paths object. I can change the decorator in this PR if desired.

Let me know if you have questions/comments/changes!

Edit: Meant to include a note that this is completely separate from my other open PR (#47).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant