Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement suggestions from joss reviewer #13

Merged
merged 13 commits into from
Nov 15, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 7 additions & 6 deletions .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,11 +24,11 @@ jobs:
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt
pip install pytest
pip install .
pip install pytest nbmake
- name: Run test
run: |
pytest .
pytest --nbmake .


test-windows:
Expand All @@ -47,12 +47,13 @@ jobs:
- name: Install dependencies
shell: bash -l {0}
run: |
pip install -r requirements.txt
pip install pytest
python -m pip install --upgrade pip
pip install .
pip install pytest nbmake
- name: Run test
shell: bash -l {0}
run: |
pytest .
pytest --nbmake .

coverage:
runs-on: ubuntu-latest
Expand Down
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ a0.set_title("Input point cloud") ; a0.axis("off") ; a0.set_aspect("equal")

# plot the persistence diagram, showing a single prominent class
cc = CircularCoords(X, n_landmarks=200)
plot_diagrams(cc.dgms_, title="Persistence diagram", ax=a1)
plot_diagrams(cc._dgms, title="Persistence diagram", ax=a1)

# plot the data colored by the circle-valued map constructed by DREiMac
circular_coordinates = cc.get_coordinates()
Expand All @@ -58,7 +58,7 @@ For Jupyter notebooks with more examples, please check the [documentation](https

## Installing

Make sure you are using Python 3.8 or newer.
Make sure your Python version is >=3.8 and <3.12.
DREiMac depends on the following python packages, which will be installed automatically when you install with pip:
`matplotlib`,
`numba`,
Expand Down
3 changes: 2 additions & 1 deletion docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ You can check the :ref:`theory section <theory>` for details and or the examples
Installing
----------

Make sure you are using Python 3.8 or newer.
Make sure your Python version is >=3.8 and <3.12.
DREiMac depends on the following python packages, which will be installed automatically when you install with pip:
`matplotlib`,
`numba`,
Expand Down Expand Up @@ -45,6 +45,7 @@ Contents
notebooks/parameters_n_landmarks_and_cocycle_idx
notebooks/parameter_perc
notebooks/parameter_standard_range
notebooks/parameter_n_samples
notebooks/parameters_prime_and_check_cocycle_condition

Further examples
Expand Down
22 changes: 11 additions & 11 deletions docs/notebooks/bullseye.ipynb

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Outputs shouldn't be committed to git. It's otherwise very hard to see if the code changed ore only the outputs.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the feedback. The reason for having the outputs in the notebooks is that we are using the notebooks as documentation, so we want to have the user be able to see the output without necessarily running the notebooks. But this is most relevant for the documentation itself, which just includes the notebooks and thus requires the outputs to be there. Do you know if it is possible to have the notebooks without output in the repository, and have the documentation CI action run each cell of each notebook before compiling the documentation and putting it online?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the input here.
This doesn't quite address our workflow with the documentation, so I'd like
to request that we move on from this and leave it be for now.
Thanks,
Chris

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ctralie, I understand your perspective. However, for PR reviews, navigating through binaries takes work. Perhaps something like ReviewNB could be a solution? Just a thought.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the quick response. I'm curious why you'd look at the binaries directly though. Why not just look at the notebooks themselves, since github renders them?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(This tool does look nice though @raphaelreinauer. I'm just wondering if, in the interest of time, and in the interest of keeping the toolchain simpler, we can punt this issue for now)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is purely a personal preference (see the Review channel), so please consider it a suggestion to add in the future.

Reviewing a PR, I look at change logs to determine the modifications. The diff in the Jupyter Notebook has both code changes and binaries. However, the binary changes can often overshadow the code modifications, making the PR challenging to review.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another drawback of committing binaries (which could even change often like plots in Jupyter Notebooks) is that the repository size increases significantly as these files remain in the git history.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, understood, and agreed that it is a pain. I do wish that matplotlib saved svgs instead of pngs where possible in notebooks. But thank you for your flexibility here

Large diffs are not rendered by default.

23 changes: 8 additions & 15 deletions docs/notebooks/coil20.ipynb

Large diffs are not rendered by default.

51 changes: 10 additions & 41 deletions docs/notebooks/complexProjectiveOnGenusTwoSurface.ipynb

Large diffs are not rendered by default.

16 changes: 8 additions & 8 deletions docs/notebooks/genusTwoSurface.ipynb

Large diffs are not rendered by default.

8 changes: 4 additions & 4 deletions docs/notebooks/movingDot.ipynb

Large diffs are not rendered by default.

87 changes: 87 additions & 0 deletions docs/notebooks/parameter_n_samples.ipynb

Large diffs are not rendered by default.

14 changes: 7 additions & 7 deletions docs/notebooks/parameter_perc.ipynb

Large diffs are not rendered by default.

36 changes: 16 additions & 20 deletions docs/notebooks/parameter_standard_range.ipynb

Large diffs are not rendered by default.

12 changes: 6 additions & 6 deletions docs/notebooks/parameters_n_landmarks_and_cocycle_idx.ipynb

Large diffs are not rendered by default.

28 changes: 14 additions & 14 deletions docs/notebooks/twoSphere.ipynb

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion dreimac/_version.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
__version__ = "0.3.0"
__version__ = "0.3.1"
1 change: 0 additions & 1 deletion dreimac/circularcoords.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,6 @@ def __init__(
maxdim=maxdim,
verbose=verbose,
)
self.type_ = "circ"

def get_coordinates(
self,
Expand Down
25 changes: 12 additions & 13 deletions dreimac/complexprojectivecoords.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,10 +38,9 @@ def __init__(
):
EMCoords.__init__(self, X, n_landmarks, distance_matrix, prime, maxdim, verbose)
simplicial_complex_dimension = 3
self.cns_lookup_table_ = combinatorial_number_system_table(
self._cns_lookup_table = combinatorial_number_system_table(
n_landmarks, simplicial_complex_dimension
)
self.type_ = "complexprojective"

def get_coordinates(
self,
Expand Down Expand Up @@ -101,33 +100,33 @@ def get_coordinates(

# compute boundary matrix
delta1 = CohomologyUtils.make_delta1(
self.dist_land_land_, rips_threshold, self.cns_lookup_table_
self._dist_land_land, rips_threshold, self._cns_lookup_table
)

# lift to integer cocycles
integer_cocycle = CohomologyUtils.lift_to_integer_cocycle(
cocycle, prime=self.prime_
cocycle, prime=self._prime
)

# go from sparse to dense representation of cocycles
integer_cocycle_as_vector = CohomologyUtils.sparse_cocycle_to_vector(
integer_cocycle, self.cns_lookup_table_, self.n_landmarks_, int
integer_cocycle, self._cns_lookup_table, self._n_landmarks, int
)

if check_cocycle_condition:
is_cocycle = _is_two_cocycle(
integer_cocycle_as_vector,
self.dist_land_land_,
self._dist_land_land,
rips_threshold,
self.cns_lookup_table_,
self._cns_lookup_table,
)
if not is_cocycle:
delta2 = CohomologyUtils.make_delta2_compact(
self.dist_land_land_, rips_threshold, self.cns_lookup_table_
self._dist_land_land, rips_threshold, self._cns_lookup_table
)
d2cocycle = delta2 @ integer_cocycle_as_vector.T

y = d2cocycle // self.prime_
y = d2cocycle // self._prime

constraints = LinearConstraint(delta2, y, y, keep_feasible=True)
n_edges = delta2.shape[1]
Expand All @@ -149,7 +148,7 @@ def get_coordinates(
solution = optimizer_solution["x"]
new_cocycle_as_vector = (
integer_cocycle_as_vector
- self.prime_ * np.array(np.rint(solution), dtype=int)
- self._prime * np.array(np.rint(solution), dtype=int)
)
integer_cocycle_as_vector = new_cocycle_as_vector

Expand All @@ -163,9 +162,9 @@ def get_coordinates(
integral,
varphi,
ball_indx,
self.dist_land_land_,
self._dist_land_land,
rips_threshold,
self.cns_lookup_table_,
self._cns_lookup_table,
)

# reduce dimensionality of complex projective space
Expand All @@ -175,7 +174,7 @@ def get_coordinates(
projective_dim_red_mode,
self.verbose,
)
self.variance_ = epca["variance"]
self._variance = epca["variance"]

return epca["X"]

Expand Down
50 changes: 26 additions & 24 deletions dreimac/emcoords.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ def __init__(self, X, n_landmarks, distance_matrix, prime, maxdim, verbose):
"""
assert maxdim >= 1
self.verbose = verbose
self.X_ = X
self._X = X
if verbose:
tic = time.time()
print("Doing TDA...")
Expand All @@ -59,30 +59,31 @@ def __init__(self, X, n_landmarks, distance_matrix, prime, maxdim, verbose):
)
if verbose:
print("Elapsed time persistence: %.3g seconds" % (time.time() - tic))
self.prime_ = prime
self.dgms_ = res["dgms"]
self.idx_land_ = res["idx_perm"]
self.n_landmarks_ = len(self.idx_land_)
self._prime = prime
self._dgms = res["dgms"]
# TODO: the following is kept for backwards compatibility, remove in next interface-breaking version
self.dgms_ = self._dgms
self._idx_land = res["idx_perm"]
self._n_landmarks = len(self._idx_land)
#self.dist_land_data_ = res["dperm2all"]
if distance_matrix is False:
self.dist_land_data_ = res["dperm2all"]
self._dist_land_data = res["dperm2all"]
elif X.shape[0] == X.shape[1]:
self.dist_land_data_ = res["dperm2all"]
self._dist_land_data = res["dperm2all"]
else:
self.dist_land_data_ = X[self.idx_land_,:]
self.coverage_ = np.max(np.min(self.dist_land_data_, 1))
self.dist_land_land_ = self.dist_land_data_[:, self.idx_land_]
self.cocycles_ = res["cocycles"]
self._dist_land_data = X[self._idx_land,:]
self._coverage = np.max(np.min(self._dist_land_data, 1))
self._dist_land_land = self._dist_land_data[:, self._idx_land]
self._cocycles = res["cocycles"]
# Sort persistence diagrams in descending order of persistence
for i in range(1, maxdim+1):
idxs = np.argsort(self.dgms_[i][:, 0] - self.dgms_[i][:, 1])
self.dgms_[i] = self.dgms_[i][idxs, :]
dgm_lifetime = np.array(self.dgms_[i])
idxs = np.argsort(self._dgms[i][:, 0] - self._dgms[i][:, 1])
self._dgms[i] = self._dgms[i][idxs, :]
dgm_lifetime = np.array(self._dgms[i])
dgm_lifetime[:, 1] -= dgm_lifetime[:, 0]
self.cocycles_[i] = [self.cocycles_[i][idx] for idx in idxs]
CohomologyUtils.reindex_cocycles(self.cocycles_, self.idx_land_, X.shape[0])
self._cocycles[i] = [self._cocycles[i][idx] for idx in idxs]
CohomologyUtils.reindex_cocycles(self._cocycles, self._idx_land, X.shape[0])

self.type_ = "emcoords"

def get_representative_cocycle(self, cohomology_class, homological_dimension):
"""
Expand All @@ -102,13 +103,14 @@ def get_representative_cocycle(self, cohomology_class, homological_dimension):
Cohomological birth of the linear combination or single cocycle
cocycle: ndarray(K, homological_dimension+2, dtype=int)
Representative cocycle. First homological_dimension+1 columns are vertex indices,
and last column takes values in finite field corresponding to self.prime_
and last column takes values in finite field corresponding to self._prime.
The number of rows K is the number of simplices on which the cocycle is non-zero.
"""

assert isinstance(cohomology_class, int)

dgm = self.dgms_[homological_dimension]
cocycles = self.cocycles_[homological_dimension]
dgm = self._dgms[homological_dimension]
cocycles = self._cocycles[homological_dimension]
return (
dgm[cohomology_class, 0],
dgm[cohomology_class, 1],
Expand Down Expand Up @@ -144,10 +146,10 @@ def get_cover_radius(self, perc, cohomdeath_rips, cohombirth_rips, standard_rang
raise Exception(
"The cohomology class selected is too short, try setting standard_range to False."
)
self.rips_threshold_ = (1 - perc) * start + perc * end
self.r_cover_ = self.rips_threshold_ / 2
self._rips_threshold = (1 - perc) * start + perc * end
self._r_cover = self._rips_threshold / 2

return self.r_cover_, self.rips_threshold_
return self._r_cover, self._rips_threshold

def get_covering_partition(self, r_cover, partunity_fn):
"""
Expand All @@ -167,7 +169,7 @@ def get_covering_partition(self, r_cover, partunity_fn):
ball_indx: ndarray(n_data, dtype=int)
The index of the first open set each data point belongs to
"""
dist_land_data = self.dist_land_data_
dist_land_data = self._dist_land_data
U = dist_land_data < r_cover
phi = np.zeros_like(dist_land_data)
phi[U] = partunity_fn(dist_land_data[U], r_cover)
Expand Down
7 changes: 3 additions & 4 deletions dreimac/lenscoords.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,6 @@ def __init__(
maxdim=maxdim,
verbose=verbose,
)
self.type_ = "lens"

def get_coordinates(
self,
Expand Down Expand Up @@ -79,7 +78,7 @@ def get_coordinates(

"""

n_landmarks = self.n_landmarks_
n_landmarks = self._n_landmarks

homological_dimension = 1
cohomdeath_rips, cohombirth_rips, cocycle = self.get_representative_cocycle(
Expand All @@ -92,7 +91,7 @@ def get_coordinates(

varphi, ball_indx = EMCoords.get_covering_partition(self, r_cover, partunity_fn)

root_of_unity = np.exp(2 * np.pi * 1j / self.prime_)
root_of_unity = np.exp(2 * np.pi * 1j / self._prime)

cocycle_matrix = np.ones( (n_landmarks, n_landmarks), np.complex_ )
cocycle_matrix[cocycle[:, 0], cocycle[:, 1]] = root_of_unity ** cocycle[:, 2]
Expand All @@ -101,6 +100,6 @@ def get_coordinates(
class_map = np.sqrt(varphi.T) * cocycle_matrix[ball_indx[:], :]

epca = EquivariantPCA.ppca(class_map, complex_dim-1, projective_dim_red_mode, self.verbose)
self.variance_ = epca["variance"]
self._variance = epca["variance"]

return epca["X"]
5 changes: 2 additions & 3 deletions dreimac/projectivecoords.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,6 @@ def __init__(self, X, n_landmarks, distance_matrix=False, maxdim=1, verbose=Fals
maxdim=maxdim,
verbose=verbose,
)
self.type_ = "proj"

def get_coordinates(
self,
Expand Down Expand Up @@ -76,7 +75,7 @@ def get_coordinates(

"""

n_landmarks = self.n_landmarks_
n_landmarks = self._n_landmarks

homological_dimension = 1
cohomdeath_rips, cohombirth_rips, cocycle = self.get_representative_cocycle(
Expand All @@ -100,6 +99,6 @@ def get_coordinates(
epca = EquivariantPCA.ppca(
class_map, proj_dim, projective_dim_red_mode, self.verbose
)
self.variance_ = epca["variance"]
self._variance = epca["variance"]

return epca["X"]
Loading