-
-
Notifications
You must be signed in to change notification settings - Fork 127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial CSC/ CSR classes #442
Conversation
I'm okay with this; if you pass the correct compressed dimensions; the
I suggest we document that it can return any |
Codecov Report
@@ Coverage Diff @@
## master #442 +/- ##
==========================================
+ Coverage 95.61% 95.80% +0.19%
==========================================
Files 20 20
Lines 2939 3002 +63
==========================================
+ Hits 2810 2876 +66
+ Misses 129 126 -3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've given this a shallow review, deeper one to come later.
@hameerabbasi, what did you mean exactly by this? This is what I was thinking of with the Return types topic above. Does this mean modifying all of:
I'm wondering if these can be split up into separate PRs, since they touching different parts of the codebase. Right now I'm thinking that dimension preserving operations with Could (Added in 8f21503) |
Yes.
Yes, I'm fine with that.
I'd say they should return these types, but I'd prefer not to be locked into that decision. Which is why we return the correct type, but only document we're returning a
That should be done, yes, at the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A general question
sparse/_compressed/compressed.py
Outdated
|
||
if not isinstance(self, Compressed2d): | ||
self.compressed_axes = ( | ||
tuple(compressed_axes) | ||
if isinstance(compressed_axes, Iterable) | ||
else None | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this change necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In particular, I don't like it because it makes the superclass depend on the subclass -- That's bad coding practice, usually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise self.compressed_axes
would be mutable for Compressed2d
classes, since we'd have assignment here.
I'm open to other suggestions. Alternatives I've considered are:
- Separate
__init__
methods – my initial solution setattr(csr. "compressed_axes", x)
is a no-op – but it really should be an error- Check that the new value is correct – this is adding moving parts to an immutable property. I don't like making changes to the subclasses based purely on the implementation of super classes.
- Higher level super class. This might be adding too much indirection.
I don't like it because it makes the superclass depend on the subclass
I agree, but think that super classes have to be designed around allowing subclassing. I think this is a perennial python problem (e.g. how easy is it to subclass an ndarray or dataframe?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One way to go about this is to make it immutable in this PR. Just rename it to _compressed_axes
and also add a @property
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another is to just have it mutable for now, and make it immutable later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use composition instead of inheritance and only subclass, SparseArray
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To answer your question: I don't see anyone relying on subclass behaviour too much, just the fact that it has all the operations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just the fact that it has all the operations.
Yeah, this is an important part. I would like to think more about the type structure before making any big decisions on it, and would like to get this merged so operations can be defined and test/ benchmark suites can be built. I imagine experience while adding to the classes – as well as getting more eyes on it – will help clarify here.
To that end, what's the minimum fix here? To me it's:
- Leave as is
- Split common initialization into it's own function
I like 1, since it's the smallest amount of code that allows both behaviors. It also has very little indirection. I'd be happy to implement 2 if that's a better solution to you. Either of these would come with an issue to think about type structure.
I don't see anyone relying on subclass behaviour too much
I think it will come up, especially as typing continues to mature. I bet it's going to come up while writing the dispatch methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the best fix is to leave it mutable (#442 (comment)) and move mutability to the next PR. I'll mark the issue as a release blocker if immutability is that important to you. GCXS
wasn't released, and so we're fine changing that behavior.
As a side note we may be overthinking the immutability, COO
is most often used, and it holds even data
and coords
as attributes. Hardly anyone I've come across modifies them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a commit where it's kept as immutable, but we don't need the instance check. Just makes sure it's not set to anything other that what it should be.
Added in 8f21503
👍. Where would this have been documented? Is it sufficient that |
We don't have to do it in this PR, but the docstring of |
This is in, thanks @ivirshup! |
Quick comment from the gallery: this is very fun to see :) |
As discussed in #433. The goal of this PR is primarily to get skeletons for these classes that work, while fast implementations of indices and broadcasting can be added in subsequent PRs.
Open questions
Class structure
Current design:
CSC
andCSR
classes subclassGCXS
. Is this a good idea, or should a non-user visible super class be added? It may be strange that:Return types
What is the result type of
CSR + CSC
? What aboutCOO + CSC
?Do we follow
scipy
convention here?