-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[c++] Column abstraction: SOMADimension
, part 1
#3425
base: xan/sc-59427/arrow-helpers
Are you sure you want to change the base?
[c++] Column abstraction: SOMADimension
, part 1
#3425
Conversation
…nt domain checks, replace vector with span when selecting points
SOMADimension
, part 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm only partway through this PR. I'm still struggling with this (now) 1,875-line PR -- it's doing a lot of things. I'll need to pause and work on some other things for a while, and come back to this. (The other 3 PRs you split out that were smaller were nicely self-contained and self-descriptive and I was able to understand and review them earlier today.)
current_domain = std::any_cast<std::pair<std::string, std::string>>( | ||
_core_current_domain_slot(ctx, array)); | ||
|
||
if (current_domain.first == "" && (current_domain.second == "\x7f" || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a comment here about why both \x7f
and \xff
-- it needs to be retained. This is very confusing without that comment. (Which is why I put the comment in.)
TileDB-SOMA/libtiledbsoma/src/soma/soma_array.h
Lines 901 to 920 in a8349a5
// Here is an intersection of a few oddities: | |
// | |
// * Core domain for string dims must be a nullptr pair; it cannot be | |
// anything else. | |
// * TileDB-Py shows this by using an empty-string pair, which we | |
// imitate. | |
// * Core current domain for string dims must _not_ be a nullptr pair. | |
// * In TileDB-SOMA, unless the user specifies otherwise, we use "" for | |
// min and "\x7f" for max. (We could use "\x7f" but that causes | |
// display problems in Python.) | |
// | |
// To work with all these factors, if the current domain is the default | |
// "" to "\7f", return an empty-string pair just as we do for domain. | |
// (There was some pre-1.15 software using "\xff" and it's super-cheap | |
// to check for that as well.) | |
if (arr[0] == "" && (arr[1] == "\x7f" || arr[1] == "\xff")) { | |
return std::pair<std::string, std::string>("", ""); | |
} else { | |
return std::pair<std::string, std::string>(arr[0], arr[1]); | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good for the most part, but it's so big it's hard to follow everything. Let's do a code walk through together next time we are both working.
I also added a couple small comments, and John's comment about the missing code comments still needs to be addressed.
|
||
using namespace tiledb; | ||
|
||
class SOMADimension : public virtual SOMAColumn { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why virtual inheritance? As far as I can tell, nothing inherits from SOMADimension
.
#include "soma/soma_column.h" | ||
#include "soma/soma_dimension.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like you remove these in a future PR. Might as well just not add them in to begin with.
b9e6046
to
16bb437
Compare
SOMAColumn
provides an abstraction over TileDB attributes and dimensions and exposes a common interface for all columns regardless of type. Subclasses ofSOMAColumn
can implement complex indexing mechanism through additional dimensions and encapsulate all that logic in one place and make it modular.Subsequent PRs will add implementation for dimension and attributes.
Throughout this PR there is extensive use of
std::any
to enable polymorphism with the differentSOMAColumn
types while maintaining a templated interface at the abstractSOMAColumn
.Notes for Reviewer:
This PR introduces the abstract
SOMAColumn
class and theSOMADimension
concrete class with basic unit tests.