Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[c++] Use valid ASCII for wide-as-possible string current domain #3367

Merged
merged 3 commits into from
Nov 22, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 5 additions & 5 deletions libtiledbsoma/src/soma/soma_array.cc
Original file line number Diff line number Diff line change
Expand Up @@ -1046,9 +1046,8 @@ void SOMAArray::_set_soma_joinid_shape_helper(
case TILEDB_CHAR:
case TILEDB_GEOM_WKB:
case TILEDB_GEOM_WKT:
// TODO: make these named constants b/c they're shared
// with arrow_adapter.
ndrect.set_range(dim_name, "", "\xff");
// See comments in soma_array.h.
ndrect.set_range(dim_name, "", "\x7f");
break;

case TILEDB_INT8:
Expand Down Expand Up @@ -1427,8 +1426,9 @@ void SOMAArray::_set_domain_helper(
auto lo_hi = ArrowAdapter::get_table_string_column_by_index(
newdomain, i);
if (lo_hi[0] == "" && lo_hi[1] == "") {
// Don't care -> as big as possible
ndrect.set_range(dim_name, "", "\xff");
// Don't care -> as big as possible.
// See comments in soma_array.h.
ndrect.set_range(dim_name, "", "\x7f");
} else {
throw TileDBSOMAError(std::format(
"domain (\"{}\", \"{}\") cannot be set for "
Expand Down
11 changes: 6 additions & 5 deletions libtiledbsoma/src/soma/soma_array.h
Original file line number Diff line number Diff line change
Expand Up @@ -906,13 +906,14 @@ class SOMAArray : public SOMAObject {
// imitate.
// * Core current domain for string dims must _not_ be a nullptr pair.
// * In TileDB-SOMA, unless the user specifies otherwise, we use "" for
// min and "\xff" for max.
// * However, "\xff" causes display problems in Python. It's also
// flat-out confusing to show to users.
// min and "\x7f" for max. (We could use "\x7f" but that causes
// display problems in Python.)
//
// To work with all these factors, if the current domain is the default
// "" to "\xff", return an empty-string pair just as we do for domain.
if (arr[0] == "" && arr[1] == "\xff") {
// "" to "\7f", return an empty-string pair just as we do for domain.
// (There was some pre-1.15 software using "\xff" and it's super-cheap
// to check for that as well.)
if (arr[0] == "" && (arr[1] == "\x7f" || arr[1] == "\xff")) {
return std::pair<std::string, std::string>("", "");
} else {
return std::pair<std::string, std::string>(arr[0], arr[1]);
Expand Down
5 changes: 3 additions & 2 deletions libtiledbsoma/src/utils/arrow_adapter.cc
Original file line number Diff line number Diff line change
Expand Up @@ -1126,8 +1126,9 @@ ArraySchema ArrowAdapter::tiledb_schema_from_arrow_schema(
std::string hi = strings[4];
if (lo == "" && hi == "") {
// These mean "I the caller don't care, you
// libtiledbsoma make it as big as possible"
ndrect.set_range(col_name, "", "\xff");
// libtiledbsoma make it as big as possible".
// See also comments in soma_array.h.
ndrect.set_range(col_name, "", "\x7f");
} else {
ndrect.set_range(col_name, lo, hi);
LOG_DEBUG(std::format(
Expand Down
Loading