Skip to content

Commit

Permalink
[c++] Add helper methods to convert to ArrowSchema (#3424)
Browse files Browse the repository at this point in the history
* Add helper methods for to and from arrow contructs

* Remove unnecessary `std::string` conversion to `c_str`

Co-authored-by: John Kerl <[email protected]>

* lint fix

* Switch C-style casts to named casts

* [c++] Column abstraction: `SOMADimension`, part 1 (#3425)

`SOMAColumn` provides an abstraction over TileDB attributes and dimensions and exposes a common interface for all columns regardless of type. Subclasses of `SOMAColumn` can implement complex indexing mechanism through additional dimensions and encapsulate all that logic in one place and make it modular.

---------

Co-authored-by: John Kerl <[email protected]>
  • Loading branch information
XanthosXanthopoulos and johnkerl authored Jan 2, 2025
1 parent fbeee6f commit 9e7a461
Show file tree
Hide file tree
Showing 15 changed files with 2,237 additions and 29 deletions.
2 changes: 1 addition & 1 deletion apis/python/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -254,7 +254,7 @@ def run(self):
CXX_FLAGS.append(f'-Wl,-rpath,{str(tiledb_dir / "lib")}')

if sys.platform == "darwin":
CXX_FLAGS.append("-mmacosx-version-min=11.0")
CXX_FLAGS.append("-mmacosx-version-min=13.3")

if os.name == "posix" and sys.platform != "darwin":
LIB_DIRS.append(str(tiledbsoma_dir / "lib" / "x86_64-linux-gnu"))
Expand Down
4 changes: 4 additions & 0 deletions libtiledbsoma/src/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,8 @@ add_library(TILEDB_SOMA_OBJECTS OBJECT
${CMAKE_CURRENT_SOURCE_DIR}/soma/soma_array.cc
${CMAKE_CURRENT_SOURCE_DIR}/soma/soma_group.cc
${CMAKE_CURRENT_SOURCE_DIR}/soma/soma_object.cc
${CMAKE_CURRENT_SOURCE_DIR}/soma/soma_column.cc
${CMAKE_CURRENT_SOURCE_DIR}/soma/soma_dimension.cc
${CMAKE_CURRENT_SOURCE_DIR}/soma/soma_collection.cc
${CMAKE_CURRENT_SOURCE_DIR}/soma/soma_experiment.cc
${CMAKE_CURRENT_SOURCE_DIR}/soma/soma_measurement.cc
Expand Down Expand Up @@ -206,6 +208,8 @@ install(FILES
${CMAKE_CURRENT_SOURCE_DIR}/soma/column_buffer.h
${CMAKE_CURRENT_SOURCE_DIR}/soma/soma_array.h
${CMAKE_CURRENT_SOURCE_DIR}/soma/soma_group.h
${CMAKE_CURRENT_SOURCE_DIR}/soma/soma_column.h
${CMAKE_CURRENT_SOURCE_DIR}/soma/soma_dimension.h
${CMAKE_CURRENT_SOURCE_DIR}/soma/soma_collection.h
${CMAKE_CURRENT_SOURCE_DIR}/soma/soma_dataframe.h
${CMAKE_CURRENT_SOURCE_DIR}/soma/soma_dense_ndarray.h
Expand Down
14 changes: 14 additions & 0 deletions libtiledbsoma/src/soma/enums.h
Original file line number Diff line number Diff line change
Expand Up @@ -43,4 +43,18 @@ enum class ResultOrder { automatic = 0, rowmajor, colmajor };
/** Defines whether the SOMAGroup URI is absolute or relative */
enum class URIType { automatic = 0, absolute, relative };

typedef enum {
SOMA_COLUMN_DIMENSION = 0,
SOMA_COLUMN_ATTRIBUTE = 1,
SOMA_COLUMN_GEOMETRY = 2
} soma_column_datatype_t;

// This enables some code deduplication between core domain, core current
// domain, and core non-empty domain.
enum class Domainish {
kind_core_domain = 0,
kind_core_current_domain = 1,
kind_non_empty_domain = 2
};

#endif // SOMA_ENUMS
8 changes: 0 additions & 8 deletions libtiledbsoma/src/soma/soma_array.h
Original file line number Diff line number Diff line change
Expand Up @@ -90,14 +90,6 @@ using namespace tiledb;

using StatusAndReason = std::pair<bool, std::string>;

// This enables some code deduplication between core domain, core current
// domain, and core non-empty domain.
enum class Domainish {
kind_core_domain = 0,
kind_core_current_domain = 1,
kind_non_empty_domain = 2
};

class SOMAArray : public SOMAObject {
public:
friend class ManagedQuery;
Expand Down
81 changes: 81 additions & 0 deletions libtiledbsoma/src/soma/soma_column.cc
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
/**
* @file soma_column.cc
*
* @section LICENSE
*
* The MIT License
*
* @copyright Copyright (c) 2024 TileDB, Inc.
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
* AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*
* @section DESCRIPTION
*
* This file defines the SOMAColumn class.
*/

#include "soma_column.h"

namespace tiledbsoma {

template <>
std::pair<std::string, std::string> SOMAColumn::core_domain_slot<std::string>()
const {
return std::pair<std::string, std::string>("", "");
}

template <>
std::pair<std::string, std::string>
SOMAColumn::core_current_domain_slot<std::string>(
const SOMAContext& ctx, Array& array) const {
// Here is an intersection of a few oddities:
//
// * Core domain for string dims must be a nullptr pair; it cannot
// be
// anything else.
// * TileDB-Py shows this by using an empty-string pair, which we
// imitate.
// * Core current domain for string dims must _not_ be a nullptr
// pair.
// * In TileDB-SOMA, unless the user specifies otherwise, we use ""
// for
// min and "\x7f" for max. (We could use "\x7f" but that causes
// display problems in Python.)
//
// To work with all these factors, if the current domain is the
// default
// "" to "\x7f", return an empty-string pair just as we do for
// domain. (There was some pre-1.15 software using "\xff" and it's
// super-cheap to check for that as well.)
try {
std::pair<std::string, std::string>
current_domain = std::any_cast<std::pair<std::string, std::string>>(
_core_current_domain_slot(ctx, array));

if (current_domain.first == "" && (current_domain.second == "\x7f" ||
current_domain.second == "\xff")) {
return std::pair<std::string, std::string>("", "");
}

return current_domain;
} catch (const std::exception& e) {
throw TileDBSOMAError(e.what());
}
}
} // namespace tiledbsoma
Loading

0 comments on commit 9e7a461

Please sign in to comment.