Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[r] Support enumerations #1558

Closed
eddelbuettel opened this issue Jul 28, 2023 · 1 comment
Closed

[r] Support enumerations #1558

eddelbuettel opened this issue Jul 28, 2023 · 1 comment
Assignees

Comments

@eddelbuettel
Copy link
Contributor

eddelbuettel commented Jul 28, 2023

Is your feature request related to a problem? Please describe.

TileDB 2.17 will bring enumerations support. The SOMA project is eager to take advantage of this. Parent issue: #866

Describe the solution you'd like

A branch contains development, conditional on 2.17, to read enumerations and return them. Preliminary write support (going via tiledb-r) is also implemented.

Describe alternatives you've considered

N/A

Additional context

#866

@eddelbuettel eddelbuettel self-assigned this Jul 28, 2023
@johnkerl johnkerl changed the title [r] Support Enumerations [r] Support enumerations Jul 31, 2023
@johnkerl johnkerl added r-api and removed r-api labels Sep 11, 2023
@johnkerl
Copy link
Member

Closing in favor of #866

ihnorton pushed a commit that referenced this issue Sep 15, 2023
As described in #1558 and #866, adding enumeration support is desirable once we have TileDB Embedded 2.17 available

**Changes:**

This PR supports reading of columns with enumerations (aka dictionaries aka factor variable) directly via Arrow. Preliminary write support is also available (but still goes through the `tiledb` R package for writes).

**Notes for Reviewer:**

~This PR is now work-in-progress and not ready for a merge while we await TileDB 2.17.~  The branch and PR are ready but should only be merged once prequisites are been merged.  It likely needs #1519 (C++ side) and #1663 (CI support).

CI is turned off as the TileDB default build is still without support for enumerations.
johnkerl added a commit that referenced this issue Sep 15, 2023
* **Issue and/or context:**

As described in #1558 and #866, adding enumeration support is desirable once we have TileDB Embedded 2.17 available

**Changes:**

This PR supports reading of columns with enumerations (aka dictionaries aka factor variable) directly via Arrow. Preliminary write support is also available (but still goes through the `tiledb` R package for writes).

**Notes for Reviewer:**

~This PR is now work-in-progress and not ready for a merge while we await TileDB 2.17.~  The branch and PR are ready but should only be merged once prequisites are been merged.  It likely needs #1519 (C++ side) and #1663 (CI support).

CI is turned off as the TileDB default build is still without support for enumerations.

* **Issue and/or context:**

This PR adds support for return Arrow tables with dictionaries that can include `ordered` enumerations.

**Changes:**

Given #1559 which it depends upon, a very small change to just three files in `libtiledbsoma`.

This should become clearer once the dependent PR is merged and can be rebased.

**Notes for Reviewer:**

[SC 34073](https://app.shortcut.com/tiledb-inc/story/34073/c-add-ordered-support-to-arrow-export)

* **Issue and/or context:**

This PR extends the `schema()` function to return an Arrow schema with enumerations including `ordered`.

**Changes:**

Given #1559 which it depends upon, a very small change to just one file.

This should become clearer once the dependent PR is merged and can be rebased.

**Notes for Reviewer:**

[SC 34074](https://app.shortcut.com/tiledb-inc/story/34074/c-add-ordered-support-to-arrow-export)

* [c++] Test fixes for #1559 (#1684)

* ihn/bugfix

* unit-test update

* lint

---------

Co-authored-by: John Kerl <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants