Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[c++] Fix bug with sliced Arrow-table writes, with string columns #3433

Merged
merged 1 commit into from
Dec 13, 2024

Conversation

johnkerl
Copy link
Member

@johnkerl johnkerl commented Dec 13, 2024

Issue and/or context: #3432 [sc-60767]

Changes:

As described on #3432.

Also, an existing test-case, that might have caught the problem sooner but did not, did not test string columns. That has been modified to also test string columns.

Notes for Reviewer:

@johnkerl johnkerl force-pushed the kerl/arrow-table-slice branch from 2356f18 to faf5071 Compare December 13, 2024 18:29
@johnkerl johnkerl requested a review from nguyenv December 13, 2024 18:38
@johnkerl johnkerl marked this pull request as ready for review December 13, 2024 18:39
@johnkerl johnkerl requested a review from jp-dark December 13, 2024 18:39
Copy link

codecov bot commented Dec 13, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 86.27%. Comparing base (382d691) to head (faf5071).
Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3433      +/-   ##
==========================================
+ Coverage   86.03%   86.27%   +0.24%     
==========================================
  Files          55       55              
  Lines        6278     6295      +17     
==========================================
+ Hits         5401     5431      +30     
+ Misses        877      864      -13     
Flag Coverage Δ
python 86.27% <ø> (+0.24%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
python_api 86.27% <ø> (+0.24%) ⬆️
libtiledbsoma ∅ <ø> (∅)

if (array->n_buffers == 3) {
buf = (UserType*)array->buffers[2] + array->offset;
} else {
buf = (UserType*)array->buffers[1] + array->offset;
}
uint8_t* validity = (uint8_t*)array->buffers[0];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: This would be a good opportunity to move away from the C-style cast to a named cast (static_cast in this case).

Comment on lines +970 to +971
// * Whether the column (array) has 3 buffers (validity, offset, data)
// or 2 (validity, data).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found this comment confusing since we throw if the column (array) doesn't have 3 buffers.

Copy link
Collaborator

@jp-dark jp-dark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: There is a lot of C-style casting in this code which isn't a big deal, but is a little harder to read and little more error prone that more specific cast.

@johnkerl
Copy link
Member Author

johnkerl commented Dec 13, 2024

Nit: There is a lot of C-style casting in this code which isn't a big deal, but is a little harder to read and little more error prone that more specific cast.

Thanks @jp-dark ! I'll file a follow-up

#3435

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants