Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-43142: [C++][Parquet] Refactor Encryptor API to use arrow::util::span instead of raw pointers #43195

Merged
merged 12 commits into from
Jul 11, 2024

Conversation

adamreeve
Copy link
Contributor

@adamreeve adamreeve commented Jul 9, 2024

Rationale for this change

See #43142. This is a follow up to #43071 which refactored the Decryptor API and added extra checks to prevent segfaults. This PR makes similar changes to the Encryptor API for consistency and better maintainability.

What changes are included in this PR?

  • Change AesEncryptor::Encrypt and Encryptor::Encrypt to use arrow::util::span instead of raw pointers
  • Replace the AesEncryptor::CiphertextSizeDelta method with a CiphertextLength method that checks for overflow and abstracts the size difference behaviour away from consumer code for improved readability.

Are these changes tested?

  • This is mostly a refactoring of existing code so is covered by existing tests.

Are there any user-facing changes?

No

@adamreeve adamreeve requested a review from wgtmac as a code owner July 9, 2024 02:13
Copy link

github-actions bot commented Jul 9, 2024

⚠️ GitHub issue #43142 has been automatically assigned in GitHub to PR creator.

@adamreeve
Copy link
Contributor Author

adamreeve commented Jul 9, 2024

There's a CI failure due to a segfault running arrow-dataset-file-parquet-encryption-test, but it looks like this is happening on main too, and has been reported in #43057

Given I'm working in this area already I'll take a look and see if I can reproduce this, but I don't think it should be a blocker for this PR.

cpp/src/parquet/encryption/encryption_internal.cc Outdated Show resolved Hide resolved
cpp/src/parquet/encryption/encryption_internal.cc Outdated Show resolved Hide resolved
cpp/src/parquet/encryption/encryption_internal.cc Outdated Show resolved Hide resolved
cpp/src/parquet/encryption/encryption_internal.cc Outdated Show resolved Hide resolved
cpp/src/parquet/encryption/encryption_internal.cc Outdated Show resolved Hide resolved
cpp/src/parquet/metadata.cc Outdated Show resolved Hide resolved
cpp/src/parquet/metadata.cc Outdated Show resolved Hide resolved
Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, except for one nit (see comment about int vs. unsigned)

@pitrou
Copy link
Member

pitrou commented Jul 11, 2024

@github-actions crossbow submit -g cpp

Copy link

Revision: 20e1aaf

Submitted crossbow builds: ursacomputing/crossbow @ actions-5626e3c393

Task Status
test-alpine-linux-cpp GitHub Actions
test-build-cpp-fuzz GitHub Actions
test-conda-cpp GitHub Actions
test-conda-cpp-valgrind GitHub Actions
test-cuda-cpp GitHub Actions
test-debian-12-cpp-amd64 GitHub Actions
test-debian-12-cpp-i386 GitHub Actions
test-fedora-39-cpp GitHub Actions
test-ubuntu-20.04-cpp GitHub Actions
test-ubuntu-20.04-cpp-bundled GitHub Actions
test-ubuntu-20.04-cpp-minimal-with-formats GitHub Actions
test-ubuntu-20.04-cpp-thread-sanitizer GitHub Actions
test-ubuntu-22.04-cpp GitHub Actions
test-ubuntu-22.04-cpp-20 GitHub Actions
test-ubuntu-22.04-cpp-emscripten GitHub Actions
test-ubuntu-22.04-cpp-no-threading GitHub Actions
test-ubuntu-24.04-cpp GitHub Actions
test-ubuntu-24.04-cpp-gcc-14 GitHub Actions

@pitrou pitrou merged commit 6e438e6 into apache:main Jul 11, 2024
34 checks passed
@pitrou pitrou removed the awaiting committer review Awaiting committer review label Jul 11, 2024
@pitrou
Copy link
Member

pitrou commented Jul 11, 2024

Thanks again for doing this @adamreeve !

@pitrou
Copy link
Member

pitrou commented Jul 11, 2024

@raulcd This is 17.0.0 material if you ever need to cut another RC.

@raulcd
Copy link
Member

raulcd commented Jul 11, 2024

This is 17.0.0 material if you ever need to cut another RC.

I've added the backport-candidate label to the issue

@adamreeve adamreeve deleted the encryptor_span branch July 11, 2024 21:34
Copy link

After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit 6e438e6.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about 2 possible false positives for unstable benchmarks that are known to sometimes produce them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants