Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-41329: [C++][Gandiva] Fix gandiva cache size env var #41330

Merged
merged 13 commits into from
May 14, 2024

Conversation

zanmato1984
Copy link
Contributor

@zanmato1984 zanmato1984 commented Apr 22, 2024

Rationale for this change

Gandiva cache size validity checks are not robust enough (the negativity test is broken), and they are not currently tested.

What changes are included in this PR?

  1. Fix checking gandiva cache size env var.
  2. Make cache size static so it only gets evaluated once.
  3. Add test cases.
  4. Enrich the description in the document about this env var.

Are these changes tested?

UT included.

Are there any user-facing changes?

None.

@zanmato1984
Copy link
Contributor Author

cc @pitrou

Copy link

⚠️ GitHub issue #41329 has been automatically assigned in GitHub to PR creator.

namespace internal {
// Only called once by GetCapacity().
// Do the actual work of getting the capacity from env var.
// Also makes the testing easier.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not possible in google test to re-initialize a static variable. So have this dedicated function to do the actual work eagerly, then we can test it instead of GetCapacity (which contains the static variable).

@github-actions github-actions bot added awaiting committer review Awaiting committer review and removed awaiting review Awaiting review labels Apr 22, 2024
cpp/src/gandiva/cache.cc Outdated Show resolved Hide resolved
Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this. Just some minor comments, otherwise LGTM.

TEST(TestCache, TestGetCacheCapacityEnvVar) {
// Uncleared env var may have side-effect to subsequent tests. Use a structure to help
// clearing the env var when leaving the scope.
struct ScopedEnvVar {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can use the existing EnvVarGuard from our testing utilities.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that's great to know. Will do. Thank you!

GANDIVA_EXPORT
int GetCapacityFromEnvVar();
} // namespace internal

GANDIVA_EXPORT
int GetCapacity();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be called GetCacheCapacity? The current name is too imprecise.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it can. But is this public API?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea :-) You could deprecate the old API if we want to ensure a smoother migration.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed with deprecating old API and adding renamed one.

Comment on lines +41 to +42
bool ok = ::arrow::internal::ParseValue<::arrow::Int32Type>(
env_value.c_str(), env_value.size(), &capacity);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the parsing according to the recommendation from #41335 (comment)

cpp/src/gandiva/cache.cc Outdated Show resolved Hide resolved
Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, thank you @zanmato1984

@zanmato1984
Copy link
Contributor Author

Shall we move on with this? @pitrou

@pitrou pitrou force-pushed the fix-gandiva-cache branch from c61ae51 to eaaa659 Compare May 14, 2024 12:15
@pitrou
Copy link
Member

pitrou commented May 14, 2024

Sorry for forgetting about this PR. I've rebased and will merge if CI is green.

@pitrou pitrou merged commit e6ab174 into apache:main May 14, 2024
33 of 34 checks passed
@pitrou pitrou removed the awaiting committer review Awaiting committer review label May 14, 2024
Copy link

After merging your PR, Conbench analyzed the 7 benchmarking runs that have been run so far on merge-commit e6ab174.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about 11 possible false positives for unstable benchmarks that are known to sometimes produce them.

vibhatha pushed a commit to vibhatha/arrow that referenced this pull request May 25, 2024
…#41330)

### Rationale for this change

Gandiva cache size validity checks are not robust enough (the negativity test is broken), and they are not currently tested.

### What changes are included in this PR?

1. Fix checking gandiva cache size env var.
2. Make cache size static so it only gets evaluated once.
3. Add test cases.
4. Enrich the description in the document about this env var.

### Are these changes tested?

UT included.

### Are there any user-facing changes?

None.

* GitHub Issue: apache#41329

Lead-authored-by: Ruoxi Sun <[email protected]>
Co-authored-by: Rossi Sun <[email protected]>
Co-authored-by: Antoine Pitrou <[email protected]>
Signed-off-by: Antoine Pitrou <[email protected]>
JerAguilon pushed a commit to JerAguilon/arrow that referenced this pull request May 29, 2024
…#41330)

### Rationale for this change

Gandiva cache size validity checks are not robust enough (the negativity test is broken), and they are not currently tested.

### What changes are included in this PR?

1. Fix checking gandiva cache size env var.
2. Make cache size static so it only gets evaluated once.
3. Add test cases.
4. Enrich the description in the document about this env var.

### Are these changes tested?

UT included.

### Are there any user-facing changes?

None.

* GitHub Issue: apache#41329

Lead-authored-by: Ruoxi Sun <[email protected]>
Co-authored-by: Rossi Sun <[email protected]>
Co-authored-by: Antoine Pitrou <[email protected]>
Signed-off-by: Antoine Pitrou <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants