Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tiledb_stats_is_enabled to C and C++ APIs #5395

Merged
merged 11 commits into from
Dec 13, 2024

Conversation

rroelke
Copy link
Contributor

@rroelke rroelke commented Dec 3, 2024

The C API has functions tiledb_stats_enable() and tiledb_stats_disable()` which affect global process state.

A library or application running on top of tiledb might want to enable statistics for a span and then restore the previous state. This is not possible without a way of asking whether stats gathering is currently enabled.

This pull request adds tiledb_stats_is_enabled so that users can check whether stats are currently enabled. Use of this API is demonstrated in tests via ScopedStats.


TYPE: C_API | CPP_API
DESC: add tiledb_stats_is_enabled to C and C++ APIs

@rroelke rroelke force-pushed the rr/sc-60139-tiledb-stats-is-enabled branch from bc5d939 to 3050a7e Compare December 3, 2024 18:29
@rroelke rroelke force-pushed the rr/sc-60139-tiledb-stats-is-enabled branch from 3050a7e to eb352cb Compare December 3, 2024 19:10
@rroelke rroelke requested review from ypatia and ihnorton December 3, 2024 20:13
tiledb/sm/c_api/tiledb.cc Outdated Show resolved Hide resolved
test/support/src/stats.cc Outdated Show resolved Hide resolved
test/src/unit-cppapi-stats.cc Show resolved Hide resolved
tiledb/sm/cpp_api/stats.h Show resolved Hide resolved
tiledb/sm/cpp_api/stats.h Outdated Show resolved Hide resolved
Copy link
Member

@teo-tsirpanis teo-tsirpanis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer any nontrivial library or application to avoid using the global stats APIs and instead get stats scoped to a context or query with the tiledb_(ctx|query)_get_stats functions (from a quick read these are always enabled).

If we want to double down on global stats, we should instead add a tiledb_stats_enable_v2(uint8_t* previous); function that also returns whether stats were already enabled, which can be implemented atomically.

tiledb/sm/cpp_api/stats.h Outdated Show resolved Hide resolved
tiledb/sm/cpp_api/stats.h Outdated Show resolved Hide resolved
tiledb/sm/c_api/tiledb.cc Outdated Show resolved Hide resolved
@rroelke
Copy link
Contributor Author

rroelke commented Dec 5, 2024

I would prefer any nontrivial library or application to avoid using the global stats APIs and instead get stats scoped to a context or query with the tiledb_(ctx|query)_get_stats functions (from a quick read these are always enabled).

If we don't want to support these existing APIs then they should be deprecated. But if we don't want to deprecate them then this change must be added for completeness.

Global stats may not be the best tool for an application developer but it is a useful tool for a system monitor so in my opinion deprecation does not make sense.

We certainly will use the context/query stats APIs in tables eventually for finer-grained profiling but my short-term goal is to enable collection and introspection into any degree of metrics.

rroelke and others added 3 commits December 5, 2024 08:47
@teo-tsirpanis
Copy link
Member

I'm not arguing to deprecate the global stats APIs, but they should be toggled only "globally" in the top level of an application or script. Quoting the PR's description (emphasis mine):

enable statistics for a span and then restore the previous state

If an application is toggling global stats deep inside its logic, where it cannot be certain whether they are previously enabled, it should not be using global stats in the first place.

it is a useful tool for a system monitor

Can't the application enable them in the very beginning of its lifetime? And if it wants to skip collecting stats for some operations it can disable them temporarily. My previous sentence does not apply in this case; our imaginary application could track on its own whether stats are enabled at any time, and does not need a new C API.

@rroelke
Copy link
Contributor Author

rroelke commented Dec 6, 2024

our imaginary application could track on its own whether stats are enabled at any time

This is specifically what I want to avoid and the reason I am adding this API.

This is for the tables CLI. Users should be able to turn stats on and off, and check whether they are enabled already. This can be done by tracking the state separately but I consider duplicated state such as that to be the wrong thing to do when there is this alternative.

@teo-tsirpanis
Copy link
Member

Okay I understand now the use case, but still think that tracking this state in the app is the best way to go. I am concerned that adding this API would encourage people to use global stats in places they shouldn't, but am fine if other people agree with adding it.

@ihnorton ihnorton requested a review from ypatia December 12, 2024 15:46
Copy link
Member

@ypatia ypatia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. With regards to the concerns on encouraging the "wrong" way to enable/disable stats with the addition of this API, my only idea would be to remove the example (that I suggested to add in the first place 😅) and fix the example as it is today to show the "correct" way of getting stats for a query.

@rroelke rroelke merged commit c045863 into dev Dec 13, 2024
63 checks passed
@rroelke rroelke deleted the rr/sc-60139-tiledb-stats-is-enabled branch December 13, 2024 12:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants