-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: TestStatusAPICombinedStatements is flaky #69533
Labels
A-sql-observability
Related to observability of the SQL layer
C-bug
Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
Comments
Azhng
added
C-bug
Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
A-sql-observability
Related to observability of the SQL layer
T-sql-observability
labels
Aug 28, 2021
This does not cause any correctness issue in production. This only causes our test to be non deterministic and flaky |
craig bot
pushed a commit
that referenced
this issue
Aug 28, 2021
68831: ui: add date range selector in stmts and txns pages r=xinhaoz a=xinhaoz Resolves #68089, #69474 This commit adds date range state properties and a date range selector component to the DB console's statements and transactions pages. This change is necessary to support surfacing persisted statistics in the DB console. This commit adds date range state properties and a date range selector component to the DB console's statements and transactions pages. This change is necessary to support surfacing persisted statistics in the DB console. The date range query parameters used by the api to fetch data is stored in localSettings on db console and localStorage in cluster-ui. The default date range is set to 1 hour ago, and is used as the value when user's reset the date range. The user is able to select dates as far back as 1 year ago, this does not mean there is data available. Release justification: category 4 low risk, high benefit changes to existing functionality Release note (ui change): New date range selector component to the DB console's statements page with the ability to show historical data. The default date range is set to 1 hour ago, and is used as the value when user's reset the date range. ------------------ ![image](https://user-images.githubusercontent.com/20136951/131007768-9a7bfeb6-a0f0-417a-8160-51f8d636f73b.png) ![image](https://user-images.githubusercontent.com/20136951/131007809-ff98359e-bfcc-4cb1-9b0b-eb34f8d04cd0.png) ![image](https://user-images.githubusercontent.com/20136951/131036176-11b03aa6-16e5-4601-b92e-e13dd0dc03df.png) ![image](https://user-images.githubusercontent.com/20136951/131198830-6b727e68-0983-483e-9385-f23e16cde705.png) Transactions page: ![image](https://user-images.githubusercontent.com/20136951/131222215-bb2b013c-034e-40bf-8d83-7f5d9784065a.png) 69375: tree: redact name in DEALLOCATE in FmtHideConstants r=ajwerner,maryliag a=rafiss This will make it so that all DEALLOCATE statements show up in one row in the SQL Statements UI and in the internal stats data structures. Recently, a workload was updated (accidentally) to start creating more DEALLOCATE statements, and it made the UI unusable because there were so many DEALLOCATE statement keys. The second commit is a rename to stop implying that statements with redacted constants are "anonymized" -- they still have names+identifiers. This also renames the name of a field in a response payload. Created while working on #69317 but doesn't fix it. Release justification: fix a high-priority bugs with a low-risk change. Release note: None 69457: sql, import: add metrics for max_row_size guardrails r=rytaft,yuzefovich a=michae2 **sql: rename max_row_size guardrails to match transaction row limits** Addresses: #67400 Rename sql.mutations.max_row_size.{log|err} to sql.guardrails.max_row_size_{log|err} for consistency with transaction_rows_{read|written}_{log|err} and upcoming metrics. Release justification: Low-risk update to new functionality. Release note (ops change): New variables sql.mutations.max_row_size.{log|err} were renamed to sql.guardrails.max_row_size_{log|err} for consistency with other variables and metrics. **sql/row: remove dependency on sql/execinfra** The next commit needs to make sql/row a dependency of sql/execinfra, so remove two small references to sql/execinfra. (I will squash this into the next commit before merging.) Release justification: Low-risk update to new functionality. Release note: None **sql, import: plumb RowMetrics from ExecutorConfig to row.Helper** Addresses: #67400 Add metrics for sql/row and pass them down from ExecutorConfig and FlowCtx all the way to row.Helper. Like sql.Metrics, there are two copies, one for user queries and one for internal queries. (These were originally part of sql.Metrics, but there are several users of sql/row that do not operate under a sql.Server or connExecutor so we are forced to add these to the ExecutorConfig and FlowCtx instead.) I ran into difficulty passing these through import, as FlowCtx itself is not plumbed through. There are only two metrics at first, corresponding to sql.guardrails.max_row_size_{log|err}. Release justification: Low-risk update to new functionality. Release note (ops): Added four new metrics, sql.guardrails.max_row_size_{log|err}.count{.internal} which are incremented whenever a large row violates sql.guardrails.max_row_size_{log|err}. 69535: server: remove non-deterministic test r=maryliag a=maryliag A new test was added but it would fail depending on the time it was executed. This commit remove this flaky test so in a future PR we can add new set of tests that make more sense to test this functionality. Fixes #69533 Release justification: Category 3 bug fix Release note: None Co-authored-by: Xin Hao Zhang <[email protected]> Co-authored-by: Rafi Shamim <[email protected]> Co-authored-by: Michael Erickson <[email protected]> Co-authored-by: Marylia Gutierrez <[email protected]>
xinhaoz
added a commit
to xinhaoz/cockroach
that referenced
this issue
Aug 30, 2021
Fixes: cockroachdb#69557 As mentioned in cockroachdb#69533, we have a race condition in tests where we request statements stats with start=now. In these tests we expect to see no results, but because in-memory stats have the aggregated_ts field set on iterator return, depending on the time the test is run we might see results returned. For example, suppose we insert stats at 15:45. If we then request stats at 16:00, requesting only stats aggregated after or at the current time, i.e. start=16:00, the aggregated_ts for in-memory stats will be set to 16:00 and thus we will see results returned. To deflake these tests, we stub the aggregated_ts field to a predetermined value. Release justification: non-production code changes Release note: None
craig bot
pushed a commit
that referenced
this issue
Aug 30, 2021
69591: sql: deflake TestStatusAPIStatements and TestStatusAPICombinedStatements r=xinhaoz a=xinhaoz Fixes: #69557 As mentioned in #69533, we have a race condition in tests where we request statements stats with start=now. In these tests we expect to see no results, but because in-memory stats have the aggregated_ts field set on iterator return, depending on the time the test is run we might see results returned. For example, suppose we insert stats at 15:45. If we then request stats at 16:00, requesting only stats aggregated after or at the current time, i.e. start=16:00, the aggregated_ts for in-memory stats will be set to 16:00 and thus we will see results returned. To deflake these tests, we stub the aggregated_ts field to a predetermined value. Release justification: non-production code changes Release note: None 69592: sql: proper version gate sql stats r=maryliag,ajwerner a=Azhng Previously, SQL Stats's implementation for version gating is faulty. This means that SQL Stats's job monitor would attempt to start sql stats compaction job in an incompatible cluster. This commit fixed the faulty implementation. Resolves #69459 Resolves #69544 Resolves #69565 Release justification: Category 2: Bug fixes and low-risk updates to new functionality Release note: None Co-authored-by: Xin Hao Zhang <[email protected]> Co-authored-by: Azhng <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
A-sql-observability
Related to observability of the SQL layer
C-bug
Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
It's happening on master. https://teamcity.cockroachdb.com/viewLog.html?buildId=3376012&tab=buildResultsDiv&buildTypeId=Cockroach_UnitTests_Test
My guess is it probably come from here. It's likely a time-related race condition.
In-memory stats gets assigned
aggregated_ts
when it's returned from the iterator API. The timestamp for in-memory is generated by truncating the current timestamp to nearest hour.So concretely, suppose we insert stats at 15:43, it will have
aggregated_ts
= 15:00. If we issue a read with read timestamp at 16:00, and the in-memory stats is not yet flushed to disk, the in-memory stats with be returned withaggregated_ts
= 16:00. Hence the test failure.Solution here would be use the existing testing knobs in SQL stats to stub out the call to timeutil.Now() function to make sure the test is 100% deterministic.
The text was updated successfully, but these errors were encountered: