workload: log histogram write/encode failures, close output file #70484

stevendanna · 2021-09-21T10:36:21Z

We are currently observing incomplete histograms being output during
nightly roachperf tpccbench runs.

I don't think the changes here are likely to address the cause, as I
would expect write failures to affect a broader range of roachperf
output. But, it is still good to log any failures we do encounter.

Further, we now sync and close the file explicitly.

Informs #70313

Release note: None

We are currently observing incomplete histograms being output during nightly roachperf tpccbench runs. I don't think the changes here are likely to address the cause, as I would expect write failures to affect a broader range of roachperf output. But, it is still good to log any failures we do encounter. Further, we now sync and close the file explicitly. Informs cockroachdb#70313 Release note: None

cockroach-teamcity · 2021-09-21T10:36:28Z

This change is

tbg · 2021-09-21T12:45:51Z

This seems to affect only tpccbench, right?
tpccbench is special in that it calls c.Reset which hard-resets the cluster VMs. I've observed somewhere that this also causes trailing null bytes in the logs: https://cockroachlabs.slack.com/archives/C01CDD4HRC5/p1630404151004600

I think the same same thing might be happening here? Perhaps the solution is as easy as making sure that we sync the histograms. Are we doing that?

tbg · 2021-09-21T12:46:36Z

I'm seeing that you're adding an explicit Sync() here - so maybe this does indeed fix the problem. Worth a try for sure.

stevendanna · 2021-09-21T15:08:39Z

TFTR!

I'm seeing that you're adding an explicit Sync() here - so maybe this does indeed fix the problem. Worth a try for sure.

Definitely possible. I think I had overlooked this since because we were able to Get() the file and then decode. But those reads could have all come from the cache even though they hadn't been flushed to disk yet.

stevendanna · 2021-09-21T15:09:42Z

bors r=erikgrinaker

craig · 2021-09-21T16:02:54Z

This PR was included in a batch that was canceled, it will be automatically retried

craig · 2021-09-21T17:59:10Z

Build succeeded:

GitHub CI (Cockroach)

stevendanna requested a review from a team September 21, 2021 10:36

stevendanna requested review from a team and erikgrinaker and removed request for a team September 21, 2021 10:43

erikgrinaker approved these changes Sep 21, 2021

View reviewed changes

craig bot merged commit 16d935d into cockroachdb:master Sep 21, 2021

stevendanna added backport-21.2.x and removed backport-21.2.x labels Sep 30, 2021

This was referenced Nov 19, 2021

release-21.2: workload: log histogram write/encode failures, close output file #72973

Merged

release-21.1: workload: log histogram write/encode failures, close output file #72974

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

workload: log histogram write/encode failures, close output file #70484

workload: log histogram write/encode failures, close output file #70484

stevendanna commented Sep 21, 2021

cockroach-teamcity commented Sep 21, 2021

tbg commented Sep 21, 2021

tbg commented Sep 21, 2021

stevendanna commented Sep 21, 2021

stevendanna commented Sep 21, 2021

craig bot commented Sep 21, 2021

craig bot commented Sep 21, 2021

workload: log histogram write/encode failures, close output file #70484

workload: log histogram write/encode failures, close output file #70484

Conversation

stevendanna commented Sep 21, 2021

cockroach-teamcity commented Sep 21, 2021

tbg commented Sep 21, 2021

tbg commented Sep 21, 2021

stevendanna commented Sep 21, 2021

stevendanna commented Sep 21, 2021

craig bot commented Sep 21, 2021

craig bot commented Sep 21, 2021