Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest: jepsen run an ntp server on controller node to enable clock skew tests #74401

Closed
aliher1911 opened this issue Jan 4, 2022 · 2 comments
Assignees
Labels
A-kv Anything in KV that doesn't belong in a more specific category. A-testing Testing tools and infrastructure C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-kv KV Team

Comments

@aliher1911
Copy link
Contributor

aliher1911 commented Jan 4, 2022

Currently clock skew tests are disabled:

// TODO(bdarnell): subcritical-skews nemesis is currently flaky due to ntp rate limiting.
// https://github.com/cockroachdb/cockroach/issues/35599
//{"subcritical-skews", "--nemesis subcritical-skews"},
//{"majority-ring-subcritical-skews", "--nemesis majority-ring --nemesis2 subcritical-skews"},
//{"subcritical-skews-start-kill-2", "--nemesis subcritical-skews --nemesis2 start-kill-2"},
Because of #35599.

Tests rely directly on ntp.ubuntu.com which starts throttling requests pretty quickly making tests flaky.

It should be easy to install an ntp package to the controller node only and point tests to use it since we could pass its IP to the test from roachtest. We need to check how it would interact with chrony or to disable chrony on controller.

Jira issue: CRDB-12064

@aliher1911 aliher1911 added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) A-testing Testing tools and infrastructure A-kv Anything in KV that doesn't belong in a more specific category. T-kv KV Team labels Jan 4, 2022
@aliher1911 aliher1911 self-assigned this Jan 4, 2022
@aliher1911
Copy link
Contributor Author

This could be done pretty easily by passing ntp server to test and then adding 'allow' to chronyd configuration on controller. No need to run ntpd as chronyd has ntpd functionality that is not enabled by default.
But easier solution could be to just use pool.ntp.org instead. We should try that first and only proceed with this approach if using pool still fails. (see #35599)

@aliher1911
Copy link
Contributor Author

This is resolved as we now use ntp that doesn't throttle us.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-kv Anything in KV that doesn't belong in a more specific category. A-testing Testing tools and infrastructure C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-kv KV Team
Projects
None yet
Development

No branches or pull requests

1 participant