-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cdc/bank roachtest pull 260MB off a 3rd party vendor upon every CI run, and fails if upstream unavailable #51543
Comments
I have marked the 3 roachtests that use this facility as skipped. |
cockroachdb#51543 Release note: None
@mwang1026 @dt the KV team meeting concluded that since Bulk I/O is owning the CDC product area, the Bulk I/O team is responsible to enhance the testing infrastructure for CDC tests. So we're pushing this to your plate. Note that the test is currently skipped. That means we disabled test coverage for CDC. That means that addressing this becomes critical path to the next release. |
Thanks @knz. @mwang1026 we should potentially re-enable this for now -- while it'd be nice to have it cached, 262MB once a night is a pretty minimal cost (compared to, say, the vms), and while i hate flakes due non-reproducible builds depending on external infra, not testing at all is worse. |
It turns out that https://github.com/cockroachdb/cockroach/blob/master/build/teamcity-local-roachtest.sh#L37 |
Yes, in fact on every CI there are three (not one) tests that do this. So the archive gets downloaded and extracted 3 times. It's not just our network ingress $$ that this impacts; the upstream server probably blocked us because we were incurring outrageous egress $$ on their side. |
cc @cockroachdb/cdc |
cc @cockroachdb/cdc |
reassigning this to CDC team as this has to do with the implementation of the roachtest. |
cc @cockroachdb/cdc |
Describe the problem
The
cdc/bank
roachtest runs the following command every time it runs:I went and checked and that is a 262MB archive to download (compressed).
The archive is not cached, unlike the builder image, so that's a mandatory ingress cost on every CI run.
Moreover, today the upstream HTTP server is saying "no" and is causing all the CI runs to fails.
Expected behavior
The archive should be embedded in the builder image, and/or the fetch should use a cached copy if it was already downloaded earlier on the TC agent.
(At the very least we should be fetching from a proxy cache inside the CRL infra so that the CI downloads are internal to GCP).
cc @jlinder @tbg for triage.
Epic DEVINF-109
Jira issue: CRDB-4033
The text was updated successfully, but these errors were encountered: