Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

release: 19.1.10 #50464

Closed
22 tasks done
asubiotto opened this issue Jun 22, 2020 · 18 comments
Closed
22 tasks done

release: 19.1.10 #50464

asubiotto opened this issue Jun 22, 2020 · 18 comments
Assignees

Comments

@asubiotto
Copy link
Contributor

asubiotto commented Jun 22, 2020

Candidate SHA: 7101af9
Deployment status: Qualifying
Qualification Suite: https://teamcity.cockroachdb.com/viewType.html?buildTypeId=Cockroach_ReleaseQualification&branch_Cockroach=provisional_202006230815_v19.1.10&tab=buildTypeStatusDiv
Nightly Suite: https://teamcity.cockroachdb.com/viewType.html?buildTypeId=Cockroach_Nightlies_NightlySuite&branch_Cockroach_Nightlies=provisional_202006230815_v19.1.10&tab=buildTypeStatusDiv

Admin UI for Qualification Clusters:

Release process checklist

Prep date: Monday 6/22/2020

  • Pick a SHA
    • fill in Candidate SHA above
    • email thread on releases@
  • Tag the provisional SHA
  • Publish provisional binaries
  • Ack security@ on the generated Stackdriver Alert to confirm these writes were part of a planned release (Just reply on the email received alert email acking that this was part of the release process)

Release Qualification

One day after prep date:

Release date: Monday 6/29/2020

@asubiotto
Copy link
Contributor Author

asubiotto commented Jun 22, 2020

Publish bleeding edge for the selected sha (73a373f at time of writing) seems to have failed: test_driver.go:232: error creating order: HTTP error 500: 500 Internal Server Error. The diff from the last successful run doesn't seem suspicious (fixing opt join simplification), so I suspect this might be a flake. Manually triggered another run: https://teamcity.cockroachdb.com/viewType.html?buildTypeId=Cockroach_MergeToMaster&tab=buildTypeChains&branch_Cockroach=release-19.1#_expand=block_bt133_bt151_bt154_bt17-2027662&hpos=0&vpos=113

@asubiotto
Copy link
Contributor Author

This run failed as well. But I was confused before. Looks like we did have a successful run on the 31 May with this sha (https://teamcity.cockroachdb.com/viewLog.html?buildId=1980371&buildTypeId=Cockroach_MergeToMaster). @rafiss it's possible this failure is due to a test issue. It looks like there were a bunch of updates to ActiveRecord on June 9th (https://github.com/cockroachdb/examples-orms/commits/master) and the first failed test is on June 10th.

I'm going to go ahead with this SHA (73a373f), hoping the latest red build is not a problem.

@asubiotto
Copy link
Contributor Author

@jlinder looks like there was also a problem posting a failure for this issue because of an unset GITHUB_API_TOKEN:

[05:32:32]Traceback (most recent call last):
[05:32:32]  File "build/teamcity-post-failures.py", line 175, in <module>
[05:32:32]    post_issue(issue)
[05:32:32]  File "build/teamcity-post-failures.py", line 133, in post_issue
[05:32:32]    headers={'Authorization': 'token {0}'.format(os.environ['GITHUB_API_TOKEN'])})
[05:32:32]  File "/usr/lib/python3.6/os.py", line 669, in __getitem__
[05:32:32]    raise KeyError(key) from None
[05:32:32]KeyError: 'GITHUB_API_TOKEN'
[05:32:32]Process exited with code 1

@cockroachdb cockroachdb deleted a comment from blathers-crl bot Jun 22, 2020
@asubiotto
Copy link
Contributor Author

asubiotto commented Jun 22, 2020

I think there are going to be problems with the nightlies until we resolve the Examples-ORMs failure since this is a dependency for the roachtests.

@asubiotto asubiotto self-assigned this Jun 22, 2020
@rafiss
Copy link
Collaborator

rafiss commented Jun 22, 2020

I'm looking into examples-orms. It has been passing against master, so somehow the problem seems to be specific to the 19.1 branch.

@jlinder
Copy link
Collaborator

jlinder commented Jun 22, 2020

@jlinder looks like there was also a problem posting a failure for this issue because of an unset GITHUB_API_TOKEN:

It looks like the GITHUB_API_TOKEN was mistakenly removed back in December. I've added it back in as an env. parameter.

@rafiss
Copy link
Collaborator

rafiss commented Jun 22, 2020

Here is a PR to get examples-orms passing: cockroachdb/examples-orms#106. We'll skip the test for versions <=19.1 until we can debug the underlying issue. (Note, we may decide to leave the skipped test permanently, as our tooling support page specifies that we support ActiveRecord for CockroachDB 20.1+.)

@asubiotto
Copy link
Contributor Author

Updated sha with security fix: 7101af9

@asubiotto
Copy link
Contributor Author

@RaduBerinde looks like the random syntax tests failed in the nightlies. This isn't a blocker but it looks like a query is stuck in optimization. The symptoms look similar to #43076. Was the fix for that backported to 19.1? Here is the link if you want to take a look: https://teamcity.cockroachdb.com/viewLog.html?buildId=2032012&buildTypeId=Cockroach_Nightlies_RandomSyntaxTests&tab=buildResultsDiv&branch_Cockroach_Nightlies=provisional_202006230815_v19.1.10

@asubiotto
Copy link
Contributor Author

asubiotto commented Jun 23, 2020

Urgh, looks like we ran into the same quota issue as #50465 (comment). Will have to rerun the Roachtest GCE build: https://teamcity.cockroachdb.com/viewLog.html?buildId=2032556&tab=queuedBuildOverviewTab

@asubiotto
Copy link
Contributor Author

asubiotto commented Jun 23, 2020

Starting the test failure checkoff process while the build is still running to save time.

Test Failures List

Roachtest GCE

Failures: https://teamcity.cockroachdb.com/viewLog.html?buildId=2032556&tab=queuedBuildOverviewTab

[storage]

  • clearrange/checks=false

[appdev]

  • hibernate
  • psycopg

[kv]

  • kv0bench/nodes=10/cpu=8/shards=20/sequential
  • ycsb/B/nodes=3/cpu=32
  • tpccbench/nodes=9/cpu=4/chaos/partition

[bulk-io]

  • restore2TB/nodes=10

[sql-schema]

  • schemachange/random-load

Random Syntax Tests

Failures: https://teamcity.cockroachdb.com/viewLog.html?buildId=2032012&buildTypeId=Cockroach_Nightlies_RandomSyntaxTests&tab=buildResultsDiv&branch_Cockroach_Nightlies=provisional_202006230815_v19.1.10

[optimizer]

  • TestRandomSyntaxSQLSmith

@rafiss
Copy link
Collaborator

rafiss commented Jun 23, 2020

Signed off on hibernate -- we recently expanded the test coverage, but had missed marking some expected failures for 19.1

@maddyblue
Copy link
Contributor

Signed off on sqlsmith. Statement timeout, which happens a bunch.

@petermattis
Copy link
Collaborator

Signed off on clearrange/checks=false. The failure is a dup of #44845. We run out of disk space one one node during the IMPORT phase of the test. This has been happening intermittently on the various release branches for a while now.

@tbg
Copy link
Member

tbg commented Jun 24, 2020

Signed off on the kv tests. One of them looked like a straight-up infra problem. The other two have trouble (re)starting CRDB and return with an opaque error from roachprod, in both cases the logs look like the node actually did start. I would dig deep into this on master or 20.1, but since this is 19.1 I'm fine chalking this up to some infra problem as well, though less sure.

@rafiss
Copy link
Collaborator

rafiss commented Jun 24, 2020

Signed off on psycopg2 -- it's hanging because of a bug in the tests that has been fixed upstream

@spaskob
Copy link
Contributor

spaskob commented Jun 24, 2020

signed off for schema change random load

@pbardea
Copy link
Contributor

pbardea commented Jun 24, 2020

restore2TB/nodes=10: infra flake during health check, but node appears healthy

monitor task failed: dial tcp 35.226.173.7:26257: connect: connection timed out

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants