Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TestMigrationPurgeOutdatedReplicas: invalid memory address or nil pointer dereference #59180

Closed
RaduBerinde opened this issue Jan 20, 2021 · 3 comments
Assignees
Labels
C-test-failure Broken test (automatically or manually discovered). skipped-test

Comments

@RaduBerinde
Copy link
Member

https://teamcity.cockroachdb.com/viewLog.html?buildId=2596743&buildTypeId=Cockroach_UnitTests

@RaduBerinde RaduBerinde added the C-test-failure Broken test (automatically or manually discovered). label Jan 20, 2021
RaduBerinde added a commit to RaduBerinde/cockroach that referenced this issue Jan 20, 2021

Verified

This commit was signed with the committer’s verified signature.
laverya Andrew Lavery
Informs cockroachdb#59180.

Release note: None
@irfansharif
Copy link
Contributor

I'm not sure how to find the stack trace. Pretty sure this is the same thing as #58523 (comment) (+cc @tbg). Maybe as a stop gap, I'll send a PR out (tomorrow) to temporarily check for the ReplicaState being nil. This test isn't the only that that'll be affected by it, it could be pretty much any and every test.

craig bot pushed a commit that referenced this issue Jan 20, 2021
59166: acceptance: skip TestDocker* flaky tests r=RaduBerinde a=RaduBerinde

These tests all flake with the same failure.

Informs #58955.

Release note: None

59181: server: skip TestMigrationPurgeOutdatedReplicas r=irfansharif a=RaduBerinde

Informs #59180.

Release note: None

Co-authored-by: Radu Berinde <radu@cockroachlabs.com>
@tbg
Copy link
Member

tbg commented Jan 20, 2021 via email

irfansharif added a commit to irfansharif/cockroach that referenced this issue Jan 20, 2021
There's a scary lack of synchronization around how we set the
ReplicaState for a given replica, and how we mark a replica as
"initialized". What this means is that very temporarily, it's possible
for the entry in Store.mu.replicas to be both "initialized" and have an
empty ReplicaState. This was an existing problem, but is now more likely
to bite us given the migrations infrastructure attempts to purge
outdated replicas at start up time (when replicas are being initialized,
and we're iterating through extan replicas in the Store.mu.replicas
map).

This issue has caused a bit of recent instability: cockroachdb#59180, cockroachdb#58489,
\cockroachdb#58523, and cockroachdb#58378. While we work on a more considered fix to the
problem (tracked in cockroachdb#58489), we can introduce stop the bleeding in the
interim (and unskip some tests).

Release note: None
@irfansharif
Copy link
Contributor

Unskipping in #59194, same underlying bug as #58523.

irfansharif added a commit to irfansharif/cockroach that referenced this issue Jan 20, 2021
There's a scary lack of synchronization around how we set the
ReplicaState for a given replica, and how we mark a replica as
"initialized". What this means is that very temporarily, it's possible
for the entry in Store.mu.replicas to be both "initialized" and have an
empty ReplicaState. This was an existing problem, but is now more likely
to bite us given the migrations infrastructure attempts to purge
outdated replicas at start up time (when replicas are being initialized,
and we're iterating through extan replicas in the Store.mu.replicas
map).

This issue has caused a bit of recent instability: cockroachdb#59180, cockroachdb#58489,
\cockroachdb#58523, and cockroachdb#58378. While we work on a more considered fix to the
problem (tracked in cockroachdb#58489), we can introduce stop the bleeding in the
interim (and unskip some tests).

Release note: None
craig bot pushed a commit that referenced this issue Jan 20, 2021
59194: kv: introduce a stopgap for lack of ReplicaState synchronization r=irfansharif a=irfansharif

There's a scary lack of synchronization around how we set the
ReplicaState for a given replica, and how we mark a replica as
"initialized". What this means is that very temporarily, it's possible
for the entry in Store.mu.replicas to be both "initialized" and have an
empty ReplicaState. This was an existing problem, but is now more likely
to bite us given the migrations infrastructure attempts to purge
outdated replicas at start up time (when replicas are being initialized,
and we're iterating through extan replicas in the Store.mu.replicas
map).

This issue has caused a bit of recent instability: #59180, #58489,
\#58523, and #58378. While we work on a more considered fix to the
problem (tracked in #58489), we can introduce stop the bleeding in the
interim (and unskip some tests).

Release note: None

59201:  sql: add telemetry for materialized views and set schema. r=otan a=RichardJCai

 sql: add telemetry for materialized views and set schema.

Release note: None

Resolves #57299 

Co-authored-by: irfan sharif <irfanmahmoudsharif@gmail.com>
Co-authored-by: richardjcai <caioftherichard@gmail.com>
pbardea pushed a commit to pbardea/cockroach that referenced this issue Jan 21, 2021
There's a scary lack of synchronization around how we set the
ReplicaState for a given replica, and how we mark a replica as
"initialized". What this means is that very temporarily, it's possible
for the entry in Store.mu.replicas to be both "initialized" and have an
empty ReplicaState. This was an existing problem, but is now more likely
to bite us given the migrations infrastructure attempts to purge
outdated replicas at start up time (when replicas are being initialized,
and we're iterating through extan replicas in the Store.mu.replicas
map).

This issue has caused a bit of recent instability: cockroachdb#59180, cockroachdb#58489,
\cockroachdb#58523, and cockroachdb#58378. While we work on a more considered fix to the
problem (tracked in cockroachdb#58489), we can introduce stop the bleeding in the
interim (and unskip some tests).

Release note: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-test-failure Broken test (automatically or manually discovered). skipped-test
Projects
None yet
Development

No branches or pull requests

3 participants