-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Fix/4849 add cluster stress spec #4899
[WIP] Fix/4849 add cluster stress spec #4899
Conversation
Looks like I found a small bug in the MNTR here:
This can only find |
Not sure why the build failed on Linux here - didn't get much of an error message back. |
93b3a47
to
67b703f
Compare
|
…operties (akkadotnet#4902) * close akkadotnet#4901 - replace reflection magic in MNTR with reading of MultiNodeConfig properties * fixed outdated DiscoverySpec
Changed this to make it consistent with the JVM
This format error would cause the StandardOutLogger to throw a `FormatException` internally
This format error would cause the StandardOutLogger to throw a `FormatException` internally
…ronontheweb/akka.net into fix/4849-addClusterStressSpec
@@ -348,6 +348,7 @@ Target "MultiNodeTests" (fun _ -> | |||
|
|||
Target "MultiNodeTestsNetCore" (fun _ -> | |||
if not skipBuild.Value then | |||
setEnvironVar "akka.cluster.assert" "on" // needed to enable assert invariants for Akka.Cluster |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Asserts gossip invariants when the MNTR is run
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments on these changes thus far - but I have some cleanup to do on the PR.
@@ -175,7 +175,7 @@ protected override void AfterTermination() | |||
|
|||
//TODO: ExpectedTestDuration? | |||
|
|||
void MuteLog(ActorSystem sys = null) | |||
public virtual void MuteLog(ActorSystem sys = null) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needed to add this in order to subclass this behavior in other specs
@@ -62,13 +62,29 @@ public static Cluster Get(ActorSystem system) | |||
return system.WithExtension<Cluster, ClusterExtension>(); | |||
} | |||
|
|||
static Cluster() | |||
{ | |||
bool GetAssertInvariants() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Enabled assert invariants inside the Cluster
when the "akka.cluster.assert"
environment variable is set
@@ -649,11 +655,12 @@ public ImmutableHashSet<UniqueAddress> Receivers(UniqueAddress sender) | |||
{ | |||
if (iter.MoveNext() == false || n == 0) | |||
{ | |||
iter.Dispose(); // dispose enumerator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We weren't properly disposing the enumerator from the HeartbeatNodeRing
before.
? slice1 | ||
: take(remaining, NodeRing().Until(sender).Where(c => !c.Equals(sender)).GetEnumerator(), slice1).Item2; | ||
? slice1 // or, wrap0around | ||
: take(remaining, NodeRing().TakeWhile(x => x != sender).GetEnumerator(), slice1).Item2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cleaned up the LINQ syntax here
else if (x.Uid == y.Uid) return 0; | ||
else return 1; | ||
return result; | ||
var ha = x.Uid; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shifts the ring to primarily factor in the UID in the node ring comparisons, rather than the address. Cheaper and faster while remaining stable so long as the membership doesn't change.
/// Aggregated status of a subject node is defined as (in this order): | ||
/// - Terminated if any observer node considers it as Terminated | ||
/// - Unreachable if any observer node considers it as Unreachable | ||
/// - Reachable otherwise, i.e. no observer node considers it as Unreachable | ||
/// </summary> | ||
internal class Reachability //TODO: ISerializable? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't actually change anything in this file - just ReSharper'd it.
@@ -508,7 +508,6 @@ public int InitialParticipants | |||
/// </summary> | |||
public void RunOn(Action thunk, params RoleName[] nodes) | |||
{ | |||
if (nodes.Length == 0) throw new ArgumentException("No node given to run on."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needed to allow the MNTR to call RunOn
with no nodes
@@ -30,11 +31,11 @@ public DefaultFailureDetectorRegistry(Func<FailureDetector> factory) | |||
|
|||
private readonly Func<FailureDetector> _factory; | |||
|
|||
private AtomicReference<Dictionary<T, FailureDetector>> _resourceToFailureDetector = new AtomicReference<Dictionary<T, FailureDetector>>(new Dictionary<T, FailureDetector>()); | |||
private AtomicReference<ImmutableDictionary<T, FailureDetector>> _resourceToFailureDetector = new AtomicReference<ImmutableDictionary<T, FailureDetector>>(ImmutableDictionary<T, FailureDetector>.Empty); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cleaned up some potential mutability issues here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some additional comments - this PR is ready for review now.
@@ -1964,8 +1965,11 @@ public ReceiveGossipType ReceiveGossip(GossipEnvelope envelope) | |||
// for all new joining nodes we remove them from the failure detector | |||
foreach (var node in _latestGossip.Members) | |||
{ | |||
if (node.Status == MemberStatus.Joining && !localGossip.Members.Contains(node)) | |||
if (!localGossip.Members.Contains(node)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the major bugfix for #4849
/// </summary> | ||
internal class Reachability //TODO: ISerializable? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made no real changes to this class - just ReSharper'd it.
@@ -154,7 +154,7 @@ public State(HeartbeatHistory history, long? timeStamp) | |||
|
|||
private AtomicReference<State> _state; | |||
|
|||
private State state | |||
internal State state |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made internal
for debugging purposes
Replaced via #4940 |
No description provided.