-
Notifications
You must be signed in to change notification settings - Fork 286
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Work in progress] Implement concurrent resync #852
Conversation
215fd2a
to
3947ec7
Compare
2d51fed
to
a3a96ce
Compare
retest this please |
@plwhite I think this is worth a first look while I'm on vacation. Some areas that I know still need work:
|
@@ -47,6 +47,7 @@ def __init__(self, config, ip_version, iptables_updater): | |||
self.ifaces = set() | |||
self.programmed_leaf_chains = set() | |||
self._dirty = False | |||
self._datamodel_in_sync = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found the name of this variable confusing. It really means "is there a snapshot in progress", but I read it as "the first snapshot has completed". Worth renaming it, I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think your first reading was right: it means "first snapshot complete".
Mostly this just looks right, though had a poke round rather than a full review. Some comments.
I need to do more, but net of this is that I'm comfortable with what I've seen. |
|
@plwhite For the record, SOCK_STREAM is definitionally in-order, regardless of the underlying protocol (AF_UNIX, AF_APPLETALK, whatever), which means your desire to "use something with guaranteed ordering to make the point" is actually already done by this code! This is one of the few good decisions of the socket API: you can ask for "whatever is equivalent to TCP" on your chosen protocol and get it. Specifically, from the man page:
Note also:
Though I have no idea how AF_UNIX and SOCK_DGRAM combine to produce unreliability: feels like a fun thing to investigate some time: can you actually lose a datagram on an AF_UNIX socket? @matthewdupre might be interested by the possibility of combining AF_UNIX and SOCK_SEQPACKET:
In other words: SCTP you can actually use! Clearly @matthewdupre needs to get this into the Felix code sometime. |
8a68443
to
1d6bc1c
Compare
@Lukasa I started with |
retest this please |
I concluded that it probably was OK, but I'm happy with being told that it definitely is! So if that's not the subtle, hard-to-find and yet critical bug that's hidden in the code, where is it? |
retest this please |
ea6edfd
to
8c90301
Compare
retest this please |
2 similar comments
retest this please |
retest this please |
I suggest you add package the RPM python debs up to the list too. While we haven't done it for the others, we can at least prevent the problem from growing. |
Another required pre-merge item: remove the 1.3.0~~smc version. |
* Only stop the watcher for resync when we've finished the previous resync. * Extra logging around join(). * Fix non-determinism in FV test.
9c322f6
to
025379c
Compare
@matthewdupre the debs were cleanly built with py2dsc so I say we defer the build until there's something for Jenkins to build. |
[Work in progress] Implement concurrent resync
This PR is merged, but we should still discuss the queue size issue. That is a small enough detail not to block anything. |
Rather than stopping the world to resync with etcd, process the resync in parallel with watching etcd. Improves scalability under high churn.
[ ] Update install docs.(Covered by Remove manual install of python-etcd and posix-spawn from docs. #880.)Extras?:
Let's do these as follow-ups:
Optimize incremental parsing perf?Use faster json library for parsing events?