Kitsune2 Bootstrap Server--Core #10

neonphog · 2024-11-21T20:24:56Z

This is the core functionality of the kitsune2 bootstrap server code.

To make the PRs more manageable, I've broken them out into some preparatory work before this, and all the endpoint test after this.

See the previous Prep #9 PR for that prep work.
See the follow on Testing #11 PR for api endpoint tests and general notes about this effort in the PR description.

crates/bootstrap_srv/src/bin/kitsune2-bootstrap-srv.rs

crates/bootstrap_srv/src/server.rs

mattyg · 2024-11-22T16:14:35Z

crates/bootstrap_srv/src/server.rs

+
+        // validate expires_at is not more than 30 min after created_at
+        if info.expires_at - info.created_at
+            > (std::time::Duration::from_secs(60 * 30).as_micros() as i64)


Maybe we should move this value to a const in the config file.

Done: 263d5ab

mattyg · 2024-11-22T16:14:53Z

crates/bootstrap_srv/src/server.rs

+
+        // validate created at is less than 3 min in the future
+        if info.created_at
+            > now + (std::time::Duration::from_secs(60 * 3).as_micros() as i64)


Maybe we should move this value to a const in the config file.

Done: 263d5ab

mattyg · 2024-11-22T16:14:59Z

crates/bootstrap_srv/src/server.rs

+
+        // validate created at is not older than 3 min ago
+        if info.created_at
+            < now - (std::time::Duration::from_secs(60 * 3).as_micros() as i64)


Maybe we should move this value to a const in the config file.

Done: 263d5ab

crates/bootstrap_srv/src/config.rs

mattyg · 2024-11-22T16:20:28Z

crates/bootstrap_srv/src/server.rs

+    }
+}
+
+fn maint_worker(


I read this and thought it was a typo on "main_worker" -- maybe spelling it fully would be cleaer.

Maybe I'll change it to prune_worker? I hate typing maintenance, I always get the e's and a's backwards the first time.

Done: 263d5ab

crates/bootstrap_srv/src/server.rs

mattyg · 2024-11-22T16:32:00Z

crates/bootstrap_srv/src/space.rs

+
+/// A map of spaces.
+#[derive(Clone)]
+pub struct SpaceMap(Arc<Mutex<HashMap<bytes::Bytes, Space>>>);


Is there a reason to use bytes::Bytes instead of the wrapper type you made SpaceId?

Yes. I see this server / protocol being useful for WAN discovery outside of projects using kitsune itself. Perhaps it would actually be better to exist outside this monorepo, but that is more overhead for us. At the very least, I didn't want this to have any dependencies on other kitsune crates.

In that case do you want to define a SpaceId or similar wrapper type in this crate just for readability?

Sure, I can start with just a typedef for code clarity. I don't think there's any need for a newtype yet.

Done: 263d5ab

mattyg · 2024-11-22T16:32:23Z

crates/bootstrap_srv/src/space.rs

+
+impl SpaceMap {
+    /// Read the content of a space.
+    pub fn read(&self, space: &bytes::Bytes) -> std::io::Result<Vec<u8>> {


same here - should this be a SpaceId?

see #10 (comment)

Done: 263d5ab

crates/bootstrap_srv/src/space.rs

Co-authored-by: mattyg <[email protected]>

crates/bootstrap_srv/src/store.rs

crates/bootstrap_srv/src/space.rs

zippy · 2024-11-22T17:48:42Z

crates/bootstrap_srv/src/server.rs

+    fn drop(&mut self) {
+        self.cont.store(false, std::sync::atomic::Ordering::SeqCst);
+        for worker in self.workers.drain(..) {
+            let _ = worker.join().expect("Failure shutting down worker thread");


Is there any value in passing the error through here?

What would you like to pass this through? This is a drop impl, we don't have many options. Technically you're not supposed to panic in drop impls, you'll get a stack dump that is not very useful... I'm open to other suggestions... but rust's handling of panics in other threads is pretty poor, I haven't really heard of a better pattern.

Doesn't the main thread = server thread panic here?

It needn't be the main thread if the server were passed to a different thread. But yes, whichever thread the server instance variable is dropped on will panic.

Won't that stop the whole server when it's only about dropping workers?

I'd suggest that this doesn't have to be a drop implementation here. The main program is doing the drop explicitly but you only need to create one instance of this per program, so it could equally be a shutdown method that can return an error?

A drop implementation that calls shutdown().unwrap() would be good, to make sure that the user gets some attempt at shutdown if they forget to call it

Done: ebac6ff

zippy · 2024-11-22T18:56:28Z

crates/bootstrap_srv/src/store.rs

+/// and having too many file handles open.
+///
+/// Start with 10MiB?
+const MAX_PER_TEMPFILE: u64 = 1024 * 1024 * 10;


Should this also be in the config?

The pressure leading to wanting this larger is the inefficiency of having multiple tempfiles open. This lets us store more than 10_000 entries in one file (because infos are often much less than 1024 bytes long). It doesn't feel like we should need this to be much larger.

The pressure leading to wanting this smaller is the constraints of disk space. On a 16 core system (which cloud servers will likely not have anywhere near that) the default production config will have 64 active write tempfiles, giving 64MiB. Even if we have 10 older generations still around, that's only 640 MiB. Do we think servers will ever have that little disk space? Or to look at it from the opposite perspective, let's say the cloud server provides 10GiB of disk space. We can store 10,485,760 agent records at once, even if many of those are defunct, waiting to be cleaned up.

I don't feel like dev-ops folks are going to understand the consequences of making changes to this number. Providing the option to may just be confusing, and not practically useful. What do you think?

P.S. Re-reading this post, I'm not sure I even described it without being confusing...

jost-s

Looks great! Pruning is tested in the other PR - so just a couple of questions.

crates/bootstrap_srv/src/bin/kitsune2-bootstrap-srv.rs

crates/bootstrap_srv/src/parse.rs

jost-s · 2024-11-25T18:52:04Z

crates/bootstrap_srv/src/server.rs

+    fn drop(&mut self) {
+        self.cont.store(false, std::sync::atomic::Ordering::SeqCst);
+        for worker in self.workers.drain(..) {
+            let _ = worker.join().expect("Failure shutting down worker thread");


Doesn't the main thread = server thread panic here?

crates/bootstrap_srv/src/server.rs

Co-authored-by: Jost Schulte <[email protected]>

crates/bootstrap_srv/src/bin/kitsune2-bootstrap-srv.rs

ThetaSinner · 2024-11-25T21:16:51Z

crates/bootstrap_srv/src/config.rs

+    ///
+    /// Defaults:
+    /// - `testing = 32`
+    /// - `production = 32`


Should this be higher in production? Like 10k?

No, this is just for bootstrap. I'd prefer to handle more spaces and less agents per space than the other way around. Once they are in the space, they can get additional peers by gossip.

Makes sense!

ThetaSinner · 2024-11-25T21:19:17Z

crates/bootstrap_srv/src/parse.rs

@@ -0,0 +1,83 @@
+/// An entry with known content.


Would be nice to point to where this is documented from here?

Done: 85dffc7

ThetaSinner · 2024-11-25T21:22:45Z

crates/bootstrap_srv/src/server.rs

+use crate::*;
+use tiny_http::*;
+
+/// Don't allow created_at to be greater or less than this far away from now.


Suggested change

/// Don't allow created_at to be greater or less than this far away from now.

/// Don't allow created_at to be greater than this far away from now.

I think the "absolute" is implied by the sentence and the less is a little confusing

Done: b407784

ThetaSinner · 2024-11-25T21:26:24Z

crates/bootstrap_srv/src/server.rs

+    fn drop(&mut self) {
+        self.cont.store(false, std::sync::atomic::Ordering::SeqCst);
+        for worker in self.workers.drain(..) {
+            let _ = worker.join().expect("Failure shutting down worker thread");


I'd suggest that this doesn't have to be a drop implementation here. The main program is doing the drop explicitly but you only need to create one instance of this per program, so it could equally be a shutdown method that can return an error?

ThetaSinner · 2024-11-25T21:27:12Z

crates/bootstrap_srv/src/server.rs

+    fn drop(&mut self) {
+        self.cont.store(false, std::sync::atomic::Ordering::SeqCst);
+        for worker in self.workers.drain(..) {
+            let _ = worker.join().expect("Failure shutting down worker thread");


A drop implementation that calls shutdown().unwrap() would be good, to make sure that the user gets some attempt at shutdown if they forget to call it

ThetaSinner · 2024-11-25T21:33:00Z

crates/bootstrap_srv/src/server.rs

+        let prune_cont = cont.clone();
+        let prune_space_map = space_map.clone();
+        workers.push(std::thread::spawn(move || {
+            prune_worker(config, prune_cont, prune_space_map)


It would be bad if this thread died and we didn't know. Doesn't necessarily seem like a problem for this issue but would that be a good thing to track somewhere?

Since we haven't adopted any tracing in this binary yet (it is less needed with server code, because we probably just want it all going to systemd log anyways) I've just eprintln-d it for now. We can do something different easily in the future: 7923234

ThetaSinner · 2024-11-25T21:34:43Z

crates/bootstrap_srv/src/server.rs

+            return Err(std::io::Error::other("InvalidExpiresAt"));
+        }
+
+        // validate signature (do this at the end because it's more expensive


Suggested change

// validate signature (do this at the end because it's more expensive

// validate signature (do this at the end because it's more expensive)

Done: 68c1004

ThetaSinner · 2024-11-25T21:44:23Z

crates/bootstrap_srv/src/store.rs

This module is a little scary to me :) The documentation makes sense, and I can follow the code, I'm not sure how much I can asses the behavior

I agree. It would be nice to find a way to test it, at least a little bit. Perhaps if I made the max file size configurable at least internally, and then wrote some tests that explicitly dropped read handles and checked if the files still exist on-disk? I don't know if that would be flaky or not...

I've been through this in detail now and I'm much happier. Thank you for letting me try and fail to pick it apart so I could make sense of it :)

ThetaSinner

Still working through the store, will pick up after meetings

ThetaSinner · 2024-11-26T12:37:48Z

crates/bootstrap_srv/src/store.rs

+//!   open a new tempfile for writing.
+//! - The older read handles will persist the existence of the older tempfiles
+//!   until the last read reference is dropped, at which point the tempfile
+//!   will be cleaned up by the os.


Is the OS guaranteed to do this right away? I thought this would only happen on reboot unless the tempfile implementation in the code has logic to cleanup when dropped

The tempfile drop implementation indeed cleans up the tempfile. If that fails for any reason (panic on the thread) then it is up to the os to clean it up. That's completely system specific.

I'd optionally include that in the documentation here but maybe it's too much detail

Added some details: ea028b0

crates/bootstrap_srv/src/store.rs

neonphog added 2 commits November 21, 2024 13:16

deps and other changes in prep for bootstrap srv PRs

2487361

core kitsune2 bootstrap server code

a5da1ed

neonphog requested a review from a team November 21, 2024 20:24

This was referenced Nov 21, 2024

Kitsune2 Bootstrap Server--Testing #11

Open

Kitsune2 Bootstrap Server--Preparation #9

Merged

wip: bootstrap server #7

Closed

mattyg reviewed Nov 22, 2024

View reviewed changes

crates/bootstrap_srv/src/bin/kitsune2-bootstrap-srv.rs Show resolved Hide resolved

mattyg reviewed Nov 22, 2024

View reviewed changes

crates/bootstrap_srv/src/server.rs Show resolved Hide resolved

mattyg reviewed Nov 22, 2024

View reviewed changes

crates/bootstrap_srv/src/server.rs Show resolved Hide resolved

mattyg reviewed Nov 22, 2024

View reviewed changes

crates/bootstrap_srv/src/config.rs Show resolved Hide resolved

mattyg reviewed Nov 22, 2024

View reviewed changes

crates/bootstrap_srv/src/server.rs Show resolved Hide resolved

mattyg reviewed Nov 22, 2024

View reviewed changes

crates/bootstrap_srv/src/server.rs Show resolved Hide resolved

mattyg reviewed Nov 22, 2024

View reviewed changes

crates/bootstrap_srv/src/server.rs Show resolved Hide resolved

mattyg reviewed Nov 22, 2024

View reviewed changes

crates/bootstrap_srv/src/space.rs Show resolved Hide resolved

mattyg reviewed Nov 22, 2024

View reviewed changes

crates/bootstrap_srv/src/space.rs Outdated Show resolved Hide resolved

Update crates/bootstrap_srv/src/space.rs

59abeea

Co-authored-by: mattyg <[email protected]>

mattyg reviewed Nov 22, 2024

View reviewed changes

crates/bootstrap_srv/src/store.rs Show resolved Hide resolved

mattyg reviewed Nov 22, 2024

View reviewed changes

crates/bootstrap_srv/src/store.rs Show resolved Hide resolved

Update crates/bootstrap_srv/src/store.rs

47296a7

mattyg reviewed Nov 22, 2024

View reviewed changes

crates/bootstrap_srv/src/store.rs Show resolved Hide resolved

mattyg reviewed Nov 22, 2024

View reviewed changes

crates/bootstrap_srv/src/space.rs Show resolved Hide resolved

Base automatically changed from bootstrap-prep to main November 22, 2024 17:25

zippy reviewed Nov 22, 2024

View reviewed changes

neonphog added 3 commits November 22, 2024 13:15

Merge branch 'main' into bootstrap-core

7e548b2

address review comments

263d5ab

Merge branch 'main' into bootstrap-core

c99e568

jost-s reviewed Nov 25, 2024

View reviewed changes

neonphog and others added 2 commits November 25, 2024 13:00

Update crates/bootstrap_srv/src/server.rs

d265c14

Co-authored-by: Jost Schulte <[email protected]>

review comment

a4ac1e1

ThetaSinner reviewed Nov 25, 2024

View reviewed changes

ThetaSinner reviewed Nov 26, 2024

View reviewed changes

neonphog added 6 commits November 26, 2024 13:25

review comment

ebac6ff

review comment

85dffc7

review comment

b407784

review comment

7923234

review comment

68c1004

review comment

ea028b0

	/// Don't allow created_at to be greater or less than this far away from now.
	/// Don't allow created_at to be greater than this far away from now.

	// validate signature (do this at the end because it's more expensive
	// validate signature (do this at the end because it's more expensive)

Kitsune2 Bootstrap Server--Core #10

Are you sure you want to change the base?

Kitsune2 Bootstrap Server--Core #10

Conversation

neonphog commented Nov 21, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jost-s left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ThetaSinner left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

neonphog commented Nov 21, 2024 •

edited

Loading

jost-s left a comment •

edited

Loading