Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

persist: add some unit tests #12826

Merged
merged 3 commits into from
Jun 10, 2022
Merged

persist: add some unit tests #12826

merged 3 commits into from
Jun 10, 2022

Conversation

danhhz
Copy link
Contributor

@danhhz danhhz commented Jun 1, 2022

See commits for details. The first commit isn't a test, but it was small so I snuck it in.

Motivation

  • This PR adds a feature that has not yet been specified.

Testing

  • This PR has adequate test coverage / QA involvement has been duly considered.

Release notes

This PR includes the following user-facing behavior changes:

  • N/A

@danhhz danhhz requested review from aljoscha and ruchirK June 1, 2022 21:38
return Err(Since(self.since.clone()));
}
let mut machine = self.machine.clone();
let () = machine.listen(&as_of).await?;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this change hangs the open_loop benchmark because it creates listeners as_of zero at startup and immediately awaits them before any writes have come in. any ideas how to proceed? everything I've come up with has been awful.

let listen = reader
.listen(Antichain::from_elem(0))
.await

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this is an early sign that the previous behaviour of listen() was more ergonomic? (See my other comment)

Copy link
Contributor

@aljoscha aljoscha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice optimization! 🙌

Regarding the change in semantics of listen(): what's the reasoning behind that? It doesn't seem to be necessary for the optimization, and the test for the optimization could also be written with the old semantics. I'm asking because it seemed natural to me that snapshot() would block, because it does "get me the entire data up to as_of" while listen() felt more like an async stream where creating the stream at any legal as_of would not block but then updates would only trickle in once they are available.

"{:?}",
client.open::<Vec<u8>, String, u64, i64>(shard_id).await
),
"Err(CodecMismatch { requested: (\"Vec<u8>\", \"String\", \"u64\", \"i64\"), actual: (\"String\", \"String\", \"u64\", \"i64\") })"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you using string comparison instead of something like:

            assert_eq!(
                client
                    .open::<Vec<u8>, String, u64, i64>(shard_id)
                    .await
                    .unwrap_err(),
                InvalidUsage::CodecMismatch {
                    requested: tpe("Vec<u8>", "String", "u64", "i64",),
                    actual: tpe("String", "String", "u64", "i64",),
                }
            );

Where tpe() is a helper I made up. Plus I had to add #[cfg_attr(test, derive(PartialEq, Eq))] on InvalidUsage.

The strings seem somewhat hard to maintain, but there probably is a good reason. 😅

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I usually lean toward matching on error message in tests because that's often how they're consumed in production. unwrap_err is a good idea, I simply forgot it existed :). I'll switch to that!

Copy link
Contributor Author

@danhhz danhhz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the change in semantics of listen(): what's the reasoning behind that? It doesn't seem to be necessary for the optimization, and the test for the optimization could also be written with the old semantics. I'm asking because it seemed natural to me that snapshot() would block, because it does "get me the entire data up to as_of" while listen() felt more like an async stream where creating the stream at any legal as_of would not block but then updates would only trickle in once they are available.

I'm convinced!

"{:?}",
client.open::<Vec<u8>, String, u64, i64>(shard_id).await
),
"Err(CodecMismatch { requested: (\"Vec<u8>\", \"String\", \"u64\", \"i64\"), actual: (\"String\", \"String\", \"u64\", \"i64\") })"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I usually lean toward matching on error message in tests because that's often how they're consumed in production. unwrap_err is a good idea, I simply forgot it existed :). I'll switch to that!

Copy link
Contributor

@ruchirK ruchirK left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First two commits look good to me, third commit im still reading and its taking me a bit to internalize because I'm slow this morning but don't let that block merging!

"Err(CodecMismatch { requested: (\"String\", \"String\", \"i64\", \"i64\"), actual: (\"String\", \"String\", \"u64\", \"i64\") })"
);
// We can't test the D param mismatch currently because i64 is literally
// the only type that implements both Codec64 and Semigroup right now.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm this is really surprising to me because Semigroup/Monoid should be implemented for the unsigned integers and i guess it just never was a pressing need

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

opened TimelyDataflow/differential-dataflow#368 which once it merges and we bump differential should let us test the diff param mismatch

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh because u64 will implement Semigroup now? nice!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, thanks for the timely fix!

Copy link
Contributor Author

@danhhz danhhz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ready for another look!

@@ -644,6 +643,11 @@ mod tests {
let mut snapshot = read.expect_snapshot(2).await;
let mut listen = read.expect_listen(0).await;

// Manually advance the listener's machine so that it has the latest
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love this! some other options:

  • snapshot currently clones the machine so that the methods can be &self instead of &mut self but I don't think that's super important. removing that clone would happen to make this unnecessary because the listen would inherit the state from the snapshot call
  • change the listen call to try fetching updated state once if it's not immediately serveable
  • dunno something else?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking right now! sorry about dropping this!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think snapshot mutating the reader makes the most sense to me! intuitively, having a mechanism whereby a reader changes after snapshotting seems to make sense, and it seems like we're doing

 // Hack: Keep this method `&self` instead of `&mut self` by cloning the
 // cached copy of the state, updating it, and throwing it away
 // afterward.

purely as a means to keep that method &self, which i think might have the rationale that its more like the expectation for the api? I don't know of any stronger reason, and given all of that, I feel like &mut self is a fine way forward!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I was rewriting this test locally, to fit with the previous (and now unchanged!) semantics of listen, I changed this to first do one next() call on listen, asserted against that. Then made it unreliable, and then fetched the rest of the listen events. Also slightly awkward, but doesn't require calling internal methods or changing signatures. 🤷‍♂️

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes a lot of sense to make snapshot &mut and remove the machine clone (and we should consider doing that independantly), but after reverting my change to also make listen wait for as_of to be available, it seems pretty subtle for the test to rely on the fact that we call snapshot first. went with aljoscha's suggestion

@danhhz
Copy link
Contributor Author

danhhz commented Jun 9, 2022

either of you want to take another look at this? if not, I'll resolve these conflicts (and fix my lint issue) and merge

Copy link
Contributor

@aljoscha aljoscha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a nit and a comment. But I think this is good to merge!

/// An update was not beyond the expected lower of the batch
UpdateNotBeyondLower {
/// An update was not at or beyond the expected lower of the batch
UpdateNotAtOrBeyondLower {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

super nit: I think timely (and Frank) already understand "beyond" as "not less than", which means "at or greater", in laymans terms.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL (and I confirmed frank shares this interpretation)! reverted

@@ -644,6 +643,11 @@ mod tests {
let mut snapshot = read.expect_snapshot(2).await;
let mut listen = read.expect_listen(0).await;

// Manually advance the listener's machine so that it has the latest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I was rewriting this test locally, to fit with the previous (and now unchanged!) semantics of listen, I changed this to first do one next() call on listen, asserted against that. Then made it unreliable, and then fetched the rest of the listen events. Also slightly awkward, but doesn't require calling internal methods or changing signatures. 🤷‍♂️

@@ -200,29 +200,48 @@ where
}
}

pub async fn listen(&self, as_of: &Antichain<T>) -> Result<Self, Since<T>> {
Copy link
Contributor

@aljoscha aljoscha Jun 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is super nit, but: it feels a bit weird to have these methods on Machine and State, because ReadHandle doesn't really have to call them, they're just an additional layer of verification/assertion. We could maybe put that in a comment here or maybe call this verify_listen or sth.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did both!

This was started in 12685, but the last mine was left as a TODO to avoid
making 12216 rebase. Now that 12216 is in, finish this work.
Copy link
Contributor Author

@danhhz danhhz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTRs!!

/// An update was not beyond the expected lower of the batch
UpdateNotBeyondLower {
/// An update was not at or beyond the expected lower of the batch
UpdateNotAtOrBeyondLower {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL (and I confirmed frank shares this interpretation)! reverted

"Err(CodecMismatch { requested: (\"String\", \"String\", \"i64\", \"i64\"), actual: (\"String\", \"String\", \"u64\", \"i64\") })"
);
// We can't test the D param mismatch currently because i64 is literally
// the only type that implements both Codec64 and Semigroup right now.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, thanks for the timely fix!

@@ -200,29 +200,48 @@ where
}
}

pub async fn listen(&self, as_of: &Antichain<T>) -> Result<Self, Since<T>> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did both!

@@ -644,6 +643,11 @@ mod tests {
let mut snapshot = read.expect_snapshot(2).await;
let mut listen = read.expect_listen(0).await;

// Manually advance the listener's machine so that it has the latest
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes a lot of sense to make snapshot &mut and remove the machine clone (and we should consider doing that independantly), but after reverting my change to also make listen wait for as_of to be available, it seems pretty subtle for the test to rely on the fact that we call snapshot first. went with aljoscha's suggestion

@danhhz danhhz enabled auto-merge June 10, 2022 18:04
This adds a performance optimization where a Listener doesn't fetch the
latest Consensus state if the one it currently has can serve the next
request. A similar thing already was true of SnapshotIter, so also
included is a test that covers both.
@danhhz danhhz merged commit fa2dd29 into MaterializeInc:main Jun 10, 2022
@danhhz danhhz deleted the persist_tests branch June 10, 2022 19:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants