Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add api version enum for determining runtime behaviour #1362

Merged
merged 3 commits into from
Nov 24, 2020

Conversation

cryptonemo
Copy link
Collaborator

feat: add tests paths for both v1_0 and v1_1 versions
feat: allow api_version to be used as cli argument
fix: renamed ApiVersion and use semver string parsing

@cryptonemo
Copy link
Collaborator Author

Resolves #1357

let api_version = Version::parse(api_version_str)?;
match api_version.major {
1 => match api_version.minor {
0 => Ok(ApiVersion::V1_0),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should check that patch is 0

@@ -37,19 +38,3 @@ pub(crate) const TEST_SEED: [u8; 16] = [
pub const MAX_LEGACY_POREP_REGISTERED_PROOF_ID: u64 = 4;

pub type PoRepID = [u8; 32];
pub fn is_legacy_porep_id(porep_id: PoRepID) -> bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should keep this around and validate that we have the correct porep id for now

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can add this back, but I don't think asserting this around too often is the best idea, as the behavior is fully decoupled from the porep_id at the moment, and that would explicitly tie it back to it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know, but we should have this check for these two API versions at the high level at least, maybe in filecoin-proofs, just to make sure we maintain the same invariants that we have right now.

Copy link
Collaborator

@porcuquine porcuquine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I made one detailed suggestion about how to potentially structure (with an argument for why something like it might be valuable).

I also noted a few places where what appear to have been unrelated code changes and refactors were introduced. It would really be better not to slip those in amidst enormous procedural refactoring. They raise the burden of review and make it harder to evaluate the changeset later.

storage-proofs/core/Cargo.toml Show resolved Hide resolved
_ => Err(anyhow::format_err!(
"Could not parse API Version from string (major)"
)),
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be much clearer to have something like this (untested — but something more or less like this should work):

match (api_version.major, api_version.minor, api_version.patch) {
    (1, 0, 0) => Ok(ApiVersion::V1_0_0),
    (1, 1, 0) => Ok(ApiVersion::V1_1_0),
    (1, 1, _) | (1, 0, _) => => Err(anyhow::format_err!(
                         "Could not parse API Version from string (patch)"
                     )),
    (1, _, _) => Err(anyhow::format_err!(
                     "Could not parse API Version from string (minor)"
                 )),
    _ => Err(anyhow::format_err!(
                 "Could not parse API Version from string (major)"
             )),
}

I'm also not sure the specificity of the error messages is necessary. If not, you could maximize clarity and concision by just matching the two valid combinations and one fallback.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the simple 2 case distinction, this makes sense. I don't think this simplifies much for the general cases though (i.e. when we're actually using a range of versions across semver), in which case it needs to be broken out further into something like what's already there.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated (minus the syntax error)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can match ranges in patterns too. I think if you order the clauses right, you might be able to use that fact to keep it relatively simple and legible. Maybe it will break down sometimes too, but it might be worth at least trying to isolate the complexity so remains mostly clear.

https://doc.rust-lang.org/book/ch18-03-pattern-syntax.html#matching-ranges-of-values-with-

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some more thoughts — nothing concrete to do now. Are you planning to use the semver matching facilities? In that case, the more complicated ranges could be encoded directly.

Example:

match api_version {
    v if VersionReq::parse("^1.2.3").matches(v) =>  todo!(),
    v if VersionReq::parse("^2.0").matches(v) => todo!(),
    v if VersionReq::parse("^1.0.0").matches(v) => todo!(),
}

I didn't think through the actual range matching in detail; this was just to indicate the form. Depending on what kinds of things come up, it might make sense to replace these explicit matches calls with custom predicates that can live in one place and capture the intended partitioning of versions as it evolves. If the same partitions are used in multiple decisions and become at all complicated, having a single source of truth to audit, update, and keep in sync between usages may be useful.

graph_bucket_aux::<H>(legacy_porep_id);
graph_bucket_aux::<H>(new_porep_id);
graph_bucket_aux::<H>(legacy_porep_id, ApiVersion::V1_0_0);
graph_bucket_aux::<H>(new_porep_id, ApiVersion::V1_1_0);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the idea that the decision is now keyed only on the ApiVersion? Does it actually make sense to differentiate the porep_id when in this context, or is that just a way the test might pass but using the (previous) mechanism?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a number of tests that have 'new' or 'legacy' porep that can be simplified by removing that distinction completely (they just need to be different). We can revise those further, but I didn't see it as necessary on the first cut through. I agree that 'new' and 'legacy' porep_ids is just confusing, so could see doing that now.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine to wait — was just checking to make sure I understand the new world.

let mut challenge_hasher = Sha256::new();
challenge_hasher.update(AsRef::<[u8]>::as_ref(&pub_inputs.randomness));
challenge_hasher.update(&u64::from(sector.id).to_le_bytes()[..]);

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this? Is there supposed to be some changed behavior? It would really make current and historical review of this 45-file changeset simpler if it were restricted to only changes associated with introduction of this versioning scheme. If this is a simple refactor, it would be much better to put it in a distinct PR.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair, but this brings these parts of the code up to date with the 'new normal' way of how these are done.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dignifiedquire Thoughts here? I can back this out of this PR to make a new PR for just those updates, but it seemed clear enough to me on this pass. Either way, it's pretty straight-forward.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you put it into its own commit, fine in this PR

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For review, yes, I'll push shortly. When we squash/merge it makes little difference in that case :-)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking we squash the rest before merge and merge regularly so we end up with two commits on master

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, updated into 2 commits, although I much prefer the squash/merge to the 'regular' merge.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

next time again

storage-proofs/post/src/fallback/vanilla.rs Outdated Show resolved Hide resolved
(0, &mut parents[1..])
let (predecessor_index, other_drg_parents) = match self.api_version {
ApiVersion::V1_0_0 => (m_prime, &mut parents[..]),
ApiVersion::V1_1_0 => (0, &mut parents[1..]),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's an idea. What if instead of doing this kind of ad hoc inline matching, we only enact conditional code by calling methods of the ApiVersion? The downside: loss of local context in the implementations. The upside: it would be very easy to maintain and inspect an exhaustive set of differences between versions. This example would be something like (untested):

trait DRGParentOrdering {
    …
}

impl DRGParentOrdering for ApiVersion::1_0_0 {
    fn parent_ordering(&self, m_prime: usize, parents: &mut[T]) -> (usize, &mut[T]) {
        (m_prime, parents)
    }

impl DRGParentOrdering for ApiVersion::1_1_0 {
    fn parent_ordering(&self, m_prime: usize, parents: &mut[T]) -> (usize, &mut[T]) {
        (0, parents[1..])
    }
}

… and then:

let (predecessor_index, other_drg_parents) = self.api_version.parent_ordering(m_prime, parents);

Add appropriate comments, and this becomes a very clear and self-documenting record of exactly what has changed between versions, as well as for each atomic patch's dependencies.

In addition to the self-documenting property, this would help enforce a discipline to combat the almost inevitable spaghetti-like nature of changes likely to accumulate. By establishing and enforcing such a pattern, you would disable and disarm the temptation to just make the implementation a little more complicated, a little more clever — a complexity which might accumulate in the name of simplicity and be difficult to untangle later. Such accumulations and potential multiplicative complexity are one of the things that worries me most about this path (although I understand why its general shape is probably the right one now).

This is one reason I think the right discipline and a strong set of rules (even if they have to be amended in accordance with need over time) will go a long way toward protection. If something is too difficult to do within the trait-patch method, it might be a good idea to rethink it. Admittedly, this proposal has the disadvantage of moving the actual logic out of line. I think this could be solved with a little tooling. One idea: a macro which inlines the values for the current — i.e. correct version and defers to to trait implementations for the others.

let (predecessor_index, other_drg_parents) = with_version_or!(DRGParentOrdering, (0, &mut [parents[..]));

This assumes there is a concept of a 'current version', and that it is tracked in the code, but I think that may be a good/safe assumption.

Again, I realize this is more work, and introduces a certain amount of awkwardness. At the same time, I think the rigidity can be protective by:

  • Making it harder to do ad hoc things or to introduce unexpected variations or multiplicative complexity into the code.
  • Isolating and forcing centralized documentation of all version deltas.
  • Forcing a higher level of discipline around the form of allowable version-based changes. Nothing prevents gradually extending the complexity of what is allowed, but the need to do that or else view the result of shortcutting it as at least technical debt, might force more deliberate attention around each such change introduced and how they interact.

I realize I this all comes across as very grave, but as we have discussed before, I think the approach we are taking here (however necessary) skirts a dangerous territory — and it's well worth extra forethought and extra discipline to ensure we stay on the right side of the line. Avoiding the problems is easy now, at the first very simple change. It's later, as they stack up, grow in complexity, and need to survive refactors, etc. that it might become hard to deal with. Whatever policies we can put in place to prevent that will be valuable. Enshrining those in code makes it just that much harder to break or change them in the heat of the moment.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's nice for simple cases (like the above), but won't work when parameters or return values change. Each behavior change will have to be considered differently at some point.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea, but I fear it is not quite practical, as we end up with circular imports. ApiVersion is defined in storage-proofs-core, but if we were to push all these details onto it, we would have to use types in core from storage-proofs-porep and storage-proofs-post, resulting in a circle.

One idea to help reduce the random places with api versions matching, is to at least always put the two different versions into a function call, named with a suffix of _v_1_0_0 or simlar, for example

impl Bar {
  fn foo(&self, version: ApiVersion) {
    match version {
      ApiVersion::V1_0_0 => self.foo_v1_0_0(),
      ApiVersion::V1_1_0 => self.foo_v1_1_0(),
    }
  }

  fn foo_v1_0_0(&self) {
    todo!()
  }
  fn foo_v1_1_0(&self) {
    todo!()
  }
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dignifiedquire This is also a nice re-factor for the simple case, but breaks down when args or return types change (at least the dispatch foo above. It could work if there was no dispatch and just separate versioned foo methods -- but that pushes the dispatch (i.e. match logic) up further if the return type changes, which is exactly what's trying to be avoided, no?). The upshot of these discussions are that they are re-factors, which can be done at any time (and probably will no matter what we choose here), independent of the fact that we've settled on a logic switching mechanism for upgrades.

For this first cut, imposing a formality around something that's known to not work in all cases seems arbitrary, but if it's the will of the team, we can.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree we don’t need to figure this out before merging

Copy link
Collaborator

@porcuquine porcuquine Nov 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't let it hold you up. I more wanted to inject some ideas into the mix for as the ongoing refactors @cryptonemo predicts happen. Changes to the API signature are obviously a larger problem that will need to be considered (perhaps soon). That may be somewhat orthogonal.

This is all just continued brainstorming and not meant to be blocking — just as feed for ongoing design and consideration:

@dignifiedquire You could implement the traits in the most local scope they are needed. That would be less centralized, but still easy to inspect and would capture information of itself.

feat: add tests paths for both v1_0_0 and v1_1_0 versions
feat: allow api_version to be used as cli argument
fix: renamed ApiVersion and use semver string parsing
@dignifiedquire dignifiedquire merged commit 59da4c1 into master Nov 24, 2020
@dignifiedquire dignifiedquire deleted the upgrade-graph-only branch November 24, 2020 15:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants