Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicated endpoints for simulated Crucible Pantry make upgrades tricky #4599

Open
bnaecker opened this issue Dec 1, 2023 · 2 comments
Open
Assignees

Comments

@bnaecker
Copy link
Collaborator

bnaecker commented Dec 1, 2023

In #4497, we added a new optional API endpoint parameter in Nexus. While propagating that to Crucible in oxidecomputer/crucible#1039, we had to make a breaking change to the Crucible Pantry API. Those were due to openapi-lint recommendations, which came along for the ride but were needed to satisfy all the cross-repo Cargo dependencies.

Then during #4571, we made the new parameter required, and also updated the pinned commit for Crucible (among other clients). I made a silly mistake there, updating the package-manifest.toml file but not Cargo.toml. (Really, I should have used ./tools/update_crucible.sh, but I wasn't aware of it.) That meant that the Crucible Pantry client that Nexus was built against was different from the deployed server, leading to issues like #4595.

@leftwo and I attempted to resolve that in #4597, but surprisingly hit some failures in CI. The failures show a failing request from Nexus to the Pantry -- it expected a 204, and instead got a 400.

This was very surprising exactly because we now believed the client and server to be in sync. However, these tests run against a simulated Crucible Pantry. One of those is an API endpoint that had to change to satisfy the OpenAPI lint rules:

path = "/crucible/pantry/0/volume/{id}/bulk_write",

There are a few other changes that need to happen as well, which are going to be part of #4597. It would be great to make this more difficult to break. I am planning to add some kind of sanity check that the simulated and real API match. There may be other and better ways to prevent or catch these changes in the future too.

@bnaecker bnaecker self-assigned this Dec 1, 2023
bnaecker added a commit that referenced this issue Dec 1, 2023
bnaecker added a commit that referenced this issue Dec 1, 2023
bnaecker added a commit that referenced this issue Dec 1, 2023
bnaecker added a commit that referenced this issue Dec 1, 2023
@davepacheco
Copy link
Collaborator

Would it help if the simulated implementation were in the same repo as the real one?

@bnaecker
Copy link
Collaborator Author

bnaecker commented Dec 6, 2023

I think that might help in some ways. That would at least make it easier to see mismatches between routes, or there could even be a cfg directive that chooses a real or simulated implementation of the endpoint, but the actual function decorated with #[endpoint] would be the same.

One downside I think is that sled-agent would need to depend on crucible more directly, whereas now I think the dependency is only on the client types. I'm not sure how big of an issue that is practically.

bnaecker added a commit that referenced this issue Dec 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants