Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

define rules and cleanup dependency tree to enable sustainable releases in the future #289

Closed
MonsieurNicolas opened this issue Jul 29, 2022 · 9 comments
Assignees

Comments

@MonsieurNicolas
Copy link
Contributor

To expedite development we've postponed looking at the dependency tree with "sustainability" in mind.

In particular we have not defined goals (potentially conflicting) for the different parts of the code.

Something like:

  • Code that gets deployed at the core layer (like host functions, and native contracts) have the most extreme restrictions, and have to go through the CAP process, in particular:
    • we have to guarantee backward compat at the "bit level" (cannot change behavior/meta for historical transactions)
    • highest bar for security and performance review (this has implication on dependencies that have to be reviewed every time they get revved)
    • "never panic" (especially relevant to "native contract" type of code that depend on larger parts of the SDK)
    • release cadence is probably at most "quarterly" (with possibility of emergency security release, which adds a layer of "branching")
  • SDK updates that may impact contracts
    • certain changes may be reflected in a "SEP" when defining "APIs" or semantics expected to be maintained across (potentially different versions) contracts
    • dependency monitoring on a case-by-case basis
    • shipping cadence ~monthly
  • Code that gets deployed as part of the SDK to support the "developer experience"
    • Test, fuzzing, performance related for example, or to support the IDE
    • No impact on contracts themselves
    • No particular restriction on shipping (can be nightly)

Those goals need to be reconciled with how the various components are organized, depend on each other (some dependency injection/inversion may be needed) and released.

@MonsieurNicolas
Copy link
Contributor Author

Looking at a recent CVE in Go's bignum package, we may also want to ban certain APIs (that tend to create certain classes or security issues) for code that ends up at the core layer. For example, anything related to marshalling from/to strings as code in that space tends to do "way too much stuff" and it's probably not practical to code review all that code anyways. We'll need a linter of sorts to help ensure that those crates are "pure" in that respect.

@leighmcculloch
Copy link
Member

leighmcculloch commented Oct 19, 2022

Code that gets deployed as part of the SDK to support the "developer experience"

The test experience in the SDK is largely supported by the soroban-env-host crate, because the meaningful logic supporting testing is the Host, so while there is some functionality to support testing in the SDK, it is very minimal, so minimal I don't think it is worth the overhead of engineering a separation for.


The above mostly discusses the distinctions between the core layer and the SDK, however there are layers above the SDK that are much easier to separate from the core concerns. Most functionality in the SDK can be broken up into three parts:

  • A thin layer on-top of unsafe operations. This is very delicate, and an intimate knowledge of the host is required. We should be very careful how we modify this.
  • Generation of contract specs and type conversions. We have to be careful with compatibility with this code.
  • A thin layer of test utilities built on top of the soroban_env_host::Host. This is tighty coupled. The SDK is a consumer of the Host API in a very similar way that Core is a consumer of it. It is extremely important that the SDKs use of the Host is as consistent as possible as Core's because a contracts ability to reliably test using it is dependent on this.

Overall there seems to me to be an inherent tight coupling of the concerns in the SDK and the env crates, and significant coordination between the two, even though there exists solid interfaces between the two.

However, if we step away from the SDK, there are plenty of other projects that very easily distance from the guarantees of these components. For example, soroban-rpc, soroban-cli, dapp development, and all of the client libs are very far removed from the guarantees we need to provide in core or the SDK. It's much easier to iterate independently on those layers.

Here's a diagram illustrating existing dependencies and the types of concerns I see showing up in different areas:

soroban-whats-been-built-and-areas excalidraw

cc @graydon @tsachiherman @ire-and-curses

@graydon graydon self-assigned this Feb 9, 2023
@MonsieurNicolas
Copy link
Contributor Author

The test experience in the SDK is largely supported by the soroban-env-host crate, because the meaningful logic supporting testing is the Host, so while there is some functionality to support testing in the SDK, it is very minimal, so minimal I don't think it is worth the overhead of engineering a separation for.

currently yes (and I hope we can keep it that way), what I envision is that there should be things build on top: that's why I mentioned fuzzers, but I also imagine things like special hooks to facilitate verification.

In any case, what I think we need to do here is actually codify expectations for the different layers that @leighmcculloch identified so that we can properly trim & enforce the type of dependencies we allow in each crate (automation may follow, we first need to define the rules).
I think this has some implications on code review & code pinning strategies and potentially on testing strategies too (core level changes are extremely heavy to implement and test and we don't want this outside of core).

@leighmcculloch
Copy link
Member

leighmcculloch commented Aug 28, 2023

trim & enforce the type of dependencies we allow in each crate

The real challenge with being selective will not be the direct dependencies we've picked, of which there are a few but reasonable number, but the deep tree of transitive dependencies.

For example, these are the direct dependencies we have today:

  • stellar-xdr has base64, serde, serde_with, hex, arbitrary, clap, serde_json, thiserror. About half of these are excluded from builds that stellar-core runs as they are only used for the stellar-xdr CLI.
  • soroban-env-common has wasmi, serde, static_assertions, ethnum, arbitrary, num-traits, num-derive.
  • soroban-env-guest has static_assertions.
  • soroban-env-macros has syn, quote, proc-macro2, itertools, serde, serde_json.
  • soroban-native-sdk-macros has syn, quote, proc-macro2, itertools.
  • soroban-env-host has wasmi, static_assertions, sha2, ed25519-dalek, curve25519-dalek, rand, rand_chacha, hex, num-traits, num-integer, num-derive, log, backtrace, k256, getrandom, sha3, tracy-client, env_logger, itertools, log, tabwriter, thousands, textplots, wasmprinter, expect-test, more-asserts, linregress.
  • soroban-sdk has arbitrary ,ed25519-dalek, rand, ctor, hex, proptest, proptest-arbitrary-interop.

In total that's 40 unique dependencies.

Contrast that with 172 total dependencies in the rs-soroban-env repo.

We might be able to trim some of these. For example, the hex, num-traits, num-integer, num-derive, and itertools libraries could probably be replaced with a small amount of our own logic if necessary. Not confident at all on the remainder though.

@graydon
Copy link
Contributor

graydon commented Sep 1, 2023

That's an exaggeration:

  • 11 of the 27 deps you listed for env-host are dev-deps, they're just for testing.
  • static_assertions is hardly a dep, it's a compile-time-only logical check.
  • several of the deps are just subcrates of a single logical dep: the 3 num crates, the 2 rand crates, the 2 rustcrypto-hashes crates, the 2 dalek crates.
  • getrandom is just a version-pin of a transitive dep, not a direct dep
  • backtrace is already in the stdlib and we're just using an piece of it still feature-gated off on stable
  • log is both released by the rust-lang org (arguably part of the stdlib) and also something we're not even using outside of tests, we can move it to a dev-dep also, it's just the log facade and I hooked it up in case we were going to log to stellar-core from soroban. Seems it was premature.
  • hex should just be a dev-dep too, it's only used in a test.

IOW I would say that the host has 6 significant logical dependencies: wasmi, sha2+sha3, *-dalek, num-*, k256, rand_*.

@graydon
Copy link
Contributor

graydon commented Sep 1, 2023

Also dalek-curve25519 was actually only in there before for calibration -- because we used to define the costs of a signature verification with reference to the number of curve25519 operations it might do in the worst case -- and @jayz22 removed it when changing the calibration code. I actually ought to double check with @jayz22 on whether that's a conservative enough calibration (I'll file another bug) but in any case there should really only be at most one direct non-dev (non-calibration) dalek crate there.

@graydon
Copy link
Contributor

graydon commented Sep 1, 2023

(#1031 trims deps a bit, #1030 is about re-checking calibration)

@graydon
Copy link
Contributor

graydon commented Sep 1, 2023

Going back to the root of this bug though, I am not sure there's much actionable remaining here. We've done the following:

  • Trimmed dependencies quite a lot (exaggerated list above notwithstanding, host and common depend on very little)
  • Reasonably carefully considered each of those deps we have taken; I don't think we're in a position to change our minds about any and I think we've done at least a moderate amount of sniffing around inside their code for weirdness. I'm not going to say I've audited them, but hell, I haven't audited sqlite or ASIO or SOCI either. Maybe I should have, but at some level I think we have to trust that they do what they say they do on the tin (and appear to do in our testing of them).
  • Have a clear understanding of the bit-identical behaviour necessary to ever take an update (this is spelled out in the actual host version-bumping system and docs)
  • Made a ton of ceremony to catch accidental changes / updates. Possibly too much! The baked-in second check of dependencies pointed-to by core ensures we're not going to ship a dep-change unless someone reviews:
    • The Cargo.toml and Cargo.lock change in env-host
    • The Cargo.lock change in core
    • The secondary cargo-lock tree change on core, which someone has to manually run a special regeneration command to even accomplish.
    • Avoids any banned code, banned licenses, CVEs, or duplicate crates as caught by cargo-deny

I don't know what else to do to the dependency tree to make env releases sustainable. The bug asks to "define rules" -- where? About what? If we have to take a new dep in the future surely we'll take it, but not without care and consideration and certainly not accidentally (any more than we carelessly add deps to core normally; we think through whether we can avoid it and only add them when they're necessary and obviously better than writing it ourselves would be). Should I just write that general guideline down somewhere?

@graydon
Copy link
Contributor

graydon commented Sep 18, 2023

Closing, I think this is as done as it's going to get.

@graydon graydon closed this as completed Sep 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants