-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Ability to Run as Full Simulator #165
Conversation
sim-lib/Cargo.toml
Outdated
@@ -15,7 +15,9 @@ expanduser = "1.2.2" | |||
serde = { version="1.0.183", features=["derive"] } | |||
serde_json = "1.0.104" | |||
bitcoin = { version = "0.30.1", features=["serde"] } | |||
lightning = { version = "0.0.116" } | |||
# LDK is on a different version to bitcoin to us, which leads to some issues, import so we can use its types. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there reasons why bitcoin_ldk
types are being used as opposed to sticking with types from bitcoin
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Long and painful story that's now resolved, but for prosperity:
We were relying of a different version of the bitcoin
dependency to LDK:
sim-ln
->bitcoin-30
sim-ln
->lightning-117
->bitocin-29
We couldn't downgrade bitcoin-30
, because we're using a feature it it. But to use lightning-117
(specifically in pathfinding) we needed to provide the public key struct type in bitcoin-29
. So this alias allowed me to import both types. This was messy and gross and thankfully solved by another LDK release that updated the bitcoin version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a handful of small comments on code style and a question about using types from LDK's bitcoin library.
This would benefit from documentation comments across several structs and methods but expect they'd be updated before this is ready
oh, my! 🎆 🥇 |
d58c28d
to
8785cf6
Compare
Pushed some additional docs + cleanup, no semantic changes! |
4dc5455
to
3fb2d0d
Compare
Pushed a off by one bug that I caught writing tests! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the design makes sense overall. I may need to give this a second pass down the line (the diff is huge 😅 ) but my current understanding is as follows:
SimNode
is the main class that mimics our current node implementations, and hence it implements LightningNode
. It has a shared reference to a network (Arc<Mutex<SimGraph>>
) that represents the simulated network. The network includes all channels, which have a breakdown in classes all the way down to the channel policies. The construction of a simulation works as usual: we create a list of clients that are instances that implement LightningNode
(in this case SimNode
s), when a payment is made, the simulated node accesses the graph via its shared pointer and triggers all the state transitions, which are managed by the SimGraph
.
I left a bunch of suggestions, most (if not all of them) can be found here: sr-gi@67f810e. Feel free to cherry-pick as desired.
Some general comments:
-
I don't think 604019d should be needed. We can have a different constructor for the simulator when calling it with a simulated graph (
Simulator::with_sim_channels
or something) that receives all the usual but instead of a map of clients, receives a collection of simulated channels. That constructor could have the functionality ofln_node_from_graph
to build the "clients", plus construct (sr-gi@67f810e#diff-f891364aff98268ac6d94109bd61594c23af50e37e547fec9b0ea44a816459a0R390-R421). Doing this gets rids of a bunch of imports, plus having to define the triggers in the CLI and so on. -
I think d2ea936 should be squashed, I don't think it's meaningful enough to be a commit on its own (but up to you in the end)
Thanks for the mega-review @sr-gi 🫶 Not sure if this is helpful, but I've pushed a single fixup for each commit with the suggested changes (each fixup directly follows its original commit) with the hope of making round two a little easier! If it's not useful I'll just go ahead and squash. Added a few responses inline where the big-picture stuff needs some discussion. |
Also, I've started on some unit test coverage for this but going to put it in as a separate PR because this one is already far too big :') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! 🎉
I can confirm most of the comments have been addressed or refuted, just left a few comments on the open discussion plus things that may have been passed unnoticed.
I'm happy with the fixups to be squashed at this point.
sim-lib/src/sim_node.rs
Outdated
let preimage = PaymentPreimage(rand::random()); | ||
let preimage_bytes = Sha256::hash(&preimage.0[..]).to_byte_array(); | ||
let payment_hash = PaymentHash(preimage_bytes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I opened a PR upstream to add this, feels like a nice utility to have as part of the API: lightningdevkit/rust-lightning#2916
sim-lib/src/sim_node.rs
Outdated
for channel in graph_channels.iter() { | ||
channels.insert(channel.short_channel_id, channel.clone()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Iterating taking ownership (into_iter
) and moving the insert after the internal for loop will allow you to drop the channel clone
sim-lib/src/sim_node.rs
Outdated
match self.nodes.get(node) { | ||
Some(channels) => Ok((node_info(*node), channels.clone())), | ||
None => Err(LightningError::GetNodeInfoError( | ||
"Node not found".to_string(), | ||
)), | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was not addressed, but feel free to disregard it (just keeping track of what was covered)
sim-cli/src/main.rs
Outdated
let (shutdown_trigger, shutdown_listener) = triggered::trigger(); | ||
let sim = Simulation::new( | ||
clients, | ||
validated_activities, | ||
cli.total_time, | ||
cli.expected_pmt_amt, | ||
cli.capacity_multiplier, | ||
write_results, | ||
(shutdown_trigger, shutdown_listener), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can be avoided if the additional constructor approach is considered:
sr-gi@67f810e#diff-f891364aff98268ac6d94109bd61594c23af50e37e547fec9b0ea44a816459a0R390-R421
and
sr-gi@67f810e#diff-094f58aee7f261f0f15154547c8e32f0d54ed2fdea0963b6e8b630feff954d54R205-R213
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh very nice, I think I'm going to leave this for a follow up because I'll drop the commit where we actually use the simulator before merge (to be followed up with surfacing the config options to use it).
Squashed + addressed last comments. I've removed the commit which actually uses the simulator, because we need to add a configuration option for an end user to provide that format ( Will follow up with a PR adding tests + that option. |
ACK f199c2f Last comment, feel free to add it as a follow-up if you find this useful: You can add a couple of utility functions to /// Utility function to easily convert from u64 to `ShortChannelID`
impl From<u64> for ShortChannelID {
fn from(value: u64) -> Self {
ShortChannelID(value)
}
}
/// Utility function to easily convert `ShortChannelID` into u64
impl From<ShortChannelID> for u64 {
fn from(scid: ShortChannelID) -> Self {
scid.0
}
} Here are a couple of examples: - short_channel_id: channel.short_channel_id.0,
+ short_channel_id: channel.short_channel_id.into(), - let scid = ShortChannelID(hop.short_channel_id);
+ let scid = hop.short_channel_id.into(); - return Err(ForwardingError::ChannelNotFound(ShortChannelID(
- hop.short_channel_id,
- )))
+ return Err(ForwardingError::ChannelNotFound(
+ hop.short_channel_id.into(),
+ )) |
This commit adds a ChannelState struct which is used to track the policy and state of a channel in the *outgoing* direction. This will be used to check forwards against the node's advertised policy and track the movement of outgoing HTLCs through the channel. Note that we choose to implement this state *unidirectionally*, so a single channel will be represented by two ChannelState structs (one in each direction).
Add a single representation of a simulated lightning channel which uses our state representation to add and remove htlcs. For simplicity, each side of the channel is represented as a separate state, with the only interaction between the two through the changes to local balance that happen when we settle htlcs.
Add an implementation of the LightningNode trait that represents the underlying lightning node. This implementation is intentionally kept simple, depending on some SimNetwork trait to handle the mechanics of actually simulating the flow of payments through a simulated graph.
We want to be able to distinguish between expected and critical payment errors in our simulated payments. An expected error occurs due to a lightning-related failure (such as running out of liquidity), and a critical one happens because something has gone wrong with our simulator (for example, an assertion failure about balances). This commit adds an some utilities to ForwardingError to make this distinction and display errors properly.
Add an implementation of our SimNetwork trait that will do the heavy lifting of propagating htlcs through a simulated network.
Added in last push! |
Thanks for the great review on this @sr-gi 🫶🫶🫶 |
Anytime :D |
This PR is a WIP that adds the ability to run SimLN without any real lightning nodes, instead providing "simulated nodes" that copy the the routing policies and liquidity constraints of real nodes. The last commit hard-codes a graph to run this "full simulation" on as an example, whereas in reality we'd load from a json description of the desired topology/policies.
High Level Overview
Lightning implementations in SimLN are abstracted using the LightningNode trait. The steps that we follow to support a simulated implementation of this trait (SimNode) are as follows:
–from_graph option
, which allows starting the simulator with a graph file (format tbd) that provides channel policies and topology (not included in this PR).SimGraph
struct to manage:SimGraph
struct to manage payment dispatch.Graph Abstraction
Management of payments through the simulated network is separated managed with the following layers:
These structs are primarily used to provide send/track payment APIs, receiving updates on the state of their payments from SimGraph.
Channel State Machine
Implementation of channel state is broken down into three layers:
Each channel state is responsible for handling tracking of liquidity and HTLC limits in one direction, and the simulated channel is responsible for enforcing sanity checks across the channel (eg, that we don’t exceed our capacity).
An example state machine update is provided in the table below:
tl;dr