Add Ability to Run as Full Simulator #165

carlaKC · 2024-02-01T13:52:14Z

This PR is a WIP that adds the ability to run SimLN without any real lightning nodes, instead providing "simulated nodes" that copy the the routing policies and liquidity constraints of real nodes. The last commit hard-codes a graph to run this "full simulation" on as an example, whereas in reality we'd load from a json description of the desired topology/policies.

High Level Overview

Lightning implementations in SimLN are abstracted using the LightningNode trait. The steps that we follow to support a simulated implementation of this trait (SimNode) are as follows:

Run the simulator with a –from_graph option, which allows starting the simulator with a graph file (format tbd) that provides channel policies and topology (not included in this PR).
Use a single, top-level SimGraph struct to manage:
- A hashmap of simulated channels available
- Propagation of payments through the simulated channels
Implement LightningNode on a SimNode struct which:
- Handles implementation of send/tracking functionality
- Points to the coordinating SimGraph struct to manage payment dispatch.

Graph Abstraction

Management of payments through the simulated network is separated managed with the following layers:

SimGraph: high level coordinator that keeps a hashmap of each simulated channel (see below) in the network, along with their forwarding policies. Responsible for coordinating addition/removal of HTLCs as they propagate through the network.
SimNode: implementation of the LightningNode trait which rely on the SimGraph to propagate payments on their behalf.

These structs are primarily used to provide send/track payment APIs, receiving updates on the state of their payments from SimGraph.

Channel State Machine

Implementation of channel state is broken down into three layers:

Channel Policy: representation of the directional policy of a participant (fees etc)
Channel State: tracks the “live” state of the outgoing state of the channel
Simulated Channel: tracks state for each direction of the channel, and high level information

Each channel state is responsible for handling tracking of liquidity and HTLC limits in one direction, and the simulated channel is responsible for enforcing sanity checks across the channel (eg, that we don’t exceed our capacity).

An example state machine update is provided in the table below:

Step Alice Channel State Bob Channel State

Channel Opened by Alice Local_balance: 100_000 / In_Flight: nil Local_balance: 0 / In_Flight: Nil

Alice sends 100 sats to Bob Local_balance: 999_000 / In_Flight: 100 sats Local_balance: 0 / In_Flight: Nil

Alice payment succeeds Local_balance: 999_000 / In_Flight: nil Local_balance: 100 / In_Flight: Nil

Bob sends 50 sats to Alice Local_balance: 999_000 / In_Flight: nil Local_balance: 50 / In_Flight: 50

Bob payment fails Local_balance: 999_000 / In_Flight: nil Local_balance: 100 / In_Flight: nil

tl;dr

enigbe · 2024-02-09T03:20:27Z

sim-lib/Cargo.toml

@@ -15,7 +15,9 @@ expanduser = "1.2.2"
 serde = { version="1.0.183", features=["derive"] }
 serde_json = "1.0.104"
 bitcoin = { version = "0.30.1", features=["serde"] }
-lightning = { version = "0.0.116" }
+# LDK is on a different version to bitcoin to us, which leads to some issues, import so we can use its types.


Are there reasons why bitcoin_ldk types are being used as opposed to sticking with types from bitcoin?

Long and painful story that's now resolved, but for prosperity:

We were relying of a different version of the bitcoin dependency to LDK:

sim-ln -> bitcoin-30

sim-ln -> lightning-117 -> bitocin-29

We couldn't downgrade bitcoin-30, because we're using a feature it it. But to use lightning-117 (specifically in pathfinding) we needed to provide the public key struct type in bitcoin-29. So this alias allowed me to import both types. This was messy and gross and thankfully solved by another LDK release that updated the bitcoin version.

sim-lib/src/sim_node.rs

enigbe

Added a handful of small comments on code style and a question about using types from LDK's bitcoin library.

This would benefit from documentation comments across several structs and methods but expect they'd be updated before this is ready

okjodom · 2024-02-09T06:39:37Z

oh, my! 🎆 🥇

carlaKC · 2024-02-09T21:55:26Z

Pushed some additional docs + cleanup, no semantic changes!

carlaKC · 2024-02-16T16:18:26Z

Pushed a off by one bug that I caught writing tests!

sr-gi

I think the design makes sense overall. I may need to give this a second pass down the line (the diff is huge 😅 ) but my current understanding is as follows:

SimNode is the main class that mimics our current node implementations, and hence it implements LightningNode. It has a shared reference to a network (Arc<Mutex<SimGraph>>) that represents the simulated network. The network includes all channels, which have a breakdown in classes all the way down to the channel policies. The construction of a simulation works as usual: we create a list of clients that are instances that implement LightningNode (in this case SimNodes), when a payment is made, the simulated node accesses the graph via its shared pointer and triggers all the state transitions, which are managed by the SimGraph.

I left a bunch of suggestions, most (if not all of them) can be found here: sr-gi@67f810e. Feel free to cherry-pick as desired.

Some general comments:

I don't think 604019d should be needed. We can have a different constructor for the simulator when calling it with a simulated graph (Simulator::with_sim_channels or something) that receives all the usual but instead of a map of clients, receives a collection of simulated channels. That constructor could have the functionality of ln_node_from_graph to build the "clients", plus construct (sr-gi@67f810e#diff-f891364aff98268ac6d94109bd61594c23af50e37e547fec9b0ea44a816459a0R390-R421). Doing this gets rids of a bunch of imports, plus having to define the triggers in the CLI and so on.
I think d2ea936 should be squashed, I don't think it's meaningful enough to be a commit on its own (but up to you in the end)

sim-lib/Cargo.toml

sim-lib/src/sim_node.rs

carlaKC · 2024-02-27T13:55:25Z

Thanks for the mega-review @sr-gi 🫶

Not sure if this is helpful, but I've pushed a single fixup for each commit with the suggested changes (each fixup directly follows its original commit) with the hope of making round two a little easier! If it's not useful I'll just go ahead and squash.

Added a few responses inline where the big-picture stuff needs some discussion.

carlaKC · 2024-02-27T13:59:10Z

Also, I've started on some unit test coverage for this but going to put it in as a separate PR because this one is already far too big :')

sim-lib/src/sim_node.rs

sr-gi

Looks good! 🎉

I can confirm most of the comments have been addressed or refuted, just left a few comments on the open discussion plus things that may have been passed unnoticed.

I'm happy with the fixups to be squashed at this point.

sim-lib/src/sim_node.rs

sr-gi · 2024-02-29T19:56:58Z

sim-lib/src/sim_node.rs

+        let preimage = PaymentPreimage(rand::random());
+        let preimage_bytes = Sha256::hash(&preimage.0[..]).to_byte_array();
+        let payment_hash = PaymentHash(preimage_bytes);


I opened a PR upstream to add this, feels like a nice utility to have as part of the API: lightningdevkit/rust-lightning#2916

sr-gi · 2024-02-29T20:57:32Z

sim-lib/src/sim_node.rs

        for channel in graph_channels.iter() {
            channels.insert(channel.short_channel_id, channel.clone());


Iterating taking ownership (into_iter) and moving the insert after the internal for loop will allow you to drop the channel clone

sim-lib/src/sim_node.rs

sr-gi · 2024-02-29T21:39:27Z

sim-lib/src/sim_node.rs

+        match self.nodes.get(node) {
+            Some(channels) => Ok((node_info(*node), channels.clone())),
+            None => Err(LightningError::GetNodeInfoError(
+                "Node not found".to_string(),
+            )),
+        }
+    }


This was not addressed, but feel free to disregard it (just keeping track of what was covered)

sim-lib/src/sim_node.rs

sr-gi · 2024-02-29T21:58:59Z

sim-cli/src/main.rs

+    let (shutdown_trigger, shutdown_listener) = triggered::trigger();
    let sim = Simulation::new(
        clients,
        validated_activities,
        cli.total_time,
        cli.expected_pmt_amt,
        cli.capacity_multiplier,
        write_results,
+        (shutdown_trigger, shutdown_listener),


I think this can be avoided if the additional constructor approach is considered:

sr-gi@67f810e#diff-f891364aff98268ac6d94109bd61594c23af50e37e547fec9b0ea44a816459a0R390-R421

and

sr-gi@67f810e#diff-094f58aee7f261f0f15154547c8e32f0d54ed2fdea0963b6e8b630feff954d54R205-R213

Oh very nice, I think I'm going to leave this for a follow up because I'll drop the commit where we actually use the simulator before merge (to be followed up with surfacing the config options to use it).

carlaKC · 2024-03-11T20:37:03Z

Squashed + addressed last comments. I've removed the commit which actually uses the simulator, because we need to add a configuration option for an end user to provide that format (--simulate-network, or something like that). Only major change is that I added a ShortChannelID and went with a struct so that we can impl Display on it.

Will follow up with a PR adding tests + that option.

sr-gi · 2024-03-12T19:44:16Z

ACK f199c2f

Last comment, feel free to add it as a follow-up if you find this useful:

You can add a couple of utility functions to ShortChannelID to convert back and forth from/to u64, that way you can simply call x.into() in the code when converting, without having to even use the wrappers/inner values:

/// Utility function to easily convert from u64 to `ShortChannelID`
impl From<u64> for ShortChannelID {
    fn from(value: u64) -> Self {
        ShortChannelID(value)
    }
}

/// Utility function to easily convert `ShortChannelID` into u64
impl From<ShortChannelID> for u64 {
    fn from(scid: ShortChannelID) -> Self {
        scid.0
    }
}

Here are a couple of examples:

- short_channel_id: channel.short_channel_id.0,
+ short_channel_id: channel.short_channel_id.into(),

- let scid = ShortChannelID(hop.short_channel_id);
+ let scid = hop.short_channel_id.into();

- return Err(ForwardingError::ChannelNotFound(ShortChannelID(
-                    hop.short_channel_id,
-                )))
+ return Err(ForwardingError::ChannelNotFound(
+                   hop.short_channel_id.into(),
+              ))

This commit adds a ChannelState struct which is used to track the policy and state of a channel in the *outgoing* direction. This will be used to check forwards against the node's advertised policy and track the movement of outgoing HTLCs through the channel. Note that we choose to implement this state *unidirectionally*, so a single channel will be represented by two ChannelState structs (one in each direction).

Add a single representation of a simulated lightning channel which uses our state representation to add and remove htlcs. For simplicity, each side of the channel is represented as a separate state, with the only interaction between the two through the changes to local balance that happen when we settle htlcs.

Add an implementation of the LightningNode trait that represents the underlying lightning node. This implementation is intentionally kept simple, depending on some SimNetwork trait to handle the mechanics of actually simulating the flow of payments through a simulated graph.

We want to be able to distinguish between expected and critical payment errors in our simulated payments. An expected error occurs due to a lightning-related failure (such as running out of liquidity), and a critical one happens because something has gone wrong with our simulator (for example, an assertion failure about balances). This commit adds an some utilities to ForwardingError to make this distinction and display errors properly.

Add an implementation of our SimNetwork trait that will do the heavy lifting of propagating htlcs through a simulated network.

carlaKC · 2024-03-13T14:15:41Z

You can add a couple of utility functions to ShortChannelID to convert back and forth from/to u64, that way you can simply call x.into() in the code when converting, without having to even use the wrappers/inner values:

Added in last push!

carlaKC · 2024-03-13T14:20:09Z

Thanks for the great review on this @sr-gi 🫶🫶🫶

sr-gi · 2024-03-13T14:36:28Z

Anytime :D

carlaKC force-pushed the sim-node branch from 67de925 to 87996e0 Compare February 6, 2024 14:27

enigbe reviewed Feb 9, 2024

View reviewed changes

sim-lib/src/sim_node.rs Outdated Show resolved Hide resolved

enigbe reviewed Feb 9, 2024

View reviewed changes

sim-lib/src/sim_node.rs Outdated Show resolved Hide resolved

enigbe reviewed Feb 9, 2024

View reviewed changes

sim-lib/src/sim_node.rs Outdated Show resolved Hide resolved

enigbe reviewed Feb 9, 2024

View reviewed changes

carlaKC force-pushed the sim-node branch 2 times, most recently from d58c28d to 8785cf6 Compare February 9, 2024 21:54

carlaKC requested a review from sr-gi February 12, 2024 14:20

carlaKC mentioned this pull request Feb 12, 2024

Feature: Simulation Time #81

Open

carlaKC force-pushed the sim-node branch 2 times, most recently from 4dc5455 to 3fb2d0d Compare February 16, 2024 16:17

sr-gi reviewed Feb 16, 2024

View reviewed changes

carlaKC force-pushed the sim-node branch from 3fb2d0d to 5737338 Compare February 27, 2024 13:49

carlaKC marked this pull request as ready for review February 27, 2024 13:52

carlaKC changed the title ~~[draft]: Add Ability to Run as Full Simulator~~ Add Ability to Run as Full Simulator Feb 27, 2024

carlaKC requested a review from sr-gi February 27, 2024 13:58

sr-gi reviewed Feb 29, 2024

View reviewed changes

sim-lib/src/sim_node.rs Outdated Show resolved Hide resolved

sim-lib/src/sim_node.rs Outdated Show resolved Hide resolved

sr-gi reviewed Feb 29, 2024

View reviewed changes

cargo: update to lightning 121

2447510

carlaKC force-pushed the sim-node branch from 5737338 to f199c2f Compare March 11, 2024 20:22

carlaKC requested review from sr-gi and enigbe March 11, 2024 20:46

sr-gi mentioned this pull request Mar 12, 2024

Refactor: Use From<PaymentPreimage> when constructing PaymentHash #172

Open

sr-gi approved these changes Mar 12, 2024

View reviewed changes

carlaKC added 6 commits March 13, 2024 09:50

sim-lib: add graph implementation of SimNetwork trait

6ea47bb

Add an implementation of our SimNetwork trait that will do the heavy lifting of propagating htlcs through a simulated network.

sim_node: add helper functions to produce simulation graph and nodes

f155904

carlaKC force-pushed the sim-node branch from f199c2f to f155904 Compare March 13, 2024 14:14

carlaKC merged commit b66a84a into bitcoin-dev-project:main Mar 13, 2024
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Ability to Run as Full Simulator #165

Add Ability to Run as Full Simulator #165

carlaKC commented Feb 1, 2024

enigbe Feb 9, 2024

carlaKC Feb 9, 2024

enigbe left a comment

okjodom commented Feb 9, 2024

carlaKC commented Feb 9, 2024

carlaKC commented Feb 16, 2024

sr-gi left a comment

carlaKC commented Feb 27, 2024

carlaKC commented Feb 27, 2024

sr-gi left a comment

sr-gi Feb 29, 2024

sr-gi Feb 29, 2024

sr-gi Feb 29, 2024

sr-gi Feb 29, 2024

carlaKC Mar 11, 2024

carlaKC commented Mar 11, 2024 •

edited

Loading

sr-gi commented Mar 12, 2024 •

edited

Loading

carlaKC commented Mar 13, 2024

carlaKC commented Mar 13, 2024

sr-gi commented Mar 13, 2024

Step	Alice Channel State	Bob Channel State
Channel Opened by Alice	Local_balance: 100_000 / In_Flight: nil	Local_balance: 0 / In_Flight: Nil
Alice sends 100 sats to Bob	Local_balance: 999_000 / In_Flight: 100 sats	Local_balance: 0 / In_Flight: Nil
Alice payment succeeds	Local_balance: 999_000 / In_Flight: nil	Local_balance: 100 / In_Flight: Nil
Bob sends 50 sats to Alice	Local_balance: 999_000 / In_Flight: nil	Local_balance: 50 / In_Flight: 50
Bob payment fails	Local_balance: 999_000 / In_Flight: nil	Local_balance: 100 / In_Flight: nil

		for channel in graph_channels.iter() {
		channels.insert(channel.short_channel_id, channel.clone());

Add Ability to Run as Full Simulator #165

Add Ability to Run as Full Simulator #165

Conversation

carlaKC commented Feb 1, 2024

High Level Overview

Graph Abstraction

Channel State Machine

tl;dr

enigbe Feb 9, 2024

Choose a reason for hiding this comment

carlaKC Feb 9, 2024

Choose a reason for hiding this comment

enigbe left a comment

Choose a reason for hiding this comment

okjodom commented Feb 9, 2024

carlaKC commented Feb 9, 2024

carlaKC commented Feb 16, 2024

sr-gi left a comment

Choose a reason for hiding this comment

carlaKC commented Feb 27, 2024

carlaKC commented Feb 27, 2024

sr-gi left a comment

Choose a reason for hiding this comment

sr-gi Feb 29, 2024

Choose a reason for hiding this comment

sr-gi Feb 29, 2024

Choose a reason for hiding this comment

sr-gi Feb 29, 2024

Choose a reason for hiding this comment

sr-gi Feb 29, 2024

Choose a reason for hiding this comment

carlaKC Mar 11, 2024

Choose a reason for hiding this comment

carlaKC commented Mar 11, 2024 • edited Loading

sr-gi commented Mar 12, 2024 • edited Loading

carlaKC commented Mar 13, 2024

carlaKC commented Mar 13, 2024

sr-gi commented Mar 13, 2024

carlaKC commented Mar 11, 2024 •

edited

Loading

sr-gi commented Mar 12, 2024 •

edited

Loading