Lightning Routing Replayer
22 Nov 2024In Lightning, whenever you send a payment, you must first pick a path over which to send it. If you pick that path well, the payment will complete immediately and for (hopefully) a low fee. If you pick poorly, a node in the path may not have enough funds available for your payment and will have to fail the payment back to you, at which point your node will need to pick a new path and try the payment again, resulting in slow payments. If you pick poorly several times, the payment may time out and ultimately fail, even if there may have been sufficient capacity for your payment over some path.
In practice, most pathfinding algorithms in Lightning lean heavily on memory of past payment or probe attempts in order to select paths which are likely to succeed. This allows a Lightning node to learn which channels in the network are usually saturated (and in which direction) and allow it to avoid common logjams in the network. Pathfinding algorithms without memory of past attempts appear to perform quite poor, especially in paying a handful of hard-to-pay nodes which operate as “sinks” - generally receiving more than they send.
Testing new Lightning pathfinding algorithms is quite easy when you aren’t concerned with memory - you simply write your algorithm and try to pay many nodes on the network. The more nodes you manage to pay, the better your algorithm is. However, when you’re dealing with memory you have to test an algorithm over a long time horizon, testing lots of payments and comparing the success over time. This makes evaluating pathfinding changes difficult.
Today, I’m announcing a simple pathfinding evaluator (and accompanying dataset) which allows researchers and developers to test a pathfinding algorithm by replaying many historical probes from the perspective of my Lightning node, comparing the real results with the path probabilities their pathfinding model predicts.
While the use of paths LDK’s router picked biases the framework somewhat, it should be general enough with enough data to still provide a good indication of whether a model change is positive or negative.
You can find the replayer and can learn more from its README in its git repo.