Skip to content

Release held LSPS2 HTLCs when abandoning JIT opens#4703

Open
joostjager wants to merge 3 commits into
lightningdevkit:mainfrom
joostjager:lsps2-abandon-fail-intercepted-htlcs
Open

Release held LSPS2 HTLCs when abandoning JIT opens#4703
joostjager wants to merge 3 commits into
lightningdevkit:mainfrom
joostjager:lsps2-abandon-fail-intercepted-htlcs

Conversation

@joostjager

Copy link
Copy Markdown
Contributor

When an LSPS2 JIT open is abandoned after the initial payment has already been intercepted, the service now unwinds both sides of the pending flow: it fails held HTLCs back through LDK and removes the stale intercept SCID lookup.

The tests exercise this through a real intercepted-payment path, so the abandon case now verifies the HTLC release, stale-SCID cleanup, and a fresh follow-up JIT attempt.

@ldk-reviews-bot

ldk-reviews-bot commented Jun 16, 2026

Copy link
Copy Markdown

👋 Thanks for assigning @tnull as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

@joostjager joostjager marked this pull request as ready for review June 16, 2026 14:08
@joostjager joostjager requested a review from tnull June 16, 2026 14:08
Comment thread lightning-liquidity/src/lsps2/service.rs Outdated
@ldk-claude-review-bot

ldk-claude-review-bot commented Jun 16, 2026

Copy link
Copy Markdown
Collaborator

I've reviewed the current state of the code against my prior review notes. The substantive issues were already covered in my previous review pass (the inner-else not failing the HTLC at service.rs:1080/1085, the persist error masking the original failure reason at service.rs:1074, and the stale doc comment at service.rs:990). The earlier deadlock claims were already retracted and remain correctly disregarded.

I examined the full diff again for anything new — the channel_open_abandoned rewrite (lines 1306-1364), the remove_outbound_channel helper, the htlc_intercepted Err/else branches, lock ordering, persist ordering, and the new tests — and found no additional issues beyond what was already flagged.

Review Summary

No new issues found beyond those already posted in the prior review pass.

For reference, the previously posted findings still standing:

  • lightning-liquidity/src/lsps2/service.rs:1080/1085 — inner else branch (reachable via stale SCID left by prune_expired_request_state) returns an error without calling fail_intercepted_htlc, leaving the held HTLC stuck until CLTV timeout, unlike the sibling Err path.
  • lightning-liquidity/src/lsps2/service.rs:1074 — a failed persist_peer_state masks the original htlc_intercepted failure reason (err), degrading diagnostics.
  • lightning-liquidity/src/lsps2/service.rs:990 — doc comment is stale w.r.t. the HTLC now being failed back / unknown-scid HTLC being failed in the recognized-but-stale path.

Cross-cutting (non-blocking, consistent with existing file patterns): in-memory maps are mutated before persist_peer_state is awaited in both new paths, but the state reconstructs consistently from persisted outbound_channels_by_intercept_scid on reload, so it is self-healing.

@joostjager joostjager removed the request for review from tnull June 16, 2026 14:55
// channel map, so pruning can leave the SCID pointing here after
// the channel entry was removed. Fail the HTLC instead of leaving
// it held.
let _ = self.channel_manager.get_cm().fail_intercepted_htlc(intercept_id);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can do this - the LSPS2 flow might not be the only consumer of intercept SCIDs necessarily. Previously the API was even idempotent and not returning an error, so that many handlers could consume LDK's events without interfering with this flow/each other. That changed by now (for better or worse), but I still don't think we should just arbitrary fail something if we don't know we don't own it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, Claude also mentioned something along those lines in its review above. Will make another pass on this.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just did the bookkeeping cleanup, no fail back anymore. Also reworked the tests to be independent. Not as much need anymore to reuse existing code.

@joostjager joostjager force-pushed the lsps2-abandon-fail-intercepted-htlcs branch from 8bf1242 to 2165cde Compare June 17, 2026 09:34
Comment thread lightning-liquidity/src/lsps2/service.rs
Comment thread lightning-liquidity/src/lsps2/service.rs
@joostjager joostjager force-pushed the lsps2-abandon-fail-intercepted-htlcs branch 2 times, most recently from 79a6533 to bfae928 Compare June 17, 2026 10:05
@joostjager

Copy link
Copy Markdown
Contributor Author

Seemed there was also a persist call missing. Added in separate commit.

// the channel entry was removed.
drop(peer_state);
drop(outer_state_lock);
self.peer_by_intercept_scid.write().unwrap().remove(&intercept_scid);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two notes on this branch:

  1. Correction to my earlier deadlock comment: I previously flagged peer_by_intercept_scid.write() here as deadlocking against the if let read guard. That is not the case in this code — the guard is created in the separate let statement at line 1002 (...read().unwrap().get(&intercept_scid).copied()), so it is dropped at that statement's ;, well before this block runs. There is no held read guard here. Please disregard the deadlock concern.

  2. Genuine (minor) issue: this branch is reachable — prune_expired_request_state removes entries from outbound_channels_by_intercept_scid but leaves the SCID in the global peer_by_intercept_scid, so an HTLC can arrive for a stale SCID and land here. Unlike the Err path above (which calls fail_intercepted_htlc), this branch returns an error without failing the held HTLC. The HTLC then stays stuck in ChannelManager until CLTV timeout. Since LSPS2 did recognize the SCID, consider failing it back here too.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tnull what do you think of 2, should we fail back because it did recognize the scid?

Fail queued intercepted HTLCs before removing pending LSPS2
JIT-channel state in channel_open_abandoned. Remove the
abandoned intercept SCID from the global lookup and centralize
per-peer channel cleanup so the state is marked dirty.

Add a real-interception test that verifies abandon releases the
held HTLC from ChannelManager before removing LSPS2 state.
When an LSPS2-owned intercepted HTLC is rejected, remove its
per-peer channel state and global SCID lookup entry before
returning the API error. Copy the SCID owner out before taking the
global write lock so cleanup does not deadlock in Rust 2021.

Keep unknown SCIDs non-destructive so applications can multiplex
interception events across handlers, and cover that with a real
interception test.
@joostjager joostjager force-pushed the lsps2-abandon-fail-intercepted-htlcs branch from bfae928 to 8623f32 Compare June 17, 2026 10:13
Comment thread lightning-liquidity/src/lsps2/service.rs Outdated
When htlc_intercepted rejects an LSPS2-owned HTLC, it fails
the intercept and prunes the per-peer and global SCID mappings
before returning an API error. Persist the peer state after
releasing locks so the cleanup survives restarts.

If persistence also fails, include the original HTLC interception
error in the returned diagnostic instead of hiding it.

@tnull tnull left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, here's a lot going on here, also seemingly unrelated refactorings and the scope of the PR is now different from the original issue, but a rather substantial change now (~300 LoC). Can't say I'm the biggest fan of slipping that in last-minute, but if we still intend to:

a) We should check we actually cover all the cases mentioned in #3468 (and link that issue, given we now seem to address it)
b) Would be good to keep the individual commits less invasive, especially if there's no need to rewrite the pre-existing patterns that are also present in other methods of the service (not saying it's the most beautiful code, but consistency is important).

peer_state.remove_outbound_channel(intercept_scid);
let err = e.err;
drop(peer_state);
drop(outer_state_lock);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need these explicit drops?

if let Some(jit_channel) =
peer_state.outbound_channels_by_intercept_scid.get_mut(&intercept_scid)
{
should_persist = Some(*counterparty_node_id);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need all these changes from reference to copy suddenly?

.remove(&intercept_scid);
// TODO: cleanup peer_by_intercept_scid
return Err(APIError::APIMisuseError { err: e.err });
if let Err(e) = self.persist_peer_state(counterparty_node_id).await

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure we need to eagerly persist here?

},
}
} else {
// `peer_by_intercept_scid` is a separate index from the per-peer

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the 'per-peer channel map'? In which case would pruning leave an entry in peer_by_intercept_scid?

OutboundJITChannelState::PendingInitialPayment { .. }
| OutboundJITChannelState::PendingChannelOpen { .. }
);
let intercepted_htlcs = match &mut jit_channel.state {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need all this seemingly unrelated refactoring?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants