Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: transaction with no input/output states was not delegated properly #502

Closed
awrichar opened this issue Jan 6, 2025 · 4 comments · Fixed by #505
Closed

bug: transaction with no input/output states was not delegated properly #502

awrichar opened this issue Jan 6, 2025 · 4 comments · Fixed by #505
Labels
bug Something isn't working

Comments

@awrichar
Copy link
Contributor

awrichar commented Jan 6, 2025

What happened?

I was testing some changes to the Noto domain, part of which involved building a transaction that had only "read" and "info" states, and no "input" or "output" states. The transaction flow seemed to get stuck when it should have delegated to the notary. While I saw the first few steps of the transaction flow happen on node 2 (the sender node), I saw no record at all of the transaction on node 1 (the notary node). When I changed the flow to have "inputs" and "outputs" after assembly, it seemed to work.

What did you expect to happen?

Transactions should always be delegated and submitted properly, regardless of what types of states they contain.

How can we reproduce it (as minimally and precisely as possible)?

See attached logs. Transaction was 046a84d1-bf4e-4ab2-8fa7-74844f4a3306.

paladin1.log
paladin2.log
paladin3.log

Anything else we need to know?

I'm not positive that this is related to read/info states, but that seems to be the only obvious difference in the flows that worked and those that did not.

OS version

No response

@awrichar awrichar added the bug Something isn't working label Jan 6, 2025
@awrichar
Copy link
Contributor Author

awrichar commented Jan 7, 2025

After looking into the logs, I see node2 attempting to delegate to node1 (notary) at this point:

[2025-01-06T23:23:27.300Z]  INFO GRPC sending message id=df849ab4-f5a8-4fbe-85b9-59f85effe654 cid=<nil> component=private-tx-manager messageType=DelegationRequest replyTo=node2 to peer node1 TRANSPORT=81db5764-37ed-4c42-9f68-c060c258b57c plugin=81db5764-37ed-4c42-9f68-c060c258b57c

But the first line in the node1 logs is this (note the later timestamp):

[2025-01-06T23:23:28.648Z]  INFO debug server listening on 127.0.0.1:6060 pid=1

This looks like node1 crashed and came back up, and apparently it does not remember to keep processing transactions afterward. That leaves two actions:

  1. Reproduce and determine the reason for the crash
  2. Make nodes more resilient to keep processing transactions if they crash and come back up

@awrichar
Copy link
Contributor Author

awrichar commented Jan 7, 2025

The crash:

[2025-01-07T19:55:07.336Z] ERROR Error getting state distributions: PD011831: Invalid transaction state for state distribution pid=1 role=pctm-loop-0xad3c780a19ec30d6d2f49b89acc457855c3be4cf
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0xffff46f270a8]

goroutine 1058 [running]:
github.com/kaleido-io/paladin/core/internal/privatetxnmgr.(*transactionFlow).applyTransactionAssembledEvent(0x400096f380, {0xffff47693798, 0x40008b9680}, 0x4000a3cdc0)
        /app/core/go/internal/privatetxnmgr/transaction_flow_mutators.go:125 +0x1b8
github.com/kaleido-io/paladin/core/internal/privatetxnmgr.(*transactionFlow).ApplyEvent(0x400096f380, {0xffff47693798, 0x40008b9680}, {0xffff47693b50, 0x4000a3cdc0})
        /app/core/go/internal/privatetxnmgr/transaction_flow_mutators.go:43 +0x294
github.com/kaleido-io/paladin/core/internal/privatetxnmgr.(*Sequencer).handleTransactionEvent(0x4000cb0600, {0xffff47693798, 0x40008b9680}, {0xffff47693b50, 0x4000a3cdc0})
        /app/core/go/internal/privatetxnmgr/sequencer_event_loop.go:101 +0x1b8
github.com/kaleido-io/paladin/core/internal/privatetxnmgr.(*Sequencer).evaluationLoop(0x4000cb0600)
        /app/core/go/internal/privatetxnmgr/sequencer_event_loop.go:48 +0x270
created by github.com/kaleido-io/paladin/core/internal/privatetxnmgr.(*Sequencer).Start in goroutine 1150
        /app/core/go/internal/privatetxnmgr/sequencer.go:325 +0x108

@awrichar
Copy link
Contributor Author

awrichar commented Jan 7, 2025

I can't quite figure out what the nil pointer is on that line.

However, this logic does appear to have a hole in it, if you have info states but no output states:

if tf.transaction.PostAssembly.OutputStatesPotential != nil && tf.transaction.PostAssembly.OutputStates == nil {

@peterbroadhurst
Copy link
Contributor

This looks like node1 crashed and came back up, and apparently it does not remember to keep processing transactions afterward.

Yes. This is a significant TODO that @hosie owns, and is intertwined with engineering work on #463 currently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
2 participants