You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We've deployed a net from scratch, with 4 validator nodes. We started out at 5 sols per node, and delegated a billion sols after. Recently, we reached 4 billion sols active stake (for comparison, solana mainnet has yet to reach 400 million as of writing), and our net crashed. We can't boot it back up either, our last snapshot is shortly before the epoch change and subsequent 4 billion stake milestone.
This is our error log:
[2024-11-14T01:26:25.974747961Z INFO solana_runtime::bank] new epoch stakes, epoch: 119, total_stake: 4000000019999973760
thread 'thread 'thread '<unnamed><unnamed><unnamed>' panicked at ' panicked at ' panicked at /opt/bins/solana/1.17.27/solana-1.17.27/programs/stake/src/stake_state.rs/opt/bins/solana/1.17.27/solana-1.17.27/programs/stake/src/stake_state.rs/opt/bins/solana/1.17.27/solana-1.17.27/programs/stake/src/stake_state.rs:::436436436:::101010:
:
:
called `Option::unwrap()` on a `None` valuecalled `Option::unwrap()` on a `None` valuecalled `Option::unwrap()` on a `None` value
stack backtrace:
[2024-11-14T01:26:26.057024521Z INFO solana_metrics::metrics] datapoint: tpu-verifier recv_batches_us_90pct=0i recv_batches_us_min=0i recv_batches_us_max=0i recv_batches_us_mean=0i verify_batches_pp_us_90pct=0i verify_batches_pp_us_min=0i verify_batches_pp_us_max=0i verify_batches_pp_us_mean=0i discard_packets_pp_us_90pct=0i discard_packets_pp_us_min=0i discard_packets_pp_us_max=0i discard_packets_pp_us_mean=0i dedup_packets_pp_us_90pct=0i dedup_packets_pp_us_min=0i dedup_packets_pp_us_max=0i dedup_packets_pp_us_mean=0i batches_90pct=0i batches_min=0i batches_max=0i batches_mean=0i packets_90pct=0i packets_min=0i packets_max=0i packets_mean=0i num_deduper_saturations=0i total_batches=0i total_packets=0i total_dedup=0i total_excess_fail=0i total_valid_packets=0i total_discard_random=0i total_shrinks=0i total_dedup_time_us=0i total_discard_time_us=0i total_discard_random_time_us=0i total_verify_time_us=0i total_shrink_time_us=0i
[2024-11-14T01:26:26.104300166Z INFO solana_metrics::metrics] datapoint: connection_cache_tpu_quic cache_hits=2i cache_misses=0i cache_evictions=0i eviction_time_ms=0i get_connection_ms=0i get_connection_lock_ms=0i get_connection_hit_ms=0i get_connection_miss_ms=0i make_connection_ms=0i total_connections=0i connection_reuse=2i connection_errors=0i zero_rtt_accepts=0i zero_rtt_rejects=0i congestion_events=0i streams_blocked_uni=0i data_blocked=0i acks=0i num_packets=2i total_batches=2i batch_failure=0i send_timeout=0i average_send_packet_us=0i successful_packets=0i average_prepare_connection_us=0i
[2024-11-14T01:26:26.229872130Z INFO solana_metrics::metrics] datapoint: serve_repair-requests_received total_requests=0i dropped_requests_outbound_bandwidth=0i dropped_requests_load_shed=0i dropped_requests_low_stake=0i whitelisted_requests=0i total_dropped_response_packets=0i handle_requests_staked=0i handle_requests_unstaked=0i processed=0i total_response_packets=0i total_response_bytes_staked=0i total_response_bytes_unstaked=0i self_repair=0i window_index=0i request-highest-window-index=0i orphan=0i serve_repair-request-ancestor-hashes=0i pong=0i window_index_misses=0i ping_cache_check_failed=0i pings_sent=0i decode_time_us=0i handle_requests_time_us=0i err_time_skew=0i err_malformed=0i err_sig_verify=0i err_unsigned=0i err_id_mismatch=0i
0: rust_begin_unwind
at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/panicking.rs:595:5
1: core::panicking::panic_fmt
at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/core/src/panicking.rs:67:14
2: core::panicking::panic
at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/core/src/panicking.rs:117:5
3: solana_stake_program::stake_state::redeem_rewards
[2024-11-14T01:26:26.309572830Z INFO solana_metrics::metrics] datapoint: ancestor_hashes_responses total_packets=0i processed=0i dropped_packets=0i invalid_packets=0i ping_count=0i ping_err_verify_count=0i
4: solana_runtime::bank::Bank::redeem_rewards::{{closure}}::{{closure}}
5: rayon::iter::plumbing::Folder::consume_iter
6: rayon::iter::plumbing::bridge_producer_consumer::helper
[2024-11-14T01:26:26.334952317Z INFO solana_core::window_service] num addresses: 0, top packets by source: []
[2024-11-14T01:26:26.334956735Z INFO solana_metrics::metrics] datapoint: blockstore-insert-shreds num_shreds=0i total_elapsed_us=0i insert_lock_elapsed_us=0i insert_shreds_elapsed_us=0i shred_recovery_elapsed_us=0i chaining_elapsed_us=0i commit_working_sets_elapsed_us=0i write_batch_elapsed_us=0i num_inserted=0i num_repair=0i num_recovered=0i num_recovered_inserted=0i num_recovered_failed_sig=0i num_recovered_failed_invalid=0i num_recovered_exists=0i num_recovered_blockstore_error=0i num_repaired_data_shreds_exists=0i num_turbine_data_shreds_exists=0i num_data_shreds_invalid=0i num_data_shreds_blockstore_error=0i num_coding_shreds_exists=0i num_coding_shreds_invalid=0i num_coding_shreds_invalid_erasure_config=0i num_coding_shreds_inserted=0i
[2024-11-14T01:26:26.334964019Z INFO solana_metrics::metrics] datapoint: recv-window-insert-shreds handle_packets_elapsed_us=0i run_insert_count=0i num_packets=0i num_repairs=0i num_shreds_received=0i shred_receiver_elapsed_us=0i prune_shreds_elapsed_us=0i num_shreds_pruned_invalid_repair=0i num_errors=15i num_errors_blockstore=0i num_errors_other=0i num_errors_try_crossbeam_send=0i num_errors_cross_beam_recv_timeout=15i
7: <rayon_core::job::StackJob<L,F,R> as rayon_core::job::Job>::execute
8: rayon_core::registry::WorkerThread::wait_until_cold
9: rayon_core::registry::ThreadBuilder::run
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
stack backtrace:
0: rust_begin_unwind
at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/panicking.rs:595:5
1: core::panicking::panic_fmt
at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/core/src/panicking.rs:67:14
2: core::panicking::panic
at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/core/src/panicking.rs:117:5
3: solana_stake_program::stake_state::redeem_rewards
4: solana_runtime::bank::Bank::redeem_rewards::{{closure}}::{{closure}}
5: rayon::iter::plumbing::Folder::consume_iter
6: rayon::iter::plumbing::bridge_producer_consumer::helper
7: <rayon_core::job::StackJob<L,F,R> as rayon_core::job::Job>::execute
8: rayon_core::registry::WorkerThread::wait_until_cold
[2024-11-14T01:26:26.396545175Z ERROR solana_metrics::metrics] datapoint: panic program="validator" thread="?" one=1i message="panicked at /opt/bins/solana/1.17.27/solana-1.17.27/programs/stake/src/stake_state.rs:436:10:
called `Option::unwrap()` on a `None` value" location="/opt/bins/solana/1.17.27/solana-1.17.27/programs/stake/src/stake_state.rs:436:10" version="1.17.27 (src:00000000; feat:3746964731, client:SolanaLabs)"
9: rayon_core::join::join_context::{{closure}}
10: rayon_core::registry::in_worker
11: rayon::iter::plumbing::bridge_producer_consumer::helper
[2024-11-14T01:26:26.424312324Z INFO solana_metrics::metrics] datapoint: memory-stats total=16768888832i swap_total=2148528128i free_percent=1.6338187386464036 used_bytes=9155133440i avail_percent=45.40405430722817 buffers_percent=0.4778983318624698 cached_percent=38.08877795057947 swap_free_percent=100
12: <rayon_core::job::StackJob<L,F,R> as rayon_core::job::Job>::execute
13: rayon_core::registry::WorkerThread::wait_until_cold
14: rayon_core::registry::ThreadBuilder::run
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
stack backtrace:
[2024-11-14T01:26:26.452182033Z ERROR solana_metrics::metrics] datapoint: panic program="validator" thread="?" one=1i message="panicked at /opt/bins/solana/1.17.27/solana-1.17.27/programs/stake/src/stake_state.rs:436:10:
called `Option::unwrap()` on a `None` value" location="/opt/bins/solana/1.17.27/solana-1.17.27/programs/stake/src/stake_state.rs:436:10" version="1.17.27 (src:00000000; feat:3746964731, client:SolanaLabs)"
0: rust_begin_unwind
at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/panicking.rs:595:5
1: core::panicking::panic_fmt
at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/core/src/panicking.rs:67:14
2: core::panicking::panic
at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/core/src/panicking.rs:117:5
3: solana_stake_program::stake_state::redeem_rewards
4: solana_runtime::bank::Bank::redeem_rewards::{{closure}}::{{closure}}
5: rayon::iter::plumbing::Folder::consume_iter
6: rayon::iter::plumbing::bridge_producer_consumer::helper
7: <rayon_core::job::StackJob<L,F,R> as rayon_core::job::Job>::execute
8: rayon_core::registry::WorkerThread::wait_until_cold
9: rayon_core::registry::ThreadBuilder::run
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
[2024-11-14T01:26:26.452475536Z ERROR solana_metrics::metrics] datapoint: panic program="validator" thread="?" one=1i message="panicked at /opt/bins/solana/1.17.27/solana-1.17.27/programs/stake/src/stake_state.rs:436:10:
called `Option::unwrap()` on a `None` value" location="/opt/bins/solana/1.17.27/solana-1.17.27/programs/stake/src/stake_state.rs:436:10" version="1.17.27 (src:00000000; feat:3746964731, client:SolanaLabs)"
Since we're seeing a None there right after a checked_mul() and are dealing with ...slightly oversized numbers, we're suspecting an integer overflow here. Then again, u128 should leave ample room to work with on that multiplication, so we're unsure how it could possibly overflow.
Software version is v1.17.34.
We had previously tracked the affected line down expecting a fix or something in the latest versions, so for convenience's sake, here it is: It got moved to rewards.rs following a refactor, albeit with little change to the underlying logic, and we haven't seen any indication that this behavior should vary across versions.
(Note: reopening this here after realizing the old repo is no longer in use, so the links above won't quite match)
Proposed Solution
None, I'm afraid.
The text was updated successfully, but these errors were encountered:
Problem
Hello,
We've deployed a net from scratch, with 4 validator nodes. We started out at 5 sols per node, and delegated a billion sols after. Recently, we reached 4 billion sols active stake (for comparison, solana mainnet has yet to reach 400 million as of writing), and our net crashed. We can't boot it back up either, our last snapshot is shortly before the epoch change and subsequent 4 billion stake milestone.
This is our error log:
Since we're seeing a None there right after a
checked_mul()
and are dealing with ...slightly oversized numbers, we're suspecting an integer overflow here. Then again, u128 should leave ample room to work with on that multiplication, so we're unsure how it could possibly overflow.Software version is v1.17.34.
We had previously tracked the affected line down expecting a fix or something in the latest versions, so for convenience's sake, here it is: It got moved to rewards.rs following a refactor, albeit with little change to the underlying logic, and we haven't seen any indication that this behavior should vary across versions.
(Note: reopening this here after realizing the old repo is no longer in use, so the links above won't quite match)
Proposed Solution
None, I'm afraid.
The text was updated successfully, but these errors were encountered: