Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

overflow when subtracting duration from instant #1805

Closed
kevholder opened this issue Jul 23, 2020 · 5 comments · Fixed by #1806
Closed

overflow when subtracting duration from instant #1805

kevholder opened this issue Jul 23, 2020 · 5 comments · Fixed by #1806
Assignees

Comments

@kevholder
Copy link

Do you want to request a feature or report a bug?

bug

What is the current behavior?

panic

If the current behavior is a bug, please provide the steps to reproduce and if possible a minimal demo of the problem.

    Finished release [optimized] target(s) in 0.30s
     Running `target/release/graph-node --postgres-url 'postgresql://user@localhost:5432/graph-node' --ethereum-rpc 'mainnet:http://localhost:8333' --ipfs '127.0.0.1:5001'`
Jul 23 17:40:17.334 INFO Graph Node version: 0.18.0 :: fraction+4 (dc826116b 2020-07-20) dirty 1 modification
Jul 23 17:40:17.335 INFO Starting up
Jul 23 17:40:17.335 INFO Trying IPFS node at: http://127.0.0.1:5001/
Jul 23 17:40:17.335 INFO Creating transport, url: http://localhost:8333, network: mainnet
Jul 23 17:40:17.335 INFO Connecting to Postgres, conn_pool_size: 10, url: postgresql://user@localhost:5432/graph-node
Jul 23 17:40:17.335 INFO Successfully connected to IPFS node at: http://127.0.0.1:5001/
Jul 23 17:40:17.395 INFO Connected to Postgres, url: postgresql://user@localhost:5432/graph-node, component: Store
Jul 23 17:40:17.402 INFO Connecting to Ethereum..., network: mainnet
Jul 23 17:40:17.412 INFO Connected to Ethereum, network_version: 1515, network: mainnet
Jul 23 17:40:17.413 INFO Waiting for other graph-node instances to finish migrating, component: Store
Jul 23 17:40:17.413 INFO Running migrations, component: Store
Jul 23 17:40:17.414 INFO Migrations finished, component: Store
Jul 23 17:40:17.414 INFO Completed pending Postgres schema migrations, component: Store
Jul 23 17:40:17.415 INFO Creating LoadManager in disabled mode, component: LoadManager
thread 'tokio-runtime-worker' panicked at 'overflow when subtracting duration from instant', src/libstd/time.rs:374:9
stack backtrace:
   0:        0x10a8454ce - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::he3a6eecb2201efb0
   1:        0x10a86bc0c - core::fmt::write::h8ef98027ac1df1be
   2:        0x10a83e4f7 - std::io::Write::write_fmt::hed52d76b85d70756
   3:        0x10a8474c5 - std::panicking::default_hook::{{closure}}::h013242fa408d10b7
   4:        0x10a847202 - std::panicking::default_hook::h9777ceea989331ac
   5:        0x10a847a7d - std::panicking::rust_panic_with_hook::h845f7f3fcb3a3323
   6:        0x10a847652 - rust_begin_unwind
   7:        0x10a8da2ef - core::panicking::panic_fmt::h60d70f2d73364aac
   8:        0x10a8da17a - core::option::expect_failed::hf9b785662d036c95
   9:        0x10a844f19 - <std::time::Instant as core::ops::arith::Sub<core::time::Duration>>::sub::hbd31f6699f9ae13c
  10:        0x10a18b3a2 - graph::data::graphql::effort::LoadManager::new::h1aa05dcbf988a5b8
  11:        0x1098be4df - graph_node::main::{{closure}}::{{closure}}::h123330bafa15cc7b
  12:        0x109888a17 - futures::future::chain::Chain<A,B,C>::poll::h84161ac93583aa73
  13:        0x1095b0d98 - futures::task_impl::std::set::ha3144a263a9f12e0
  14:        0x1096bc7ab - <futures_util::future::try_future::unwrap_or_else::UnwrapOrElse<Fut,F> as core::future::future::Future>::poll::h67450d6bc701ec96
  15:        0x1095adbc7 - tokio::runtime::task::core::Core<T,S>::poll::h6ec408dbb75190ab
  16:        0x10974a60c - <std::panic::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once::h9abf196fcc8f2e54
  17:        0x10982f009 - tokio::runtime::task::harness::Harness<T,S>::poll::he175c130153063fc
  18:        0x10a7a49e8 - std::thread::local::LocalKey<T>::with::hc21c62c4880aabad
  19:        0x10a7ad976 - tokio::runtime::thread_pool::worker::Context::run_task::h8ef1a1f41062fb83
  20:        0x10a7ad0be - tokio::runtime::thread_pool::worker::Context::run::hab70af7a5f5e42aa
  21:        0x10a7b6093 - tokio::macros::scoped_tls::ScopedKey<T>::set::h452e2db85536cc57
  22:        0x10a7acbc4 - tokio::runtime::thread_pool::worker::run::hbcc4820362ed3c70
  23:        0x10a7b130c - tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut::h87c9010611192b97
  24:        0x10a7b942b - tokio::runtime::task::harness::Harness<T,S>::poll::ha5044858cf73012a
  25:        0x10a7b2626 - tokio::runtime::blocking::pool::Inner::run::h8fd199bf2a4eb989
  26:        0x10a7a29e0 - tokio::runtime::context::enter::hc9fb6582dd04dfba
  27:        0x10a79e076 - std::sys_common::backtrace::__rust_begin_short_backtrace::h44f355ea4f18ed28
  28:        0x10a7a8869 - core::ops::function::FnOnce::call_once{{vtable.shim}}::h3343eb32514d6f5c
  29:        0x10a84ac3d - std::sys::unix::thread::Thread::new::thread_start::h0031c4f1bba2584f
  30:     0x7fff72333109 - _ZL12preoptimized
Panic in tokio task, aborting!
./run.sh: line 7: 31030 Abort trap: 6           cargo run -p graph-node --release -- --postgres-url postgresql://user@localhost:5432/graph-node --ethereum-rpc mainnet:http://localhost:8333 --ipfs 127.0.0.1:5001

What is the expected behavior?

No panic.

Any idea what is causing this panic? Strangely, I am fairly sure this error was not happening yesterday with the same command-line arguments. I tried resetting the database and IPFS to no avail. I am testing with a heavily modified version of Ethereum API so that might be the culprit but still it would help to have a pointer in the right direction.

@Jannis Jannis assigned Jannis and lutter and unassigned Jannis Jul 23, 2020
@kevholder
Copy link
Author

Except for the postgresql://user@localhost:5432/graph-node , does graph-node keep some state anywhere else, e.g. on the file system? I am trying to start it from scratch again because I swear it was working yesterday.

@kevholder
Copy link
Author

I'm using macOS 10.15.5 and rustc 1.45.0

Possibly related issue: rust-lang/rust#48980

@kevholder
Copy link
Author

Oh found something interesting: rust-lang/rust#48980 (comment)

Maybe it fails because I rebooted my computer less than a day ago?

@kevholder
Copy link
Author

Yep, changing let long_ago = Duration::from_secs(86400); to let long_ago = Duration::from_secs(1); fixes the issue.

@lutter
Copy link
Collaborator

lutter commented Jul 24, 2020

Oh wow .. thanks for doing such a thorough investigation! Using time since boot is just mean ;) I'll change it so it doesn't fail no matter when the computer was booted.

lutter added a commit that referenced this issue Jul 24, 2020
On Mac OSX, Instant::now() is the time since the last boot, which might
have been less than a day ago, in which case KillState::new would panic. We
now use 'now - 60s' as a time in the past for initialization, and if that
fails, we'll use just 'now'. There's not much danger in using 'now', it
might just lead to spurious logging and a too-soon update of the kill_rate
if the node becomes immediately overloaded after starting.

Thanks to kevholder for root-causing the issue and demonstrating a fix.

Fixes #1805
lutter added a commit that referenced this issue Jul 24, 2020
On Mac OSX, Instant::now() is the time since the last boot, which might
have been less than a day ago, in which case KillState::new would panic. We
now use 'now - 60s' as a time in the past for initialization, and if that
fails, we'll use just 'now'. There's not much danger in using 'now', it
might just lead to spurious logging and a too-soon update of the kill_rate
if the node becomes immediately overloaded after starting.

Thanks to kevholder for root-causing the issue and demonstrating a fix.

Fixes #1805
lutter added a commit that referenced this issue Jul 24, 2020
On Mac OSX, Instant::now() is the time since the last boot, which might
have been less than a day ago, in which case KillState::new would panic. We
now use 'now - 60s' as a time in the past for initialization, and if that
fails, we'll use just 'now'. There's not much danger in using 'now', it
might just lead to spurious logging and a too-soon update of the kill_rate
if the node becomes immediately overloaded after starting.

Thanks to kevholder for root-causing the issue and demonstrating a fix.

Fixes #1805
@lutter lutter closed this as completed in f2f6ec8 Jul 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants