-
Notifications
You must be signed in to change notification settings - Fork 53
Open
Labels
Description
Rain server panics while a task becomes redy here. The relevant part of the log seems to be the following:
...
DEBUG 2018-03-17T15:31:49Z: librain::server::scheduler: Scheduler: New ready task (1,23092)
... [many New ready task info lines, various IDs]
DEBUG 2018-03-17T15:31:49Z: librain::server::scheduler: Scheduler: New ready task (1,23092)
thread 'main' panicked at 'assertion failed: r', src/server/scheduler.rs:148:17
stack backtrace:
0: std::sys::unix::backtrace::tracing::imp::unwind_backtrace
DEBUG 2018-03-17T15:31:49Z: tokio_reactor: loop process - 1 events, 0.000s
at libstd/sys/unix/backtrace/tracing/gcc_s.rs:49
1: std::sys_common::backtrace::_print
at libstd/sys_common/backtrace.rs:71
2: std::panicking::default_hook::{{closure}}
at libstd/sys_common/backtrace.rs:59
at libstd/panicking.rs:207
3: std::panicking::default_hook
at libstd/panicking.rs:223
4: std::panicking::rust_panic_with_hook
at libstd/panicking.rs:402
5: std::panicking::begin_panic
6: librain::server::scheduler::ReactiveScheduler::schedule
7: librain::server::state::State::run_scheduler
8: librain::server::state::<impl librain::common::wrapped::WrappedRcRefCell<librain::server::state::State>>::turn
9: rain::main
10: std::rt::lang_start::{{closure}}
11: std::panicking::try::do_call
at libstd/rt.rs:59
at libstd/panicking.rs:306
12: __rust_maybe_catch_panic
at libpanic_unwind/lib.rs:102
13: std::rt::lang_start_internal
at libstd/panicking.rs:285
at libstd/panic.rs:361
at libstd/rt.rs:58
14: main
15: __libc_start_main
16: _start
DEBUG 2018-03-17T15:31:49Z: tokio_reactor: loop process - 1 events, 0.000s
DEBUG 2018-03-17T15:31:49Z: tokio_reactor::background: shutting background reactor down NOW
...
However, a small test for multiple identical inputs passes, even with subsequent submits. The benchmark only fails with >500 tasks per layer. See the benchmark attached. It was run as python3 scalebench.py net -l 256 -w 1024 -s 0, the error happens around layer 10.
The debug checks with RAIN_DEBUG_MODE=1 do not find any consistency problems.