Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Scheduler Memory Leaks #712

Merged

Conversation

benjchristensen
Copy link
Member

The NewThreadScheduler, CurrentThreadScheduler and ExecutorScheduler all had memory leaks when doing recursion with the Func2 method signature. This pull request fixes that along with improving the unit test coverage.

The fix involved treating "outer" and "inner" schedulers differently, with "inner" being the place where recursion happens.

The memory behavior can be tested using TestRecursionMemoryUsage.

This fixes the problems reported in #643 and #648 but does not change the Scheduler or Subscription interfaces or public implementation details.

NewThreadScheduler is working, the other two are not so commented out for now until fixed.
- the Action0 method did not have a leak
- the Func2 method on inner scheduler recursion did have a leak
- passing for all but ExecutorScheduler
- new InnerExecutorScheduler and childSubscription
- improvements to unit tests
- Current/Immediate/NewThread/Executor Schedulers are passing unit tests
- Current/NewThread/Executor Schedulers do not leak memory on the recursion test (Immediate can’t be used for recursion otherwise it stack overflows)
- use Long instead of Int so we don’t overflow
- migrate from deprecated method
- outer/inner scheduling so nested order is correct while not deadlocking on certain nested use cases as found in previous testing
- introduced a bug during refactor, caught it while updating unit tests
- merged all scheduler tests into the same package
- using inheritance so that the same tests are applied to each of the different Scheduler implementations
- manual test (too slow for normal execution) can be run to test memory leaks (TestRecursionMemoryUsage.java)
@benjchristensen
Copy link
Member Author

@headinthebox working on this made me think our work on Outer/Inner schedulers may still be worthwhile as the distinction is important in usage, particularly for recursion. The branch is at https://github.com/benjchristensen/RxJava/blob/scheduler-refactor-inner-outer/rxjava-core/src/main/java/rx/Scheduler.java#L104

Many if not all of the issues I found were related to treating outer and inner schedulers the same.

@cloudbees-pull-request-builder

RxJava-pull-requests #629 FAILURE
Looks like there's a problem with this pull request

@benjchristensen
Copy link
Member Author

Apparently I have some non-deterministic tests:

rx.schedulers.ExecutorSchedulerTests.recursionUsingFunc2
rx.schedulers.ExecutorSchedulerTests.recursionUsingAction0

I'll spend some time looking at them a little later before I merge this.

If anyone else wants to review or provide guidance on improving this please jump in.

}));

// replace the InnerExecutorScheduler child subscription with this one
childSubscription.set(s);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is possible that when the execution gets here, another recursive scheduling has taken place on another thread (due to a multi-threaded pool and L172) and childSubscription might contain a newer subscription. This will however set it back to an older one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I'm aware of that. Note the comments I wrote about this right below (starting at line 182):

 /*
  * TODO: Consider what will happen if `schedule` is run concurrently instead of recursively
  * and we lose subscriptions as the `childSubscription` only remembers the last one scheduled.
  * 
  * Not obvious that this should ever happen. Can it?
  * 
  * benjchristensen => Haven't been able to come up with a valid test case to prove this as an issue
  * so it may not be.
 */

Can you provide a use case where this issue actually occurs? I can't think of one that is legit. I tried writing a few forced ones where I called the inner schedule twice but they were very wrong uses of a Scheduler.

An Observable is single-threaded, thus when it is recursing on an inner scheduler it should only ever have a single childSubscription scheduled and only schedule a new task at the end of completing the current one.

That said I think ExecutorService is fundamentally flawed as it currently stands.

  1. It allows concurrent execution if used in a pattern other than sequential recursion. In other words, it doesn't protect against misuse.
  2. It jumps threads which could theoretically cause problems (though this doesn't seem to cause immediate issues in the tests I've done: https://github.com/benjchristensen/RxJava/blob/scheduler-memory-leak/rxjava-core/src/test/java/rx/schedulers/ExecutorSchedulerTests.java#L42)
  3. Performance is very bad when recursing (probably due to jumping threads): Recursion on ExecutorService with >1 Thread is Slow #711

I still think it's valid to have a "thread pool" model so every scheduled action isn't launching a thread, and so we don't have unbounded threads for computational work, but I'm considering that maybe we should have a pool of NewThreadScheduler type event loops where the work is performed. It would be a nice to find a way of doing that though that didn't allow a single event-loop being saturated to cause an Observable to become slow/blocked when other threads (event-loops) are idle. That is one benefit of using the executor thread-pool. This assumes there is not a strong reason to never leave the thread once chosen. Is there a reason to prevent onNext calls on a single Observable from moving between threads as long as they are sequential? For performance reasons we want to prefer staying on a single thread, but is there any reason not to move across threads?

The problem would remain though if someone used their own Executor. Perhaps the ExecutorScheduler should not exist and raw thread-pools are inappropriate for Rx Schedulers?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, I can't provide such test case; I heard of concurrency tools which can affect thread scheduling and expose bugs, but I don't have any concrete tips. I could "manually" trigger the case where I set a breakpoint on L181 and let the second invocation go through first. My IncrementalSubscription was designed to prevent this out-of-order case. Beyond that, maybe it is worth asking the concurrency-interest list for advice on ExecutorService.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I too can artificially trigger the problem, just can't think of a real Observable use case.

I'm going to explore a pool of event-loops for the 'computation' scheduler and think more about whether ExecutorScheduler can be used differently to avoid these problems.

Happy New Year by the way! It's been awesome having you involved in this project.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to proceed with the merge as this issue with ExecutorScheduler is an edge case and less of a problem than the issue of recursion having a memory leak.

I've created a new issue for fixing the ExecutorScheduler: #713

- Increasing timeout by a lot to handle slow machines such as this: https://netflixoss.ci.cloudbees.com/job/RxJava-pull-requests/629/testReport/junit/rx.schedulers/ExecutorSchedulerTests/recursionUsingFunc2/
- The timeout is only there if a deadlock or memory leak occurs (which is what this PR is fixing) so when everything is healthy it does not timeout
@cloudbees-pull-request-builder

RxJava-pull-requests #630 FAILURE
Looks like there's a problem with this pull request

- this test does a flatMap which uses merge which has non-deterministic ordering since the Observable.from can be on a new thread each time
@cloudbees-pull-request-builder

RxJava-pull-requests #631 SUCCESS
This pull request looks good

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants