Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support clean shutdown of TaskRunner threads and documentation #6173

Closed
damien-urruty-sonarsource opened this issue Jul 7, 2020 · 28 comments
Labels
bug Bug in existing code

Comments

@damien-urruty-sonarsource

Hello,

I am precisely in a situation where I need to shutdown a client manually.

We are developing an IntelliJ plugins that uses OkHttp (https://sonarlint.org/). Since latest versions of IntelliJ, plugins can be installed and uninstalled without restarting the IDE. It means that when plugin is uninstalled we need to unload all the classes we use, and OkHttp is part of that.

I followed the guidelines here: https://square.github.io/okhttp/4.x/okhttp/okhttp3/-ok-http-client/#shutdown-isnt-necessary, and I use the latest version (4.7.2).

It appears that the steps described in this page are not sufficient. I can still see an instance of TaskRunner and RealBackend, that retains an ThreadPoolExecutor. I had to call also:

((TaskRunner.RealBackend)TaskRunner.INSTANCE.getBackend()).shutdown();

Which is a bit hacky, but this correctly released the executor.

I think this should be at least documented, or maybe improved with a better API. The shutdown method could possibly be defined in the Backend interface.

@damien-urruty-sonarsource damien-urruty-sonarsource added the bug Bug in existing code label Jul 7, 2020
@yschimke
Copy link
Collaborator

yschimke commented Jul 7, 2020

We probably do need some public API to do this, but it's awkward because it's a singleton. So you might kill another user of OkHttp in the same process.

Throwing out some ideas

  1. TaskRunner.shutdown() or TaskRunner.INSTANCE.executor()...
  2. Avoid any singleton e.g. use per root client?
  3. Allow OkHttpClient.Builder to opt out of using the Singleton, and avoid building in that case.

@yschimke
Copy link
Collaborator

@swankjesse Are any of the solutions above workable for you? Public shutdown API, avoid a singleton TaskRunner completely, or only start the TaskRunner singleton on first use and provide opt out?

@swankjesse
Copy link
Collaborator

I don't like exposing implementation details like these because they change. For example, when Loom comes we'll use virtual threads and this won't work the same way.

Lemme try to find something. In the interim, use an internal API?

@damien-urruty-sonarsource
Copy link
Author

Hey, thanks for your response!

As I said, I already have a workaround with the one-liner I provided above, so there is no urgency. A similar issue in Okio I referenced above is more problematic because I have no workaround: square/okio#738

I agree that having an API exposing implementation details would not be the right approach.

As this TaskRunner is triggered when I use the HTTP client API, I would expect to have a way to shut it down from the same API, e.g. on OkHttpClient, or maybe be able to pass a specific executor in the builder API ? But in the end it might mean not relying on a singleton

@swankjesse
Copy link
Collaborator

Both Okio’s watchdog and TaskRunner do release their threads after 60 seconds. How do you feel about a system property to shorten that lifetime? Maybe something like this?

System.setProperty("okio.threadpool.keepalivemillis", "3000")

We’d use this property for both Okio and OkHttp. If it’s set we honor it, otherwise we do 60 seconds as today.

This is a bit loose and it’s potentially something we could break! But its a singleton which lets us share threads between OkHttpClient instances.

@yschimke
Copy link
Collaborator

@swankjesse That's why I favour this approach

Allow OkHttpClient.Builder to opt out of using the Singleton, and avoid building in that case.

Maybe in combination with above, allow a client to be configured not to share the thread pool. But the system property seems like the minimal fix here.

@damien-urruty-sonarsource
Copy link
Author

Also the advantage of having an API is that each user can choose which settings to apply. If we have a global system property, then it is shared by every user.

Anyway your decision will be mine!

@yschimke
Copy link
Collaborator

Either way suggest two PRs. So submit the one that Jesse suggested first.

@swankjesse
Copy link
Collaborator

The Okio watchdog is an process-wide singleton and there's no place to configure it.

@damien-urruty-sonarsource
Copy link
Author

Ok, let me try, I will open a PR

@yschimke
Copy link
Collaborator

yschimke commented Aug 8, 2020

What about a public API

/**
 * Attempt to close any resources held by OkHttp outside client instances.  After cleanly closing all calls and clients
 * in the JVM, this method will succeed. However if any connections are still held open this may fail due to threads that are still in use.
 */
fun OkHttp.tryShutdown(): boolean

Basically non destructive but releases idle resources and returns a boolean whether it was successful? We could use it as a strong hint, and avoid either corrupting OkHttp if it's still in use, or configuring OkHttp suboptimally.

@yschimke yschimke changed the title Shutdown Isn’t Necessary doc page: missing instructions about TaskRunner Support clean shutdown of TaskRunner threads and documentation Aug 8, 2020
@JakeWharton
Copy link
Collaborator

What is the behavior with the executor service in dispatcher in that method?

@yschimke
Copy link
Collaborator

yschimke commented Aug 8, 2020

That one is already public. Frameworks controlling their own OkHttp client instances can already provide their own implementation of the executor service, or access the default (not linked to TaskRunner). So nothing, not behaviour on that executor service.

They can also close their clients, connections and calls.

But TaskRunner is private and a singleton and this shutdown will likely fail if you haven't cleanly closed everything else first, at which point you could look at what is still open.

@JakeWharton
Copy link
Collaborator

I see, you're proposing an unstable API for only internal things?

I would much prefer a public shutdown API that does everything. I'm not convinced the current position on shutdown is a good one.

@yschimke
Copy link
Collaborator

yschimke commented Aug 8, 2020

Sorry, I meant a simple single public singleton method to try shutdown and report whether it was successful.

I'm throwing out ideas, mainly because there are a few cases where it's hard to easily and cleanly meeting various needs. But ideally I'd still prefer that we make the defaults generally work, and having a logical additional step to cleanup e.g. override your OkHttp instance to not use the TaskRunner singleton.

@davidfrickert
Copy link

Any news on this?
I'm currently running on a GraalVM Native Image this lib https://github.com/minio/minio-java that uses OkHttp.
And I would really like OkHttp not to hang my Isolates for 60s with TaskRunner and OkIO Watchdog threads...

@yschimke
Copy link
Collaborator

yschimke commented Jun 8, 2021

I thought this particular issue you raised is fixed now after 4.5 #5832

I don't have issues with OkHttp clients + GraalVM + shutdown now that client is using daemon threads

@davidfrickert
Copy link

Hmm, in my case my Isolates stay up for 60s due to TaskRunner and OkIO Watchdog not exiting...

Example code:

private static String run(MinioClient minioClient, int seed, byte[] buffer) throws (...) {
		InputStream stream = minioClient.getObject(GetObjectArgs.builder()
																.bucket("files")
																.object(String.format("file-%d.dat", seed))
																.build());
		for (int bytesread = 0;
			bytesread < size;
			bytesread += stream.read(buffer, bytesread, size - bytesread));
		stream.close();
		return DatatypeConverter.printHexBinary(MessageDigest.getInstance("MD5").digest(buffer));
	}

Could I not be closing something?
Or the library that uses OkHttp underneath (Minio java), not be closing something?

Is the behaviour of having daemon threads wait for 60s normal?
This is the core problem that I'm having.

@swankjesse
Copy link
Collaborator

@davidfrickert these are daemon threads. They shouldn't keep your process alive on their own.

@davidfrickert
Copy link

davidfrickert commented Jun 8, 2021

@swankjesse Thanks for your response.

Well, from what I'm seeing it seems to be blocking, due to the logic implemented on Native Image isolate destruction code.

oracle/graal#2617 (comment)

They try to interrupt all threads currently executing (including daemon), and wait for them to exit to be able to successfuly close the Isolate and free all the memory.
Daemon threads block the successful cleanup if they don't exit on the interrupt - there is no distinction here between Daemon / Normal threads, all must stop, so, the Isolate memory stays alive for 60s - the configured max idle time on these daemon threads of OkHttp/OkIO, which is quite unfortunate.
Not sure what to do here to fix this, but I guess I might have to drop OkHttp if the cause is really the daemon threads not cleaning up.

@yschimke
Copy link
Collaborator

yschimke commented Jun 9, 2021

Is isolates different than processes here? We want to support GraalVM here, so it's worth seeing how we can fix this, so I want to understand the difference here.

@davidfrickert
Copy link

Yeah @yschimke
I'm still a bit new in the concepts so I'm not sure I can adequately explain the insides of it.

They are a concept that enables a program to allocate memory in a strictly isolated heap (from the main program heap, and other Isolates' heap), but they run in the same process afaik.

This Medium post is quite nice explaining it: https://medium.com/graalvm/isolates-and-compressed-references-more-flexible-and-efficient-memory-management-for-graalvm-a044cc50b67e

GraalVM native images now support isolates (multiple independent VM instances in the same process)

@yschimke
Copy link
Collaborator

yschimke commented Jun 9, 2021

Thanks for links. They seem close enough to a concept from HHVM (Facebook PHP). So are you reusing the same process with multiple isolates? Or it's still 1:1 with process lifetime?

@davidfrickert
Copy link

davidfrickert commented Jun 9, 2021

No worries!
Yeah reusing the same process, afaik, just allocating a Thread to the Isolate for the code that runs in its context.

@swankjesse
Copy link
Collaborator

Fix will be to exit on interrupt, then launch a new thread if new work is enqueued.

@davidfrickert
Copy link

Thanks @swankjesse, created issue #6702 to address the possible discussion on it.

@yschimke
Copy link
Collaborator

Closing as low priority. #6702 covers the native image use case.

@koppor
Copy link

koppor commented Jul 10, 2024

I have a similar issue with Eclipse Temurin when using langchain4j:

grafik

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bug in existing code
Projects
None yet
Development

No branches or pull requests

7 participants
@JakeWharton @swankjesse @yschimke @koppor @davidfrickert @damien-urruty-sonarsource and others