-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Need PredictionFunction to provide object-pooling or to be thread-safe #1718
Comments
@eerhardt @TomFinley @Zruty0 could you please discuss a concrete action path for this issue so it can be triaged? |
Another option is to use a However, you need to be careful not to cache the variable across async boundaries, or other places where you may jump between threads. Because if you did that, now you may have multiple threads using the same PredictionFunction. |
Hi @shmoradims and @CESARDELATORRE, and thank you @CESARDELATORRE for such a complete description of your problem. Stateful objects in libraries that are not somehow intrinsically threadsafe generally should not handle their own synchronization, because what constitutes "thread safe" is very much an application specific question. Complicating this further, this is especially true in ML.NET where these stateful objects can themselves sometimes be quite substantial in terms of memory/compute footprint (e.g., the intermediate buffers needed in doing something like evaluating a DNN). There's also of course the issue that the prediction function. This is of course why so few items in .NET that are stateful handle their own synchronization. Indeed, the only exception I am aware of are of course those basic primitive structures that enable basic multithreading scenarios, but as far as a simple object itself being threadsafe? E.g., would anyone ever try to make This is quite evident in this specific scenario: I think what might serve the interests in this project would be to register a singleton object pool that creates/holds new prediction functions, e.g., make the object pool a singleton, and have the serving method (1) fetch an instance, (2) use it, (3) return it. ...Unless the model in question is somehow a time series model in a way which is stateful (e.g., the scenario we're trying to unblock with the work in #1618), in which case a singleton instance of a function with a global lock might be appropriate, maybe. ...Unless ASP.NET eventually gets the ability to run something akin to Django worker servers (assuming I remember how it works correctly, it's been some years), which would be even better than what ASP.NET does since then there could be some "worker" with a bunch of assets they personally own, which would be far better, since they could just natively do setup and teardown themselves. ...Unless what you're doing has nothing whatsoever to do with ASP.NET or web services at all, e.g., batch processing of a large file with many instances, in which case, in which case even if you wanted multithreading you could just spawn a bunch of separate objects, pass them to separate tasks using ...Unless the model happens to be something like evaluating DNNs where it would be more harmful than helpful to try to achieve throughput through multithreading, and batch prediction would be a better approach. The point, of course, is that this sort of object is not made thread safe because what constitutes thread safety is very, very application and model dependent. This is quite obvious in this case, as we see from the list above. (ANd that since writing it, I see @eerhardt has come up with yet another way to do it.) You'll note that a significant part of the issue's is steeped both implicitly and explicitly in the precise method by which ASP.NET itself divies up work across multiple threads. This of course is a huge red flag for anyone designing an API that they'd best not get involved, since if things were to change just a little bit, or we were to have a different scenario, a totally different solution would be appropriate. Indeed, the .NET team itself learned this lesson with their collections once upon a time. They used to have a thing called Even now I occasionally run across a library that with the best of intentions tried to handle synchronization for the user. E.g., an implementation of object pooling at some level of their stack, for everyone's "convenience" -- followed immediately, inevitably, by lengthly description of how and why to turn it off for the 75% of scenarios where such a simplistic approach is insufficient or leads to bad perf. We did in fact make precisely this same mistake in ML.NET v0.1, where we made our "prediction models" thread safe. What a disaster that was. So, the action item might be one of education. Maybe we explain what objects are and are not thread safe, maybe even go into why. (Though, on the other hand, this seemed to be known in this case?) So maybe something for the documentation folks to take up? I'm not sure. |
@eerhardt - Agree, but the problem is not just to protect/lock the prediction function. The other problem is "how to re-use" the prediction function so the My point is that in most of the cases, for most Http requests, either using [ThreadStatic] or Scoped in DI/IoC, the method What we need is a way to share PredictionFunction-objects across Http requests and the two ways that I see are these:
About the ...Unless cases mentioned by Tom. Sure, there might be cases like those where you don't need it, but in a large % of .NET applications and developers, they will be scoring the model in .NET web apps or Web API services where our main guidance is ASP.NET Core, so we need a prescriptive way of doing it efficiently. Additionally, it could also be improved from an "easier to use" point of view by simplifying it: Let's say the model-object and the object pooling are under a single object which can be singleton, it would be a lot easier for the web developer to score a model by interacting with a single object. Right now, the developer has to differentiate between the model-object and the prediction-function-object which have different recommended lifetimes (singleton vs. scoped/thread-safe). For sure, the current implementation to be done in ASP.NET Core is not straightforward. PRIORITY? |
ASP.NET pools threads by default. A new thread isn't created to service each Http request. That would be a massive overhead. |
I imagine it's pretty clear by now just from this discussion @CESARDELATORRE why we won't be doing this, but just to be explicit: you and @eerhardt are having a discussion about how best to synchronize this resource for just this one library ASP.NET. And that's ignoring the fact again that this particular scheme would not be appropriate for different classes of models, in the set I just enumerated. (In particular: I could see how either of your approaches might be situationally correct, depending on the circumstances. Or situationally incorrect.) Could we imagine a one-size-fits-all, or even one-sized-fits-most solution to this problem?
So, at the risk of stating the obvious, ML.NET absolutely did not invent the concept of an object that would benefit from pooling and sharing. ML.NET is certainly not the only library we'd expect ASP.NET to call. If ASP.NET handles this sort of poolable sharable resources as awfully as your message seems to indicate (does it really?), this is something for them to fix, not something for every library that might conceivably be called through them to fix for them. |
@eerhardt - We can have a discussion off-line about I'm saying that because in an ASP.NET Core app, the thread pool has by default the following available threads: Effectively, until the app is not used quite a lot, the probability of getting a reused thread that at the same time already used/created the PredictionFunction object, is kind of low. In many cases the MakePredictionFunction() will be called when getting an Http request because the app has 32,767 threads available in the pool and chances of getting a re-used thread that at the same time already created a Prediction Function might be low until the application has been used for significant time. If we had an object pool of PredictionFunction objects, it would be viceversa, most of the threads would take an already available PredictionFunction object from the pool. There's an existing case in ASP.NET Core with HttpClientFactory which uses an object pool of HttpMessageHandler objects: It is a comparable case because of issues when using HttpClient is used as Singleton or as Transient, and the best choice was to create object pooling. @TomFinley - About "just this one library ASP.NET" well..., it is not about if it is a single library ,which is not just a single library, ASP.NET is a higher level framework on top of .NET. But the important point is about the % of .NET developers using ASP.NET. And ASP.NET means a very large portion of the development in .NET and even larger % for the new/modern .NET development. About "ASP.NET handles this sort of poolable sharable resources as awfully" I don't understand why you say that. Either thread pooling and DI/IoC work very well in ASP.NET Core. The issue is about how to quickly reuse the non-threadsafe PredictionFunction across Http requests. I think this problem will happen with any web framework in .NET, either ASP.NET Core, Nancy, etc. I believe that just by using In any case, this feature might not be a priority, I just want to highlight the possibilities for improvement. :) |
Great discussions everyone. Based on all the comments, two conclusions seem clear to me:
I suggest the following action items:
|
@shmoradims - I was just chatting with Eric about this. When talking with the ASP.NET team they also told me time ago that In any case, this issue is not a blocking issue. It is just an area I think we should investigate more in order to get the best developer experience in the future for one of the most important workloads in .NET and certainly the most important workload in .NET Core (ASP.NET Core means 90% of apps in .NET Core). I agree that for now, the best action to do is to implement the object pooling approach in our sample applications, document it in docs.microsoft.com and get feedback from users. I personally think it is not a very straight forward experience that in order to just score a model in a web app or Http service you need to understand multiple concepts that can be complex and even confusing when exploring for the first time:
I think the experience should be simplified in the future, but for now, I agree, let's implement sample code with the object pooling approach, document it in docs.microsoft.com and when we get further feedback from customers, we can evolve accordingly. 👍 |
@CESARDELATORRE I created a concrete work item #1789. I assigned it to you and project 0.9, but feel free to change those. Based the the feedback we get, we can revisit this issue. @TomFinley @eerhardt thanks for all your comments. |
@shmoradims - I just implemented sample code on how to use object pooling for the PredictionFunction / PredictionEngine objects in this PR for the ASP.NET Core web app eShopDashboardML, here: However, after creating the code which I find very generic and usable for any end-user app while at the same time it is significant code to create/understand, I still think that this kind of implementation/code should be part of ML.NET API not really part of user's code. We need to have a further discussion about it reviewing the code I wrote for this approach and since I still think this should be part of ML.NET API, I'm re-opening the issue, ok? 👍 |
@eerhardt does the PredictionEnginePool resolve this issue? |
Yes. I believe this can now be closed. |
This issue is suggesting an enhancement since the current approaches/workarounds have both drawbacks, meaning we're not getting the scalability and performace as good as it could be in multithreaded scenarios like ASP.NET Core web apps or Web API services.
Context:
In multiple scenarios, but especially in ASP.NET Core web apps, the model-object (ITransformer) and the prediction function (PredictionFunction) object should be re-used because they are "expensive" objects when initializing and will impact when having many concurrent users.
In the case of the model-object (ITransformer), it is a thread-safe object, so the model-object loaded from the .ZIP file can be registered as singleton. That way it'll be re-used by any thread or Http request in the ASP.NET Core app. See this code as an example: https://github.com/dotnet/machinelearning-samples/blob/master/samples/csharp/end-to-end-apps/Regression-SalesForecast/src/eShopDashboard/Startup.cs#L53
Main issue with the Prediction Function:
The Prediction function is also "expensive" as it takes significant milliseconds when creating the prediction-function object from the model-object with the
model.MakePredictionFunction()
method.Ideally, because it is "expensive", it should be re-used across executions in the same app. But since it is not thread-safe, the current options in an ASP.NET Core web app are the following:
- OPTION 1: Register the PredictionFunction object as Scoped (
AddScoped()
) for its DI/IoC object lifetime, as in this code:https://github.com/dotnet/machinelearning-samples/blob/master/samples/csharp/end-to-end-apps/Regression-SalesForecast/src/eShopDashboard/Startup.cs#L61
But, the problem with this approach is that in many of the cases you don't get much benefit unless within the same Http request you make multiple calls to the .Predict() method. The point is that since the Scope lifetime is only re-used within a single Http request, for most of the Http requests it'll need to call the model.MakePredictionFunction() method.
The only way to share an object across Http requests in .NET Core DI/IoC is with singleton, but that requires the object to be thread-safe.
The possible service/object lifetimes in .NET Core IoC/DI are:
See the available object lifetimes here:
https://docs.microsoft.com/en-us/aspnet/core/fundamentals/dependency-injection?view=aspnetcore-2.1#service-lifetimes
https://blogs.msdn.microsoft.com/cesardelatorre/2017/01/26/comparing-asp-net-core-ioc-service-life-times-and-autofac-ioc-instance-scopes/ (Autofac does support Thread Scope, though..)
- OPTION 2: Register the PredictionFunction object as Singleton (
AddSingleton()
) for its DI/IoC object lifetime, as in this code (Commented line):https://github.com/dotnet/machinelearning-samples/blob/master/samples/csharp/end-to-end-apps/Regression-SalesForecast/src/eShopDashboard/Startup.cs#L60
Benefits: If you register the prediction-function object as Singleton for its ASP.NET Core DI/IoC object lifetime, since it is Singleton, any new Http request (except the first Http request since the app started) will just use the prediction function by calling the
Predict()
method. Therefore the performance of a single Http request would be significantly better.Problem: However, the issue with this approach is that since the prediction function is not thread-safe, you need to use mechanisms like a critical section to lock the prediction function object to be used by a single thread when executing the
Predict()
method, like in this code (Commented line):https://github.com/dotnet/machinelearning-samples/blob/master/samples/csharp/end-to-end-apps/Regression-SalesForecast/src/eShopDashboard/Controllers/CountrySalesForecastController.cs#L57
The issue with this approach is that since you are limiting the use of "Predict()" to a single thread, that will be a bottleneck when scaling. In the cases when you could have many concurrent Http requests and many of them trying to run the Predict() method, the scalability won't be as good as it could be because we're limiting that code execution to a single thread able to run the Predict() method.
Basically, with this approach you might significantly limiting the scalability of the app (in regards the model execution/scoring) across threads when having many Http requests.
Workarounds: Currently, use any of the two mentioned methods, being aware of the handicaps from each:
Long term solutions:
I see two possible solutions:
Create an object pooling of "prediction function objects". That way, most Http requests won't need to call the "expensive"
.CreatePredictionFunction()
method while it would be scalable since many threads could be using the multiple objects available in the object pooling.Make the prediction Function thread-safe. If the prediction-function was thread-safe while scalable enough, it could be simply registered as Singleton in the DI/IoC system in ASP.NET Core without needing to use a critical section or comparable approaches.
Is there any other possible approach available?
Once these scenarios are further discussed in this thread or in our API reviews, I'll document the current possible approaches when consuming/running an ML.NET morel in an ASP.NET Core web app or Web API service.
Related issues:
#421
The text was updated successfully, but these errors were encountered: