Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix single row prediction performance in a multi-threaded environment #6021

Closed
Ten0 opened this issue Aug 4, 2023 · 2 comments · Fixed by #6024
Closed

Fix single row prediction performance in a multi-threaded environment #6021

Ten0 opened this issue Aug 4, 2023 · 2 comments · Fixed by #6024

Comments

@Ten0
Copy link
Contributor

Ten0 commented Aug 4, 2023

Summary

Single row prediction API takes a write lock on the Booster. This breaks performance when using the same booster in a multi-threaded environment.

Motivation

The use case is a complex algorithm processing millions of rows and sometimes needing to run a prediction.
The computation is already parallelized at the "millions-of-rows" level.
It needs to use the booster to (sometimes) generate a row and run the prediction on it.
However, currently that performs very poorly due to global contention here:

UNIQUE_LOCK(mutex_)

Description

It seems that this write lock is required due to the fact that the Predictor writes inside itself as it predicts (#3771), and that the Predictor for single row predictions is stored inside the Booster, unlike the Predictor for batch prediction functions, which is instantiated locally by the Predict function, allowing locking to be shared.

Now the predictor seems to be stored inside the booster in order to be re-usable across calls, to make the single predictions through LGBM_BoosterPredictForMatSingleRow reasonably inexpensive: this was implemented before it became recommended to make that a two-step process by using LGBM_BoosterPredictForMatSingleRowFastInit instead.
HOWEVER, as of the introduction of LGBM_BoosterPredictForMatSingleRowFastInit in #2992, we now have a "SingleRowPredictor" object that is supposed to be instantiated and preserved across calls to predict single rows with LGBM_BoosterPredictForMatSingleRowFast: that is FastConfigHandle. (Probably would have been better to name it SingleRowPredictorHandle as part of the public API but I guess that might be a little late now?)

This means that at the moment, resources necessary for LGBM_BoosterPredictForMatSingleRowFast is are split between two places:

The idea is: put all the resources that need to be initialized once before running single row predictions behind the FastConfigHandle object (aka SingleRowPredictorHandle), and take a write lock on that object, instead of the full Booster, when running predictions.

This way, when processing on multiple threads, each thread can make its call to LGBM_BoosterPredictForMatSingleRowFastInit, then work without contention:

  • Removes the need for taking a global write lock when predicting in single row mode
  • Simplifies the booster by removing the single_row_predictor_ array (should it have been named single_row_predictors instead btw? In any case, on SingleRowPredictorHandle aka FastConfigHandle we only need a single Predictor object), and removing the SetSingleRowPredictor, PredictSingleRow functions
    • Specifically for this, there are two approaches:
      1. Make the LGBM_BoosterPredictForMatSingleRow instantiate its FastConfig at the beginning of the call. This is basically what it used to do but it didn't hold the predictor, so that would probably decrease performance for users that currently should use LGBM_BoosterPredictForMatSingleRowFastInit but use LGBM_BoosterPredictForMatSingleRow instead.
      2. Make the LGBM_BoosterPredictForMatSingleRow keep using a single row predictor stored in the booster object, but that would be distinct from the one used by LGBM_BoosterPredictForMatSingleRowFast. This has the downside of being less clean, because we would need to leave the current single_row_predictor_, SetSingleRowPredictor, PredictSingleRow functions in Booster, however that maintains the current performance of LGBM_BoosterPredictForMatSingleRow for all users.

cc @AlbertoEAF

@Ten0
Copy link
Contributor Author

Ten0 commented Aug 4, 2023

@jameslamb Wouldn't the appropriate label be efficiency (and maybe feature request) rather than question ?

@jameslamb
Copy link
Collaborator

Sure.

Ten0 added a commit to Ten0/LightGBM that referenced this issue Aug 6, 2023
Fixes microsoft#6021

- Store all resources that are reused across single-row predictions in the dedicated `SingleRowPredictor` (aka `FastConfig`)
- Use that instead of resources in the `Booster` when doing single row predictions to avoid having to lock the `Booster` exclusively.
- A FastConfig being alive now takes a shared lock on the booster (it was likely very incorrect to mutate the booster while this object was already built anyway)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants