You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As promised in #73, I'd like to share here some sample code I've been working on, and the results I've obtained. Please note, unfortunately this is not reproducible elsewhere yet, this depends on some work in progress branches I have not just in Dask-GLM, but also Dask, CuPy and NumPy, so unfortunately, it will take at least a few more weeks until this is publicly reproducible. We're actively working with all projects I've mentioned before to have all these fixes and features integrated as soon as possible.
importcupyimportdask_glm.estimators# x from 0 to 300x=300*cupy.random.random((10000, 1))
# y = a*x + b with noisey=0.5*x+1.0+cupy.random.normal(size=x.shape)
# create a linear regression modelest=dask_glm.estimators.LinearRegression(fit_intercept=False, solver='proximal_grad')
est.fit(x, y)
Here's the timing output of the last line alone:
CPU times: user 357 ms, sys: 99.2 ms, total: 456 ms
Wall time: 468 ms
Using gradient_descent as solver:
CPU times: user 985 ms, sys: 417 ms, total: 1.4 s
Wall time: 1.4 s
And using lbfgs solver with NumPy (not CuPy!), which was the fastest solver using NumPy as backend:
CPU times: user 3.53 ms, sys: 311 µs, total: 3.84 ms
Wall time: 2.35 ms
And finally, using sklearn.linear_model.LinearRegression with all default arguments, except of fit_intercept (set to False, just like in the previous examples):
CPU times: user 2.31 ms, sys: 333 µs, total: 2.64 ms
Wall time: 1.59 ms
I believe there may be differences in the default algorithm or default parameters used by default in sklearn. Does anyone know if that's the case and how could we have more of an apples to apples comparison?
It's worth mentioning, I intentionally used fit_intercept=False because setting it to True is still depending on fixing existing issues in Dask. I didn't present output for other Dask-GLM solvers using the CuPy backend because they are still not working, see #73.
The text was updated successfully, but these errors were encountered:
As promised in #73, I'd like to share here some sample code I've been working on, and the results I've obtained. Please note, unfortunately this is not reproducible elsewhere yet, this depends on some work in progress branches I have not just in Dask-GLM, but also Dask, CuPy and NumPy, so unfortunately, it will take at least a few more weeks until this is publicly reproducible. We're actively working with all projects I've mentioned before to have all these fixes and features integrated as soon as possible.
Here's the timing output of the last line alone:
Using
gradient_descent
as solver:And using
lbfgs
solver with NumPy (not CuPy!), which was the fastest solver using NumPy as backend:And finally, using sklearn.linear_model.LinearRegression with all default arguments, except of
fit_intercept
(set toFalse
, just like in the previous examples):I believe there may be differences in the default algorithm or default parameters used by default in sklearn. Does anyone know if that's the case and how could we have more of an apples to apples comparison?
It's worth mentioning, I intentionally used
fit_intercept=False
because setting it toTrue
is still depending on fixing existing issues in Dask. I didn't present output for other Dask-GLM solvers using the CuPy backend because they are still not working, see #73.The text was updated successfully, but these errors were encountered: