Force dtype `np.float64` for optimization while data dtype is `np.float32` #843

mlondschien · 2024-09-20T06:14:22Z

We're having quite a few problems with optimization in float32. In small batches, these go away if we .astype(np.float64) our data before calling glum.GeneralizedLinearRegressor.fit. This also makes the algorithm much faster for some reason. However, we cannot afford the float32 -> float64 conversion on the entire dataset due to memory constraints.

Is there an option to do the optimization in glum (i.e., probably, coef and the current hessian estimate) in float64 even if the data itself is float32?

The text was updated successfully, but these errors were encountered:

stanmart · 2024-09-20T07:36:19Z

I don't think tabmat can handle a mismatch in dtypes. For example,

import tabmat as tm
import numpy as np

X = tm.DenseMatrix(
    np.random.rand(1000, 10).astype(np.float32),
)
d = np.random.rand(1000).astype(np.float64)

X.sandwich(d)

fails with

[...]
File src/tabmat/ext/dense.pyx:29, in tabmat.ext.dense.dense_sandwich()
ValueError: Buffer dtype mismatch, expected 'double' but got 'float'

So while it would be a nice feature, I don't think it also requires non-trivial changes in tabmat -- in particular, the parts written in C++.

Edit: fix dimensions

Quantco deleted a comment Sep 20, 2024

mlondschien mentioned this issue Sep 20, 2024

Use dtype dependent precision #844

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Force dtype `np.float64` for optimization while data dtype is `np.float32` #843

Force dtype `np.float64` for optimization while data dtype is `np.float32` #843

mlondschien commented Sep 20, 2024

stanmart commented Sep 20, 2024 •

edited

Loading

Force dtype np.float64 for optimization while data dtype is np.float32 #843

Force dtype np.float64 for optimization while data dtype is np.float32 #843

Comments

mlondschien commented Sep 20, 2024

stanmart commented Sep 20, 2024 • edited Loading

Force dtype `np.float64` for optimization while data dtype is `np.float32` #843

Force dtype `np.float64` for optimization while data dtype is `np.float32` #843

stanmart commented Sep 20, 2024 •

edited

Loading