Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FIX] Restructuring #303

Merged
merged 26 commits into from
Oct 30, 2024
Merged

[FIX] Restructuring #303

merged 26 commits into from
Oct 30, 2024

Conversation

elephaint
Copy link
Contributor

@elephaint elephaint commented Oct 25, 2024

  • Put source files in src folder
  • Restructure methods adding Sparse methods to docs
  • Update CI to improve speed of tests
  • Update model performance tests to test more models
  • Docs only update on release or workflow dispatch, not on push to main
  • Move Numba functions to utils, perform eager compilation and micro optimization of lasso function.
  • Use double dtype everywhere; reconcilers are sensitive to low(er) precision, for now unify to double and maybe in the future turn everything to single

Lasso (ERM-reg / ERM-reg_bu) micro optimization before and after (the method is still slow as **** ):

Screenshot 2024-10-29 104009

Screenshot 2024-10-29 103932

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

.github/release-drafter.yml Show resolved Hide resolved
.github/workflows/python-publish.yml Outdated Show resolved Hide resolved
action_files/test_models/src/data.py Show resolved Hide resolved
action_files/test_models/src/evaluation.py Show resolved Hide resolved
action_files/test_models/src/models.py Outdated Show resolved Hide resolved
hierarchicalforecast/core.py Outdated Show resolved Hide resolved
@elephaint
Copy link
Contributor Author

Test for lasso equivalence of methods (removed it from the code to keep it clean):

#| hide
# Lasso test old vs new
@njit
def _lasso_old(X: np.ndarray, y: np.ndarray, 
          lambda_reg: float, max_iters: int = 1_000,
          tol: float = 1e-4):
    # lasso cyclic coordinate descent
    n, feats = X.shape
    norms = (X ** 2).sum(axis=0)
    beta = np.zeros(feats, dtype=np.float32)
    beta_changes = np.zeros(feats, dtype=np.float32)
    residuals = y.copy()

    for it in range(max_iters):
        for i, betai in enumerate(beta):
            # is feature is close to zero, we 
            # continue to the next.
            # in this case is optimal betai= 0
            # print(beta)

            if abs(norms[i]) < 1e-8:
                continue
            xi = X[:, i]
            #we calculate the normalized derivative
            rho = betai + xi.flatten().dot(residuals) / norms[i] #(norms[i] + 1e-3)
            #soft threshold
            beta[i] = np.sign(rho) * max(np.abs(rho) - lambda_reg * n / norms[i], 0.)#(norms[i] + 1e-3), 0.)
            beta_changes[i] = np.abs(betai - beta[i])
            if beta[i] != betai:
                residuals += (betai - beta[i]) * xi
        if max(beta_changes) < tol:
            break
    #print(it)
    return beta


S = np.array([[1., 1., 1., 1.],
              [1., 1., 0., 0.],
              [0., 0., 1., 1.],
              [1., 0., 0., 0.],
              [0., 1., 0., 0.],
              [0., 0., 1., 0.],
              [0., 0., 0., 1.]])
h = 2
_y = np.array([10., 5., 4., 2., 1.])
y_bottom = np.vstack([i * _y for i in range(1, 5)])
y_hat_bottom_insample = np.roll(y_bottom, 1)
y_hat_bottom = np.vstack([i * np.ones(h) for i in range(1, 5)])
idx_bottom = [3, 4, 5, 6]

y_hat=S @ y_hat_bottom
y_insample=S @ y_bottom
idx_bottom=idx_bottom

n_hiers, n_bottom = S.shape

for method in ["reg", "reg_bu"]:
    for lambda_reg in [None, 0.1, 10, 1000]:
        for with_nans in [True, False]:
            y_hat_bottom_insample = np.roll(y_bottom, 1)
            if with_nans:
                y_hat_bottom_insample[:, 0] = np.nan
            y_hat_insample=S @ y_hat_bottom_insample
            y_insample=S @ y_bottom

            nan_idx = np.isnan(y_hat_insample).any(axis=0)
            y_insample = y_insample[:, ~nan_idx]
            y_hat_insample = y_hat_insample[:, ~nan_idx]
            h = min(y_hat.shape[1], y_hat_insample.shape[1])
            y_hat_insample = y_hat_insample[:, -h:] # shape (h, n_hiers)
            y_insample = y_insample[:, -h:]

            if method == 'reg':
                X = np.kron(S, y_hat_insample.T)
                z = y_hat_insample.reshape(-1)

                if lambda_reg is None:
                    lambda_reg = np.max(np.abs(X.T.dot(z)))
                else:
                    lambda_reg = lambda_reg            

                beta_old = _lasso_old(X, z, lambda_reg)
                beta_new = _lasso(X, z, lambda_reg, max_iters=1000, tol=1e-4) 
                np.testing.assert_allclose(beta_new, beta_old, atol=1e-5, rtol=1e-3)

            if method == 'reg_bu':
                X = np.kron(S, y_hat_insample.T)
                Pbu = np.zeros_like(S)
                Pbu[idx_bottom] = S[idx_bottom]
                z = y_hat_insample.reshape(-1) - X @ Pbu.reshape(-1)
                    
                if lambda_reg is None:
                    lambda_reg = np.max(np.abs(X.T.dot(z)))
                else:
                    lambda_reg = lambda_reg             

                beta_old = _lasso_old(X, z, lambda_reg)
                beta_new = _lasso(X, z, lambda_reg, max_iters=1000, tol=1e-4) 
                np.testing.assert_allclose(beta_new, beta_old, atol=1e-5, rtol=1e-3)

Copy link
Member

@jmoralez jmoralez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non blocking comments.

action_files/test_models/src/data.py Show resolved Hide resolved
action_files/test_models/src/evaluation.py Show resolved Hide resolved
action_files/test_models/src/models.py Show resolved Hide resolved
@elephaint elephaint merged commit bf944f2 into main Oct 30, 2024
17 checks passed
@elephaint elephaint deleted the docs-restructuring branch October 30, 2024 18:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants