[FEAT] Add support for exogenous variables in utils.aggregate #297

KuriaMaingi · 2024-10-10T19:22:27Z

This change to the utility function will assist in instances where you need to generate your summation and Y_df but also want to retain any exogenous vars required for your forecast.

You will need to pass in a dictionary containing your exogenous vars and the Pandas agg functions you want applied against them. The latter can either be a single string or a list of strings if you want to apply multiple aggfuncs to the same column.

Output column will have the format column_aggfunc.

Examples:
aggregate(df, spec, exog_vars = {'avg_price':'mean'}) -> Will return a new column called "avg_price_mean"
aggregate(df, spec, exog_vars = {'avg_price':['mean','sum']}) -> Will return 2 new columns called "avg_price_mean" & "avg_price_sum

…exogenous variables within the hierarchical forecast aggregate function. This supports either single or multiple aggregation functions being applied to the same column. Output column will have the format column_aggfunc. \nExamples:\n aggregate(df, spec, exog_vars = {'avg_price':'mean'}) \n aggregate(df, spec, exog_vars = {'avg_price':['mean','sum']})

review-notebook-app · 2024-10-10T19:22:31Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

elephaint · 2024-10-15T14:40:57Z

This change to the utility function will assist in instances where you need to generate your summation and Y_df but also want to retain any exogenous vars required for your forecast.

You will need to pass in a dictionary containing your exogenous vars and the Pandas agg functions you want applied against them. The latter can either be a single string or a list of strings if you want to apply multiple aggfuncs to the same column.

Output column will have the format column_aggfunc.

Examples: aggregate(df, spec, exog_vars = {'avg_price':'mean'}) -> Will return a new column called "avg_price_mean" aggregate(df, spec, exog_vars = {'avg_price':['mean','sum']}) -> Will return 2 new columns called "avg_price_mean" & "avg_price_sum

This is a great and valuable contribution. I'd like to first merge #296 so that we can have all the correct tests running. In the mean time, I'll have a look at the PR

elephaint

Thanks - nice and simple!

Could you include a unit test at the bottom of utils.ipynb that verifies correct behaviour? For example, for a small toy dataset it returns the correct aggregations over a set of inputs?

christophertitchen

Thank you for implementing the recommendations from the old PR.

As @elephaint said, a simple unit test is all it needs.

Going forward, considering the potential scale of hierarchical forecasting problems, we should probably consider swapping out pandas in these sorts of tasks for a library like Polars, for single-node environments, or Ibis, to give us the flexibility to easily switch between something like Polars or DuckDB as execution engines in single-node environments and PySpark in distributed environments.

KuriaMaingi · 2024-10-17T05:45:49Z

Thanks @elephaint and @christophertitchen , I'll have some time later this week and wrap this up.

On Polars/DuckDB/PySpark, this would have to be a repository level change right? Or would we want to consider modular self-contained conversions for tasks that are likely to require a fair amount of compute e.g. aggregations?

elephaint · 2024-10-17T19:34:34Z

Thanks @elephaint and @christophertitchen , I'll have some time later this week and wrap this up.

On Polars/DuckDB/PySpark, this would have to be a repository level change right? Or would we want to consider modular self-contained conversions for tasks that are likely to require a fair amount of compute e.g. aggregations?

I'd first like to move towards Polars support on a repo level, that should be relatively doable. Distributed support seems harder because of all the computations involved with Numpy arrays at the moment.

…ring if you only want to apply one agg_func. Added a unit test to verify aggregate functionality

…ring if you only want to apply one agg_func Added a unit test to verify aggregate functionality

KuriaMaingi · 2024-10-18T21:26:30Z

Apologies, the commit somehow ended up being very messy merging & syncing all the changes that had happened since. All I was resolving were:

Adding the unit test
Small change to enable the function signature to handle instances where the exog_var aggfunc was a single string instead of a list e.g.

aggregate( df = df, spec = spec, exog_vars = {'exog1':'mean','exog2':'sum'} )

Let me know if this inadvertently caused other issues

elephaint · 2024-10-21T07:56:11Z

Apologies, the commit somehow ended up being very messy merging & syncing all the changes that had happened since. All I was resolving were:

Adding the unit test

Small change to enable the function signature to handle instances where the exog_var aggfunc was a single string instead of a list e.g.

aggregate( df = df, spec = spec, exog_vars = {'exog1':'mean','exog2':'sum'} )

Let me know if this inadvertently caused other issues

Thanks, I think the PR is fine but I observed some merge conflict issues in utils.ipynb that must be removed; I'll remove those myself.

elephaint

@KuriaMaingi Thank you for your contribution! We appreciate it very much.

KuriaMaingi added 2 commits October 11, 2024 09:15

Added the changes to utils.py

c8eaec9

Added the changes to utils.py

a1c542d

elephaint changed the title ~~ENH: Add support for exogenous variables in utils.aggregate~~ [FEAT] Add support for exogenous variables in utils.aggregate Oct 15, 2024

elephaint added the feature label Oct 15, 2024

Merge branch 'main' into ENH-exog-support-hierarchical-utils

962d073

elephaint requested changes Oct 15, 2024

View reviewed changes

Merge branch 'main' into ENH-exog-support-hierarchical-utils

4e667a8

christophertitchen reviewed Oct 16, 2024

View reviewed changes

Merge branch 'main' into ENH-exog-support-hierarchical-utils

20cb2ed

KuriaMaingi added 3 commits October 19, 2024 00:02

Improved type hinting for exog_vars + enabled support for a single st…

5b14084

…ring if you only want to apply one agg_func. Added a unit test to verify aggregate functionality

Improved type hinting for exog_vars + enabled support for a single st…

b5157d4

…ring if you only want to apply one agg_func. Added a unit test to verify aggregate functionality

Improved type hinting for exog_vars + enabled support for a single st…

6e61663

…ring if you only want to apply one agg_func Added a unit test to verify aggregate functionality

KuriaMaingi requested a review from elephaint October 19, 2024 19:09

elephaint added 5 commits October 21, 2024 09:58

remove_merge_conflict_tags

f537d8e

remove_merge_conflict_tags

584f804

remove_merge_conflict_tags_again

dda4ddf

fix_conflicts_and_clean_nb_and_fix_contributing

6d6ccbe

clean_up_utils

df3ebe8

elephaint approved these changes Oct 21, 2024

View reviewed changes

elephaint merged commit b5245d0 into Nixtla:main Oct 21, 2024
15 checks passed

KuriaMaingi deleted the ENH-exog-support-hierarchical-utils branch October 21, 2024 14:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEAT] Add support for exogenous variables in utils.aggregate #297

[FEAT] Add support for exogenous variables in utils.aggregate #297

KuriaMaingi commented Oct 10, 2024

review-notebook-app bot commented Oct 10, 2024

elephaint commented Oct 15, 2024

elephaint left a comment

christophertitchen left a comment

KuriaMaingi commented Oct 17, 2024

elephaint commented Oct 17, 2024

KuriaMaingi commented Oct 18, 2024 •

edited

Loading

elephaint commented Oct 21, 2024 •

edited

Loading

elephaint left a comment

[FEAT] Add support for exogenous variables in utils.aggregate #297

[FEAT] Add support for exogenous variables in utils.aggregate #297

Conversation

KuriaMaingi commented Oct 10, 2024

review-notebook-app bot commented Oct 10, 2024

elephaint commented Oct 15, 2024

elephaint left a comment

Choose a reason for hiding this comment

christophertitchen left a comment

Choose a reason for hiding this comment

KuriaMaingi commented Oct 17, 2024

elephaint commented Oct 17, 2024

KuriaMaingi commented Oct 18, 2024 • edited Loading

elephaint commented Oct 21, 2024 • edited Loading

elephaint left a comment

Choose a reason for hiding this comment

KuriaMaingi commented Oct 18, 2024 •

edited

Loading

elephaint commented Oct 21, 2024 •

edited

Loading