Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

leading and trailing def #417

Closed
vergauwenthomas opened this issue Jan 23, 2024 · 1 comment · Fixed by #467
Closed

leading and trailing def #417

vergauwenthomas opened this issue Jan 23, 2024 · 1 comment · Fixed by #467
Assignees
Labels
enhancement New feature or request

Comments

@vergauwenthomas
Copy link
Owner

Proposal by @amberJ99

Problem: The selection of leading and trailing periods is based on a number of assumed observations (debias_pref_sample_size_leading) rather than the use of time deltas. This can give bad results if there is a large (seasonal) gap located in the leading/trailing periods.

Proposal:
Use the leading/trailing min max time deltas as timedeltas (so max 30 days ahead) and use the minimum criterium as a minimum in several observations.

 # Select all leading and all trailing obs
    leading_period = obs[obs["datetime"] < gap.startgap]
    trailing_period = obs[obs["datetime"] > gap.endgap]
    logger.debug(f'   {leading_period.shape[0]} leading records, {trailing_period.shape[0]} trailing records.')

    # some derived integers
    poss_shrinkage_leading = leading_period.shape[0] - debias_min_sample_size_leading
    poss_shrinkage_trailing = trailing_period.shape[0] - debias_min_sample_size_trailing
    poss_extention_leading = leading_period.shape[0] - debias_pref_sample_size_leading
    poss_extention_trailing = (
        trailing_period.shape[0] - debias_pref_sample_size_trailing
    )

    # check if desired sample sizes for leading and trailing are possible
    if (leading_period.shape[0] >= debias_pref_sample_size_leading) & (
        trailing_period.shape[0] >= debias_pref_sample_size_trailing
    ):
        logger.debug("leading and trailing periods are both available for debiassing.")
        # both periods are oke
        leading_df = leading_period[-debias_pref_sample_size_leading:]
        trailing_df = trailing_period[:debias_pref_sample_size_trailing]

@vergauwenthomas
Copy link
Owner Author

  • increase the default leading and trailing size

@vergauwenthomas vergauwenthomas added the enhancement New feature or request label Jan 23, 2024
@vergauwenthomas vergauwenthomas linked a pull request Mar 11, 2024 that will close this issue
@vergauwenthomas vergauwenthomas linked a pull request Aug 2, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
2 participants