Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add subscan operations to preprocess #1028

Merged
merged 10 commits into from
Dec 4, 2024
Merged

Add subscan operations to preprocess #1028

merged 10 commits into from
Dec 4, 2024

Conversation

earosenberg
Copy link
Contributor

Add basic statistics, PSDs, and glitches on subscans.

Add basic statistics, PSDs, and glitches on subscans.
@earosenberg earosenberg marked this pull request as ready for review November 21, 2024 16:56
@earosenberg earosenberg requested a review from msilvafe November 21, 2024 16:56
Copy link
Contributor

@msilvafe msilvafe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @earosenberg, it'll be really nice to have this infrastructure for subscan operations. I put inline comments reflecting the conversation we had on Friday plus a few more things as I went through more closely.

Comment on lines 281 to 291
if (ii == 0) and isinstance(out[k], RangesMatrix): # Treat like dets
if a in full._axes:
_, fs, ns = full[a].intersection(new[a], return_slices=True)
else:
fs = range(new[a].count)
ns = range(new[a].count)
oidx.append(fs)
nidx.append(ns)
else: # Treat like samps
oidx.append(slice(None))
nidx.append(slice(None))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you expand the inline comment to explain what this is for/why you added it so that when something breaks and I come back here to figure out I why I remind myself why this looks so weird?

Comment on lines 720 to 723
subscan_aman.wrap('start_time', tt[start_inds], [(0, 'subscans')])
stop_time = aman.timestamps[ss_ind[:-1,1]]
last_time = [2*tt[-1] - tt[-2] if end_inds[-1] == tt.size else tt[end_inds[-1]]]
subscan_aman.wrap('stop_time', np.concatenate([stop_time, last_time]), [(0, 'subscans')])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't seem like you use these (start/stop_time) anywhere and I don't think we want to wrap these because in general these can change if you restrict samples which will be reflected in subscan_flags and the corresponding start/stop times can always be computed from those flags and timestamps since the both have axes aligned with samps

@@ -791,13 +858,20 @@ def calc_and_save(self, aman, proc_aman):
calc_aman = core.AxisManager(aman.dets, aman.samps)
calc_aman.wrap('turnarounds', ta, [(0, 'dets'), (1, 'samps')])

self.save(proc_aman, calc_aman)
if ('merge_subscans' not in self.calc_cfgs) or (self.calc_cfgs['merge_subscans']):
subscan_aman = aman.subscan_info
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this just go in calc_aman? It seems like all other methods look for subscan_info inside of aman not proc_aman since FlagTurnarounds.process has a call to get_turnaround_flags that repopulates aman.subscan_info.

name = "stats"
def __init__(self, step_cfgs):
self.signal = step_cfgs.get('signal', 'signal')
self.wrap = step_cfgs.get('wrap', 'stats')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we make the default to be tod_stats or something similar since we already have hwpss_stats and azss_stats

calc:
stat_names: ["median", "std"]
split_subscans: False # optional
mask: # optional, for cutting a power spectrum in frequency
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about we call this psd_mask and rename TODStats just to GetStats or something similar that doesn't imply this only operates on the TOD.

@@ -353,6 +390,8 @@ def fit_noise_model(
f_max=100,
merge_name="noise_fit_stats",
merge_psd=True,
freq_spacing=None,
approx_fit=False,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's get rid of approx_fit and just have the behavior of fit in the preprocess method do what approx_fit is doing there. This is essentially doing what's in calc_wn without the fknee and alpha estimates so perhaps we expand calc_wn to include those, call that fn to set p0 and when you want "approx_fit" you just call that fn.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good.
It's a little confusing that calc_wn returns the square root of median(psd) whereas fit_noise_model doesn't take the root; do we want to change that? Also seems cleaner to add a new function estimate_fknee but I can mash it all into calc_wn if you prefer

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this bothers me quite a lot that we (probably I) decided to change the convention between calc_wn and fit_noise_model. I think that they should both be in squared units (like fit_noise_model is now). I see what you're saying about wanting to have another function like estimate_fknee but I think my preference is to have one function called estimate_noise_model which returns exactly the same shaped output as fit_noise_model i.e. an axis shaped (dets x noise_coeffs). If we want to keep calc_wn fn alive for backwards compatibility we could just have it call estimate_noise_model (so that we don't have to update definitions of median/mean/etc in multiple places) and just return the white noise coefficients but I would keep everything in the squared units for consistency.

Another related issue was brought up by @ykyohei in #995 where we really should be always taking the PSDs w/ overlap = 0, storing the nperseg in the calc_psd output so that we can have the calc_wn/any fn that takes a median over psd bins correct for this "median bias".

It's maybe ok to address that all here but also ok opening up a new PR for these fixes and reviewing that separately, I'll leave it up to you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some consideration I think estimate_noise_model either needs to be more robust then just the initial guess of the fit_noise_model or at least have some error quantification. Otherwise you can have catastrophic mis-fits -- this is also a problem with fit_noise_model but I assume it could be caught with the covariance. I've just deleted approx_fit for now, will open a new PR (or at least an issue) to deal with these other problems.

"""
Returns an axis manager with information about subscans.
This includes direction, start time, stop time, and a ranges matrix (subscans samps)
True inside each subscan. Subscans are defined excluding turnarounds.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's follow the docstring format adding in

Arguments
-----------
arg: type
    description

Returns
--------
return_variable: type
    description

return arr[ndslice]

def wrap_info(aman, info_aman_name, info, info_names, merge=True):
"""Wrap info into an aman. Assumed to be (dets,) or (dets, subscans)"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function at first glance seems to be quite general but I think it's somewhat specific to wrapping in the stats info appropriately choosing whether or not to include subscan axes. Can you just call it wrap_stats or wrap_and_check_subscans or something similarly specific and give the doc string a bit more description.

stat_names : list
List of strings identifying which statistics to run.
split_subscans : bool
(Optional). If True statistics will be computed on subscans. Assumes aman.subscans exists already.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aman.subscan_info

signal = aman[signal]
if split_subscans:
if mask is not None:
raise ValueError("Cannot mask samples and split subscans")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be possible to do a union of the subscan flags + a secondary flag if they're both ranges() aligned along samps. Ok to leave for later but I'd just create an issue for it to track it.

Copy link
Contributor Author

@earosenberg earosenberg Nov 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added support for discontinuous ranges to apply_rng but didn't implement the full masking. Added #1046 .

@earosenberg
Copy link
Contributor Author

Addressed all above points except where noted. Also added new features:

  • CalcPsd with subscan=True now creates psd.Pxx_ss with a subscans axis and psd.Pxx from the mean of the subsans.
  • Added plots for signal and psd

@earosenberg earosenberg requested a review from msilvafe December 2, 2024 16:00
Copy link
Contributor

@msilvafe msilvafe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing all of my comments and testing all of this.

@earosenberg earosenberg merged commit cfc268f into master Dec 4, 2024
4 checks passed
@earosenberg earosenberg deleted the subscan-preproc branch December 4, 2024 16:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants