H4C IDR2.2 #20

r-pascua · 2023-11-14T23:31:59Z

Given that there are so many moving parts in the analysis, I figured it would be a good idea for the pipeline to go through an informal PR before we re-run the H4C analysis. I am hoping that this PR serves as a useful reference for completing the H4C IDR2.2 memo, and also serves as an additional point of reference for trying to understand how the pipeline works. I have tagged as reviewers people who I tagged in the Slack thread where we started discussing this, just for the sake of putting it on the radar for interested parties—this discussion should also be open to other interested folks in the collaboration.

Here is a list of things that I think are important and should be discussed and generally approved by the group. I do not expect that this is a complete list, so please suggest things to add or remove.

Structure of the analysis pipeline code.
- I have initialized a idr2.3 directory with the intent that this is where the final version of the pipeline will live. I simply copied over what looked like the most up-to-date pieces of the pipeline and organized them in an attempt to emulate the structure used in previous iterations (e.g., H1C IDR3).
- This is currently split up into four major sections:
  - rtp broadly contains calibration, flagging, smoothing, and imaging, all performed on a per-night basis. I just copied over files from rtp/v2.
  - pre_lstbin contains in-painting, "crosstalk" filtering (i.e., the notch filter step), coherent time averaging, and per-night power spectrum estimation. The files were copied over from post_processing/v7_interleave/pre_lstbin. Note that time averaging and per-night power spectrum estimation products are only meant to be used in a diagnostic sense and files produced from this step should not be propagated further down the pipeline. I will update the in-painting parameters to reflect the new spectral window choices based on @Kai-FengChen's work.
  - lstbin contains two pipelines: one for LST-binning the sum files, and another for LST-binning the diff files. The files were copied over from post_processing/v7_interleave/lstbin.
  - post_lstbin_frate contains everything downstream of LST-binning: "main-lobe" fringe-rate filtering, coherent time averaging, power spectrum estimation, and "auto errors" (which I assume is estimating thermal noise error bars?). I will be modifying things related to fringe-rate filtering to use the new filter parameters.
Details of the pipeline tasks.
- I'm not sure where we ended up with regard to trying to use data that was potentially affected by "jumpy gains", so it's possible that I copied over the wrong set of files.
- Do we still want to generate the diagnostic per-night files?
- Do we still want to generate per-night images?
- Do we still want to generate nightly notebooks?
Environment version control.
- I think everyone is on board with tracking the environment through an environment yaml that can be parsed by conda. We just need to agree on the versions of the various packages and save the environment yaml in the relevant pipeline directory.
Division of labor.
- Who is responsible for running the various parts of the pipeline?
- Are there other parts of the pipeline that need to be updated aside from what I've listed above?
- Who is in charge of writing the IDR memo? Are we breaking this up by section?
General tidying up.
- There are a lot of files with very long and descriptive names, and some of the files have code which wraps in some editors. I think this is fine when the pipeline is under active development and is still in the experimental phase, but I think we should tidy things up a bit before crystallizing the pipeline.

r-pascua · 2023-11-16T20:51:32Z

At the last pspec call, I was assigned with divvying out tasks to do related to this last push, so here's what I would like to propose:

As I understand it, @lisaleemcb and @JianrongTan will primarily be in charge of re-running the pipeline, potentially with help from @Kai-FengChen. Given that, I think it would be a good idea for the two (or three) of you to decide who is going to be responsible for re-running which section. Additionally, I would like to request that you do a high-level review of the code (i.e., just check to make sure that the things we've changed look OK, and check that all the hard-coded paths are sensible) prior to merging this PR.
It's still unclear to me which set of rtp-related files were used for the latest round of analysis. Could some combination of @aewallwi and @jsdillon confirm or deny that I've chosen the correct set of rtp-related files to copy over? If I've chosen the wrong set of files, then can you tell me which files I should have copied over?
@JianrongTan can you look into the error estimation step of the pipeline and update things there if necessary?
@lisaleemcb can you get the environment version control stuff set up and make sure that the environment names in the toml files are correct?
@Kai-FengChen can you update the inpainting-related things?
I will update the "main lobe" fringe-rate filtering things.
@acliu can you make an executive decision about who will manage writing the memo?

I think those are the main items for now, and I hope this is a fair distribution of labor. Please let me know if there are any issues with what I've proposed.

jsdillon · 2023-11-16T20:55:41Z

It's still unclear to me which set of rtp-related files were used for the latest round of analysis. Could some combination of @aewallwi and @jsdillon confirm or deny that I've chosen the correct set of rtp-related files to copy over? If I've chosen the wrong set of files, then can you tell me which files I should have copied over?

The key question boils down to whether https://github.com/HERA-Team/hera_pipelines/blob/b1456ec411793d2b97d58dab042438bf028925ed/pipelines/h4c/idr2.3/rtp/h4c_rtp_stage_2_throw_away_flagged_baselines_keep_fluctuating_ants.toml is the right TOML. I think so, but @aewallwi should confirm.

also fix so that we're using the right case (param_file)

aewallwi · 2023-11-30T15:18:44Z

It's still unclear to me which set of rtp-related files were used for the latest round of analysis. Could some combination of @aewallwi and @jsdillon confirm or deny that I've chosen the correct set of rtp-related files to copy over? If I've chosen the wrong set of files, then can you tell me which files I should have copied over?

The key question boils down to whether https://github.com/HERA-Team/hera_pipelines/blob/b1456ec411793d2b97d58dab042438bf028925ed/pipelines/h4c/idr2.3/rtp/h4c_rtp_stage_2_throw_away_flagged_baselines_keep_fluctuating_ants.toml is the right TOML. I think so, but @aewallwi should confirm.

This is the correct toml. There have been a few bug fixes since this PR was introduced (though no updates to the params for frf and spws) so I think we should rebase on main.

aewallwi · 2023-11-30T15:23:32Z

I think that if we want to incorporate the interleaving machinery for postprocessing we should be copying the files from v7 while it looks like we are using v6?

r-pascua · 2023-11-30T17:26:05Z

It's still unclear to me which set of rtp-related files were used for the latest round of analysis. Could some combination of @aewallwi and @jsdillon confirm or deny that I've chosen the correct set of rtp-related files to copy over? If I've chosen the wrong set of files, then can you tell me which files I should have copied over?

The key question boils down to whether https://github.com/HERA-Team/hera_pipelines/blob/b1456ec411793d2b97d58dab042438bf028925ed/pipelines/h4c/idr2.3/rtp/h4c_rtp_stage_2_throw_away_flagged_baselines_keep_fluctuating_ants.toml is the right TOML. I think so, but @aewallwi should confirm.

This is the correct toml. There have been a few bug fixes since this PR was introduced (though no updates to the params for frf and spws) so I think we should rebase on main.

Great, thank you for the clarification! Are the bug fixes already on main?

r-pascua · 2023-11-30T17:27:32Z

I think that if we want to incorporate the interleaving machinery for postprocessing we should be copying the files from v7 while it looks like we are using v6?

I copied over the post-processing stuff from v7. Here's a snippet from the post_lstbin_frate toml:

path_to_do_scripts = "/users/aewallwi/lustre/hera_pipelines/pipelines/h4c/post_processing/v7_interleave/post_lstbin_frate/task_scripts"

H4C Nsamples Changes

… into h4c_rerun

…to h4c_rerun

…t new directories for rerun outputs separate from h4c old date.

… into h4c_rerun

…pipeline that deletes all interim data files and saves only the mock data and smoothed calfiles.

…ectory paths for tidiness

…conda env. rip hera3dev :(

…ed of seeing the error logs.

copy over pipeline files

3fd9975

r-pascua requested review from lisaleemcb, acliu, jsdillon, aewallwi, Kai-FengChen and JianrongTan November 14, 2023 23:31

rename h4c idr2.2 -> idr2.3

b1456ec

r-pascua added 4 commits November 22, 2023 14:11

frf: add code for using filter param file

ae0b96a

frf: reformat do_FR_FILTER.sh

5c0226a

also fix so that we're using the right case (param_file)

frf: add filter param files

b819302

update: use new spectral windows

8ba5653

Kai-FengChen and others added 5 commits January 31, 2024 19:32

Change from two delay windows to three

f9cd85b

Turn on apply_flag_to_nsample in do_DELAY.sh

2fc293e

Switch on --weight_only_by_flags in do_LSTBIN.sh

4810a03

Merge pull request #24 from HERA-Team/h4c_nsamples_changes

3be3aa8

H4C Nsamples Changes

add conjugates to filter parameter files

1ad29ce

mwilensky768 mentioned this pull request May 21, 2024

Add frf noise cov #27

Open

lisaleemcb and others added 6 commits May 30, 2024 18:05

first pass at h4c idr3.2 rerun yaml

45a23ae

Merge branch 'h4c_rerun' of https://github.com/HERA-Team/hera_pipelines…

f03607f

… into h4c_rerun

updated yaml file and changed env in toml file

f08c843

Merge branch 'main' of https://github.com/HERA-Team/hera_pipelines in…

092b057

…to h4c_rerun

updated pipeline tomls for h4c rerun. standardised direcotries and se…

9702b67

…t new directories for rerun outputs separate from h4c old date.

Merge branch 'h4c_rerun' of https://github.com/HERA-Team/hera_pipelines…

421aa55

… into h4c_rerun

lisaleemcb added 5 commits June 9, 2024 10:13

fixed typo and added do_CLEANUP.sh clean up script to the validation …

c10125a

…pipeline that deletes all interim data files and saves only the mock data and smoothed calfiles.

added validation version of the idr2.3 dir pipeline, changes some dir…

60f0197

…ectory paths for tidiness

copied validation scripts to idr2.3 folder and updated toml to match

e476981

updated all h4c analysis pipeline toml files to use the new h4c_idr3 …

75f137c

…conda env. rip hera3dev :(

temporarily turned off imaging step in the rtp state because I am tir…

c87a167

…ed of seeing the error logs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

H4C IDR2.2 #20

H4C IDR2.2 #20

r-pascua commented Nov 14, 2023 •

edited

Loading

r-pascua commented Nov 16, 2023

jsdillon commented Nov 16, 2023

aewallwi commented Nov 30, 2023

aewallwi commented Nov 30, 2023

r-pascua commented Nov 30, 2023

r-pascua commented Nov 30, 2023

H4C IDR2.2 #20

Are you sure you want to change the base?

H4C IDR2.2 #20

Conversation

r-pascua commented Nov 14, 2023 • edited Loading

r-pascua commented Nov 16, 2023

jsdillon commented Nov 16, 2023

aewallwi commented Nov 30, 2023

aewallwi commented Nov 30, 2023

r-pascua commented Nov 30, 2023

r-pascua commented Nov 30, 2023

r-pascua commented Nov 14, 2023 •

edited

Loading