Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

H4C IDR2.2 #20

Open
wants to merge 22 commits into
base: main
Choose a base branch
from
Open

H4C IDR2.2 #20

wants to merge 22 commits into from

Conversation

r-pascua
Copy link
Contributor

@r-pascua r-pascua commented Nov 14, 2023

Given that there are so many moving parts in the analysis, I figured it would be a good idea for the pipeline to go through an informal PR before we re-run the H4C analysis. I am hoping that this PR serves as a useful reference for completing the H4C IDR2.2 memo, and also serves as an additional point of reference for trying to understand how the pipeline works. I have tagged as reviewers people who I tagged in the Slack thread where we started discussing this, just for the sake of putting it on the radar for interested parties—this discussion should also be open to other interested folks in the collaboration.

Here is a list of things that I think are important and should be discussed and generally approved by the group. I do not expect that this is a complete list, so please suggest things to add or remove.

  • Structure of the analysis pipeline code.
    • I have initialized a idr2.3 directory with the intent that this is where the final version of the pipeline will live. I simply copied over what looked like the most up-to-date pieces of the pipeline and organized them in an attempt to emulate the structure used in previous iterations (e.g., H1C IDR3).
    • This is currently split up into four major sections:
      • rtp broadly contains calibration, flagging, smoothing, and imaging, all performed on a per-night basis. I just copied over files from rtp/v2.
      • pre_lstbin contains in-painting, "crosstalk" filtering (i.e., the notch filter step), coherent time averaging, and per-night power spectrum estimation. The files were copied over from post_processing/v7_interleave/pre_lstbin. Note that time averaging and per-night power spectrum estimation products are only meant to be used in a diagnostic sense and files produced from this step should not be propagated further down the pipeline. I will update the in-painting parameters to reflect the new spectral window choices based on @Kai-FengChen's work.
      • lstbin contains two pipelines: one for LST-binning the sum files, and another for LST-binning the diff files. The files were copied over from post_processing/v7_interleave/lstbin.
      • post_lstbin_frate contains everything downstream of LST-binning: "main-lobe" fringe-rate filtering, coherent time averaging, power spectrum estimation, and "auto errors" (which I assume is estimating thermal noise error bars?). I will be modifying things related to fringe-rate filtering to use the new filter parameters.
  • Details of the pipeline tasks.
    • I'm not sure where we ended up with regard to trying to use data that was potentially affected by "jumpy gains", so it's possible that I copied over the wrong set of files.
    • Do we still want to generate the diagnostic per-night files?
    • Do we still want to generate per-night images?
    • Do we still want to generate nightly notebooks?
  • Environment version control.
    • I think everyone is on board with tracking the environment through an environment yaml that can be parsed by conda. We just need to agree on the versions of the various packages and save the environment yaml in the relevant pipeline directory.
  • Division of labor.
    • Who is responsible for running the various parts of the pipeline?
    • Are there other parts of the pipeline that need to be updated aside from what I've listed above?
    • Who is in charge of writing the IDR memo? Are we breaking this up by section?
  • General tidying up.
    • There are a lot of files with very long and descriptive names, and some of the files have code which wraps in some editors. I think this is fine when the pipeline is under active development and is still in the experimental phase, but I think we should tidy things up a bit before crystallizing the pipeline.

@r-pascua
Copy link
Contributor Author

At the last pspec call, I was assigned with divvying out tasks to do related to this last push, so here's what I would like to propose:

  • As I understand it, @lisaleemcb and @JianrongTan will primarily be in charge of re-running the pipeline, potentially with help from @Kai-FengChen. Given that, I think it would be a good idea for the two (or three) of you to decide who is going to be responsible for re-running which section. Additionally, I would like to request that you do a high-level review of the code (i.e., just check to make sure that the things we've changed look OK, and check that all the hard-coded paths are sensible) prior to merging this PR.
  • It's still unclear to me which set of rtp-related files were used for the latest round of analysis. Could some combination of @aewallwi and @jsdillon confirm or deny that I've chosen the correct set of rtp-related files to copy over? If I've chosen the wrong set of files, then can you tell me which files I should have copied over?
  • @JianrongTan can you look into the error estimation step of the pipeline and update things there if necessary?
  • @lisaleemcb can you get the environment version control stuff set up and make sure that the environment names in the toml files are correct?
  • @Kai-FengChen can you update the inpainting-related things?
  • I will update the "main lobe" fringe-rate filtering things.
  • @acliu can you make an executive decision about who will manage writing the memo?

I think those are the main items for now, and I hope this is a fair distribution of labor. Please let me know if there are any issues with what I've proposed.

@jsdillon
Copy link
Member

  • It's still unclear to me which set of rtp-related files were used for the latest round of analysis. Could some combination of @aewallwi and @jsdillon confirm or deny that I've chosen the correct set of rtp-related files to copy over? If I've chosen the wrong set of files, then can you tell me which files I should have copied over?

The key question boils down to whether https://github.com/HERA-Team/hera_pipelines/blob/b1456ec411793d2b97d58dab042438bf028925ed/pipelines/h4c/idr2.3/rtp/h4c_rtp_stage_2_throw_away_flagged_baselines_keep_fluctuating_ants.toml is the right TOML. I think so, but @aewallwi should confirm.

@aewallwi
Copy link
Contributor

  • It's still unclear to me which set of rtp-related files were used for the latest round of analysis. Could some combination of @aewallwi and @jsdillon confirm or deny that I've chosen the correct set of rtp-related files to copy over? If I've chosen the wrong set of files, then can you tell me which files I should have copied over?

The key question boils down to whether https://github.com/HERA-Team/hera_pipelines/blob/b1456ec411793d2b97d58dab042438bf028925ed/pipelines/h4c/idr2.3/rtp/h4c_rtp_stage_2_throw_away_flagged_baselines_keep_fluctuating_ants.toml is the right TOML. I think so, but @aewallwi should confirm.

This is the correct toml. There have been a few bug fixes since this PR was introduced (though no updates to the params for frf and spws) so I think we should rebase on main.

@aewallwi
Copy link
Contributor

I think that if we want to incorporate the interleaving machinery for postprocessing we should be copying the files from v7 while it looks like we are using v6?

@r-pascua
Copy link
Contributor Author

  • It's still unclear to me which set of rtp-related files were used for the latest round of analysis. Could some combination of @aewallwi and @jsdillon confirm or deny that I've chosen the correct set of rtp-related files to copy over? If I've chosen the wrong set of files, then can you tell me which files I should have copied over?

The key question boils down to whether https://github.com/HERA-Team/hera_pipelines/blob/b1456ec411793d2b97d58dab042438bf028925ed/pipelines/h4c/idr2.3/rtp/h4c_rtp_stage_2_throw_away_flagged_baselines_keep_fluctuating_ants.toml is the right TOML. I think so, but @aewallwi should confirm.

This is the correct toml. There have been a few bug fixes since this PR was introduced (though no updates to the params for frf and spws) so I think we should rebase on main.

Great, thank you for the clarification! Are the bug fixes already on main?

@r-pascua
Copy link
Contributor Author

I think that if we want to incorporate the interleaving machinery for postprocessing we should be copying the files from v7 while it looks like we are using v6?

I copied over the post-processing stuff from v7. Here's a snippet from the post_lstbin_frate toml:

path_to_do_scripts = "/users/aewallwi/lustre/hera_pipelines/pipelines/h4c/post_processing/v7_interleave/post_lstbin_frate/task_scripts"

@mwilensky768 mwilensky768 mentioned this pull request May 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants