Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use filter files #916

Merged
merged 8 commits into from
Nov 7, 2023
Merged

Use filter files #916

merged 8 commits into from
Nov 7, 2023

Conversation

r-pascua
Copy link
Contributor

This PR adds the ability to use the tophat filtering code with yaml files containing pre-computed filter parameters. The core assumption here is that there is one filter file per spectral window, and the filter files have the following structure:

filter_centers:
    (ant1, ant2): blah
filter_half_widths:
    (ant1, ant2): blah

where (ant1, ant2) here is a placeholder for all of the baselines in the array (or a superset of baselines in the array). I am open to changing the file format slightly, since yaml.SafeLoader doesn't know how to interpret python tuples; we would just need to agree on a format to use, document it, and stick with it. I have written simple unit tests for the code functionality, but I haven't yet tried a more sophisticated test (i.e., on something that looks like real data with many baselines).

I also added some line breaks here and there because the code was wrapping in my terminal.

Many of the lines were clipping in my text editor, so I inserted some
line breaks to make it more readable.
Also some more cleaning up of the filtering code.
@r-pascua r-pascua requested a review from jsdillon October 30, 2023 22:34
@codecov
Copy link

codecov bot commented Oct 30, 2023

Codecov Report

Attention: 2 lines in your changes are missing coverage. Please review.

Comparison is base (959420a) 97.18% compared to head (d6de897) 97.17%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #916      +/-   ##
==========================================
- Coverage   97.18%   97.17%   -0.01%     
==========================================
  Files          23       23              
  Lines       10446    10486      +40     
==========================================
+ Hits        10152    10190      +38     
- Misses        294      296       +2     
Flag Coverage Δ
unittests 97.17% <98.46%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
hera_cal/frf.py 97.39% <98.46%> (-0.14%) ⬇️

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@jsdillon jsdillon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good. Just a handful of minor issues, mostly in documentation and explanations.

For future reference, I'd really prefer if unrelated whitespace changes are put in a separate PR. Putting it all together makes it a lot more time consuming to review because I have to carefully scan for the substantive changes.

hera_cal/frf.py Outdated
@@ -1572,24 +1574,29 @@ def tophat_frfilter_argparser(mode='clean'):
"and apply independent fringe-rate filters. Default is 1 (no interleaved filters).",
"This does not change the format of the output files but it does change the nature of their content.")
filt_options.add_argument("--ninterleave", default=1, type=int, help=desc)
filt_options.add_argument(
"--param_file", default="", type=str, help="File containing filter parameters"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you say more about the type of file expected?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, definitely. I kept it terse because I wanted to wait until we agreed on how to format the files.

hera_cal/frf.py Show resolved Hide resolved
hera_cal/frf.py Outdated
have_bl_info.append(
(bl[:2] in filter_antpairs) or (bl[:2][::-1] in filter_antpairs)
)
if not all(have_bl_info):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why track have_bl_info and then only raise the error at the end, instead of just raising it ass soon as you encounter it? Also, the error message should say which baseline it couldn't find.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about I update this so that when it errors it tells the user all of the baselines that couldn't be found?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's fine

hera_cal/frf.py Show resolved Hide resolved
hera_cal/tests/test_frf.py Outdated Show resolved Hide resolved
@r-pascua
Copy link
Contributor Author

r-pascua commented Nov 2, 2023

Thank you for the review @jsdillon! I will make the recommended changes. I apologize for adding a bunch of fluff in the form of whitespace changes, but I develop in vim and it was practically impossible to read the code in the terminal with how much line wrapping there was. I will keep this in mind if I make future additions to the code.

Before addressing your comments, I would like to get your opinion on the file format. Ideally, I would access the filter parameters with an antenna pair tuple; however, python tuples are not supported by yaml.SafeLoader (which is what I figure we would want to use). I have two proposals, and would like your input on which you would prefer.

  1. Use baseline integers for the dictionary keys, assuming 350 antennas. This would require an extra step of converting the baseline integers to antenna pair tuples after reading in the file, but this feels like a very easy conversion to the format expected by the filtering code.
  2. Use strings for the dictionary keys, formatted like "(0,1)" for example. This feels like the worse of the two options, since this is susceptible to whitespace issues (e.g., assuming this format would cause the code to miss keys formatted like "(0, 1)"), and generally feels kind of clunky.

Thank you again for the review! I can make all of the requested changes once we settle on a final format for the filter files.

@jsdillon
Copy link
Member

jsdillon commented Nov 2, 2023

What do you mean by baseline integers? Like the ones in pyuvdata? Those aren't human readable...

@r-pascua
Copy link
Contributor Author

r-pascua commented Nov 2, 2023

Yeah, like the ones from pyuvdata. The main problem I'm butting up against is that yaml.SafeLoader doesn't work with python tuples as keys, and using tuples as keys gives really wonky looking yaml files. Here's an example:

filter_centers:
  ? &id001 !!python/tuple
  - 0
  - 1
  : 0.123
  ? &id002 !!python/tuple
  - 0
  - 2
  : 0.173
filter_half_widths:
  *id001: 0.07
  *id002: 0.08

@r-pascua
Copy link
Contributor Author

r-pascua commented Nov 2, 2023

OK, I have updated the code so that filter parameter files use strings for the antenna pairs, and there is an extra bit of code that converts those strings into tuples. I have also fleshed out the documentation regarding the parameter files and added an example parameter file. Here is what the sample file looks like:

filter_centers:
  (0, 1): 0.1234
  (0, 2): 0.173
filter_half_widths:
  (0, 1): 0.05
  (0, 2): 0.08

Please let me know what additional changes you would like to see when you have time to go through the changes.

Copy link
Member

@jsdillon jsdillon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One last request, but after that, good to go!

Comment on lines +959 to +960
with open(tmpdir / "filter_info.yaml", "w") as f:
yaml.dump(filter_info, f)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One final suggestion: let's use the exmaple_filter_params.yaml in the test (that ensures it's not accidentally deleted in the future)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. How about I use the example filter file in the test for the missing baseline error message? The one that tests the actual performance of the code has two cases to make sure it does things correctly when the antennas are flipped in the filter parameter file, so I'm partial to using a temporary file for those tests.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fine

@jsdillon jsdillon merged commit 0f1b175 into main Nov 7, 2023
7 of 8 checks passed
@jsdillon jsdillon deleted the use_filter_files branch November 7, 2023 20:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants