Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor file discovery #1839

Merged
merged 16 commits into from
Aug 26, 2024
Merged

Refactor file discovery #1839

merged 16 commits into from
Aug 26, 2024

Conversation

kbwestfall
Copy link
Collaborator

Prompted by @rcooke-ast's changes to the file search functionality, I've done a couple of things:

  • The pypeit code will now be exclusive (not inclusive) of compression signatures. I.e., the file in the pypeit file, and the allowed extensions for each spectrograph must now be given explicitly and exactly match the file name on disk. To me, it's bad practice to allow the file name to be different from what is given in the pypeit file.
  • This means that I had to explicitly define the allowed extensions for each spectrograph. This defaults to either .fits or .fits.gz, but is overwritten for a few spectrographs.

There's also an associated dev-suite PR that adds a unit test for this. Both PRs are directed into the relevant AAT/UHRF PRs.

Copy link
Collaborator

@rcooke-ast rcooke-ast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is much cleaner, and a great way to search through potential extensions and paths 👍 Thanks for doing this!

@@ -587,6 +587,7 @@ class GNIRSIFUSpectrograph(GeminiGNIRSSpectrograph):
# * Need to store a wavelength solution for different grating options (Note, the Holy Grail algorithm works pretty well, most of the time)
name = 'gemini_gnirs_ifu'
pypeline = 'SlicerIFU'
allowed_extensions = ['.fits', '.fits.bz2']
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens is the user has a set of fits files in their Raw/ directory, and also the exact same files but compressed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is consistent with the previous behavior: the code would find both. At least now, users can limit the search to one or the other using the --extension command-line argument. I added some text in the setup doc about this.


if isinstance(raw_path, list):
return numpy.concatenate([files_from_extension(p, extension=extension) for p in raw_path]).tolist()
files = numpy.concatenate([files_from_extension(p, extension=extension) for p in raw_path])
return numpy.unique(files).tolist()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we possibly need to check if there are strings present in this list that are identical, except for the extension? I would be doubtful that anyone would carry around identical compressed and uncompressed files in the same directory, but then again, it's possible.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather not over-complicate it. I think it's enough to warn people that the code will potentially find both compressed and uncompressed files if they're in the same directory, which has always been true.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK - sounds good to me! 👍

@codecov-commenter
Copy link

codecov-commenter commented Aug 13, 2024

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 72.34043% with 13 lines in your changes missing coverage. Please review.

Project coverage is 38.25%. Comparing base (66ac429) to head (49d2233).

Files Patch % Lines
pypeit/spectrographs/spectrograph.py 63.63% 4 Missing ⚠️
pypeit/io.py 85.71% 2 Missing ⚠️
pypeit/scripts/obslog.py 0.00% 2 Missing ⚠️
pypeit/pypeitsetup.py 87.50% 1 Missing ⚠️
pypeit/scripts/chk_for_calibs.py 0.00% 1 Missing ⚠️
pypeit/scripts/setup_gui.py 0.00% 1 Missing ⚠️
pypeit/spectrographs/aat_uhrf.py 0.00% 1 Missing ⚠️
pypeit/spectrographs/keck_kcwi.py 0.00% 1 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@             Coverage Diff              @@
##           aat_uhrf    #1839      +/-   ##
============================================
+ Coverage     38.23%   38.25%   +0.02%     
============================================
  Files           209      209              
  Lines         48544    48562      +18     
============================================
+ Hits          18560    18578      +18     
  Misses        29984    29984              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@kbwestfall
Copy link
Collaborator Author

I've started a dev-suite run for this branch.

@rcooke-ast
Copy link
Collaborator

rcooke-ast commented Aug 19, 2024

The tests are passing (except the HIRES failures), fixed in #1833.
image

@kbwestfall kbwestfall merged commit 72bf70f into aat_uhrf Aug 26, 2024
23 checks passed
@kbwestfall kbwestfall deleted the filesearch branch August 26, 2024 16:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants