Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XraySpectrum1D class with loaders #427

Open
wants to merge 94 commits into
base: main
Choose a base branch
from

Conversation

eblur
Copy link

@eblur eblur commented Dec 28, 2018

New XraySpectrum1D class with loaders for Chandra HETG data. Support for more observatories or Chandra data types can come later.

I also switched AreaResponse and ResponseMatrix back to ARF and RMF. I did this because the true acronyms are "Auxiliary Response File" and "Redistribution Matrix File". These words are long and don't have any real meaning at first glance. Because the entire X-ray community defaults to calling them "arfs and rmfs" or simply "the response", I don't see any good reason to fully spell out these acronyms.

This PR replaces #178

In addition to basic initialization tests (provided in test_xrayspectrum1d.py), example code and verification for loading files is available here:
https://github.com/eblur/pyxsis/blob/add_specutils/tests/notebooks/test_xrayspectrum1d.ipynb

@eblur eblur mentioned this pull request Dec 28, 2018
SaOgaz pushed a commit that referenced this pull request Mar 25, 2019
Converted the unicode double quote to normal person double quote
@eteq
Copy link
Member

eteq commented Apr 23, 2019

Sorry this has been sitting for so long. TBH I didn't realize this was in a to-be-reviewed state, @eblur. A couple quick high-level thoughts on this (I can try to do a full review once we know what's up with the tests...)

  • I see bin_lo and bin_hi are implemented as attributes. This is because that's traditionally how x-ray spectra are treated, right? That definitely makes sense as an input, but I'm concerned about whether they stay consistent when e.g. a spectrum is sliced. There's a bunch of specutils machinery that makes sure the spectral_axis stays consistent, but that's not being triggered in the current scheme. It's in the plan to extend the spectral_axis machinery to track bins, but I understand you shouldn't have to wait for this... So the question is: is it acceptable that bin_lo/bin_hi might end up inconsistent if the user does something like xrayspec[10:20]?
  • Relatedly, the ARM and RMFs seem to get assigned just once, and therefore wouldn't be aware of slicing and the like. Do they need to be or is it OK if they just get carried around as-they-are? (Bearing in mind I might be totally misunderstanding since I don't really know what they are!)

Also I see the tests are failing, and it looks like the test failure is real in that it's somewhere in the SpectralFrame machinery. @nmearl, do you have any idea what's going on here?

@eblur
Copy link
Author

eblur commented Apr 24, 2019

  • I see bin_lo and bin_hi are implemented as attributes. This is because that's traditionally how x-ray spectra are treated, right? That definitely makes sense as an input, but I'm concerned about whether they stay consistent when e.g. a spectrum is sliced. There's a bunch of specutils machinery that makes sure the spectral_axis stays consistent, but that's not being triggered in the current scheme. It's in the plan to extend the spectral_axis machinery to track bins, but I understand you shouldn't have to wait for this... So the question is: is it acceptable that bin_lo/bin_hi might end up inconsistent if the user does something like xrayspec[10:20]?

In theory, if the xrayspec[10:20] slice is being used, then all a user would need to do to get bin_lo and bin_hi would be xrayspec.bin_lo[10:20] and xrayspec.bin_hi[10:20], right? So no, I don't see it as an issue.

Due to low signal-to-noise, X-ray spectra are always binned and they often use irregular binning. This is taken care of in my XBinSpectrum class in pyxsis. This PR is simply to build zeroth order machinery for that behavior, not address all binning issues here.

  • Relatedly, the ARM and RMFs seem to get assigned just once, and therefore wouldn't be aware of slicing and the like. Do they need to be or is it OK if they just get carried around as-they-are? (Bearing in mind I might be totally misunderstanding since I don't really know what they are!)

Nope, you can't slice ARFs and RMFs in the same way you would with the counts histogram. So that all gets taken care of again by higher level codes.

@eteq
Copy link
Member

eteq commented Apr 29, 2019

Thanks for the clarifications, @eblur !

In theory, if the xrayspec[10:20] slice is being used, then all a user would need to do to get bin_lo and bin_hi would be xrayspec.bin_lo[10:20] and xrayspec.bin_hi[10:20], right? So no, I don't see it as an issue.

Ah, that's not quite what I was thinking about. The issue is instead because if the user does xspecslice = xrayspec[10:20], xspecslice.spectral_axis and xspecslice.bin_lo are no longer consistent - the slicing machinery doesn't automatically know to populate the new object's bin_lo/bin_hi with the sliced versions. So the information about how to line up the sliced spectrum and the bins gets lost in any function that makes use of slicing. So if the user (or any of the other functions we will hopefully make that are bin-aware) is later using that spectrum they will end up with something that they can't trust to have the correct bins relative to the spectrum itself.

The good news for that is that the WCS machinery Should (™️ ...) be able to take care of that if we can resolve #176 (which we need to be able to do anyway!), but hat isn't yet solved and it would be great if we can get your PR here in without having to fully fix that. Maybe we should just say we block the user from slicing the spectrum directly (until there's a way to properly represent the bins the way you need to - e.g. #176)? Or maybe there's a way to just "hack it" to create the correct bin_lo/bin_hi when slicing happens?

you can't slice ARFs and RMFs in the same way you would with the counts histogram

The ARFs and RMFs then have a similar problem that any slicing doesn't work, so if the user/other functions do slicing the spectrum becomes inconsistent. But maybe that's a bigger issue if it requires complicated code in pyxsis? Is that something that's really complex and therefore it's a lot of work/requires user intervention, or is it something we could imagine putting into the XRaySpectrum1D later?

@eblur
Copy link
Author

eblur commented Apr 30, 2019

I see you're point, and I can write a solution (for XraySpectrum1D, not for ARFs and RMFs). The ARFs, on the best of days, easy to slice because they have the same binning as the counts histogram. In some situations, they don't, so slicing them along with the counts histogram is not feasible. The RMF matrices are stored in a weird way (that I don't understand, but Daniela does). Either way, it's just a lot more work to slice them and it's not the norm to cut up spectra in that way. So overall you're going to have to live with X-ray spectra behaving differently from other types of spectra.

However, that's not why Travis is failing. It's breaking on XraySpectrum1D.__init__, a problem I have not been able to reproduce locally.

@eteq
Copy link
Member

eteq commented May 1, 2019

So overall you're going to have to live with X-ray spectra behaving differently from other types of spectra.

Gotcha - I think I'm fine with that in a general sense. My chief concerns right now are to 1) Make sure we don't confuse users who don't know about this by having them getting a nonesense-sliced XRaySpectrum, and 2) don't want to in the process break internal code like the line-flux measurement code which assumes you can slice up spectra this way basically as an implementation detail.

Longer-term, 3) There are domains I know of where people want to keep track of the binning like an XRay Spectrum and also do the slicing. So it would be great to solve the more general binning problem (i.e., for cases where the spectral_axis is expressed as a parametrized WCS instead of, say a list of wavelengths) and use the same solution in XraySpectrum1D. But again I don't want this to be a blocker for getting this in right now.

However, that's not why Travis is failing

Gotcha. I'll try to take a look when I can if @nmearl doesn't have ideas.

@nmearl
Copy link
Contributor

nmearl commented May 2, 2019

On a preliminary glance, it seems that gwcs can't parse the unit of the spectral axis. The current released version of gwcs doesn't have the extra cases for the SpectralFrame (e.g. speed or energy) from the PRs I've put in, so it can't currently parse any spectral axis that's not a frequency or length.

@nden do you have plans for a release coming up? If not, we might need to update travis to pull from the current master.

@nden
Copy link
Contributor

nden commented May 2, 2019

I was planning to release after the astropy release. Is this too late?

@hamogu
Copy link
Member

hamogu commented May 4, 2019

What's common in X-ray space is instead of slicing the data (which typically is small in terms of memory use), you carry an extra array that tell use which bins are "noticed" (used) and which ones are "ignored". That may be a nonconsecutive pattern! For the user experience, you would have to override slicing such that it sets that "ignore-array" and returns a shallow copy of your object with the different ignore array.
The spec.wavelength etc. then become properties that return only the wavelength / flux / ... of the active bin. The advantage is that you don't have the slice up the ARF and RMF (which is a bit of work as @eblur ) said, you don't even need to make a copy of the that in memory.

I can actually see slicing to be a good interface to that, in particular for users who are not used to X-rays and I could see some application to that even in non-X-ray spectra.

However, there are a lot of corner cases to think about when trying to implement that - the most important problem is clearly that bins need to be used instead of a wavelength mid-point array, because with non-consequtive slicing you don't want wave =[ 1,2,3,4, 15, 16, 17] to be interpreted as having one bin that's 11 Ang wide in the middle.

I'm not saying that that's how it should be implemented in this specific PR, I'm just saying that there might be different ways to do it; there might even be space to have two implementations in specutils with different performance characteristics.


def chandra_hetg_identify(origin, *args, **kwargs):
"""Check whether given file contains Chandra HETG spectral data."""
with fits.open(args[0]) as hdu:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will fail catastrophically if the file it's called on it not a fits file, but e.g. a txt file. Yes, Chandra files will always be fits file, but the "identify" function should work on any file to identify if this is a Chandra file. See discussion in #404

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, what if I have a txt file that for some weird reason ends on "fit", let's say "best_model.fit"?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you write me a solution for that?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. I think https://github.com/astropy/specutils/blob/master/specutils/io/default_loaders/hst_cos.py#L13 would work. Essentially, that just wraps your check on HDU keywords in the try-except block, which will take care of non-fits files.

However, I now wonder, why you you want to identify HETG spectra? Since ARF, RMF and PHA (TYPE I or II) are all OGIP standardized, would it not make more sense to have an "identify OGIP complient spectrum" identifier and then get all X-ray spectra (XMM, Chandra, ACIS, HETG, SUSAku,...) for the price of one?

Base automatically changed from master to main March 22, 2021 15:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants