Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JP-3347: Improve spectral outlier detection #8828

Merged
merged 12 commits into from
Oct 17, 2024
1 change: 1 addition & 0 deletions changes/8828.outlier_detection.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
For slit spectra, threshold outliers with a median error across exposures instead of input error from the exposure itself.
2 changes: 1 addition & 1 deletion docs/jwst/outlier_detection/arguments.rst
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ that control the behavior of the processing:
The value to assign to resampled image pixels that have zero weight or
do not receive any flux from any input pixels during drizzling.
Any floating-point value, given as a string, is valid.
A value of 'INDEF' will use the last zero weight flux.
The default value of 'NAN' sets NaN values.

``--maskpt``
The percent of maximum weight to use as lower-limit for valid data;
Expand Down
56 changes: 34 additions & 22 deletions docs/jwst/outlier_detection/outlier_detection_spec.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,38 +13,50 @@ Specifically, this routine performs the following operations (modified from the

#. Extract parameter settings from input model and merge them with any user-provided values

- the same set of parameters available to:
- The same set of parameters available to:
ref:`Default Outlier Detection Algorithm <outlier-detection-imaging>`
also applies to this code
also applies to this code.

#. Convert input data, as needed, to make sure it is in a format that can be processed

- A :py:class:`~jwst.datamodels.ModelContainer` serves as the basic format
for all processing performed by
this step, as each entry will be treated as an element of a stack of images
to be processed to identify bad pixels, cosmic-rays and other artifacts
- If the input data is a :py:class:`~jwst.datamodels.CubeModel`, convert it into a
:py:class:`~jwst.datamodels.ModelContainer`.
This allows each plane of the cube to be treated as a separate 2D image
for resampling (if done) and for combining into a median image.
#. Resample all input images into a :py:class:`~jwst.datamodels.ModelContainer` using
:py:class:`~jwst.resample.resample_spec.ResampleSpecData`

- Resampled images are written out to disk with suffix "outlier_s2d"
to be processed to identify bad pixels, cosmic rays and other artifacts.

#. If the ``resample_data`` parameter is set to True, resample all input images
using :py:class:`~jwst.resample.resample_spec.ResampleSpecData`.

- Error images are resampled alongside the science data to create
approximate error arrays for each resampled exposure.
- The resampled data are written out to disk with suffix "outlier_s2d"
if the ``save_intermediate_results`` parameter is set to `True`.
- **If resampling is turned off**, the original unrectified inputs are used to create
the median image for cosmic-ray detection.
#. Create a median image from (possibly) resampled :py:class:`~jwst.datamodels.ModelContainer`

- The median image is written out to disk with suffix "median"
if the ``save_intermediate_results`` parameter is set to `True`
#. Blot median image to match each original input image
#. Create a median image from the resampled exposures, or directly from
the input exposures if ``resample_data`` is set to False.

- The error images for each exposure are also median-combined.
- The median datamodel, with ``data`` and ``err`` extensions, is
written to disk with suffix "median" if the ``save_intermediate_results``
parameter is set to `True`.

#. If ``resample_data`` is set to True, blot the median image to match
each original input image.

- The median error image is also blotted to match the input.
- Resampled/blotted images are written out to disk if the ``save_intermediate_results``
parameter is set to `True`
- **If resampling is turned off**, the median image is used for comparison
with the original input models for detecting outliers
#. Perform statistical comparison between blotted image and original image to identify outliers
#. Update input data model DQ arrays with mask of detected outliers
parameter is set to `True`.

#. Perform statistical comparison between the median image and the original image
to identify outliers.

- Signal-to-noise thresholding uses the median error image, rather than the
input error image, in order to better identify outliers that have both
high data values and high error values in the input exposures.

#. Update input data model DQ arrays with mask of detected outliers.
Update the SCI, ERR, and VAR arrays with NaN values at the outlier
locations.


.. automodapi:: jwst.outlier_detection.spec
98 changes: 88 additions & 10 deletions jwst/outlier_detection/_fileio.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,45 +4,123 @@


def save_median(median_model, make_output_path):
'''
Save median if requested by user
"""Save a median model.

Output suffix is 'median'.

Parameters
----------
median_model : ~jwst.datamodels.ImageModel
The median ImageModel or CubeModel to save
'''

make_output_path : function
A function to be used to create an output path from
an input path and an optional suffix.

"""
_save_intermediate_output(median_model, "median", make_output_path)


def save_drizzled(drizzled_model, make_output_path):
"""Save a drizzled model.

Input model is expected to have a filename that ends with
either 'outlier_i2d.fits' or 'outlier_s2d.fits'.

Parameters
----------
drizzled_model : ~jwst.datamodels.ImageModel
The median ImageModel or CubeModel to save.

make_output_path : function
A function to be used to create an output path from
an input path and an optional suffix.

"""
expected_tail = "outlier_?2d.fits"
suffix = drizzled_model.meta.filename[-len(expected_tail):-5]
_save_intermediate_output(drizzled_model, suffix, make_output_path)


def save_blot(input_model, blot, make_output_path):
blot_model = _make_blot_model(input_model, blot)
def save_blot(input_model, blot, blot_err, make_output_path):
"""Save a blotted model.

Output suffix is 'blot'.

Parameters
----------
input_model : ~jwst.datamodels.ImageModel
An input model corresponding to the blotted data,
containing metadata to copy.

blot : array-like
The blotted science data.

blot_err : array-like or None
The blotted error data, if available.

make_output_path : function
A function to be used to create an output path from
an input path and an optional suffix.

"""
blot_model = _make_blot_model(input_model, blot, blot_err)
_save_intermediate_output(blot_model, "blot", make_output_path)


def _make_blot_model(input_model, blot):
def _make_blot_model(input_model, blot, blot_err):
"""Assemble a blot model.

Parameters
----------
input_model : ~jwst.datamodels.ImageModel
An input model corresponding to the blotted data,
containing metadata to copy.

blot : array-like
The blotted science data.

blot_err : array-like or None
The blotted error data, if available.

Returns
-------
blot_model : ~jwst.datamodels.ImageModel
An image model containing the blotted data.

"""
blot_model = type(input_model)()
blot_model.data = blot
if blot_err is not None:
blot_model.err = blot_err
blot_model.update(input_model)
return blot_model


def _save_intermediate_output(model, suffix, make_output_path):
"""
Ensure all intermediate outputs from OutlierDetectionStep have consistent file naming conventions

"""Save an intermediate output from outlier detection.

Ensure all intermediate outputs from OutlierDetectionStep have
consistent file naming conventions.

Parameters
----------
model : ~jwst.datamodels.ImageModel
The intermediate datamodel to save.

suffix : str
A suffix to add to the output file name.

make_output_path : function
A function to be used to create an output path from
an input path and an optional suffix.

Notes
-----
self.make_output_path() is updated globally for the step in the main pipeline
to include the asn_id in the output path, so no need to handle it here.
"""

"""
# outlier_?2d is not a known suffix, and make_output_path cannot handle an
# underscore in an unknown suffix, so do a manual string replacement
input_path = model.meta.filename.replace("_outlier_", "_")
Expand Down
2 changes: 1 addition & 1 deletion jwst/outlier_detection/outlier_detection_step.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ class OutlierDetectionStep(Step):
weight_type = option('ivm','exptime',default='ivm')
pixfrac = float(default=1.0)
kernel = string(default='square') # drizzle kernel
fillval = string(default='INDEF')
fillval = string(default='NAN')
emolter marked this conversation as resolved.
Show resolved Hide resolved
maskpt = float(default=0.7)
snr = string(default='5.0 4.0')
scale = string(default='1.2 0.7')
Expand Down
20 changes: 11 additions & 9 deletions jwst/outlier_detection/spec.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
"""
Submodule for performing outlier detection on spectra.
"""
"""Perform outlier detection on spectra."""

from jwst.datamodels import ModelContainer, ModelLibrary
from jwst.stpipe.utilities import record_step_status

Expand Down Expand Up @@ -36,8 +35,7 @@ def detect_outliers(
in_memory,
make_output_path,
):
"""
Flag outliers in spec data.
"""Flag outliers in spec data.

See `OutlierDetectionStep.spec` for documentation of these arguments.
"""
Expand Down Expand Up @@ -68,20 +66,22 @@ def detect_outliers(
good_bits=good_bits,
)

median_data, median_wcs = median_with_resampling(
median_data, median_wcs, median_err = median_with_resampling(
library,
resamp,
maskpt,
save_intermediate_results=save_intermediate_results,
make_output_path=make_output_path,)
make_output_path=make_output_path,
return_error=True)
else:
median_data, median_wcs = median_without_resampling(
median_data, median_wcs, median_err = median_without_resampling(
library,
maskpt,
weight_type,
good_bits,
save_intermediate_results=save_intermediate_results,
make_output_path=make_output_path,
return_error=True
)

# Perform outlier detection using statistical comparisons between
Expand All @@ -96,11 +96,13 @@ def detect_outliers(
scale1,
scale2,
backg,
median_err=median_err,
save_blot=save_intermediate_results,
make_output_path=make_output_path
)
else:
flag_crs_in_models(input_models,
median_data,
snr1)
snr1,
median_err=median_err)
return input_models
1 change: 1 addition & 0 deletions jwst/outlier_detection/tests/test_outlier_detection.py
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,7 @@ def test_flag_cr(sci_blot_image_pair):
_flag_resampled_model_crs(
sci,
blot.data,
None,
5.0,
4.0,
1.2,
Expand Down
Loading
Loading