Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Continuum BG Estimation #235

Merged
merged 38 commits into from
Nov 1, 2024
Merged

Continuum BG Estimation #235

merged 38 commits into from
Nov 1, 2024

Conversation

ckarwin
Copy link
Contributor

@ckarwin ckarwin commented Aug 21, 2024

This is an initial method for estimating the background for continuum sources. It is still highly preliminary, and so I am opening this as a PR draft, in order to share the example notebook which has the initial algorithm hard-coded (available here). I plan to develop the ContinuumEstimation class leading up to the October meeting, in order to at least have a working version that can then be further develop. I think the ContinuumEstimation class should be part of a background_estimation base class, which will also contain a LineEstimation class and a BurstEstimation class. But we should discuss this and figure out what makes the most sense.

Here are slides summarizing the initial results from the method presented in the example tutorial:
continuum_bg_estimation.pdf. Below is a summary of the next steps:

Improving algorithm:

  • Optimize cut for mask selection (so far only tested cumdist < 0.4)

  • Use actual ARM measurement for mask selection

  • Use more sophisticate inpainting method:

    • Inpainting Galactic Foreground Intensity and Polarization Maps Using Convolution Neural Networks (Puglisi+20, link)
      • Tested for small angular scales, but what about large scales?
    • SMILE inpainting algorithm: (Li 2014, link)
      • Based on solving the Laplace equation for each missing pixel by using the values of the surrounding pixels. Might not work so well for our purposes.
      • Also used in Macias+19 (link).
    • We used CNN for a somewhat similar problem: Deep learning models of the discrete component of the Galactic interstellar gamma-ray emission (link).
    • Develop our own, optimized for our specific problem (maybe employing some ML algorithm).
    • Other methods?
  • Give more freedom to fit (first need to fix unrelated convergence problem).

  • Vectorize code

More tests:

  • Test method on extended/diffuse source.
  • Test Crab fit with GDCE included in total BG.
  • Test Crab fit with other point source(s) included in BG.
  • Test Crab fit will all other astrophysical sources included in BG.

@ckarwin ckarwin marked this pull request as draft August 21, 2024 20:39
Copy link

codecov bot commented Aug 21, 2024

Codecov Report

Attention: Patch coverage is 64.28571% with 50 lines in your changes missing coverage. Please review.

Project coverage is 74.25%. Comparing base (0ea3e15) to head (b3877c3).
Report is 39 commits behind head on develop.

Files with missing lines Patch % Lines
...osipy/background_estimation/ContinuumEstimation.py 63.76% 50 Missing ⚠️
Files with missing lines Coverage Δ
cosipy/__init__.py 100.00% <100.00%> (ø)
cosipy/background_estimation/__init__.py 100.00% <100.00%> (ø)
...osipy/background_estimation/ContinuumEstimation.py 63.76% <63.76%> (ø)

@ckarwin ckarwin requested a review from hirokiyoneda August 21, 2024 20:43
@ckarwin ckarwin added the background Background fitting and modeling label Aug 21, 2024

return

def laod_full_data(self, data_file, data_yaml):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

load (typo?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, thanks! Fixed now.

@ckarwin ckarwin marked this pull request as ready for review October 25, 2024 17:07
Copy link
Contributor

@hiyoneda hiyoneda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have reviewed the code. I added some suggestions about the code.

cosipy/background_estimation/ContinuumEstimation.py Outdated Show resolved Hide resolved

class ContinuumEstimation:

def calc_psr(self, ori_file, detector_response, coord, output_file, nside=16):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the future (not in this PR), we can move this part outside this class because the point source calculation is also performed in several different classes, e.g., point source injector.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that makes sense.

"""

# Orientatin file:
sc_orientation = SpacecraftFile.parse_from_file(ori_file)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use sc_orientation directly instead of the path to an orientation file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, done.

self.psr = response.get_point_source_response(coord = coord, scatt_map = scatt_map)

# Save:
self.psr.write(output_file + ".h5")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you return psr here? I think it would be better than saving a PSR file outside this function: users receive psr from this function and save it by themselves if they want.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, done.

Full path to precomputed response file (.h5 file).
"""

logger.info("...loading the pre-computed image response ...")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-> "...loading the pre-computed point source response ..." ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed.

search_right = True
while search_right == True:

if this_index+j >= len(self.psr.axes['PsiChi'].centers)-1:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this_index+j >= self.psr.axes['PsiChi'].nbins-1:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More elegant, thanks! Changed.

plt.show()
plt.close()

return
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to return arm_mask and sorted_indices? Assigning them as member parameters might be confusing because it will be hard to know from which sliced data self.arm_mask and sorted_indices are generated. My suggestion is that arm_mask and sorted_indices are explicitly generated for each loop in continuum_bg_estimation and they are used as input parameters of simple_inpainting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made your suggested changes.


return

def simple_inpainting(self, m_data):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following the above comment, can you add arm_mask and sorted_indices as input parameters?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


self.full_data = BinnedData(data_yaml)
self.full_data.load_binned_data_from_hdf5(data_file)
self.estimated_bg = self.full_data.binned_data.project('Em', 'Phi', 'PsiChi')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move the line 92 inside continuum_bg_estimation because the background model is not yet generated at this point? And, maybe, estimated_bg can be just an output of the function continuum_bg_estimation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

self.estimated_bg.write(output_file,overwrite=True)
logger.info("Finished!")

return
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you return estimated_bg here? It would be better that estimated_bg is saved by users outside this function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@ckarwin
Copy link
Contributor Author

ckarwin commented Oct 31, 2024

Thanks again for your review, @hiyoneda! It was very helpful. I implemented all your comments, and I also added unit tests. I think it can be merged if you think it's ok.

@hiyoneda
Copy link
Contributor

hiyoneda commented Nov 1, 2024

@ckarwin Thanks. It looks good. I have two more comments on the notebook. After they are reflected, I think that it can be merged.

  1. On the 3rd cell, the response file is downloaded, but the file is not consistent with the one we load later. Should it be SMEXv12.Continuum.HEALPixO3_10bins_log_flat.binnedimaging.imagingresponse.nonsparse_nside8.area.good_chunks_unzip.earthocc.zip?
  2. At the last 2nd cell, the background file is generated, but only 1 Em bin and 2 Phi bins are considered. It is already mentioned, but can you add more information? Especially, how the values in other bins are filled (I suppose they are the same as the actually events), and how long it takes to run through all of bins roughly.

@ckarwin
Copy link
Contributor Author

ckarwin commented Nov 1, 2024

@hiyoneda Ok, it's been updated.

  1. Yes, the correct file is SMEXv12.Continuum.HEALPixO3_10bins_log_flat.binnedimaging.imagingresponse.nonsparse_nside8.area.good_chunks_unzip.earthocc.zip
  2. I added the following text:

Note that the current code has not yet been optimized for speed, as it uses a simple nested for loop. The time required to generate the estimated background using all bins is roughyly 4 hours. The option to use a subset of the Em bins and/or Phi bins may be useful for analyses that also use a given subset, but at this point the main motivation for this option is for demonstrational purposes, and when using this option, nothing is done with the other bins.

Copy link
Contributor

@hiyoneda hiyoneda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! It can be merged.

@hiyoneda hiyoneda merged commit f3a7477 into cositools:develop Nov 1, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
background Background fitting and modeling
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants