First photutils notebook: background estimation #1

laurenmarietta · 2018-06-28T19:46:57Z

Added the first photutils tutorial, for background estimation, as well as some supporting files (the photutils logo and the matplotlib style file). Updated the .gitignore to exclude the other notebooks, which are still in development.

Please provide feedback, comments, critique, etc.!

…till incomplete)

…the background estimation notebook; playing with segmentation

eteq

@laurenmarietta - A great start on this sequence of notebooks!

Note that I could not actually use the inline commenting for the notebook itself, because of the first item I mention below. So for now to not slow us any further I'll just leave comments as separate paragraphs below. But in the future it's good to do the first thing there so that we can use inline github comments. Or if it'll be easier for back-and-forth commenting, you can do that change first and then I can move those comments to be inline.

Initial: In the notebook as you've checked it in, you've executed all the cells. That means the notebook is 8MB, primarily because of all the images. This is not desirable because git isn't meant for tracking content changes like that. Instead, you should check into git the unevaluated notebook (i.e. do a "clear outputs" before the git add). That's a much better file both for reviewing on github and for storing in git. Then when the notebooks are actually hosted somewhere, we automatically run them. That's also a good check because it determines if the notebook actually works as written, rather than only working because you did something in a cell you then later deleted (for example). Going forward we'll need to rebase this so that it doesn't take up a bunch of stuff in the repo, but not urgently (i.e. we can talk about that later).

Intro: Not sure how much we care about this... but it might be good to say just a bit about what the image is - that is, a "this is an ACS image of ". That both serves the purpose of making it clear that these data are space telescope data while at the same time attracting interest by the average astronomer who likely wants to know of a relevant science context.

Cell [1]: I don't think the import matplotlib is needed, nor the LogNorm, as it doesn't look like you use it elsewhere?

Cell [2]: This is good, but FYI I think in the future you'll want that to say something like "a shared style file over all the notebooks". But it's good as-is for now since there's currently jsut the one.

Words above Cell [3]]: this might also be a good place to give the science context (instead of the "intro" part above) - it also might be good to point out that this is a High Level Science Product rather than a raw image from the archive (otherwise we'd suggest using astroquery since that's more robust to changes in the URLs, but since it's a HLSP it makes sense to do it this way).

Cell In[4]: Is there a specific reason you're using for-loops here? I think this will work in it's place (and a lot faster):

mask = data == 0
n_data_pixels = len(data[~mask])
background = np.linspace(-1e-4, 5e-4, num=n_data_pixels)

modified_data = np.copy(data)
modified_data[~mask] = background[:]

Cell In[6], and others below: The plt.show() at the end isn't necessary - that happens automatically with the inline backend (and in fact the show sometimes causes the plot to show up twice depending on how your mpl is setup).

Above Cell In[11]: you might want to define what you mean by "scalar" - I know you mean "in contrast to 2D", but the reader who's only made it this far might not quite get that distinction yet. Just a sentence or two would be fine, perhaps referencing the upcoming section.

Cell In[15]: Instead of re-writing the data, I think this does what you want: xdf_scalar_bkgdsub = xdf_image.subtract(mean*u.ct/u.s) - I know it's a bit weird that this isn't the - operation, but that's intentional (I don't recall the motivation for sure, but I think it's got to do with different ways to propagate uncertainties).

Cell In[16]: There's a subtlety that's not obvious here: that the reason why they look different is that your stretch is a fixed range instead of "minmax" or the like. It might be worth mentioning something about this - when I first saw that I thought "shouldn't they look the same since the range is the same, just offset by a fixed amount?". Then again, maybe the fact that I saw that and then figured it out on my own is good. So your call if you want to modify the text at all, but thought I'd mention it.

Cell In[19]: I love this plot! Don't change a thing about it. 😉

Cell In[20]: Same as In[15].

The rest looks great! One possible suggestion for something to add at the end: do another sigma_clipped_stats on the 2D-subtracted one (or show the histogram). I speculate the new std will be significantly lower, demonstrating the whole point. Or alternatively, this could be a third exercise.

eteq · 2018-08-01T20:41:58Z

.gitignore

@@ -1,3 +1,7 @@
 .pyc
 .ipynb_checkpoints/
 *.nbconvert.ipynb
+photutils/dev/*
+photutils/02_photutils_source_detection.ipynb


the bottom 3 of these should not be in the .gitignore file - I'm guessing you're ignoring them temporarily, but eventually we want them in there. Or imagine if you're bought out by google but then someone else later wants to implement this - they won't know they're supposed to delete this.

Instead, I suggest you have 3 other branches of this repo, and implement the 3 separately on those branches. That way you can keep track of their version history completely separately, and not worry about ignoring it in one place but not others. (maybe you're doing that already, in which case you can just drop this from the .gitignore

(copying from Slack conversation)

What I’ve been doing so far is I have one branch, just called photutils, where I have been developing all of the notebooks. I just created this PR branch, photutils-background-estimation, off of that development branch to ultimately merge the first notebook into master.

You suggest having one branch per notebook, but I’ve found that pretty frequently I need to make changes to multiple or all of the notebooks, not just one at a time. Do you think it sounds like a feasible workflow for me to keep this main development branch, then I can just branch off and delete the irrelevant notebooks whenever I submit a PR?

Probably not many people will be looking at this other than us, but just in case... we concluded in an out-of-band discussion that we should change to a separate branch per notebook and make separate commits in the individual branches.

laurenmarietta · 2018-08-02T14:59:24Z

Intro: Not sure how much we care about this... but it might be good to say just a bit about what the image is - that is, a "this is an ACS image of ". That both serves the purpose of making it clear that these data are space telescope data while at the same time attracting interest by the average astronomer who likely wants to know of a relevant science context.

Would you say that the description I provide before downloading the data (before [3]) isn't sufficiently clear?

otherwise we'd suggest using astroquery since that's more robust to changes in the URLs, but since it's a HLSP it makes sense to do it this way

Just to be clear, we could use astroquery to get the raw data from MAST, but since this file is a HLSP, it isn't available on MAST and we have to go straight to the archive.stsci.edu URL. Which astroquery cannot do. Is that correct?

modified_data[~mask] = background[:]

This is so much more clever than what I had - thank you!

do another sigma_clipped_stats on the 2D-subtracted one (or show the histogram). I speculate the new std will be significantly lower, demonstrating the whole point.

I had actually tried to do this before, but I found that the results weren't as exciting as I would have thought. The data before background subtraction has a standard deviation of 7.51e-4, and after it is 7.31e-4. I was unconvinced that this was a big enough difference to make any claims about, but if you think it is worth it, I can add it as an exercise.

eteq · 2018-08-07T22:31:20Z

@laurenmarietta

Would you say that the description I provide before downloading the data (before [3]) isn't sufficiently clear?

I think it was clear, but might just need a sentence or two of motivation along the lines of "... of exposure time. While some of the faintest and most distant galaxies known can be found in this image, background subtraction is essential for science like that the later notebooks demonstrate." (or the last clause could be tweaked a bit to be more specific but hopefully that gets the idea across). That said, I guess the biggest thing would be just be for it to come earlier rather than at the time of download.

Just to be clear, we could use astroquery to get the raw data from MAST, but since this file is a HLSP, it isn't available on MAST and we have to go straight to the archive.stsci.edu URL. Which astroquery cannot do. Is that correct?

That is correct as far as I know. It may be that MAST has some API (or upcoming API) to discover HLSP's - if so we might consider switch to that to find the URL (since URLs sometimes change). I'll ask someone to make sure but I think that doesn't exist right now.

I had actually tried to do this before, but I found that the results weren't as exciting as I would have thought. ...

Hmm, ok, I see your point. I guess the background variation isn't enough to be very statistically strong relative to the actual real sources. You might get around this by just increasing your "inserted" background? The exercise could even be "try it. notice it's small. does this make sense? try to make it big." We're "leaving it as an exercise for the reader" 😉

laurenmarietta · 2018-08-15T15:07:41Z

Okay, I tried to implement all the suggestions you gave!

The main part I had trouble with was the scientific motivation - mainly because we are using an HLSP, and thus it isn't /really/ necessary to perform background estimation on it (right?). I actually don't include any sort of background subtraction in subsequent notebooks after downloading the data, because the background has already been removed. So the sentence that I added, "Background subtraction is essential for accurate photometric analysis of astronomical data like the XDF," feels a little contrived. But maybe I'm just overthinking it?

eteq · 2018-09-06T15:01:48Z

@laurenmarietta - looks great now aside from some very minor things (eb15247) , so merging!

Lauren Chambers added 9 commits May 3, 2018 10:17

Initial commit; created 4 notebook outlines (PSF photometry outline s…

a2e4ca1

…till incomplete)

Added brief introductions, updated tables of contents

7084b83

Update photutils notebooks

b026bb5

First draft of background estimation notebook

31fe846

Updates to source detection and background estimation

b811d81

Edits from Erik's feedback

2aa1e84

Updates to source detection

56e5db7

Update notebooks to download data with URL; add 2D background within …

d275b73

…the background estimation notebook; playing with segmentation

Remove other notebooks from cache for notebook 1 PR

692be6b

eteq self-requested a review July 19, 2018 15:13

eteq requested changes Aug 1, 2018

View reviewed changes

Edits from Erik

365adda

Remove other notebooks from .gitignore

7c1b19c

laurenmarietta pushed a commit to laurenmarietta/csi-stsci-notebooks that referenced this pull request Aug 15, 2018

Implement changes from eteq#1

f312059

laurenmarietta pushed a commit to laurenmarietta/csi-stsci-notebooks that referenced this pull request Aug 15, 2018

Update .gitignore and implement changes from eteq#1

c7fdc1b

Lauren Chambers and others added 5 commits August 23, 2018 11:18

Fix logo embed to render in GitHub

ee713b7

Trying again with photutils logo

d52edbc

One more time, with feeling. Removing the link to see if that does it.

2c6baa4

Again

f979e5f

some last-bits of changes

eb15247

eteq merged commit 80103c5 into eteq:master Sep 6, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

First photutils notebook: background estimation #1

First photutils notebook: background estimation #1

laurenmarietta commented Jun 28, 2018 •

edited

Loading

eteq left a comment •

edited

Loading

eteq Aug 1, 2018

laurenmarietta Aug 2, 2018

eteq Aug 7, 2018

laurenmarietta commented Aug 2, 2018

eteq commented Aug 7, 2018

laurenmarietta commented Aug 15, 2018

eteq commented Sep 6, 2018

First photutils notebook: background estimation #1

First photutils notebook: background estimation #1

Conversation

laurenmarietta commented Jun 28, 2018 • edited Loading

eteq left a comment • edited Loading

Choose a reason for hiding this comment

eteq Aug 1, 2018

Choose a reason for hiding this comment

laurenmarietta Aug 2, 2018

Choose a reason for hiding this comment

eteq Aug 7, 2018

Choose a reason for hiding this comment

laurenmarietta commented Aug 2, 2018

eteq commented Aug 7, 2018

laurenmarietta commented Aug 15, 2018

eteq commented Sep 6, 2018

laurenmarietta commented Jun 28, 2018 •

edited

Loading

eteq left a comment •

edited

Loading