Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simple Identification of Discrete Piles #560

Merged
merged 27 commits into from
Apr 23, 2024
Merged

Conversation

wilsonbb
Copy link
Collaborator

This provides a simple approach of identifying discrete piles of images that are all within a given arcsecond radius. While too costly for a large dataset, this sort of approach is nice for a simplified end-to-end test on small test data. This is a streamlined version the discrete piles method added by @ColinOrionChandler in Region Searching Workbook

This PR also provides an example notebook which uses a butler repo with inserted fakes constructed by @DinoBektesevic and gives an example of choosing some discrete piles and then performing reprojection and kbmod search using a butler repo.

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@wilsonbb wilsonbb marked this pull request as ready for review April 17, 2024 19:01
@wilsonbb
Copy link
Collaborator Author

Included @ColinOrionChandler since there is a fair bit of streamlining from his previous approach and would be interested in his feedback on the notebook.

So rather than the whole PR rather just a focus on src/kbmod/region_search.py and notebooks/region_search/discrete_piles_e2e.ipynb (reviewnb link for convenience)

Copy link
Member

@DinoBektesevic DinoBektesevic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks ok to me. I am mostly worried about scalability of that inner loop and introducing yet another term ("piles", "pointing groups" etc.),. Minor complain regarding URI handling.

Otherwise the notebook is great and runs pretty good, there's a lot of very big images that do kind of pushed my kernel memory, maybe we can opt to plot fewer. Interface is nice and I like the dataset_type_in_collection_frequency functionality.

src/kbmod/region_search.py Outdated Show resolved Hide resolved
overlapping_sets = []
for i in range(len(all_ra_dec) - 1):
coord = all_ra_dec[i]
if i not in processed_data_ids:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just as a note since Im not sure it's obvious: this is a for loop in the backend, so we're double looping for datasets with a lot of ids when we get towards the end.

src/kbmod/region_search.py Outdated Show resolved Hide resolved
src/kbmod/region_search.py Outdated Show resolved Hide resolved
@wilsonbb
Copy link
Collaborator Author

This looks ok to me. I am mostly worried about scalability of that inner loop and introducing yet another term ("piles", "pointing groups" etc.),. Minor complain regarding URI handling.

Otherwise the notebook is great and runs pretty good, there's a lot of very big images that do kind of pushed my kernel memory, maybe we can opt to plot fewer. Interface is nice and I like the dataset_type_in_collection_frequency functionality.

Yeah, I'm still a bit unsure about terminology here myself (I initially pulled "discrete piles" from @ColinOrionChandler for reference) and am personally open to changing it. Maybe a discussion to bring up at one of our meetings because adding even more terminology confusion would be a pain.

I definitely agree that this sort of brute force approach won't be that scalable, and I'm interested in clustering approaches. In addition to options from sk-learn, I wonder if an approach taking more direct advantage of lsst/sphgeom might be helpful here.

@wilsonbb wilsonbb merged commit d10590b into main Apr 23, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants