Simple Identification of Discrete Piles #560

wilsonbb · 2024-04-17T18:46:22Z

This provides a simple approach of identifying discrete piles of images that are all within a given arcsecond radius. While too costly for a large dataset, this sort of approach is nice for a simplified end-to-end test on small test data. This is a streamlined version the discrete piles method added by @ColinOrionChandler in Region Searching Workbook

This PR also provides an example notebook which uses a butler repo with inserted fakes constructed by @DinoBektesevic and gives an example of choosing some discrete piles and then performing reprojection and kbmod search using a butler repo.

Conflicts: src/kbmod/region_search.py tests/test_region_search.py tests/utils/mock_butler.py

review-notebook-app · 2024-04-17T18:46:28Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

wilsonbb · 2024-04-17T19:28:43Z

Included @ColinOrionChandler since there is a fair bit of streamlining from his previous approach and would be interested in his feedback on the notebook.

So rather than the whole PR rather just a focus on src/kbmod/region_search.py and notebooks/region_search/discrete_piles_e2e.ipynb (reviewnb link for convenience)

DinoBektesevic

This looks ok to me. I am mostly worried about scalability of that inner loop and introducing yet another term ("piles", "pointing groups" etc.),. Minor complain regarding URI handling.

Otherwise the notebook is great and runs pretty good, there's a lot of very big images that do kind of pushed my kernel memory, maybe we can opt to plot fewer. Interface is nice and I like the dataset_type_in_collection_frequency functionality.

src/kbmod/region_search.py

DinoBektesevic · 2024-04-22T18:09:59Z

src/kbmod/region_search.py

+        overlapping_sets = []
+        for i in range(len(all_ra_dec) - 1):
+            coord = all_ra_dec[i]
+            if i not in processed_data_ids:


Just as a note since Im not sure it's obvious: this is a for loop in the backend, so we're double looping for datasets with a lot of ids when we get towards the end.

src/kbmod/region_search.py

wilsonbb · 2024-04-23T23:04:32Z

This looks ok to me. I am mostly worried about scalability of that inner loop and introducing yet another term ("piles", "pointing groups" etc.),. Minor complain regarding URI handling.

Otherwise the notebook is great and runs pretty good, there's a lot of very big images that do kind of pushed my kernel memory, maybe we can opt to plot fewer. Interface is nice and I like the dataset_type_in_collection_frequency functionality.

Yeah, I'm still a bit unsure about terminology here myself (I initially pulled "discrete piles" from @ColinOrionChandler for reference) and am personally open to changing it. Maybe a discussion to bring up at one of our meetings because adding even more terminology confusion would be a pain.

I definitely agree that this sort of brute force approach won't be that scalable, and I'm interested in clustering approaches. In addition to options from sk-learn, I wonder if an approach taking more direct advantage of lsst/sphgeom might be helpful here.

wilsonbb added 23 commits March 27, 2024 12:56

Collect VDR data from butler

a202953

Prevent importing LSST for unit tests

7d7aae1

Lint fix

d873ed7

Change table representation

f0d433b

lint fixes

156f00c

Update doc strings

224d0ed

Configure max_workers

045308b

Simplify LSST mocking

b31f0d9

[deploy_alpha] docs formatting

123ccb6

Merge branch 'main' into region_search_init

345e2c5

Adds support for finding discrete piles

bf327fa

Remove unused import and fix indentation

cd88a64

Merge branch 'main' into region_search_init

b724899

lint fix

d8255b4

Merge branch 'region_search_init' into region_search_discrete

56c0ce4

Added tests and example notebook

4af7b7b

Merge branch 'main' into region_search_discrete

1d18849

Conflicts: src/kbmod/region_search.py tests/test_region_search.py tests/utils/mock_butler.py

Remove caching for now

d66e8fb

Clean up mock registry datatype filtering

f31f53b

Merge branch 'main' into region_search_discrete

c00603c

Clear notebook output

625d3c9

Remove ImageCollection helper

c847788

Comments and reorganization

56f1157

wilsonbb added 2 commits April 17, 2024 11:48

Remove testing copy of notebook

6ae6fd2

Fix accidental delete

4f30fe9

wilsonbb requested review from ColinOrionChandler and DinoBektesevic April 17, 2024 19:00

wilsonbb marked this pull request as ready for review April 17, 2024 19:01

DinoBektesevic approved these changes Apr 22, 2024

View reviewed changes

wilsonbb added 2 commits April 23, 2024 14:28

Merge branch 'main' into region_search_discrete

48b4dee

Simplify inner for loop and use urllib

3d676d6

wilsonbb merged commit d10590b into main Apr 23, 2024
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simple Identification of Discrete Piles #560

Simple Identification of Discrete Piles #560

wilsonbb commented Apr 17, 2024

review-notebook-app bot commented Apr 17, 2024

wilsonbb commented Apr 17, 2024

DinoBektesevic left a comment

DinoBektesevic Apr 22, 2024

wilsonbb commented Apr 23, 2024

Simple Identification of Discrete Piles #560

Simple Identification of Discrete Piles #560

Conversation

wilsonbb commented Apr 17, 2024

review-notebook-app bot commented Apr 17, 2024

wilsonbb commented Apr 17, 2024

DinoBektesevic left a comment

Choose a reason for hiding this comment

DinoBektesevic Apr 22, 2024

Choose a reason for hiding this comment

wilsonbb commented Apr 23, 2024