Improve centroiding connected component detection #20

markasoftware · 2022-07-01T13:22:58Z

Right now it uses a big recursive function which can stack overflow in certain cases. Ideally it'd be switched to a "worklist" style algorithm (the normal way to convert a recursive function into a loop) and also have some limits put in place to reject unrealistic results (such as a centroid taking up half the screen)

markasoftware · 2022-08-02T23:33:04Z

This paper is a good example of what we can do. An added bonus is that it only needs to read the file from top to bottom, which can be helpful on embedded:

https://www.cis.rit.edu/class/simg782.old/finding_connected_components.pdf

The algorithm is actually quite typical and simple for the sort of problem it is. Loop through the pixels from top left to bottom right. Whenever we encounter an "on" pixel, we check whether the pixel immediately above or to the left is also "on". If so, we merge the current pixel into the partial centroids to the left and/or top. The only complication is if there are centroids both above and left of our current pixel, but they're different centroids. In this case we need to merge them into the same centroid. There are lots of simple ways to do this; the paper uses integers to identify the centroids, and when centroids X and Y with X>Y (WLOG) need to be merged, assigns H[X]=Y in a hashmap (this is basically a lightweight implementation of the union-find datastructure).

At the end of iteration, you just loop through all the integers that were used as IDs, and add all their pixels to the list of pixels belonging to the "canonical" ID (found by following hashmap links all the way to the bottom) they were merged into (this is basically the same procedure as in a union-find datastructure). Then you can calculate the weighted average or whatever of each cluster, just like we do now.

A further improvement is to not even store the full list of pixels for each cluster. Instead, we can just store the weighted averages, and the total weight for each one. That's enough information to merge two weighted averages together.

markasoftware added enhancement New feature or request Creativity: 2/5 Difficulty: 3/5 Time: 3/5 Creativity: 3/5 Time: 2/5 and removed Creativity: 2/5 Time: 3/5 labels Jul 1, 2022

nguy8tri self-assigned this Aug 3, 2022

nguy8tri linked a pull request Aug 19, 2022 that will close this issue

Changing Centroid Finding Algorithm #59

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve centroiding connected component detection #20

Improve centroiding connected component detection #20

markasoftware commented Jul 1, 2022

markasoftware commented Aug 2, 2022

Improve centroiding connected component detection #20

Improve centroiding connected component detection #20

Comments

markasoftware commented Jul 1, 2022

markasoftware commented Aug 2, 2022