You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Right now it uses a big recursive function which can stack overflow in certain cases. Ideally it'd be switched to a "worklist" style algorithm (the normal way to convert a recursive function into a loop) and also have some limits put in place to reject unrealistic results (such as a centroid taking up half the screen)
The text was updated successfully, but these errors were encountered:
This paper is a good example of what we can do. An added bonus is that it only needs to read the file from top to bottom, which can be helpful on embedded:
The algorithm is actually quite typical and simple for the sort of problem it is. Loop through the pixels from top left to bottom right. Whenever we encounter an "on" pixel, we check whether the pixel immediately above or to the left is also "on". If so, we merge the current pixel into the partial centroids to the left and/or top. The only complication is if there are centroids both above and left of our current pixel, but they're different centroids. In this case we need to merge them into the same centroid. There are lots of simple ways to do this; the paper uses integers to identify the centroids, and when centroids X and Y with X>Y (WLOG) need to be merged, assigns H[X]=Y in a hashmap (this is basically a lightweight implementation of the union-find datastructure).
At the end of iteration, you just loop through all the integers that were used as IDs, and add all their pixels to the list of pixels belonging to the "canonical" ID (found by following hashmap links all the way to the bottom) they were merged into (this is basically the same procedure as in a union-find datastructure). Then you can calculate the weighted average or whatever of each cluster, just like we do now.
A further improvement is to not even store the full list of pixels for each cluster. Instead, we can just store the weighted averages, and the total weight for each one. That's enough information to merge two weighted averages together.
Right now it uses a big recursive function which can stack overflow in certain cases. Ideally it'd be switched to a "worklist" style algorithm (the normal way to convert a recursive function into a loop) and also have some limits put in place to reject unrealistic results (such as a centroid taking up half the screen)
The text was updated successfully, but these errors were encountered: