-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature Request: Automatic Good/Bad Image Filter #171
Comments
I had some comments that relate to this issue here #168 (comment) In summary, I think I prefer a less aggressive approach to filtering images than what you (we) propose here. Even though I did suggest it as a potential solution, on reflection, and having looked at the psnr and rmse scores of a lot of imagery from different locations this week, I think it would be very hard to determine a threshold that worked well. I think we would end up throwing out a lot of good images and keeping in a lot of bad images, unless we tweaked the threshold of psnr (or rmse) quite a bit. Additionally, it would not be useful for short time-series, because it relies on a stable average image that would ideally be drawn from at least several tens of relatively good quality images. That can be sometimes hard to come by, for example when limited to Landsat only, or short time-periods. Instead, I think I would prefer the following:
The downside is that we have to compute the label for a lot of bad images, but we can devote some time to speeding up the model calls, which is Doodleverse/doodleverse_utils#31 Also, it still adds xarray and rioxarray as a dependency. We can discuss. |
Hi Dan I forgot to update this issue after I tested the |
However, I do think there are some good ideas in the original workflow #154 (comment) for filtering out glitchy images. I use the term 'glitch' to refer to sensor errors. They typically involve a completely different colorspace.... some examples from a site I was looking at today (I am pulling lots of examples of different types of noise together to form the basis of a ML training data set - yes a new attempt!) I think I will work on researching a new type of filter that uses ideas in the original workflow #154 (comment) for filtering out glitchy images. then after some testing we can see whether it should be included in coastseg, so I am proposing we still implement this idea, but for a low-key filter that detects the really rare glitches. I would do this by seeing what the dominant colors were and throw them out if they are in a certain range. Another idea I had to adapt this workflow was to throw out images smaller than the requested ROI. In this scope, xarray would be useful with dask to speed up reading the shape of each image. That's a common thing - yesterday I had 178 partial images out of a total of 920, or about one in five! |
Good Bad Image Filter
Description:
Users need a way to automatically get rid of images that are not usable.
Proposed Solution:
Using
rioxarray
, we can create a dataset of all the downloaded images. Then, by utilizing the time-averaged images, the RMSE (Root Mean Squared Error) and PSNR (Peak Signal to Noise Ratio) for each image can be determined. Good images are characterized by a low RMSE (indicating that the pixel values don't differ much from the time-averaged image) and a high PSNR (measuring how much the image differs from the time-averaged image, with a higher value indicating a better quality image).Benefits:
Drawbacks:
xarray
as a dependency.rioxarray
as a dependency.Additional Context:
Checklist:
rioxarray
.Peak Signal to Noise Ration Explanation
PSNR stands for Peak Signal-to-Noise Ratio. It's a metric used primarily in the fields of image and video processing to measure the quality of a reconstructed or compressed image (or video) as compared to the original one. Essentially, it quantifies how much the reconstructed image differs from the original image. The higher the PSNR, the closer the reconstructed image is to the original, and hence the better the quality.
The text was updated successfully, but these errors were encountered: