-
Notifications
You must be signed in to change notification settings - Fork 334
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change DEMs from mask to image (is_image=True) #1776
Conversation
@microsoft-github-policy-service agree |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally agree with this distinction. DEMs and weather data are more likely to be used as inputs to a model than as outputs.
@@ -473,7 +473,7 @@ def plot( | |||
# Add masks | |||
if show_feats in {"masks", "both"} and "masks" in sample: | |||
mask = masks[i] | |||
contours = find_contours(mask, 0.5) # type: ignore[no-untyped-call] | |||
contours = find_contours(mask, 0.5) # _ type: ignore[no-untyped-call] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the purpose of this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was just trying to get it to pass validation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should pass fine in CI without modification. It may not pass locally if you don't have pyvista installed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. I'll revert the change and install pyvista locally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For what it's worth, I still get this error from mypy when I run the suggested validation checks: torchgeo/datasets/vhr10.py:476: error: Unused "type: ignore" comment [unused-ignore]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That suggests you still don't have pyvista installed in the same environment where mypy is installed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact, this is blocking my commit.
mypy.....................................................................Failed
- hook id: mypy
- exit code: 1
torchgeo/datasets/vhr10.py:476: error: Unused "type: ignore" comment [unused-ignore]
torchgeo/datasets/vhr10.py:528: error: Unused "type: ignore" comment [unused-ignore]
Found 2 errors in 1 file (checked 292 source files)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is why I don't use precommit 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can fix this by adding pyvista to precommit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed by #1781. You can undo this change now.
Co-authored-by: Adam J. Stewart <[email protected]>
There's an entire area of research for height estimation of a DSM from an image. Weather data is also commonly used as a target. I think this may be too dependent on the user. There's nothing stopping anyone from subclassing the base dataset and changing |
I think the underlying issue is that we are using "mask" instead of "target" as the dict keys. So maybe we should be modifying the attribute to be |
I like this idea better (e.g. in autoencoders you might even want an image to be a target). |
I like the idea of is_target as well. However, I would like to draw attention to the |
I think we actually had a an open issue (maybe even a PR?) to fix the forcing of the dtype based on the is_image attr but I'll have to do dig for it. |
I'm willing to look for it and do whatever fixes are decided upon, if y'all are cool with it. |
I found PR #1149 that references the As far as I can tell from reading the various linked PRs, So, I would suggest:
|
I would prefer to keep the naming scheme matching with Kornia for now, at least until we decide to switch from Kornia to Torchvision transforms v2. We actually went out of our way to call it |
This is simply the default behavior. If you would like to change this in your dataset, you can add: dtype = torch.float32 |
Going back through old PRs now. It seems like we still haven't reached a consensus on this PR. Is the original motivation for this PR related to the default dtype behavior (which is easy to override) or the name itself (which might be confusing)? Or is there some fundamental reason why this change is necessary? |
To me, the issue is more in terms of documentation. I would just like to see more clarification about what is the purpose of this property and its relationship to the The real answer to what needs to be done lies in the answer to, "What is the purpose of What's the impact of setting Why does the default behavior of While the EU DEM v1.1 is no longer available, I have a copy of the entire dataset. It's natively Float32 GeoTIFFs, not integers, yet the The ASTER DEM data is still available for download, and the TorchGeo Dataset is fine how it is, since it's delivered as 16-bit signed integers. Does this make it any clearer? BTW, FWIW, next time, I'll start this off as an issue instead of just doing a pull request. |
Oh, and I'm willing to make the changes, whatever you guys decide as the maintainers. |
Thanks for the clarification. Let me try to clarify how I envision the is_imageControls two different things. The first is the name of the key that we use in the sample returned by The second is to change the default value of dtypeDefaults to float32 for Kornia and torchmetrics are picky about dtypes, and unfortunately disagree about these. If I recall (correct me if I'm wrong), Kornia augmentations require ALL inputs to be floats, while torchmetrics metrics require all inputs to be long. We have some hacky code scattered about to handle this so that users don't need to think about this. |
Thanks! I'll make the changes to reflect that and will let you know when they're done. |
I've updated the code to reflect what you wrote, @adamjstewart. I pretty much used what you wrote but tweaked the wording a little bit for different contexts. The asterdem dataset's |
torchgeo/datasets/geo.py
Outdated
#: purposes the same names as Kornia are used. When multiple datasets with different | ||
#: keys are combined and the same key is used for multiple datasets, for example 2 | ||
#: "image" and 1 "mask", the channels will be stacked so that there's still a single | ||
#: value for that key. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we can make this more succinct like the other attributes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this work?
Why is one DEM long and the other is float32? To further clarify, P.S. Did you mean to close this PR? |
OK, I understand what you're saying now, and I've mad both DEMs float32. I am, however, a little confused about what case you meant when you say "integer images" in the following:
I just want to make sure the documentation is as clear as possible, and I think what I have for |
I guess I can't really think of an application that requires integer images as input. Let's just remove that part of the sentence. |
Ok, thanks for the quick response. I've removed that part of the sentence and committed the changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think most of the documentation changes are uncontroversial. The change from mask to image is more controversial, and may be unwanted for many users. I suggest splitting this into separate PRs so we can merge the first sooner and the latter later.
I changed the documentation table to "DEM". I'll separate the documentation changes to a new, separate PR. |
The documentation changes have been submitted as PR #1811. |
The documentation for the
datasets.RasterDataset.is_image
property is vague and has not clearly stated what one should do when it comes to a field data model (raster surface data). In the context of Computer Vision and Deep Learning, raster surfaces should be considered "images" and not "masks" because they are not a segmentation mask where pixels have been classified but are instead a range of values akin to DNs in an image.In the case of raster surfaces, like DEMs, Digital Surface Models (DSMs), Digital Terrain Models (DTMs), or even another raster surface, such as temperature, atmospheric pressure, soil moisture content, etc. the
is_image
property should be set to True and treated like an image when the class inherits fromdatasets.RasterDataset
.The ASTER and EU Digital Elevation Model datasets have previously been set as masks instead of images when they are not segmentation masks but raster surfaces. This pull request clarifies the documentation for
RasterDataset.is_image
and rectifies the Aster Global DEM and EU-DEM datasets.