-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Train a starter model for Sentinel in Mexico #34
Comments
ModelTraining Strategy We are labeling the segmentation masks from scratch & given the complexity of differentiating between the classes of interest to RM, it is taking us quite some time to generate the chips. In the allocated budget of ~125 hours, we can generate approximately 1000 chips of size 256x256 as ground truth for our model. This is not sufficient to build a decent segmentation model for all the eight corridors. As a workaround for this we are trying a weakly supervised training approach, in this case:
After we have a model that is pre-trained with weakly supervised labels, we can then fine-tune the model on chips generated by our data team i.e more precise & designed for sentinel imagery. Data Distribution 0: "other",
1: "Bosque",
2: "Selvas",
3: "Pastos",
4: "Agricultura",
5: "Urbano",
6: "Sin vegetación aparente",
7: "Agua",
8: "Matorral",
9: "Suelo desnudo",
10: "Plantaciones",
11: "Otras coberturas",
12: "Vegetación caducifolia", The numbers in the diagram represent the number of pixels for each LULC class in that particular corridor. As we can see from the figure, there is severe class imbalance across all the corridors with Few things to consider while training:
Initial PEARL Model for Reforestamos PEARL models for NAIP imagery was built on top of PyTorch & used segmentation models like UNet, FCN & DeepLab. I am building the baseline model using PyTorch & PyTorch-Lightning, this takes care of both the science & engineering side of things. We have to write less boilerplate code and things like storing model checkpoints, logs loss curves, metrics etc come for free. We can easily scale the model to run on single/multiple CPU/GPU/TPU without any additional effort. Update as on 30 Jan, 23 We have a segmentation model that is trained on a single corridor with weakly supervised labels coming from the RM team.
Here are some sample results Image (Color Corrected), Ground Truth Mask, Predicted Mask, Image overlay with mask |
Model Update - 13-03-23We have a baseline model that is a DeepLabv3+ with timm-efficientnet-b5 backbone which has an weighted F1 score of 0.78 currently deployed as Mexico LULC pre alpha in PEARL backend. This models also handles issues mentioned here #47 by using color based augmentations. Issues with the current baseline model
Few ways to handle this:
Next steps in order of priority
@developmentseed/pearl |
@srmsoumya What are your thoughts about closing this ticket? I think we managed to achieve most of what you outlined as improvements. We can revise/reopen based on feedback. |
For our Sentinel release, we'll create a starter model based on priority AOIs for Reforestamos.
The text was updated successfully, but these errors were encountered: