Skip to content

GloSoFarID: Global multispectral dataset for Solar Farm IDentification in satellite imagery

Notifications You must be signed in to change notification settings

yzyly1992/GloSoFarID

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 

Repository files navigation

GloSoFarID: Global multispectral dataset for Solar Farm IDentification in satellite imagery

Zhiyuan Yang, Ryan Rad

Download the dataset from https://zenodo.org/records/10939100
Dataset paper published in arXiv:2404.05180

This dataset comprises a total of 13,703 samples of multispectral satellite images captured across continents from 2021 to 2023 in various solar panel farms. The detailed metadata for this dataset is provided in the following table:

Dataset Positive Sample Negative Sample Sum
2023 US 706 969 1675
2022 US 723 968 1691
2023 Global 1774 1689 3463
2022 Global 1803 1655 3458
2021 Global 1787 1629 3416
Total 6793 6910 13703

Input sample dimension (H, W, C): (256, 256, 13) Mask label dimension (H, W, C): (256, 256, 1) Dataset resolution: 10 meters / pixel Dataset spectral names: [B1, B2, B3, B4, B5, B6, B7, B8, B8A, B9, B10, B11, B12] Dataset spectral detail information: Sentinel-2 Bands Info Label classes: {0: "background", 1: "solar_panel"}

Dataset Previews

Examples from our GloSoFarID dataset. Each row represents one sample with all 13 bands along with the ground truth mask where the gray color indicates the solar farm area. showcase

Benchmarks

We trained the FCN, Half-UNet and UNet models on this dataset for 50 epochs with a batch size of 16. The following presents the outcomes:

Model IoU FScore
FCN 71.81 82.87
Half-UNet 71.48 82.17
UNet 79.31 87.80

How to Use this Dataset

Tensorflow

For TensorFlow users, you can load the dataset using import load_solar_dataset from dataset_converter, and execute tf_data = load_solar_dataset() to obtain a dataset object with the tf.data.Dataset class.

Convert to Images

If you are working with frameworks other than TensorFlow, you may wish to convert the dataset into images for use as input data. By running the command python dataset_converter.py, the original dataset can be transformed into multispectral .tif images and .png mask annotations.

The dataset_converter supports the following optional arguments:

  • --path, type=str, default=".", help='path to the dataset folder'
  • --file_name, type=str, default=None, help='name of a specific tfrecord.gz file to convert, if not specified, convert all dataset files in the path'
  • --include_jpg, type=bool, default=False, help='include jpg previews when converting'
Requirements

In order to execute dataset_converter.py, ensure that the following libraries are installed in your environment: pip install tensorflow pillow numpy tifffile

Note

After converting the dataset, the .tif files will take about 24 GB space. Please make sure you have enough space on your disk before runing convertion

About

GloSoFarID: Global multispectral dataset for Solar Farm IDentification in satellite imagery

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages