synthetic_data_mimicking_CA_and_AZ.zip (296.2 MB)
Synthetic Overhead Images of Wind Turbines Made to Mimic California and Arizona
Overview
This is a set of synthetic overhead imagery of wind turbines that
was created with CityEngine, and was made to qualitatively match overhead images of wind turbines from California and Arizona. There are corresponding labels that provide
the class, x and y coordinates, and height and width (YOLOv3 format) of
the ground truth bounding boxes for each wind turbine in the images.
These labels are named similarly to the images (e.g. wnd_xview_bkg_sd0_1.png will have the label titled wnd_xview_bkg_sd0_1.txt).
The images are contained in syn_CA_AZ_xview_bkg_shdw_scatter_uniform_50_wnd_v1 and the labels are contained in syn_CA_AZ_xview_bkg_shdw_scatter_uniform_50_wnd_v1.
Use
This dataset is meant as supplementation to training an object detection model on overhead images of wind turbines. This data can be added to the training set to potentially improve the performance of the model, especially if the model is being tested on small wind turbines or in desert regions.
Why
This dataset was created to study the use of synthetic imagery when doing cross-domain testing (training on one geographic region and then testing on a much different geographic region). When training the YOLOv3 model on the Overhead
Imagery of Wind Turbines dataset, we noticed qualitatively that the
model struggled much more on the images from California and Arizona.
This is because there are small wind turbines present in these states.
Because of the size of these small turbines, there is less information for the model to identify them, and oftentimes the only human-noticeable information is the shadows of the turbines. This dataset was then designed to try to improve the model's performance on these regions and type of turbines. In our experiment, the baseline training (real overhead images in the Overhead
Imagery of Wind Turbines dataset) set contained all of the images not from California and Arizona. The testing set contained all of the images from California and Arizona. This model was trained and evaluated, and then this synthetic imagery was added to the training set and the performance was once again evaluated on the testing set of just data from California and Arizona.Method
The
process for creating the dataset involved selecting background images
from
https://figshare.com/articles/dataset/Power_Plant_Satellite_Imagery_Dataset/5307364
that were not contained in the Overhead Imagery of Wind Turbines
dataset, and did not have much infrastructure that would make the scene
seem unrealistic. Then, a script was used to select these at random and
uniformly generate 3D models of both small and large wind turbines over the image and
then position the virtual camera to save four 608x608 pixel images. This
process was repeated with the same random seed, but with no background
image and the wind turbines colored as black. Next, these black and
white images were converted into ground truth labels by grouping the
black pixels in the images.
History
Usage metrics
Categories
No categories selectedLicence
Exports
RefWorks
BibTeX
Ref. manager
Endnote
DataCite
NLM
DC