<p dir="ltr"><b>physioDL: </b>A dataset for geomorphic deep learning representing a scene classification task (predict physiographic region in which a hilshade occurs)</p><p dir="ltr"><b>Purpose: </b>Datasets for geomorphic deep learning. Predict the physiographic region of an area based on a hillshade image. Terrain data were derived from the 30 m (1 arc-second) 3DEP product across the entirety of CONUS. Each chip has a spatial resolution of 30 m and 256 rows and columns of pixels. As a result, each chip measures 7,680 meters-by-7,680 meters. Two datasets are provided. Chips in the hs folder represent a multidirectional hillshade while chips in the ths folder represent a tinted multidirectional hillshade. Data are represented in 8-bit (0 to 255 scale, integer values). Data are projected to the Web Mercator projection relative to the WGS84 datum. Data were split into training, test, and validation partitions using stratified random sampling by region. 70% of the samples per region were selected for training, 15% for testing, and 15% for validation. There are a total of 16,325 chips. The following 22 physiographic regions are represented: "ADIRONDACK" , "APPALACHIAN PLATEAUS", "BASIN AND RANGE", "BLUE RIDGE", "CASCADE-SIERRA MOUNTAINS", "CENTRAL LOWLAND", "COASTAL PLAIN", "COLORADO PLATEAUS", "COLUMBIA PLATEAU", "GREAT PLAINS", "INTERIOR LOW PLATEAUS", "MIDDLE ROCKY MOUNTAINS", "NEW ENGLAND", "NORTHERN ROCKY MOUNTAINS", "OUACHITA", "OZARK PLATEAUS", "PACIFIC BORDER", and "PIEDMONT", "SOUTHERN ROCKY MOUNTAINS", "SUPERIOR UPLAND", "VALLEY AND RIDGE", "WYOMING BASIN". Input digital terrain models and hillshades are not provided due to the large file size (> 100GB). </p><p dir="ltr"><b>Files</b></p><p dir="ltr"><u>physioDL.csv</u>: Table listing all image chips and associated physiographic region (id = unique ID for each chip; region = physiographic region; fnameHS = file name of associated chip in hs folder; fnameTHS = file name of associated chip in ths folder; set = data split (train, test, or validation).</p><p><u>chipCounts.csv</u>: Number of chips in each data partition per physiographic province. </p><p><u>map.png</u>: Map of data.</p><p><u>makeChips.R</u>: R script used to process the data into image chips and create CSV files.</p><p><b>inputVectors</b></p><p><u>chipBounds.shp</u> = square extent of each chip</p><p dir="ltr"><u>chipCenters.shp</u> = center coordinate of each chip</p><p dir="ltr"><u>provinces.shp</u> = physiographic provinces</p><p dir="ltr"><u>provinces10km.shp</u> = physiographic provinces with a 10 km negative buffer</p>
Funding
National Science Foundation (Federal Award ID No. 2046059: “CAREER: Mapping Anthropocene Geomorphology with Deep Learning, Big Data Spatial Analytics, and LiDAR”)