Plant RNA-Image Repository.zip (275.3 MB)

Plant RNA-Image Repository

dataset

posted on 2023-11-19, 06:02 authored by Muhammad ShoaibMuhammad Shoaib

Together with the Agriculture University, we compiled a database of plant images and omics data. The dataset contains images of four distinct plant maladies, including powdery mildew, rust, leaf spot, and blight, as well as gene expression and metabolite data. Using a high resolution camera in a controlled environment at the facility of the Agriculture University of Peshawar, we captured 8,000 images of plants, with 2,000 images for each disease type. Each image was labeled with the disease type corresponding to it. The images were preprocessed by resizing them to 224x224 pixels and standardizing the pixel values. The dataset was divided into 70:15:15 training, validation, and testing sets, correspondingly. In addition to collecting images of the same plants, we also collected gene expression and metabolite data. We extracted RNA from the plant leaves using a commercial reagent and sequenced it on an Illumina HiSeq 4000 platform. The average length of the 100 million paired-end readings obtained was 150 base pairs. The unprocessed reads were trimmed with Trimmomatic and aligned with STAR against the reference genome. We counted the number of reads that mapped to each gene using featureCounts, and then identified differentially expressed genes between healthy and diseased plants using the DESeq2 package in R. Using gas chromatography-mass spectrometry (GC-MS), we gathered additional metabolite information. Using a methanol-water extraction protocol, we extracted metabolites from the plant leaves and analyzed the extracts using GC-MS. We obtained 500 metabolite characteristics, including amino acids, organic acids, and sugars.
If you use the dataset mentioned here, please make sure to give credit to the researchers by citing their paper titled 'Deep Learning for Plant Bioinformatics: An Explainable Gradient-Based Approach for Disease Detection.'

Reference

Shoaib, M., Shah, B., Sayed, N., Ali, F., Ullah, R., & Hussain, I. (2023). Deep learning for plant bioinformatics: an explainable gradient-based approach for disease detection. Frontiers in Plant Science, 14(October), 1–17. https://doi.org/10.3389/fpls.2023.1283235

History

Usage metrics

Keywords

plant bioinformatics OMICS biology

Licence

CC BY 4.0

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM