figshare
Browse

MAKVEN: A Visible Spectral Dataset for Colour Science and Machine Learning

Download (9.25 MB)
dataset
posted on 2025-10-30, 00:43 authored by Marcio MelloMarcio Mello
<p dir="ltr">MAKVEN is a spectral dataset covering the visible range (380–780 nm, 5 nm interval). It integrates both measured and reconstructed spectra, designed to provide efficient coverage of the CIE XYZ colour space for applications in colour science, bioinformatics, and machine learning.</p><p dir="ltr">The dataset was created to address limitations of existing collections. Classical references such as the Munsell Matt (1269 spectra) and the Macbeth ColourChecker (24 spectra) provide calibration anchors but do not cover the entire colour space. Natural datasets (e.g., Southern Cone, 916 spectra) contribute ecological diversity, while hyperspectral imagery (Foster et al., 2002, Scene 5) adds natural variability but is strongly redundant at the pixel level. To complement these resources, synthetic and reconstructed spectra were generated to fill sparsely represented areas of the colour space, based on spectral reconstruction from a CIELAB grid (Kang, 2006).</p><p dir="ltr">All spectra were interpolated to a common support (380–780 nm, 5 nm). Metadata indicate the origin of each spectrum, with the following distribution:</p><ul><li><b>synthetic:</b> 7395 spectra</li><li><b>reconst:</b> 1105 spectra</li><li><b>refs/natural:</b> 916 spectra</li><li><b>natural:</b> 3118 spectra</li><li><b>hiper/natural:</b> 1124 spectra</li></ul><p dir="ltr">This combination provides a reproducible, efficient, and accessible resource for training and evaluation of models in colourimetric prediction, luminous transmittance estimation, and portable sensor design. Users are free to define training and testing splits depending on their research goals (e.g., training on reconstructed spectra, validation on measured spectra).</p><p dir="ltr">The dataset is released under a Creative Commons Attribution 4.0 International (CC-BY 4.0) license.</p>

History

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC