figshare
Browse

Preview results_Expert (Ex) and Non-Expert (NE) raters' scores and Inter-Rater Agreement (IRA)_Highest agreement among pairs

Download (470.04 kB)
dataset
posted on 2025-06-22, 07:00 authored by Dimitrije CurcicDimitrije Curcic, Teema Muekthong, Petar Mirković, Jasna Gulan Ružić

This pilot study examines the reliability and clarity of a rubric-based framework for assessing digital freehand drawing in higher education. Two rubric types—a Generic Visual Rubric (GVR) and an Exemplar-Based Visual Rubric (EVR)—were applied to 22 student drawings by expert (n = 4) and non-expert (n = 6) raters. Abductively derived criteria grounded in art education and studio practice guided the rubric design. Inter-rater agreement was measured using percent agreement (IRA), Spearman’s rank correlation, and Cronbach’s alpha. While agreement remained modest on some metrics (e.g., line sensitivity), the EVR condition showed greater consistency across rater groups. Both Spearman’s ρ and Cronbach’s α indicated improved internal alignment, particularly with EVR. Survey responses further confirmed the rubric’s usability and diagnostic value. Results highlight the potential of illustrated, task-specific rubrics to support formative assessment and rater calibration—especially for non-experts—and emphasize the importance of pilot testing in developing reliable evaluation tools for visual disciplines.

Both rater groups awarded higher overall scores under EVR, but experts benefited most: their score dispersion contracted sharply, and their inter-rater agreement increased by more than ten percentage points. Non-experts displayed the same direction of change, though gains were more modest.

Funding

This research received no external funding.

History

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC