Preview results_Expert (Ex) and Non-Expert (NE) raters' scores and Inter-Rater Agreement (IRA)_Highest agreement among pairs
This pilot study examines the reliability and clarity of a rubric-based framework for assessing digital freehand drawing in higher education. Two rubric types—a Generic Visual Rubric (GVR) and an Exemplar-Based Visual Rubric (EVR)—were applied to 22 student drawings by expert (n = 4) and non-expert (n = 6) raters. Abductively derived criteria grounded in art education and studio practice guided the rubric design. Inter-rater agreement was measured using percent agreement (IRA), Spearman’s rank correlation, and Cronbach’s alpha. While agreement remained modest on some metrics (e.g., line sensitivity), the EVR condition showed greater consistency across rater groups. Both Spearman’s ρ and Cronbach’s α indicated improved internal alignment, particularly with EVR. Survey responses further confirmed the rubric’s usability and diagnostic value. Results highlight the potential of illustrated, task-specific rubrics to support formative assessment and rater calibration—especially for non-experts—and emphasize the importance of pilot testing in developing reliable evaluation tools for visual disciplines.
Both rater groups awarded higher overall scores under EVR, but experts benefited most: their score dispersion contracted sharply, and their inter-rater agreement increased by more than ten percentage points. Non-experts displayed the same direction of change, though gains were more modest.