To Add or Not to Add Game Elements? Exploring the Effects of Different Cognitive Task Designs Using Eye Tracking

Research on instructional design provides inconsistent results on the use of game elements in cognitive tasks or learning. Cognitive load theory suggests that game elements increase extraneous cognitive load and, thus, may distract the users. In contrast, from an emotional design perspective, the use of game elements is argued to increase performance by providing a more interesting and motivating task environment. To contribute to this debate, the current study investigated the effect of game elements on behavioral performance, attention, and motivation. We designed two versions of the number line estimation task—one with game elements and one without. Participants completed both versions of the task while their eye-fixation behavior was recorded. Results indicated that participants paid attention to game elements, that is, they fixated them, although they were not necessary to complete the task. However, no difference in estimation accuracy was observed between the two task versions. Moreover, the task version with game elements was rated to be more attractive, stimulating, and novel, and participants reported experiencing greater flow. In sum, these data indicate that game elements seem to capture attention but also increase motivational aspects of learning tasks rather than decreasing performance.


I. INTRODUCTION
G AME-BASED learning and assessment have become increasingly popular over the past years due to their potential of increasing performance and motivation [1]. Consequently, researchers and practitioners alike often attempt to augment conventional tasks with game elements to enhance performance of young students [2]. However, designing a task in a way that it becomes gamified or even gamelike requires the addition of details, game mechanics, or game elements, respectively, which might not be strictly relevant for achieving the overall objective of a given task (e.g., adding a fictitious narrative or virtual incentives). Plass et al. [3] consider game mechanics as design elements that can be used to influence cognitive, behavioral, affective, and social engagement. Furthermore, core game mechanics are activities that players repeat over and over throughout the game [4]. Pawar et al. [5] distinguish game mechanics from learning mechanics. While game mechanics constitute major elements of play activities, learning mechanics describe major elements of learning activities. Accordingly, to design effective and engaging educational games, the core game mechanics should be well integrated with learning content, learning mechanics, and instructional aspects [5]- [7]. Inadequate integration may lead to gameplay in which player's interaction with the game mechanics and related game elements does not support learning, but may cause unnecessary extraneous cognitive load undermining possibilities to learn.
According to cognitive theories of game-based learning, player's information processing capacity is limited and it can be allocated among extraneous processing, essential processing, and generative processing [8]. Extraneous processing refers to cognitive processing that does not support the instructional objective of the game. In contrast, essential processing is needed to represent the visual and verbal material of the game in working memory and generative processing is needed for making sense of the material. More practically, the poor use of game elements may lead to situations in which players have to use most of their cognitive capacity for extraneous processing, and thus, they do not have enough cognitive resources for essential and generative processing needed for learning [8]. That is, poor educational game design may actually decrease performance or learning as irrelevant details, often referred to as seductive details in the domain of multimedia learning [9], may distract players' attention from the relevant features of the task. In contrast, it is argued that game elements aim at increasing cognitive and emotional engagement, and thereby, increase motivation and performance [3], [10]. The disparity between these two instructional approaches might also explain rather heterogeneous but still overall positive effects found for gamebased learning [11], [12]. As a consequence, the use of game elements in instructional materials and cognitive assessments is still a matter of debate [13] and needs to be investigated further. More importantly, the question remains, in which way game elements influence attentional processes as well as motivational states such as flow experience.
Therefore, in the current exploratory study, we evaluated participants' eye-fixation behavior to investigate attentional processes in a math task. We employed a number line-based fraction task and its game-based equivalent to experimentally investigate the effects of game elements on three different levels: 1) behavioral performance; 2) attention distribution as reflected by participants eyefixation behavior; 3) motivational aspects. Based on these levels, following three research questions were formulated. 1) Does performance in a task with and without game elements differ? 2) Do game elements alter selective attention as indicated by eye-fixation behavior in a task? 3) Does the experienced flow and user experience in a task with and without game elements differ? Accordingly, in the following, we will first give a brief overview on the different rationales arguing for and against augmenting cognitive tasks to alter performance with game elements. After that, we briefly elaborate on the consideration of eye-fixation behavior in multimedia research and discuss studies utilizing eye tracking to investigate attentional effects of game elements or the lack thereof, respectively, before addressing motivational aspects such as flow.

A. Theoretical Contradictions-Minimalistic or Augmented
Learning Materials The use of games or gamelike experiences in education is the opposite of what some research works in the learning sciences would recommend [14]. For instance, the cognitive theory of multimedia learning [15] would postulate that embellishing game elements would impede educational efficacy as they distract users from the actual content or task objective. According to the coherence principle, any illustrations or content that is not fundamentally important or relevant should be removed to facilitate reaching the instructional goal [15], [16]. In this vein, research in the field of seductive details indicated that adding illustrations to instructional material such as textbooks that are not relevant for reaching the instructional goal can lead to poorer performance or outcomes, respectively [9], [17]- [19]. Following this theory, instructional material should be minimalistic to avoid or reduce extraneous cognitive load [20] as these irrelevant details occupy working memory with unnecessary information. In line with this, the distraction hypothesis predicts that seductive details direct selective attention away from relevant information [19] as they are selectively processed and remembered at the expense of more important but less interesting information. Consequently, adding rather irrelevant, but interesting and, thus, attention capturing elements, such as game narrative or visually appealing game graphics, might reduce performance of users as they distract them from more relevant task elements [19], [21], [22].
Recently, the approach on seductive details has been criticized. Critics argued that studies on the seductive details effect showed rather unconvincing and contradictory evidence. It is discussed that seductive details do not always impede understanding and that they can even motivate users [23]- [26]. Park et al. [24] aimed at clarifying this controversy by linking positive/negative effects of seductive details to cognitive load theory. High school students learned about biology using a multimedia environment with and without seductive details while also varying cognitive load (i.e., by altering the modality of verbal information). Results showed that students' performance in the multimedia environment was significantly higher when seductive details were present in a low cognitive load environment as compared with all other conditions [24].
Similarly, beyond a pure cognitive perspective, Schnotz et al. [27] further assumed that seductive details may increase interest in the instructional material leading to higher persistence and increased performance. Augmenting instructional material with interesting details might, therefore, affect users' willingness to invest cognitive effort. Consequently, performance might be affected by the users' perception of the task, respectively. This argument is based on the assumption that cognitive resource allocation is flexible and positively affected by users' motivational state [28]- [30]. In line with this, different frameworks and theories such as the social-cognitive control value theory of achievement emotions [31], the cognitive-affective theory of learning with media [32], or the integrated cognitive affective model of learning with multimedia [33] highlight the interdependence of emotion, motivation, and cognitive performance. Accordingly, in contrast to rather minimalistic instructional design perspectives, the emotional design perspective suggests that seductive details might be helpful for increasing performance in instructional settings by increasing motivation of users [30], [34]. For instance, Um et al. [30] observed increased motivation and invested mental effort, when applying emotional design principles to instructional materials as compared to a neutral condition with presumably lower extraneous cognitive load.
Accordingly, it has been hypothesized that game elements, such as a narrative, are useful for improving motivation and performance [35], [36]. In fact, recent meta-analyses on game-based learning identified increased performance as compared to conventional or nongame-based instructional conditions [11], [12]. Similar effects can be found for game-based cognitive assessment and training [37], [38]. Unfortunately, underlying mechanisms on the integration of game elements in a regular cognitive task have not been researched intensively so far. Hence, it is hard to pinpoint particular effects on why game-based learning seems to be superior to conventional approaches [12] even though it directly opposes assumptions posed by the cognitive load theory or cognitive theory of learning with multimedia.

B. Attention and Eye Tracking
Seductive details, such as pictures or emotional materials, have been shown to attract users' attention [39]- [41]. In this context, eye tracking offers a direct way of identifying whether seductive details or game elements catch attention as reflected by being looked at in an instructional task-probably diverting attention away from more fundamental content or task objectives [42]. Other ways of assessing attention toward game elements or seductive details are neurofunctional measures, which can only be used in specific and highly controlled scenarios, or self-reports that can be affected, for instance, by inaccurate recall or memory effects [43], [44]. Eye tracking provides an objective and rather unobtrusive way with high temporal resolution to investigate cognitive processes in learners. Consequently, over the past decade, the number of eyetracking studies in multimedia learning [45], [46] and educational game research has increased [47]- [53].
Previous research showed that eye-fixation behavior, especially fixations representing areas where the eye fixates at on the computer monitor, is valid indicator of visual attention during gaming [46], [54]- [56]. This is in line with the eyemind hypothesis predicting that the focus of human gaze indicates the focus of attention [55], [57]. It is assumed that information at the point a person is currently looking at is attended and processed cognitively [48], [55], [57]. Hy€ on€ a [57] points out that this assumption holds only if the visual information is relevant to the task at hand. Accordingly, fixations are the main measures in eye-tracking research, which allow for a number of useful metrics to be derived, for instance, the number of fixations and (mean) fixation duration [58]. For example, Tsai et al. [47] employed eye tracking to study users' cognitive processes in a dynamic and highly interactive gamebased learning environment. They showed that fixation patterns may reflect users' metacognitive control over visual attention in game-based environments.
Interestingly, fixation measures seem to be associated with task performance [46], [54], [55]. There is some evidence that high performers or experts show fewer fixations than low performers [59], [60], which might indicate more efficient and focused processing. Moreover, it was argued that longer fixation durations indicate higher cognitive effort reflecting higher task difficulty and cognitive workload but also deeper processing [61]- [63]. Therefore, eye tracking might be a suitable method to evaluate the underlying attentional demands and affordances of popular game elements, such as narrative, virtual incentives, appealing visual aesthetics, etc. [3].
Empirical studies comparing game-based to nongame-based equivalent tasks are still sparse [64], but studies investigating whether game-elements indeed divert attention away from fundamentally important task elements are even rarer. For instance, Kiili et al. [48] used eye-tracking measures to explore the perception of user interfaces of four different educational games, but did not compare them to a nongame-based equivalent. They found that extraneous elements in the educational games-especially animated content displayed concurrently to feedback-indeed distracted users and consequently disturbed the instructional process. Moreover, it was also suggested that low performers and inattentive players might be more distracted by such extraneous gaming elements than high performers and highly attentive players because the former are probably less able to distinguish between important and irrelevant information [48], [52].

C. Motivation and Flow
Besides effects on attention and performance, game elements reportedly affect motivational states, for instance by including a narrative or game fiction [10], [65]. In fact, several meta-analyses indicated that game elements improve motivation [11], [66]. In the domain of game-based learning, the concept of flow is one of the most popular motivational constructs assessed [67], [68], but in the context of seductive details, the research on flow is sparse. To our knowledge, only one study examined differences in flow experience between an interactive simulation on beer brewing with and without gamelike elements [69]. The authors, however, did not identify differences between the groups on flow and performance.
Flow is usually considered to be a positive emotional state [70], [71] and a holistic approach to motivation [72]. However, Landh€ außer and Keller [73] noted that flow is not the same as having fun, but rather a state characterized by a combination of aspects, such as concentration, a merging of action and awareness, reduced self-consciousness, a sense of control, a transformation of time, and an intrinsically rewarding experience. Moreover, Landh€ außer and Keller [73] distinguish flow preconditions (i.e., skills-demands fit, clear goals, and clear feedback) that foster involvement and enjoyment in the activity. These preconditions provide meaningful recommendations for augmenting tasks with game elements. Game elements, for instance, might be used to support formation of self-regulated goals, provide feedback about player's progress toward goals, and adapt challenges of the game to player's skill level.
Importantly, flow experience also seems to be positively related with learning outcomes and playing performance in game-based learning [68], [74]- [76] and is often related to positive user experience [68], [77]. As such, good (educational) games are often described as being able to hit the "sweet spot" of flow [3] and users experiencing flow usually are willing to extend the interaction with the game in the future and rate games better [70]. Regarding the latter and in the context of seductive details, it is important to assess whether the added seductive details or game elements, respectively, are experienced by the users in a positive way (e.g., the attractiveness of the design: [78]). This is particularly important as seductive details are supposed to draw users' attention toward the instructional material [79]. Hence, the relation between flow, user experience, and seductive details needs to be further investigated.

A. Participants
Forty-two university students (mean age 21.26 years, SD ¼ 1.93; 18 male) participated in this study. All participants had normal or corrected-to-normal vision (contact lenses but no glasses). Participants gave written informed consent. The study was approved by the Ethics Committee of the University of Graz, Austria (reference number GZ. 39/86/63 ex 2017/18) and is in accordance with the ethical standards of the Declaration of Helsinki.

B. Task Versions-Game Versus No-Game
We employed the so-called number line estimation task, which relies on the concept of a mental number line, an often used metaphor to describe our mental representation of number magnitude [80]. All participants performed number line estimation tasks in both game and no-game conditions. In the number line estimation, tasks participants had to locate fractions on a number line ranging from 0 to 1 while their eye-fixation behavior was recorded. In each estimation condition (game versus no-game), 48 proper fractions involving single-(e.g., 3/7) and double-digit (e.g., 19/25) numerators and denominators were used. In the game condition, number line estimation was embedded in an educational game called Semideus [81], [82], whereas the no-game condition reflected a stripped version of the task without any game elements. Game and no-game conditions were comparable regarding task difficulty as they both involved the same target numbers, as well as same number of levels (i.e., six levels, eight tasks/fractions). The order of the conditions was counterbalanced across all participants.
The implementation of the no-game version of the estimation task was minimalistic (see Fig. 1) as no game elements were present and, thus, corresponds to the conventional number line estimation task. A target fraction to-be-located on the number line was displayed in the upper left part of the screen and the number line on bottom of the screen. The participants could move a courser (white bar) on the number line with the left and right arrow keys and confirm their estimates by pressing the space bar on the keyboard. Feedback was provided after each estimation. Positive feedback was given by showing a green check mark above the cursor if the target fraction was estimated accurately enough (i.e., estimated location no more than AE5% away from the correct location), and at the same time, the absolute correct location was shown by a green marker on the number line. Negative feedback was given in the form of a red cross in case the estimated location was more than AE5% away from the correct location. The same negative feedback mechanic was shown if an estimation took too long (>10 s per estimate). In case the estimation was incorrect or too slow, participants again had 10 s to indicate the location of the fraction. The number of trials on each task was not limited and participants had to estimate the location of the fraction correctly before they could proceed to the next task. After correct estimates, participants were asked to press the Enter-key to proceed to the next task.
The game version of the task was based on the same core number line estimation mechanic as the no-game version, but it was extended with visual game elements, such as a narrative and corresponding scenery, an avatar, coins, progress bar, energy bar, level goals (stars), and gesture-based feedback. As it is argued that game elements disconnected from the actual task seem to distract more from task objectives or learning goals [7], we aimed at integrating the added game elements to the core task in a way that all the elements have a meaningful role. In terms of flow theory, the game elements were included to facilitate goal formation, sense of control, and interpretation of feedback with respect to goals. With respect to the signaling principle of multimedia learning theory [83], the avatar was walking on the number line highlighting the most essential part of the game, the number line.
Specifically, in the game condition, participants controlled an avatar called Semideus, who tries to collect gold coins, which a goblin had stolen from Zeus. That is, the core mechanic of the game was to locate coins from the trails of the mountain with help of symbolic fraction numbers. The avatar was controlled on the number line in the same way as in the no-game condition (arrow keys of the keyboard). A target fraction, the location of hidden gold coins, was displayed in the upper left part of the game display (see Fig. 2).
The task of the participants was to locate the position of this fraction on the number line to dig up the hidden coins. To confirm that Semideus digs the coins up at the estimated location, the participant had to press the space bar on the keyboard and Semideus dug into the ground or number line, respectively. For inaccurate estimates (i.e., estimates more than AE5% away from the correct location) the avatar was struck by lightning (see Fig. 2) and the player lost 10 units of virtual energy (displayed by an orange bar on the right-hand side of the screen) on the first error in a trial, 5 units of energy for the second error on the same trial, and 2.5 units of energy for any further errors on that same trial/item. The same negative feedback mechanic (loss of energy) was used if players took too long to press the space bar (>10 s) or respond, respectively. This time limit was visualized by a cloud getting darker with passing time and a numerical countdown within the cloud. Similar to the no-game version, participants again had 10 s to indicate the location of the fraction in case the estimation took too long or was incorrect. For correct estimates, the avatar was rewarded with coins added to the reward counter placed at the upper left part of the screen and the correct location was shown by a green marker on the number line. Additionally, positive feedback was provided through gestures of Semideus (e.g., lifting hands up and cheering; see Fig. 2). After correct estimates, participants were asked to press the Enter key to proceed up to the next platform of a mountain, so that the next task/target fraction was presented. Participants could acquire 100 to 500 coins depending on the degree of estimation accuracy (i.e., over 98% ¼ 500 coins; 97%-98% ¼ 300 coins; 95%-96% ¼ 100 coins). When all eight tasks of one level were completed, Semideus reached the top of the mountain where Zeus was waiting for him to bring back the stolen gold coins. After completing a level (reaching the top of the mountain), participants got additional feedback about their overall performance in the level: Stars and earned coins were shown (i.e., one star for completing the level and reaching the mountain top, one star for collecting more than 2000 coins, and one star when more than 80% of virtual energy was left). Although participants could run out of virtual energy in a level (100 units energy in the beginning of each level), they were still able to complete the level. However, at the mountain top participants did not earn the bonus points awarded according to remaining virtual energy.
To sum up, the game condition employed typical characteristics or building blocks, respectively [3], of a game such as narrative elements, appealing visual aesthetics, virtual incentives in the form of points and stars earned according to the performance of users, as well as positive/negative feedback.

C. Eye Tracking
To objectively assess user behavior during task execution, eye-fixation behavior was tracked using a Tobii 1750 Eye Tracker (Tobii Technology). This eye tracker is integrated into a 19'' computer monitor and was connected to a conventional computer. Above and below the monitor, there are infrared light-emitting diodes. The eye tracker collects binocular eye-tracking data at a rate of 50 Hz. Participants were seated with a maximum distance of 55 cm in front of the monitor. With the Tobii 1750 Eye Tracker (Tobii Technology), no fixation of the head was necessary enabling natural behavior of participants. Each eye-tracking measurement started with a short calibration phase. The number and duration of fixations (i.e., eye gaze remaining at the same point on the screen for longer than 100 ms, [84]) were recorded.
For the analysis of the eye-tracking data, the display of the game and the no-game condition was subdivided into different areas of interest (AOI). The screen of the game condition was divided in 14 AOIs, the screen of the no-game condition in 9 AOIs (see Fig. 3). The average number of fixations per AOI was analyzed. AOIs 1 to 3 covered the number line in both conditions. AOI1 corresponded to the left third of the number line, AOI2 to the middle third of the number line, and AOI3 to the right third. AOI5 comprised the target fraction, which should be located on the number line (see Fig. 3). AOI4 covered the area in which the Semideus avatar moved along the number line in the game condition and the red cross or green check mark were presented in the no-game condition. AOI6 covered the text "Press Enter To Proceed" after a task was solved correctly.

D. Questionnaires
Questionnaires were used to assess motivational aspects after each condition. Flow experience was measured with the flow short scale (FKS; [85]), which measures components of flow with 10 items. Six of the items measure "fluency of performance" (e.g., "I have no difficulty concentrating") and four items measure "absorption by activity" (e.g., "I do not notice time passing") on a 7-point scale (1 ¼ "not at all"-7 ¼ "very much"). The mean of each component was used in the analyses. User experience was assessed using the user experience questionnaire (UEQ; [86]) which has been widely used to assess interaction quality of certain design variations of products [87]. Accordingly, this questionnaire assesses conventional usability aspects (efficiency, transparency, controllability), user experience (originality, stimulation), as well as attractiveness. The rather general attractiveness subscale consists of six bipolar ratings, such as "unpleasant-pleasant" or "appealing-repelling." Efficiency is measured by four items, such as "quick-slow" or "efficient-inefficient." Perspicuity is assessed with four items, such as "complicated-simple" or "clear-confusing." Dependability is assessed with four items such as "predictableunpredictable." The subscale Novelty is measured by three items, such as "creative-dull" and, finally, the subscale stimulation is measured by three items, such as "boring-exciting."

E. Procedure
Participants were tested individually. Half of the participants (n ¼ 21) started with the game condition and the other half (n ¼ 21) started with the no-game condition, which both included practice trials in the respective condition. After completing the first condition (six levels, eight tasks), participants completed the FKS and UEQ questionnaires to assess their subjective experience during the previously completed condition. Then, the second condition had to be performed before completing the FKS and UEQ questionnaires again. Overall, the experiment took about 1 h.

F. Analyses 1) Behavioral Performance Level:
To test performance differences in both conditions (game versus no-game), we ran paired t-tests on error rate, mean accuracy, and mean duration per task. The error rate corresponded to the absolute number of errors committed by each participant averaged for each condition. The accuracies of participants' estimation attempts were pooled to calculate the mean accuracy for each participant. Mean duration was defined by the average time needed to correctly solve a given item or fraction, respectively-from item onset to the last space bar press, i.e., until the estimate was correct. As these effects are crucial to address our primary research questions regarding the seductive detail effect of game elements, we validated null effects by a Bayesian model selection approach, which investigates whether the null hypothesis or the alternative hypothesis is more supported by the data [88]. Accordingly, we calculated the posterior probability that the data favor the null hypothesis and the alternative hypothesis, respectively.
2) Attentional Level: We first analyzed general differences in absolute fixations in both conditions using a paired t-test. We then specifically analyzed differences in the percentage of fixations on task relevant AOIs. Accordingly, we employed a 2 Â 4 ANOVA with the within-subject factors condition (game versus no-game) and task relevant AOIs (AOI1, AOI2, AOI3, and AOI5). Next, we used correlational analyses to investigate associations between the total number of fixations across all AOIs and task accuracy as well as errors.
For analyzing the duration of fixations, we followed a similar approach. We first ran a paired t-test on the mean duration of the fixations averaged over all AOIs per condition and correlational analyses between fixation duration and task accuracy and errors. We then compared mean duration of fixations between the game and no-game condition for task relevant AOIs. Accordingly, mean duration of fixations was analyzed using a 2 Â 4 ANOVA with the within-subject factors condition (game versus no-game) and task relevant AOIs (AOI1, AOI2, AOI3, and AOI5).
3) Motivational Level: Due to technical problems ratings for the flow questionnaire (FKS) of one participant and ratings for the user experience questionnaire of 20 participants were not recorded in both conditions of the experiment. Questionnaires were analyzed using separate paired t-tests for each subscale of UEQ and FKS.

A. Behavioral Results-Performance Level
Performance in number line estimation was comparable between both game conditions (see Table I). Although participants made descriptively more errors in the no-game condition, this difference was statistically not significant. These null effects were further investigated using a Bayesian model selection approach. As regards the errors, the posterior probability for the null hypothesis was 0.63 providing weak evidence for the null hypothesis (i.e., no difference in error rate between game and no-game condition) according to Raftery [89]. As regards the mean accuracy, the posterior probability for the null hypothesis was 0.82 providing positive evidence for the null hypothesis (i.e., no difference in estimation accuracy between game and no-game condition). As regards the mean duration per task, the posterior probability for the null hypothesis was 0.78 providing positive evidence for the null hypothesis (i.e., no difference in mean duration per task between game and no-game condition). We need to note that only 2.42% (game version) and 3.92% (no-game version) of all provided estimates took participants longer than 10 s, indicating that the employed time limit per estimate did not particularly pressure participants. In the game condition, the middle and right part of the number line (AOI2, AOI3) were more frequently fixated than the left part (AOI1), and participants fixated more often the task (AOI5) than the number line (AOI1, AOI2, AOI3). In the nogame condition, the middle part of the number line (AOI2) was more frequently fixated than the left and right part (AOI1, AOI3) and the right part (AOI3) was more frequently fixated than the left part (AOI1). The task (AOI5) was more frequently fixated than the left part of the number line (AOI1), but the number of fixations on the task (AOI5) in the no-game condition was comparable with the number of fixations on the middle and right part of the number line (AOI2, AOI3).  The patterns of fixations in both, the game and no-game condition, were rather similar and a result of the fraction selection in the current experiment (see density plot in Fig. 6), i.e., correct positions on the number line were more frequently found in the middle (AOI2) and right side (AOI3) of the number line as compared to the left side (AOI1). This could also be identified in a heat map of eye fixations in Fig. 7.
Additionally, we found a significant relationship between the total number of fixations across all AOIs (i.e., the whole screen) and performance in the number line estimation task in the no-game condition. The more fixations a participant made in the no-game condition, the higher her/his error rate (r ¼ 0.50, p < 0.01) and the worse her/his estimation accuracy (r ¼ À0.49, p < 0.01). In the game condition, the correlation results were not significant (absolute number of fixations Â errors: r ¼ 0.20, p ¼ 0.21; absolute number of fixations Â accuracy: r ¼ À0. 22, 2) Fixation Durations: The mean duration of the fixations averaged over all AOIs per condition was longer in the nogame condition (mean ¼ 209.30 ms, SE ¼ 12.28) than in the game condition (mean ¼ 199.14 ms, SE ¼ 10.63; t(41) ¼ 3.45, p < 0.01, d ¼ 0.53). However, fixation duration did not correlate with performance (error rate and accuracy) in number line estimation, neither in the game nor in the no-game condition (all p > 0.71).
When comparing the mean duration of fixations between the game and no-game condition only for AOI 1, 2, 3, and 5, which were directly comparable between conditions, an ANOVA model with the within-subject factors game condition (game versus no-game) and AOI (AOI1, AOI2, AOI3, and AOI5) only revealed a significant main effect of AOI [F (3,54.95) ¼ 13.90, p < 0.01, h 2 ¼ 0.26]. Post-hoc testing indicated that participants fixated the middle and right part of the number line longer than the left part and the task (see Fig. 8).   All other effects were not significant. Hence, the mean duration of fixations did not differ between the game and no-game condition when analyzing AOI 1, 2, 3, and 5.

C. Questionnaire Results-Motivational Level
To examine the hypothesized differences in user experience in both conditions, we conducted paired t-tests for each subscale (attractiveness, perspicuity, efficiency, novelty, stimulation, and dependability) of the UEQ with the within-subject factors condition (game versus no-game). The ratings on each scale were used as dependent variable. Paired t-tests (Benjamini-Hochberg corrected for multiple comparisons) revealed that participants rated the game condition to be more attractive [game: mean ¼ 0. 89 Finally, we examined differences in the perceived flow of participants in both conditions. Therefore, we performed paired t-tests for each FKS subscale (absorption and fluency) with the within-subject factor condition (game versus nogame). We used participants' ratings as dependent variable.
Paired t-tests (Benjamini-Hochberg corrected for multiple comparisons) revealed that participants felt the experience in the game condition to be more fluent or automatic [game: mean ¼ 4.68, SE ¼ 0.22; no-game: mean ¼ 4.18, SE ¼ 0.23; t (40) ¼ 2.44, p < 0.05, d ¼ 0.38], respectively. Participants did not differ in their experienced absorption in the game and no-game condition [game: mean ¼ 5.17, SE ¼ 0.16; no-game:

V. DISCUSSION
In the current study, we investigated the effects of game elements on user behavior in a numerical learning task on three different levels: 1) behavioral performance (performance in the number line estimation task); 2) attention distribution (eye-fixation behavior); 3) motivational aspects (flow and user experience). Therefore, participants performed a number line estimation task with and without game elements. Results indicated differences in selective attention as reflected by eye-fixation behavior but no differences in performance between the game and no-game condition. Additionally, flow as well as user experience was higher in the task condition with game elements. In the following, results on the respective levels will be discussed in more detail.

A. Performance Level
Importantly, our results indicated that the game elements in the game condition did not disturb participants' estimation performance as behavioral performance was comparable between both conditions. Neither error rates nor estimation accuracy or time to solve the tasks were affected by the inclusion of game elements. These results were further substantiated by Bayesian analyses. Therefore, our data do not support the assumption that adding game elements, which were not mandatory for solving the task, decreases performance as predicted by the seductive details effect [9]. Although Rey [9] identified overall small-to-medium-sized negative effects on performance, he acknowledged, among others, learning domain as well as the kind of seductive details as important mediating factors. Research on game elements in this regard is particularly sparse emphasizing the importance of the current results. In line with Rey [9] and Habgood and Ainsworth [7], we argue that seductive details or game elements, respectively, which interrupt the coherence of the instructional material, rather than being well (intrinsically) integrated with learning mechanics, are particularly prone to impede processing. As described earlier, in the present game condition, we took great care to integrate all game elements with the core number line estimation task in a coherent and meaningful way, which may account for the lack of negative effects. However, the eye-tracking results revealed also that participants allocated only very limited amount of attention for the level progress bar and the virtual energy bar in the game condition (see Fig. 7). This, together with the relatively low challenge level and the lack of consequences when reaching zero virtual energy might have undermined the meaning of these elements in the game version.
Rey [9] further observed that the presence of time limits led to poorer performance in the seductive detail condition. He argued that users in this condition might need longer for processing larger amounts of instructional material. However, the current study did not find performance differences even though a time limit of 10 s per estimate was employed. However, it needs to be mentioned that in both conditions only a very small number of estimates actually exceeded the employed time limit, indicating that participants might not have felt particularly pressured. Moreover, comparable performance can be also explained with the relatively low challenge level of the game. It is possible that even though the game elements induced more extraneous cognitive load, the overall cognitive load remained within participants' cognitive capacity limits. Thus, processing of game elements did not disturb performance. In line with this, we also did not find differences in time needed to solve tasks between the game and no-game condition. The current results indicate that participants were able to efficiently differentiate between relevant and the rather "irrelevant" game elements in the game condition. This is consistent with recent research on effects of irrelevant pictorial information indicating that students learned to ignore irrelevant information as they gained experience with the task and thus no longer experienced negative effects [90], [91]. As compared to similar studies in this line of research, our study employed a task with rather simple mechanics and limited relevant information-providing only a target number and the number line. Therefore, users might have got proficient with the task quite quickly and learned to differentiate between the fundamental task and game elements in the game condition.

B. Attentional Level
Although there were no performance differences between game and no-game condition, we observed differential eyemovement patterns. Overall, participants made more fixations in the game condition than in the no-game condition, which is in line with other studies investigating user interface attractiveness [92] or seductive details [93], [94]. However, task relevant AOIs (AOI1, AOI2, AOI3 reflecting the number line, and AOI5 representing the to-be-solved fraction) were fixated less often in the game condition as compared to the non-game condition. This might indicate that participants' attention was distracted away from task relevant features more often in the game condition. This is not particularly surprising as more task irrelevant features were present in this condition and is in line with previous studies investigating seductive details [93], [94]. Importantly, however, this did not affect performance indicating users' ability to pay attention to both relevant and rather irrelevant task elements. In the game condition, participants also frequently looked at the Semideus avatar (AOI 4), not present in the no-game condition, as it might have served as visual extension of the cursor indicating the current position on the number line. Nevertheless, the avatar Semideus reflected partly the same information as the cursor visible upon the number line at the bottom of the screen (AOI 1-3). Accordingly, looking at the number line at the bottom of the screen (AOI 1-3) and the target number (AOI 5) was sufficient to solve the estimation tasks adequately in the no-game as well as in the game condition. Other game elements, such as the counter of reward coins (AOI 7), the bars showing the virtual energy and level progression (AOI 11 & 14), and the cloud (AOI 9) did not seem to disturb players, but provided additional information about the state of the game and provided performance feedback. The respective areas of the game screen were even largely ignored as revealed by the eye-tracking data (see Fig. 4).
On the one hand, these results support the notion that users in the game condition might have learned not to pay particular attention on task irrelevant game elements as indicated by low number of fixations on, for instance, the counter of reward coins. However, on the other hand, users in the game condition paid particular attention to the avatar (AOI 4). This might be due to the fact, that 1) the avatar served as a visual extension indicating the current position on the number line and 2) the avatar provided gesture-based feedback in terms of jubilant gestures (correct response) or getting hit by lightning (incorrect response) as well as rewarded coins to the user. In contrast, the respective AOI in the no-game condition (AOI 4), which also provided feedback but in a less engaging way (i.e., showing a green check mark for correct responses and a red cross for incorrect responses) received less attention from the learners. Direct comparison of AOI 4 between the game and no-game condition was difficult. Even though, in both conditions AOI 4 provided positive and negative feedback, the size of the respective AOI in the game condition had to be considerably larger due to the avatar and his animations. Therefore, we only described the results on a descriptive level, which indicated increased attention was being paid to this area in the game condition (mean number of fixations ¼ 150.90; SE ¼ 17.52) as compared to the non-game condition (mean number of fixations ¼ 34.67; SE ¼ 6.85). However, this result needs to be treated with caution as the comparability between these two areas is not straightforward due to their difference in size. Nevertheless, AOI 4 including the avatar was one of the most frequently fixated screen areas in the game condition that is in line with previous studies indicating that players tend to focus on their avatar [48]. Consequently, an avatar seems to be an effective channel to provide crucial information, such as performance feedback to the player.
Previous research already showed a relationship between the number of fixations and performance in different tasks (e.g., expert versus novice problem solving: [59], [60]; puzzlelike task: [54]; reading: [55]; watch a video with geographical content: [95]). In line with this, we observed that performance was lower for users who made more fixations while solving the task. However, we only found this correlation to be significant in the no-game condition, but not in the game condition. To our knowledge, only one other study investigated eyetracking metrics similar to the current study in a number line estimation task. Schneider et al. [96] found eye fixations to be scattered more in worse performers or younger children, respectively. The authors assumed that younger children do not yet have developed appropriate solution strategies and, thus, are less direct in their approach requiring more orientation and reorientation along the number line in order to locate a specific target number [96]. This general pattern is consistent with our finding because lower estimation performance was associated with more fixations in the no-game condition. This association was not found in the game condition, which might indicate that inclusion of game elements might undermine the predictive power of fixation behavior [56], [97].
Mean fixation duration over all AOIs was longer in the nogame condition as compared to the game condition. However, the mean fixation duration did not correlate with number line estimation performance (error rate and accuracy), neither in the game nor in the no-game condition. In eye-tracking research, longer mean fixation duration is usually associated with higher processing difficulty and cognitive load, but also deeper processing [46], [98], [99]. The duration of fixations in AOI 1, 2, 3, and 5 (number line at the bottom of the screen and task/fractions), which were the most central and task relevant AOIs for solving the estimation task in both conditions, did not differ between the game and no-game condition. Moreover, in both conditions, the middle and right part of the number line was fixated longer than the left part and task area, which corresponded to the fraction distribution. Therefore, even though, participants in the game condition made more fixations overall processing difficulty, as well as the level of processing depth of task relevant areas seemed to be comparable in both groups. This further emphasized the notion that users were successful in differentiating between relevant and irrelevant task elements in the game condition.

C. Motivational Level
One of the reasons for including seductive details into a task is to increase interest of the user, which may enhance motivation and flow [25], [26]. We, indeed, found increased flow experience in the game condition as compared to the no-game condition. However, this was only observed for the subscale "fluency," but not for "absorption." We assume that the game elements, especially narrative, level goals, virtual incentives, progress bar, and virtual energy bar contributed to clear goals and sense of control dimensions of flow included in the fluency subscale. Hence, participants felt the task in the game condition to run more smoothly and more fluently, felt more sense of control and, thus, had less problems concentrating. We can only speculate as to why absorption did not differ between the two conditions. The used game condition only employed a few game elements, which might not have been enough to affect the immersive dimension of flow in the absorption subscale. Even though flow is often associated with increased performance and learning gains [74]- [76], we did not find performance differences between game and no-game condition. In both conditions performance was rather high (accuracy at about 95%) indicating that the task, even though we used single-and double-digit fractions, was rather easy for the adult participants. Hence, the results might be explained with Yerkes-Dodson Law [100], which postulates an inverted U-shape between arousal and performance. Game elements might have influenced the users' arousal level, which helped them to better concentrate on the task and felt the experience to be more fluent. However, since the performance of the task was already rather high, no performance difference between the conditions was observable. In line with the results for flow we also identified increased stimulation ratings in the game condition as compared to the no-game condition. Moreover, participants rated the game condition to be more attractive and novel. As such, we might expect that when users were given the chance to choose between the two task environments, the game-based version of the number line estimation would be favored [78]. Importantly, however, we did not find any difference on the UEQ subscales perspicuity, efficiency, and dependability. This demonstrated that the game elements did not affect the task in a more pragmatic way, indicating that we established two conditions that are comparable, which often renders problems in media comparison studies [101].

D. Limitations and Future Perspective
The main limitation of the current study was the limited interaction time with the actual task of about 15 min with each version. Consequently, generalizing the effects of game elements on performance, attention, and motivation on longer lasting interactions is not possible. Moreover, missing transfer tests and longer retention intervals or follow up-tests, respectively, limit the conclusions that can be drawn from the nonexisting performance differences between game and no-game condition. This is particularly important as previous research showed that positive effects of more rewarding instructional experiences might be more pronounced with longer retention intervals [102], [103] due to memory consolidation processes. Moreover, for the university student sample, the task might have been rather easy. One might assume that a university student sample might have had higher cognitive resources and would not get as easily overloaded by game elements. Previous research demonstrated that students with lower cognitive capacity were significantly more distracted by seductive details (illustrations). Their attention was drawn to seductive details more often and for longer time intervals [17], [28]. Moreover, potential pre-existing knowledge differences were not evaluated in the current study. Hence, future studies investigating effects of game elements should recruit more heterogeneous samples and employ comprehensive pretests. Finally, further research is needed to distinguish between a coherent and incoherent augmentation of tasks with game elements. That is, game elements disconnected from the actual learning content are hypothesized to be more distracting and, thus, hampering learning [7]. In the current study, we paid particular attention to a coherent and meaningful integration of game elements. Accordingly, task mechanic, narrative, feedback, and visual appearance were carefully matched and embodied the learning material within the structure of the game world and the learners' interactions with it. Consequently, in order to be able to generalize current results, the way of integrating game elements needs to be studied systematically with different type of games, in different context, and for various learning domains [9]. Despite the aforementioned limitations, the current study makes a strong case against the negative effects of game elements on performance, at least during rather short interactions with instructional material.

VI. CONCLUSION
The present findings showed that game elements did not impair performance in number line estimation-an often-used task for assessment and training of number magnitude understanding. Consequently, from a theoretical perspective, we did not observe the seductive details effect elicited by augmenting a task with game elements. We did, however, find different eye-fixation behavior for the game and the no-game condition. Eye-tracking data revealed that participants' selective attention was captured by game elements in the game condition. However, no significant performance difference was observable between the game and the no-game condition. The observed qualitative differences in fixation behavior might also originate from increased user and flow experience. Participants indicated that they felt the game task to be more attractive, stimulating, novel, and had less problems concentrating as the task felt to be smoother and more fluent. On the practical side, we demonstrated that eye tracking is a valuable tool for exploring users' attention toward game elements, which would not have been possible or as detailed with conventional post-hoc questionnaires. Accordingly, developers of educational technologies and games can use the information gathered with eye tracking to enhance the design of learning materials. Finally, we would argue that game elements should be included in tasks rather than excluded, but only when game elements and learning mechanics are appropriately integrated and the game does not overstrain players' cognitive resources.