Why No Difference? A Controlled Flipped Classroom Study for an Introductory Differential Equations Course

Abstract Flipped classrooms have the potential to improve student learning and metacognitive skills as a result of increased time for active learning and group work and student control over pacing, when compared with traditional lecture-based courses. We are currently running a 4-year controlled study to examine the impact of flipping an Introductory Differential Equations course at Harvey Mudd College. In particular, we compare flipped instruction with an interactive lecture with elements of active learning rather than a traditional lecture. The first two years of this study showed no differences in learning, metacognitive, or affective gains between the control and flipped sections. We believe that contextual factors, such as a strong group-work culture at Harvey Mudd College, contribute to the similar performance of both sections. Additionally, to maintain a rigorous experimental design, we maintained identical content across the control and flipped section; relaxing this requirement in a non-study setting would allow us to take further advantage of educational opportunities afforded by flipping, and may therefore improve student learning.


INTRODUCTION
Most classrooms we call "flipped" or "inverted" replace traditional lectures in the classroom with active learning, a mode of instruction that focuses the responsibility of learning on to learners through meaningful activities [7,11]. Often lectures are temporally displaced out of class time through videos or other resources. Since there is strong evidence that active learning improves student achievement [4,5,8], we wondered what benefits flipping would bring relative to an interactive, active-learning style of lecture that is common at our institution (Harvey Mudd College) and many others.
To try to answer this question, we designed a 4-year cross-disciplinary study. Each of us would simultaneously teach a flipped and a control section of a course and look for differences in student achievement and attitudes, while controlling for as many variables as possible. Here we discuss preliminary results of our work in an Introductory Differential Equations course between 2013 and 2014. As of yet, we have not yet seen appreciable differences between our control and experimental groups. In this paper we will explore possible reasons why this has occurred.
Professors in Engineering and Chemistry were also involved with the design of the study, along with consultants responsible for data collection and analysis. Preliminary results from the engineering course have been submitted for publication to the American Society for Engineering Education. Our study has been supported by the National Science Foundation (TUES 1244786) and by the Harvey Mudd College Dean of Faculty.

BACKGROUND AND MOTIVATION
As evidenced by this special issue of PRIMUS, there is a growing interest in flipped classrooms. Our contribution to this conversation about flipped classrooms is a focus on the kinds of interactions that are possible as a result of the class time saved from removed lectures. Since there is strong evidence that active learning improves student achievement and reduces student misconceptions [5,8], it seems natural to flip courses to allow more time for active learning in class.
Although the term "active learning" often stands for a wide variety of instructional strategies, from here on we restrict our attention to two specific forms of active learning: problem-based learning (PBL) and collaborative learning. PBL is an active learning modality, in which students solve illdefined, ill-structured problems. It gives students the experience of using course material in a manner closer to that encountered in a "real-world" setting, thus increasing transference of course material outside of "textbook" examples. Published studies of PBL have shown that it results in more positive student attitudes, a deeper conceptual understanding of course material, the ability to apply material in new situations (transference), and improved retention of knowledge as compared with traditional instruction [8,9]. Collaborative learning, where students work in small groups with a common goal, not only improves learning, but also improves student attitudes, interpersonal interactions, and perception of social support [6,8].
Our study attempts to measure the effect of delivering instruction via videos while keeping the course content uniform across both treatment (flipped) and control (traditional) groups. This type of study is important because we still do not know why some inverted classrooms are effective [2]. Are inverted classrooms effective because more instruction takes place, because the instructors conducting the research are particularly effective, because active learning is employed, or purely due to the inversion of instruction? We have attempted to control our study as much as possible to measure the effect of inverting the course. In addition to measuring student achievement and attitudinal data through formative and summative assessments, our study attempts to measure the effects of flipped instruction on student achievement in subsequent classes.

About Harvey Mudd College
Harvey Mudd College (HMC) is a private, residential, liberal arts college of science, mathematics and engineering, with a total enrollment of nearly 800 undergraduates. It is a member of The Claremont Colleges, a collective of five undergraduate and two graduate schools, located in Southern California.
One distinctive feature of the HMC curriculum is a core set of classes that all students, regardless of major, must take. Mathematics has the largest footprint within this core curriculum: all students take six halfsemester courses that span single-variable calculus, multi-variable calculus, linear algebra, differential equations and probability and statistics. Our study involves Math 45: an Introductory Differential Equations that is taken at the end of the first year at HMC.
There are a few things about HMC that will become more relevant when interpreting our findings. First, there is a healthy culture of student cooperation at HMC. Students spend a lot of time working together in groups in and out of class, and we encourage them to do so from the moment they arrive on campus. It is easy for students to work together outside of class since nearly all students live on campus. Second, our students generally have high selfefficacy and positive attitudes about learning science and mathematics. Third, a lot of faculty at HMC currently use formative assessment, active learning, and group work in our classes (minute papers, iClickers, think-pair-share, and other similar strategies). Students are used to these kinds of interactions in our classes.
Finally, HMC students have many different ways to get help if they have questions about what they are learning. Students here are very comfortable visiting professors during their office hours. Since all students take the same set of core math classes, all upperclass students can and do serve as resources for first-year students. Finally, HMC offers a drop-in evening peer-tutoring program called Academic Excellence that provides targeted help and a good study environment for all core courses.

EXPERIMENTAL DESIGN AND METHODS
The guiding principle behind our study is that videos can be used to replace the less-interactive portions of our classes with more opportunities for active learning, allow our students to work even more with each other in class, and to create more time for students to work on more complex tasks in the presence of an instructor.
We first developed some hypotheses about how these different interactions in the classroom might lead to potential student learning outcomes by creating the logic model shown below. (See Figures 1 and 2.) The items in the white boxes (on the left) are instructional opportunities afforded by our flipped classroom design; items in green boxes (in the middle) are hypothesized consequences of those opportunities; items in blue boxes (on the right) are measurable outcomes. We hypothesize that students may increase their metacognitive skills in the flipped class because they have the ability to review and control the pace of the lecture videos. This ability is related to regulation of cognition (as opposed to knowledge of cognition), which is one of the two major components of metacognition [10].
Based on these logic models, we hypothesize that inverting the classroom will lead to the following improvements (measurable outcomes) over control sections: 1. Higher learning gains. 2. Increased ability to apply material in new situations (transfer). 3. Increased interest in and positive attitudes towards science, technology, engineering, and mathematics (STEM) fields (affective gains). 4. Increased awareness of how students learn and strategies that support their learning (metacognitive gains).
We made every attempt to design our treatment and control class sections so that students in both would encounter the same mathematical topics, tasks, and assessments. The purpose of this study is therefore to determine the extent to which student outcomes are affected by the use of classroom inversion to increase the amount of time that students have with instructors on meaningful tasks.
To measure learning gains, we administered a pre-test and post-test on differential equations topics. The pre-and post-test were identical and contained items that involved these skills and concepts: categorizing differential equations by order, linearity; mass-spring systems; solving a first-order differential equation using separation of variables and an integrating factor. Because of exam time constraints, we were unable to measure students' mathematical modeling skills more thoroughly. However, two of the five test items required students to derive a differential equation for a physical scenario. We also used homework scores and quiz scores to measure student performance.
To measure whether students are transferring their knowledge to other contexts, we plan to collect student achievement data from downstream courses. These data have not yet been collected and analyzed, so we will not discuss knowledge transfer in this preliminary report.
To measure student attitudes towards STEM and metacognitive gains, we administered a survey before and after the course. The survey contained selected items from three established instruments: Research on the Integrated Science Curriculum (RISC), Motivated Strategies for Learning Questionnaire (MSLQ), and the STEM Questionnaires developed by the STEM team at the Higher Education Research Institute (HERI). The pre-course survey contained nine items from RISC and the remaining items were from the MSLQ (18 items). The post-course survey contained the same items but added an additional 27 (for a total of 54) survey items from the HERI questionnaires. The survey items used from the MSLQ contained constructs for self-efficacy for learning, metacognitive self-regulation, peer learning, and help-seeking. The survey items used from the RISC and HERI were related to learning gains and attitudes about engagement, preparedness, and the course in general. Select survey items from the RISC and HERI were used to answer research questions regarding interest in and attitudes about STEM.

COURSE LOGISTICS AND DESIGN
As discussed above, this study involved an introductory differential equations course (Math 45) at HMC. Regardless of major, all students are required to take this course after taking a half course each in calculus, probability/statistics, and linear algebra, unless they place out of these courses by examination. As a result, students in this course have very similar mathematical preparation, and at this point in the school year students are familiar with the expectations that mathematics faculty have about how students should learn mathematics.
Math 45 is a half-semester course that meets three times a week (Mondays, Wednesdays, and Fridays) over seven weeks. There are a total of 19 50-minute class meetings. The roughly 200 students who take this course are divided into six sections. Every year, three instructors each teach two sections of the class at two different times. In 2013, the three instructors of Math 45 (two of whom are authors on this paper) each taught one flipped and one traditional "activelecture" section. To control for differences in learning due to time of day, we made sure that not all of the treatment or control sections of the class were at the same time.
Students signed up for their preferred section without knowledge of whether sections would be flipped or not. After the first day of class, when the study was explained to students and consent forms were passed out, we determined which sections to invert by a coin toss. Students were not allowed to switch sections, whether or not they decided to participate in the study.
During the six months prior to the start of this course, we developed a common set of PowerPoint slides based on lecture notes from prior years. We also created a series of video lectures using a video capture and editing software package called Camtasia. These videos are screen captures of the PowerPoint slides with live virtual ink annotations. Our voices were captured, but not our faces. The average video length is 15 minutes. The three of us who taught Math 45 in 2013 recorded the videos. We maintained a high degree of consistency in the videos by recording the first few videos together. All of us generated roughly the same amount of video. The videos and slides were made available to both the flipped and control sections, because the course does not have a required text.
We note that many aspects about the design and delivery of Math 45 are ones that Bressoud and Rasmussen identified in their broad survey of successful calculus programs [3]. (One notable exception is that HMC does not have a graduate program so students do not interact with graduate teaching assistants. Upperclass mathematics majors may serve in an analogous role.) In particular, those of us who taught Math 45 coordinated with each other regularly. One important side effect of the study is that it forced us to sit down together to agree on everything from homework policies and course goals to nomenclature and choice of examples. We discussed the daily plan for class and debriefed afterwards. As a result, students in Math 45 had a very consistent experience despite being taught by different instructors.
We also developed a common set of homework problems and gave weekly quizzes. Students in the control sections completed all of the homework assignments out of class, whereas the flipped sections completed and discussed selected homework problems in class and completed the remainder outside of class. As a result, students in both control and flipped sections were exposed to the same material. No mathematical task was presented to one set of students that the other set of students did not encounter.
In addition to teaching the theory and solution methods for elementary differential equations, the course has always had a strong modeling component. One challenge is to present mathematical modeling as it would be encountered in practice, rather than just presenting predetermined models such as the ubiquitous mass-spring system. In the first year of the study, we assigned a number of modeling tasks, but they mostly had to do with setting up the equations from a prescribed physical situation (such as water filling a tank of a particular shape). In 2014, we shifted closer to a PBL approach by devoting a significant amount of class time for students to collaboratively work on more ambitious, open-ended problems in class. These mathematical modeling tasks were designed to help students experience mathematical modeling in a more authentic way by engaging them in all parts of the mathematical modeling process (see Figure 3). We carefully scaffolded the mathematical modeling tasks so that they would include more and more parts of the mathematical modeling process so as to more closely approximate authentic modeling. Many of the mathematical modeling tasks focused on sustainability and the environment. For example, students were asked to come up with profitable, yet sustainable, fishery management strategies; they also created models for the 2014 chemical spill in Charleston, West Virginia.
Most of the class in the control sections of Math 45 was devoted to lectures using the same PowerPoint slides that were featured in the videos. We generally allowed for questions at any point during the lecture. Every lecture usually contained at least one practice problem that students would be asked to work on, either independently or in small groups. We would walk around the class during these times to formatively assess whether students were understanding the material being covered.  Students in flipped sections of Math 45 were assigned to watch one or two videos corresponding to the lecture for the day. To determine if students in Math 45 were watching the assigned videos, we created an online survey at the end of the video that students had to fill out. This data was not always accurate because of problems with the video-hosting website, but we estimate that upwards of 85% of students watched the video every time one was assigned.
We began each flipped class by spending a few minutes answering questions about the video(s) that students were assigned to watch. During the rest of the class time in the flipped sections, students worked on selected homework problems either independently or in groups and we walked around asking and answering questions. Occasionally, we asked students to raise their hands when they finished a particular part of a problem so that we could check in with them.
We selected in advance problems from homework that students worked on in the flipped sections. Usually, we chose problems that would reinforce the skills and concepts in the video students had just watched, focusing in particular on ones that might lend themselves to common mistakes or misconceptions. We would walk around the class during these times to look for these common mistakes and misconceptions. We also reserved class time to work on mathematical modeling tasks that were part of the homework assignments. We often asked students to work in small groups on these tasks. There was generally not enough time for students to finish the mathematical modeling homework problems in class, but they were able to get a head start on those problems. Students often have a difficult time knowing how to proceed on open-ended tasks that involve mathematical modeling. We hoped that by starting these problems in groups in the presence of an instructor, students would more successfully complete these types of tasks.

PRELIMINARY RESULTS
To date, Math 45 has been taught twice as part of this study, once in the Spring of 2013 and once in the Spring of 2014. Only two of the three instructors who taught in 2014 participated in the study, whereas all three instructors who taught in 2013 participated in the study. Two of the authors of this paper (Levy and Yong) taught in both years.
All of the data was collected and analyzed by Cobblestone Applied Research and Evaluation, Inc. so as to reduce the chances of bias in the management and interpretation of the data. Cobblestone's analyses appear in the next few sections. The instructors of the course never knew which students consented to have their data shared with Cobblestone.
The data from 2014 is currently being analyzed by Cobblestone, so we only report on our analysis of the 2013 data here. Preliminary analyses of the 2014 data by the instructors show that student composition, pre-test scores, post-test scores, and other class performance scores are distributed similarly to the 2013 data. Some of the technical details of these analyses are omitted here, but appear in an online supplement to this article.

Participants
In the Spring of 2013, a total of 196 students took Math 45 and 176 agreed to participate in the study. Of those, 86 students were in an inverted (treatment) section and 90 students in a control section. A statistical analysis of students' gender, ethnicity, household income level, high school GPA, level of preparedness, and first generation college student status showed no unexpected differences between inverted sections and control sections in terms of sub-group participation. That is, each of the conditions (i.e., inverted and control) had statistically equivalent students from each of the sub-groups analyzed. Overall, these findings suggest that the students in the inverted sections and the students in the control sections, while not randomly assigned, were well-matched in terms of theoretically relevant demographic and background information. (A table with percentages of participating students' gender and ethnicity appears in the online supplement to this article.)

Student Achievement Data
Analysis of the 2013 Math 45 data revealed that student achievement was nearly indistinguishable between the control and treatment groups (see Figure 4). Analysis of the Math 45 pre-test and post-test assessments showed no differences between the inverted sections and traditional sections at pre-test and post-test.  In addition, there were no significant differences between homework composite scores and quiz composite scores between the traditional and inverted sections. Cobblestone noted that all students but one who scored 80% or lower on their homework (n = 9) were in the inverted sections of Math 45. This suggests that participation in the inverted section may impair performance on the homework assignments for a certain subgroup of students. However, many of these students performed well on the exams and quizzes. Since half of these students mentioned struggles with procrastination, motivation, and timemanagement on the open-ended comments of the student survey, this may suggest that the poor performance on the homework assignments was due to study habits more than aptitude. Also, we assume that students in the inverted sections were working more collaboratively on the homework in class, which may contribute to some "clumping" of homework composite scores that we observed, as opposed to a fairly normal distribution of scores that was observed in the traditional sections.

Student Attitudinal Data
Students completed a survey at the end of the course that contained three items related to their attitudes toward STEM. Students rated how excited they felt about learning new concepts, attitudes about taking more mathematics courses, and how prepared student felt about taking them. There was no statistically significant differences between the students in the traditional sections and the inverted sections. Also, the original MSLQ contains four main constructs (theoretical concepts or ideas that are generally established through the combination of three or more survey items), one of which is associated with metacognition. Eight MSLQ items from this construct were used to measure students' metacognition for this study. The analyses of these eight items did not show any significant difference from pre-test to post-test. The online supplement to this article contains more summary statistics on these attitudinal data.

Student Satisfaction Data
Although we did not make any predictions on whether students would rate their experience in the course differently, we looked at the open-ended responses to the end-of-course satisfaction surveys to see if there were any noticeable differences. There was a mix of opinions regarding the flipped classroom experience; positive and negative feedback seemed balanced.
Some of the most positive responses to the inverted format involved students' ability to learn at their own pace: The videos really helped my learning, most likely because of the opportunity to try the practice problems at my own pace. In class, if we had practice problems, I would not even have a chance to try the problems and the class would have moved on, but the videos let me pause and take as much time as needed.
An unexpected response came from English Language Learners who reported that the videos helped them review material they might have missed in an English lecture setting.
Several students were explicit in their preference for the traditional classroom structure due to the ability to ask questions in real-time and for the greater sense of focus on and engagement with the lecture and their instructor.
One of the more common grievances among the inverted section students involved not being able to ask questions in real-time during the lecture and not being able to follow up adequately during class time. This was either because they could not keep track of what they found confusing in video lectures enough to articulate questions for class, or because they found it too difficult to get the professor's attention when they needed help. However, some students in a control section also found it "Difficult to form questions during class as I didn't have many problems to apply it to." Almost all students reported an appreciation for the ability to pause, rewind, and fast-forward through the video lectures: "The videos were awesome. Incredibly helpful when I was confused about an idea; I could re-watch that snippet again and again." Others felt they could not learn from the video lectures, and found themselves unable to overcome their frustration and confusion even with repeated viewing. There were some complaints about the pacing of the video lectures and some found it difficult to maintain their focus while watching on their own. As one student noted, "Personally, learning from listening to a professor lecture is more helpful to me than online videos." The inverted section students reported having difficulty scheduling uninterrupted time to watch the videos before class.
A few students in the control sections did have some difficulty keeping up with note-taking during in-class lectures, but because all students were granted access to the online videos and lecture notes, students in the control sections found these to be effective supplemental resources. As one student commented, "I liked having the option to watch the videos while also having lectures." Another noted, "Complete lecture notes allowed me to review material by myself, which really helped me study and understand topics." The online lecture notes (based on the PowerPoint slides) were found to be essential, with one student claiming "They were how I learned the material, so they were the most valuable to me." Regardless of condition, students most appreciated the real-life applications and connections to other science subjects offered in this course. One student commented, "I really liked the modeling aspect of the class . . . it was interesting to see how DEs could be used to model real-world situations."

DISCUSSION
That flipping seemed to have no statistically significant effect on student achievement and attitudes appears to contradict our initial logic models. However, there are important contextual factors (mentioned in Section 2.1) that may help to explain this discrepancy and suggest ways that we can modify our instruction to improve student outcomes.
As pointed out earlier, the videos that were created for the flipped class were made available to the students in the control sections. It is possible that if students in the control sections extensively used these resources, they might blur the differences between treatment and control groups and that might explain the lack of statistically significant differences between the two groups. However, student surveys and data from the video-hosting web site suggest that students in the control sections accessed these videos very infrequently. Therefore, it is unlikely that the availability of the extra materials to students in control sections explains the lack of significant difference between the two groups.
One of the hypotheses in our logic model is that students in the flipped classroom would work with each other in groups in class more frequently, and that would result in better performance. However, this effect was probably significantly mitigated by the fact that almost all HMC students work with each other outside of class.
Although we were able to more quickly identify students' misconceptions in the flipped classroom, that may not have led to measurable differences in student achievement because students also have many different ways of getting help outside of class.
We also hypothesized that students' ability to review and control the pace of the videos might lead to increased student metacognition and learning. This potential increase may have been mitigated by the fact that students in the control sections were able to ask questions at any time during class and that we provided feedback to students on mathematical tasks during control classes.
Another important factor that should be mentioned is that all of the instructors in this study are relatively new to flipped instruction. (One had done it for a few courses, and the rest were completely new to it.) In contrast, all of us had years of experience teaching interactive lecture-based courses. Therefore, it is possible that as we improve our instructional methods in the flipped classes, we may see statistically significant differences in student learning and attitudes.
Students (like us faculty) hate change. It is possible that some students who dislike changes in instructional styles would have less favorable attitudes toward flipped sections of Math 45. These students may put in less effort and therefore perform at a lower level. A natural question to ask is, what would happen if we allowed students to choose between a traditional and flipped section. Would this increased agency improve student outcomes and attitudes?
We conceive of this study on flipped classrooms at HMC as design-based research [1] and, as such, we will attempt use the lessons learned through early stages of implementing flipped classrooms to inform changes to our research design.
One of the most restrictive aspects of this study on our teaching has been the design to keep the content (mathematical topics, tasks, assignments, exams) identical between flipped and control classes. Since we want students in control sections to encounter the same tasks as students in the flipped sections, students are limited to working on homework assignments in flipped classes. There are obviously many other ways to make use of class time that has been freed up through videos. We will continue to look for innovative ways to use class time under this restriction, but another alternative is to relax this design constraint and use the data from 2013 and 2014 as a baseline for comparison.
Helping students develop their mathematical modeling skills and assessing those skills has been and will be a challenge. Having students use standard models is simpler, but we would like to provide more realistic experiences to practice the iterative modeling process depicted in Figure 3. We would also like to develop better ways to give students feedback on their mathematical modeling attempts, and continue building our library of mathematical modeling tasks for Math 45.
A blended approach of lecturing and active learning in class through flipping may be another interesting avenue for us to explore. It might be better for us to be more strategic in our choices to free up class time using videos (rather than flipping every class) because some lecture topics may be more suitable to displace from class using video.
Finally, we urge others to be cautious of extrapolating from our results (and others in this issue of PRIMUS) to their own contexts. We suspect that there are many contextual factors and aspects of our research design that may explain why we did not see a statistically significant improvement in student learning gains and attitudes as a result of flipping our instruction. Much more research needs to be done to interrogate the contexts and conditions under which classroom inversion leads to the best outcomes for all students.

SUPPLEMENTAL MATERIALS
Supplementary data for this article can be found on the publisher's website.