Beyond Normal: Preparing Undergraduates for the Work Force in a Statistical Consulting Capstone

In this article we chronicle the development of the undergraduate statistical consulting course at Miami University, from canned to client-based projects, and argue that if the course is well designed with suitable mentoring, students can perform remarkably sophisticated analyses of real-world data problems that require solutions beyond the methods encountered in previous classes. We review the historical context in which the consulting class evolved, describe the logistics of implementing it, and review assessment and student reaction to the course. We also illustrate the types of challenging projects the students are confronted with via two case studies and relate the skills learned and reinforced in this consulting class model to the skills demanded in the modern statistical work force. This course also provides an opportunity to strengthen and nurture key points from the new American Statistical Association guidelines for undergraduate programs: namely, communicating analyses of real and complex data that require the application of diverse statistical models and approaches. Supplementary materials for this article are available online. [Received December 2014. Revised July 2015.]


INTRODUCTION
In recent years, there has been a growing realization that a statistical consulting experience is critical to the undergraduate statistics curriculum. This is evidenced by the American Statistical Association (ASA) 2014 Curriculum Guidelines for Undergraduate Programs (American Statistical Association Undergraduate Guidelines Workgroup 2014) in statistics, which reflect the increasing need of statistics undergraduates to be fluent not only in the mathematical underpinnings of the disci-Byran Smucker is Assistant Professor, Department of Statistics, Miami University, Oxford, OH 45056 (E-mail: ). A. John Bailer is University Distinguished Professor and Chair, Department of Statistics, Miami University, Oxford,. The students in the Data Practicum classes in the spring semesters of 2010 and 2013 conducted the analyses summarized in the case studies reported in this article and the figures included in this article and supplementary materials were extracted from their client reports. The clients whose projects are featured in this article were The Oxford Parking & Transportation Advisory Board and Professor William Renwick from the Department of Geography at Miami University. (E-mail: smuckerb@miamioh.edu).
Color versions of one or more of the figures in the article can be found online at www.tandfonline.com/r/tas. pline, but also in computational, data management, and practical data analysis aspects of statistics. Early consulting courses (e.g., Spurrier 2001; Miami University until approximately 2010) focused on canned projects, which allowed the instructor to control the structure of the class as well as the range of statistical methods encountered. Use of such labs promotes the development of writing, presentation, and teamwork skills, but implies a known solution that the instructor possesses. Moreover, the preplanned model avoids unleashing undergraduates on real data-with all of the associated complications and subtleties in both the analysis and its communication-and denies them the opportunity to experience statistical consulting in its native state.
More recent literature develops this critique of the canned approach to student consulting by emphasizing the importance of exposing students to the challenging reality associated with formulating a statistical problem, cleaning the data, and dealing with the inevitable difficulties that do not tend to arise in projects that have already been analyzed. Mackisack and Petocz (2002) described a capstone course for seniors and discussed some of the accompanying difficult issues, such as keeping students engaged and handling messy data. Jersky (2002) characterized an undergraduate consulting course that includes both statistics majors and other undergraduates who may not have any more than an introductory statistics course in their background. Jersky's approach is similar to, though less extreme than, Taplin (2003) who endorsed a model in which a mix of real and canned projects are used to expose students to statistical consulting during the early part of their training in statistics, in hopes that these students can be recruited to the discipline. Boomer et al. (2007) and Hooks and Malone (2012) described real undergraduate consulting experiences at several institutions of varying sizes, with prerequisites varying from one to three courses in statistics. Legler et al. (2010) described a nonclass-based model in which students are selected to be a part of a year-long program that allows them ample time to explore and solve a substantial consulting problem. Similarly, Kim et al. (2014) discussed an undergraduate consulting program in which students apply to be a part of a consulting team, overseen by senior student consultants and faculty members. Included in the Kim et al. article is some feedback collected from the student consultants regarding research and professional development skills.
In the present article, we add to the chorus in favor of requiring undergraduates to work on real consulting projects, where these projects may require methods and approaches for which they have not had formal training. We first give an overview of undergraduate consulting at Miami, including a brief history, a description of the class, and a discussion of its benefits and challenges. We also include a characterization of the class's larger context within the undergraduate statistics curriculum and within the university more broadly. Then, we report on assessment and student feedback on the course, to understand the skills being developed as well as student perception of learning. Subsequently, we discuss two specific examples of challenging projects that the students navigated exceptionally well. Finally, we include some summary and concluding remarks.

HISTORY AND BACKGROUND: UNDERGRADUATE CONSULTING IN STATISTICS
AT MIAMI

History and Context
Miami University includes about 15,000 undergraduate students and a total enrollment of around 17,000. In 2009, the Department of Mathematics and Statistics partitioned, and the Department of Statistics was born. The new department has substantially increased the profile of statistics on campus, as evidenced in part by the steady growth in the size of the Statistics major, from only a handful five years ago to approximately 60 now. In the same time period, the size of the Actuarial Science minor has tripled to 90, and students have begun to pursue combined bachelor's-master's degrees. A new co-major in Analytics, a joint program with the Information Systems & Analytics Department in the Farmer School of Business at Miami, has been launched and in its two years of existence has grown to almost 70.
Within the context described in the preceding paragraph, we chronicle the development and current implementation of the statistical consulting class at Miami. The graduate version of the Data Practicum class began in 1973, a recognition that consulting is often a part of a working statistician's job description, and an exposure to "real world" problems would provide valuable experience. The quotation marks indicate that despite this motivation, the early versions of the course were labs derived from consulting problems that had been solved previously by staff in Miami's statistical consulting center.
The undergraduate version of the course was developed around 1994, inspired by and modeled after its graduate cousin. In the spring of 2010, the second author transitioned the undergraduate consulting class from canned labs to real projects after having done the same with the graduate version of the class years previously. It currently serves as a required senior capstone for Statistics majors as well as a possible elective for Math & Statistics majors, Statistics minors, and Statistical Methods minors. Students from the minors come from a variety of majors (e.g., psychology). The mix of students can be challenging but is on balance beneficial because often the nonstatistics majors have more mature perspectives on data based on their experience within their discipline, while Statistics majors tend to be stronger in statistical computation and/or methods. The minimum prerequisite is an introductory statistics course and an introduction to statistical modeling. Statistics majors will far exceed this minimum background requirement, and will often include a number of methods courses as well as a statistical computing course.
The consulting course satisfies the quantitative literacy (QL) requirement of the College of Arts & Science (though the regression prerequisite for the nonmajors that take the consulting class satisfies the QL requirement, the prerequisite for the statistics majors does not) and fulfills a capstone course requirement as well as an experiential learning requirement of the Global Miami Plan for Liberal Education.

The Course
A critical first step that occurs in the months before the consulting class is a call for interested clients. An invitation is distributed electronically that invites colleagues from around the university and local community to "put our students to work." An example of the invitation letter is included in the online supplementary material A. Since the resulting projects are integral to the training of our students, and most of the clients have been drawn from unfunded projects within the university, we charge no client fees. The number of projects can vary-often around three-though as suggested in the online supplementary material A there could be more if the proposed projects were relatively simple in structure and scope. Based on prior work in either the graduate or undergraduate consulting class, a substantial proportion of our current customers are repeats.
The course begins with some introductory material about statistical consulting, and an initial, simple, canned project is assigned for which the students are to write a draft report. Feedback is given on the first draft, to set their expectations and convey important ideas for future reports, including the expected structure and the importance of revising, critical reading, and graphical displays. The students find that the course instructor is likely to be the worst client they will encounter during the semester. As an aside, the class continues to evolve. In a recent offering of the course, research integrity has been emphasized, by requiring a reading of the ASA guidelines for ethical practice and by having the campus director for research compliance visit the class and discuss expectations for research conduct. Recently, the department has set up an R Studio server for use in classes, and R Markdown (RStudio 2013) documents have been used in the regression and graduate consulting classes, with plans to incorporate this reproducible analysis tool into the undergraduate consulting class, as has been done elsewhere (Baumer et al. 2014).
Each client comes to class and presents their project to the entire group of 10-20 students. The students ask questions and work to understand the scope of the problem. Particularly early in the semester, the instructor serves as a model consultant by ensuring that as much pertinent information as possible is extracted. After the initial meeting, the students typically handle additional interactions with the client, up until the arrangements for the final presentation. The best students are engaged and interactive during the initial meeting, and measures can be taken to encourage even quiet students to take part. For instance, each student might be required to submit a brief report after the initial client meeting that includes the client's contact information, describes the research question, summarizes the available data and how it is stored, and requires an indication of their interest in leading a team on this client project.
Following the initial client meeting, the students are typically divided into several teams of three to four students, with one student appointed as the leader of each team. Depending on the project, groups may all work on the same client objectives, or each on distinct objectives. The assignment of students to teams in ways that balance majors and statistical expertise while rotating team leadership and membership among projects is critical. For instance, each team typically includes a student who has training in statistical programming since computing is so important to most projects. Throughout the semester, each student serves as a lead at least once. Ideally, the leader coordinates and delegates according to the strengths of the various members (see the Benefits and Challenges section for more discussion of this point). Often, more than one project is ongoing simultaneously, since initial client meetings are usually not spread uniformly across the semester.
A major role of the instructor is of the experienced senior consultant who can suggest analysis strategies, since many projects include parts that require new statistical or computational methods. Sometimes the introduction to a topic is fairly extensive (e.g., a guest lecture on a statistical topic or method), but often it is brief, only pointing the students in a particular direction with few details (e.g., a description of the basic functionality of R's mixed model function nlme). The extent to which the instructor is involved depends upon the group of students.
The students then begin work on the problem, and periodically share their progress with the instructor and the rest of the students during class. Often, students are given concrete direction to guide their report. For example, teams might be directed to answer three questions: What was accomplished since the last class meeting? What problems/challenges are you currently facing? What are you planning to accomplish before the next class meeting? The best groups make excellent progress (see the subsequent case studies); the poorer groups may need more detailed oversight. Additional, out-of-class meetings with the instructor or client are scheduled as needed. It is unusual, though not unprecedented, for a group of undergraduate students to define the next steps in a project without the input of the instructor.
The amount of time spent on a project depends upon its size and difficulty. Once the steps to solving the problem are clear, and the group has made progress on its implementation, a deadline is set for an initial draft report. Once the report is submitted, it is given extensive feedback by the instructor, and iterated back to the students. Since the quality of the consulting product is paramount, this iterative feedback procedure may continue for several steps until the instructor is satisfied that it has an adequate level of quality for dissemination to the client. The client then returns to the class to listen to an oral presentation by the students.

Learning Outcomes and Assessment
In 2012, the Department of Statistics established Learning Outcomes for its Statistics B.S. major. The third one, which was to be assessed in the statistical consulting course, is: Students shall be able to effectively communicate, both orally and in written form, the results of statistical analyses to both the expert and the layperson.
All three learning outcomes are listed in the online supplementary material B. In addition to reinforcing the third learning outcome, the consulting course clearly addresses the first one as well, which has to do with the analysis and interpretation of data using statistical methods and programming.
As noted previously, the consulting course is formally a capstone, which denotes a course that requires the synthesis and integration of material from across the student's experience at Miami. Capstone courses include specialized technical material like that taught in statistics courses, as well as liberal educational skills like critical thinking, teamwork, and communication. In addition, capstones require students to take initiative in the investigation of problems. The consulting course is also designated as a service-learning course, which means that it focuses on experiential student learning in ways that lead to benefit for both student and client.
Assessment in the consulting course takes several forms. First, we specifically evaluate one project during the semester using rubrics (see supplementary materials C and D, available online) that reflects writing, oral presentation, and overall communication to both the statistical expert and layperson. The evaluation can be complicated at times by the fact that most of the work in the course is within the context of a mixed group of Statistics majors and other students, and the assessment calls for the individual appraisal of each Statistics major. This complication is handled by focusing on groups that have a majority of Statistics majors, or at least are led by a Statistics major. Oral presentations are easier to assess individually.
For instance, in the spring of 2013, three of the four groups assessed (on the project described in Section 4) produced final reports that met or exceeded expectations, while one group failed to meet expectations. All groups were led by statistics majors, but none were composed solely of them. Note that none of the groups, in any of the categories, produced acceptable reports in their first draft, though two groups were fairly close. Oral presentation skills can be more easily assessed individually, and four of the six statistics majors performed adequately while the other two were not too far from acceptable.
Another way to assess the effectiveness of the course is to hear from the students themselves via student evaluations or reflection papers. In surveying student comments regarding the class in spring 2011, recurring themes involved an appreciation of the real-world experience gained by the course and problem-solving skills that were required and honed. A sample of these comments were extracted and then grouped into categories. Frequencies of responses in the categories were tabulated and displayed in Table 1. We emphasize that our reporting is only an informal and rough indication of student sentiment.
Students felt, as indicated in Table 1 by the first and most common category, that their communication skills improved with the practice that they gained over the course of the semester. The second most common category, labeled "background" in Table 1, reflected the sense that many students believed they needed more statistical content or programming and computational skills coming in, not surprising given the diversity of backgrounds among the students. Programming skills were mentioned as particularly important. Many students mentioned the team experience in the "group" category, mostly in a positive light although there were some problems with team members not pulling their weight or project leaders failing to lead effectively. The next two categories, "process" and "experience," were strong positive categories that summarized students' belief that they developed a better understanding of the consulting/collaboration exchange, and that they had a positive response to the class itself, respectively. That this consulting class was helping to prepare them for the work force was reflected in the obviously named "work force preparation" category. The students recognized that the collaborative and project-oriented nature of the class mimicked a real-world work environment. Some students expressed frustration at the repetitiveness of multiple teams reporting on the same client project (last row of Table 1). Note that other comments were provided but with lower frequency. For instance, some students included superlatives indicating that this was the best statistics course they had taken at Miami. There was also some sentiment indicating an appreciation of the freedom of the students in doing the analyses, and the value of learning new methods. Interestingly, the typical student concern in a statistical consulting class-that it takes too much time-did not surface as a predominant concern. The student self-assessment we describe makes clear that the course promotes the "real applications" portion of the 2014 Curriculum Guidelines for Undergraduate Programs in Statistics.

Benefits and Challenges
A move to a real, client-based class has obvious benefits over the traditional consulting course-exposing the students to realistic and meaningful statistical work and honing their interpersonal skills toward peers, superiors, and clients alike. But the challenges are real and substantial as well, and they center primarily on the difficulties associated with organizing and motivating groups of undergraduates to pull their weight all in the same direction.
Projects may vary substantially in complexity and undergraduate students may vary substantially in their motivation and conscientiousness, so workload balancing can be a difficult issue. How do you ensure that everyone receives a fair grade, when individual assessment is difficult given the group setting? What can be done about free riders? Or, what if group dynamics become unhealthy? There are no obvious and easy answers to these questions, of course, but one effective ameliorative approach is to require each student to independently and anonymously evaluate the contribution of their team members. A helpful question for evaluating contributions is "Would you recommend this person to another team?" [scored 1 (never) to 10 (enthusiastically)]. After smoothing extreme ratings, student grades can be reduced in commensuration with their rating. This peer-evaluation approach might be combined with a self-assessment in which the student argues for the grade they believe they deserve. These measures do not necessarily eliminate the problem, but help to identify high achievers and free riders and consequently allow the appropriate grade to be more closely matched with the students' effort and contribution.
Another important measure is to vary the composition of the groups throughout the semester. Much like in experimental design where we ensure that levels of various factors are not simultaneously changed in a way that results in a high correlation between the factors, we can learn about the quality of individual students by shuffling students into different groups.
One of the students' challenges is managing multiple projects at the same time. Interestingly, though they multitask across many classes every day, the idea of contributing to more than one project at the same time within a particular class sometimes troubles students. Connecting this experience to the workplace-where they will face competing and overlapping pressures-and their other academic experience is helpful to bring them to an appropriate perspective.
Finally, the role of the instructor in the consulting class is very different from traditional classes or even from the role played in a traditional canned lab consulting class. The instructor needs to solicit projects for the class and needs to gauge the difficulty of projects up front. For instance, the challenging project discussed in Section 4 was welcomed as the first of the semester! Ideally, the difficulty of the projects would ramp up throughout the semester, which suggests that the instructor might work to understand the projects ahead of time. On the other hand, too much project curation militates against the philosophy of the class, which is to charge students to address whatever project they are presented with. In addition, we believe it is useful for the students and the instructor to work together to understand the problem, and this can happen most effectively if no one knows too much about the project in advance.
Overall, the change to a client-based class has allowed critical engagement of the students with clients and the client- defined tasks, and required that the students work to understand the client's subject matter, which is often key to understanding the underlying statistical questions. The new model also mimics the sort of environment in which most working statisticians find themselves-collaborative and iterative-which is important experience that can be used to great advantage in an increasingly competitive statistics and analytics job market. Indeed, students that have completed the consulting course often highlight their consulting projects in their résumés.
In the following two sections, we present an overview of two challenging projects that undergraduates in this class have undertaken. We highlight the challenges the students confronted and surmounted, while including material taken from the student reports that reflects the work that they did. A more detailed version of these case studies can be found in the online supplementary materials E and F accompanying this article. We note that in addition to these case studies, the class has included projects over the last few years such as studying the relationship between fitness and academic performance in children (kinesiology and health), pollution in the Great Miami River sediment (geology), tadpole swim velocity (biology), as well as projects from clients in mechanical engineering, nursing, accounting, and gerontology.

CASE STUDY: PARKING IN A SMALL COLLEGE TOWN
The consulting class was asked to conduct an occupancy study of the various parking areas in Oxford, Ohio, so that the Oxford Parking & Transportation Advisory Board (OPTAB) could make data-based decisions regarding meter rates, hours, fines, etc. The problem for the class was to conduct an analysis of a sample of the 790 metered spots in Oxford as well as the parking garage. The goal was to investigate different meter rates based on location to spread occupancy from "hot spots" to outer locations as well as into the unoccupied city garage. Detecting the occurrence of "meter feeding"-occupants of parking spaces extending their time for periods longer than the maximum time by adding coins to the meter before the time expires-was also of interest. An additional issue of interest was extending meter hours past the current 6 p.m. deadline, which could force many vehicles to park away from the High and Main intersection as well as produce additional income for the City of Oxford.
The class restricted attention to a total of 377 parking spaces. The students developed a data-collection plan that was implemented in March of 2010. Four days (Monday, Thursday, Friday, and Saturday) during a 1-week period during the semester were sampled hourly (11 a.m.-3 p.m.; 4 p.m.-8 p.m.). Each of the 14 students collected data on at least two occasions. For each metered spot, occupancy was recorded along with whether the meter was in violation.
Results reported to the OPTAB clients included an animated heat map of occupancy, a presentation originated fully from the students; this proved to be a powerful depiction of how the occupancy varied over time (see Figure 1 for one map that was included in the animated set). Other tables and graphs included occupancy/violation rate per day in block, occupancy/violation rate per day in garage, and length of stay. Plots faceted by day and block face provided insight into underutilized spaces (e.g., Figure 2).
Students presented the results of their data collection and analysis effort to the OPTAB. The board was very impressed with the depth, quality, and insight provided by the students, and mentioned that these data would be relevant for setting parking meter rates that might include differential rates for underutilized spaces. The OPTAB members also commented that the level of work displayed by the students may have exceeded the value of a previous report that cost over $20,000 to conduct. The students were flabbergasted and made to understand in a new way the value of their work.

CASE STUDY: ANALYSIS OF RESERVOIR SEDIMENTATION RATES
Dr. Bill Renwick, Professor of Geography at Miami University, was the client for a project in early 2013 whose objective was to analyze the rate at which sediment accumulates in reservoirs in the United States. The publicly available dataset of interest included about 3900 observations on roughly 1900 different reservoirs, collected between 1755 and 1992. Within the dataset, five variables were of interest (Table 2). A measurement in this dataset specified a particular reservoir in a particular region and included an estimated sedimentation rate that was calculated based upon estimated reservoir volumes measured at two different times. A sedimentation rate was associated with the midpoint between the beginning and ending measurements. The goal of the project, as communicated by Dr. Renwick, was to model the sedimentation rate as a function of year, to determine if there was a significant change in sedimentation rates across different regions within the United States.
The challenges presented were considerable. The dataset was messy, including duplicate and conflicting observations that had to be resolved. There were also at least four critical complications that precluded a straightforward, standard analysis of sedimentation rate regressed on time, for each region. First, the  response was highly skewed due to the natural bound of 0 on the sedimentation rate. This issue was largely remediated by a log transformation. The second difficulty had to do with how the sedimentation rate of a reservoir was measured: At two different times, sometimes years apart, the volume of a particular reservoir was calculated and the two measurements were used to estimate the sedimentation rate. Intuitively, one might expect the rates to be more volatile when the two measurements were taken close together, and less variable when more time had elapsed between the two measurements. This is, indeed, what was observed (Figure 3), suggesting that observations should be weighted as a function of the duration between measurements. The third important complication was that many reservoirs were measured multiple times, suggesting that the correlation between observations on the same reservoir should be incorporated into the model. None of the students in the class had ever fit a model with such structure. Finally, multiple comparisons were an issue that needed to be addressed, since there were 18 regions for which inference was desired.
Once the above complications were accounted for, the resulting fitted model produced a standardized residuals versus fitted plot that showed no obvious patterns and gave a reasonable level of confidence in the subsequent inference. Upon fitting the model, there were eight regions with p-values less than 0.05, indicating possible changes in sedimentation rates over time. However, once the p-values were adjusted using the procedure due to Holm (1979) to account for the multiple hypothesis tests, four regions (Mid-Atlantic, South-Atlantic Gulf, Rio Grande, California) exhibited a slope parameter significantly different from 0. Interestingly, each of the four significant slope parameters was positive, implying that the sedimentation rate was increasing across time, in contrast to the initial hypothesis.

DISCUSSION AND CONCLUSIONS
In 2014, the American Statistical Association Undergraduate Guidelines Workgroup released curriculum guidelines for undergraduate programs in statistics. Primary takeaways included an increasing emphasis on computation and data science, the importance of exposing students to real applications as well as to a variety of methods and approaches, and the necessity of strong communication skills. These are among the competencies developed in the consulting course at Miami, which consists primarily of real applications that commonly include messy data and require that students learn new methods and computational skills. Both verbal and written communication, to both experts and clients, is emphasized as well. We believe this is a modern course that aligns strongly with modern recommendations for undergraduate curriculum in statistics.
Employers consistently indicate that career success factor skills such as leadership, teamwork, and written and verbal communication particularly aimed at a nontechnical audience-as promoted by recent ASA presidential initiatives and affirmed in the 2014 curriculum guidelines-are in great demand from students. These hirers often hasten to add that there is a baseline of technical skills that are expected, but their manner clearly indicates an emphasis upon those qualities that are more intangible and general-purpose. The consulting class plays an indispensable role in Miami's undergraduate program in developing these nontechnical skills. Most of these characteristics could be fostered in the old-style consulting class environment, but the element that pushes this course to the next level-beyond normal-is the presentation of messy, unstructured, undefined statistical problems by content-specialist clients. Consequently, students report that potential employers are often more interested in the experience working on consulting projects than other course experience because it demonstrates an ability to perform useful statistical analysis in a real-world, nonacademic setting. Furthermore, since the clients care about their projects, it imbues the students with satisfaction, because their work is relevant and valued.

Supplementary Materials
Section A: Example invitation letter Section B: Learning outcomes Sections C and D: Evaluation rubrics Sections E and F: Case studies [Received December 2014. Revised July 2015