posted on 05.12.2017, 23:18 by Lorena A. BarbaLorena A. Barba, Natalia C. ClementiNatalia C. Clementi
Engineering Computations
—Original material written as Jupyter Notebooks for an undergraduate engineering course, Fall 2017

Module 2: Take off with stats

The first module of this course, "Get data off the ground," assumed no coding experience and created a foundation with Python programming constructs and data structures. You learned to play with strings, lists and NumPy arrays, using indexing, slicing, for- and if-statements, and functions. The second course module explores practical statistical analysis with Python.

Lesson 1: Cheers! Stats with beers

Exploratory analysis using a data set of canned craft beers in the US. Introduces the pandaslibrary and its data types: Data Frames and Series. Use pandas to read a data file, extract selected columns, and remove null values. Descriptive statistics: measures of central tendency and variability. Distribution plots: histograms with Matplotlib. Comparing with a normal distribution.

Lesson 2: Seeing stats in a new light

Continuing with the data set of canned craft beers, this lesson focuses on visualizing statistics. For quantitative data: histograms and box plots; for categorical data: bar plots. Visualizing multiple data with scatter plots and bubble charts.

Lesson 3: Lead in lipstick

A full worked example using what you learned in lessons 1 and 2: using data from studies by the US Food and Drug Administration on the lead content in lipstick, we fact-check alarming news headlines. Based on Prof. Kristin Sainani's lecture, "Exploring real data: lead in lipstick," of her Stanford Online course "Statistics in Medicine."

Lesson 4: Life expectancy and wealth

Deeper dive into pandas, using data for life expectancy and per-capita income over time, across the world. Inspired by the work of Hans Rosling. Pandas methods: head(), info(), value_counts(), groupby(), describe(), groupby.first(), groupby.get_group(), idxmin() Categorical data type. Bubble plots, spaghetti plots, and interactive widgets.


Note—If you have suggestions for changes or improvements to this material, please open an issue on the GitHub repository.


