Area-level grocery purchases

2020-02-04T12:11:21Z (GMT) by Luca Maria Aiello
For each geographic aggregation (LSOA, MSOA, Ward, Borough) we provide a file containing the aggregated information on food purchases, enriched with information coming from the census.

Files are comma-separated and contain 202 columns in total. Fields include:

area_id: identifier of the area

weight: Weight of the average food product, in grams

volume: Volume of the average drink product, in liters

energy: Nutritional energy of the average product, in kcals

energy_density: Concentration of calories in the area's average product, in kcals/gram

{nutrient}: Weight of {nutrient} in the average product, in grams. Possible nutrients are: carbs, sugar, fat, saturated fat, protein, fibre. The count of carbs include sugars and the count of fats includes saturated fats

energy_{nutrient}: Amount of energy from {nutrient} in the average product, in kcals

h_nutrients_weight: Diversity (entropy) of nutrients weight

h_nutrients_weight_norm: Diversity (entropy) of nutrients weight, normalized in [0,1]

h_nutrients_calories: Diversity (entropy) of energy from nutrients

h_nutrients_calories_norm. Diversity (entropy) of energy from nutrients, normalized in [0,1]

f_{category}: Fraction of products of type {category} purchased. Possible categories are: beer, dairy, eggs, fats & oils, fish, fruit & veg, grains, red meat, poultry, readymade, sauces, soft drinks, spirits, sweets, tea & coffee, water, and wine.

f_{category}_weight: Fraction of total product weight given by products of type {category}

h_category: Diversity (entropy) of food product categories

h_category_norm: Diversity (entropy) of food product categories, normalized in [0,1]

h_category_weight: Diversity (entropy) of weight of food product categories

h_category_weight_norm: Diversity (entropy) of weight of food product categories, normalized in [0,1].

representativeness_norm: The ratio between the number of unique customers in the area and the number of residents as measured by the census; values are min-max normalized in [0,1] across all areas

transaction_days: Number of unique dates in which at least one purchase has been made by one of the residents in the area.

num_transactions: Total number of products purchased by Clubcard owners who are resident in the area.

man_day: Cumulative number of man-days of purchase (number of distinct days a customer has purchased something, summed all individual customers)

population: Total population of residents in the area according to the 2015 census.

male: Total male population in the area.

female: Total female population in the area.

age_0_17: Total number of residents between 0 and 17 years old

age_18_64: Total number of residents between 18 and 64 years old.

age_65+: Total number of residents aged 65 years or more.

avg_age: Average age of residents according to the 2015 census

area_sq_km: Surface of the area (km^2)

people_per_sq_km: Population density per km^2

Where applicable, measures are accompanied with their standard deviation (fields with suffix _std), the 95% confidence interval for the mean (suffix _ci95), and the values of the 2.5th, 25th, 50th, 75th, and 97.5th percentiles (suffix _perc{value})