Integrating spatial and time sensitive data to monitor social patterns: A dynamic methodology for studying social issues

This research introduces a dynamic methodology that can be used to monitor social issues using spatial and time sensitive data. The methodology was used in a project funded to assess the Highway Watch® program administered by the American Trucking Associations, United States Department of Homeland Security. The application of the methodology is of interest to a much wider social science audience, and is the focus of this manuscript. The Highway Watch® program was established to be America's ‘eyes on the road’ in the wake of a post-9/11 society. The program was implemented to train trucking professionals to be on alert and report suspicious activity as a means of public safety. In an assessment of this program, our research team combined two unique datasets to measure the density and frequency of freight truck traffic in a sample of urban and interstate roadways to measure the coverage of potential ‘eyes on the road.’ Of interest to social scientists and spatial analysts is the use of social, spatial, and industry segment data, integrated with interactive data visualization software to analyze at-risk places and populations. The method serves as a useful tool for monitoring social issues in time and space.


Introduction
This manuscript introduces a method of measuring public safety in the United States using an innovative mapping methodology. Funded by the American Trucking Associations, US Department of Homeland Security, the research was conducted as an assessment of the 'National Highway Watchw Analysis and Improvement Program.' Of interest to a broader social science and spatial audience is the methodology used to measure public safety, clearly an important social issue. The importance of methodology to identify risk is crucial, as it potentially influences public policy and the allocation of resources to areas of need (Porter, 2010;Porter, 2011). The methodology introduced in this manuscript combines spatial and time sensitive data with freight flow and industry segment data to identify at-risk places and populations during every hour of every day of the week. The data, methods, and software used to complete this innovative mapping and monitoring technique is discussed.
Highway Watchw was a safety and security program intended to use the skills of transportation workers to protect the nation's critical infrastructure (Anonymous 2008). In 2008, funding was provided for social scientists to assess the effectiveness of certain components of the Highway Watchw program, including the identification of larger trends in the transportation community, and to measure Highway Watchw participation and decay over time. The results of a portion of the project is particularly relevant to public safety in the United States, and is potentially useful for measuring and monitoring a wide range of other social issues. We argue that this issue in itself is a crucial social issue, as the well-being of society is largely dependent on various individuals, organizations, and institutions that monitor issues such as this to eliminate dangerous situations to the public at large. Our innovative mapping technique measures the volume of freight traffic in space and time in two major metropolitan areas, Memphis, TN and Jackson, MS, and the interstate corridor connecting the two cities as seen in the Main Map. The combination of unique datasets and visualization techniques proves to be a creative and powerful tool of assessment.

Methods
Two unique datasets are integrated with GIS software and the Google Earth interface to visualize the data and confirm at-risk places during various times of the day and week. The object of this analysis is to assess the volume of Highway Watchw-trained truck drivers along particular road segments in reference to important points of interest, and the daily and weekly coverage of these areas. Visualizing the data provides a clear assessment of where and when particular areas are considered at risk or unsafe. The data presented represent a sample of the possible coverage using this data and method. As prior research has stated, traffic map visualizations are best created using logical models describing underlying and observed patterns (Xiao, Gerth, & Hanrahan, 2006), which is something that we do to the best of our ability in the Main Map presented here. Our maps are an interactive method of traffic visualization, and patterns are evident in specific grids during specific times of the day and week in our time-sensitive data.

ATRI GIS data description
As part of the Federal Highway Administration's Freight Performance Measures initiative, the American Transportation Research Institute (ATRI) maintains an anonymized multi-year truck position database which contains billions of location records for more than 500,000 unique vehicles (Institute ATR 2008). Within this database, individual truck positions are reported multiple times per day, with a typical 'operating' truck reporting a position every 5 minutes to 1 hour. The database represents an estimated 15% of all truck traffic in the continental United States.
Our research team contacted ATRI to incorporate parts of this database into a pilot GIS study. The pilot study focused on two Metropolitan Statistical Areas (MSA); the Memphis, Tennessee metropolitan area and the Jackson, Mississippi metropolitan area. Additionally, the Interstate 55 corridor between the two cities is included. Truck locations are aggregated into 1-hour time blocks and are plotted in one-mile square grids within each MSA, and one mile by three mile grids on I-55 connecting the two MSAs. The truck location data is merged with the Department of Homeland Security's 'Homeland Security Infrastructure Program Gold,' (Anonymous 2007) GIS database, which includes geographic features, road networks and critical infrastructure locations.

HSIP gold data description
The Homeland Security Infrastructure Program (HSIP), compiled by the National Geospatial-Intelligence and Department of Homeland Security agencies, is a dynamic dataset that aids multiple entities in keeping America safe and informed. The main goal of HSIP is to provide a unified homeland infrastructure geospatial data inventory for use by the Homeland Security community. The program coordinates 311 spatial data layers within 13 infrastructure sectors and more than 10 million records in one location, from state and national departments such as: United States Geological Survey (USGS), Census Bureau, EPA, Corps of Engineers, DHS, US Coast Guard, National Guard, FCC and GeoTel. The type of national data that HSIP consists of is automobile traffic, power facilities, emergency response centers (by type), and petrochemical data. This dataset also has information regarding highways, population, schools, law enforcement and other demographic data of pertinent areas of interest (Jackson, 2010;Domestic Preparedness Branch, 2007). HSIP Gold contains a subset of the HSIP vector data holdings that is released to the federal community for official use in the homeland security mission. The current dataset is continuously expanding with updated information. The HSIP main function is to open the lines of communication and information between agencies, by states and bureaucratic levels.

Description of the maps
Before describing the spatial patterns in each of the maps presented, an analysis of the estimated number of Highway Watchw trained drivers is provided. For example, a particular road segment has 100 trucks passing within a 24-hour period, according to the ATRI freight performance measure data. We estimate that this sample (assuming that it is a representative sample of truck movement in the United States) represents 15% of actual truck traffic on that road segment. Thus, the true number is estimated to be 667 trucks using that road segment during a 24-hour period. We use the 2007 percentage of Highway Watchw trained individuals in the transportation sector (7.4%) to calculate the potential number of trained drivers on the road segment during a 24hour period (again, assuming that Highway Watchw trainees are randomly distributed). This suggests that 49 trucks contain a Highway Watchw trained driver. Of those trainees, only 20% report having called a report into either 911 or the Highway Watchw Call Center in the last 12 months. Thus, we have estimated 10 Highway Watchw trainees on the road segment who are likely (based on past performance) to report an incident.
The animated video embedded within Map 1 shows the average freight flow by 1×1 mile grids in the city of Jackson, MS from Monday at 12:00 AM through Sunday at 12:00 PM. The darkest shade of red indicates the highest volume of traffic, and the orange shade represents the next highest. Yellow represents the lowest volumes of traffic. The southern part of the metropolitan area represents the most significant cluster of high truck volume in the city throughout an average week. The average number of trucks per year at this location ranges from 300 to more than 500 during midday on week days, meaning that there are potentially a high volume of trained "eyes on the road" in this place and time. Other relatively high volumes of traffic are evident on the major roadways that run through Jackson, namely the Interstate 55 corridor, as indicated by the yellow running through the middle of Map 1. Here, there are between 100 and 300 trucks, on average, passing through these grids during the afternoon and evening hours of weekdays. While this is important information for the ATA Highway Watchw program to use for assessment of the distribution of potentially trained drivers, the spatial distribution exhibited in the map provides an illustration of the informative nature of time and space data.
The animated video embedded within Map 2 shows the spatial distribution of trucks in 1 × 1 mile grids in Memphis, TN from Monday at 12:00 AM to Sunday at 12:00 PM. Here, there are two locations of high traffic volume in the city; downtown (the red grids on the left side of Map 2) and southeast Memphis (the red grids in the bottom right hand corner of Map 2). In these locations, an average of upwards of 500 trucks pass through the city during afternoon hours throughout the week. As indicated in the previous paragraph, the spatial distribution of data, combined with the element of time, provides a new dimension to data visualization that can be tremendously useful in research. In terms of this project, the volume of traffic during each hour of the day is a crucial element according to the goals of the Highway Watchw program, that these areas of Jackson and Memphis may be at the least risk of a threat to public safety, or the most prepared to report an incident.
As indicated in the legends for Maps 1 and 2, the red and orange bars indicate the highest volumes of traffic and are thus the tallest in the three dimensional depiction illustrated in the videos. The yellow grids represent significant traffic volumes as well, but to a lesser degree than the darker colors. To relate these findings to our particular project, it is informative to visualize the changing geographic distribution of "eyes on the road" at every hour of every day of the week, i.e. full coverage of "eyes on the road" given existing data. This analysis allows for an accurate assessment of places at risk at time and space, two crucial elements for projects such as this, and any other research incorporating these elements. The creative use of Google Earth with spatially anchored data results in a unique, interactive, and dynamic spatial assessment that we have never seen implemented in previous social science research.

Conclusions
By combining freight flows and infrastructure data into Geographic Information Systems and Google Earth we have successfully identified the highest volumes of traffic by hour and day in two American metropolitan areas. The use of these unique datasets and data visualization techniques provides a reliable and valid estimate of the amount of 'eyes on the road' at all times. We have shown, using a small sample of two US cities and interstate corridor, how effective this method of data visualization is for safety and security-related research.
The implications of this research stretch beyond the findings presented in this manuscript. Data are available to assess 'at-risk' places across the entire United States, and in relation to a wide variety of critical infrastructure and points of interest. For instance, it would be quite simple to determine high and low periods of surveillance in and around schools, manufacturing plants, public venues and stadiums, airports, emergency services, banks, government entities, and more. This method can also be taken another step forward. Adding Census data at the county or sub-county level will allow for a demographic analysis of 'at-risk' populations by race, socioeconomic status, age, and other characteristics. Understanding and assessing places and people that are at-risk are certainly important for sociologists, other social scientists, and society in general. It leads to questions about the societal role of protection and who, what, when, and where resources should be implemented.

Software
The most important aspect of the ATRI/Google Earth visual is to show the possibilities of analyzing and visualizing freight flow data, and the implications of this type of research. The following section briefly describes the 2007 average daily and hourly patterns of freight truck density found in the ATRI/Google Earth visual of the MSAs of Jackson, MS and Memphis, TN, and the I-55 corridor connecting the two cities. The visuals show freight flow density using Greenwich Mean Time (GMT), but is interpreted according to Central Standard Time, which is six hours behind GMT. Jackson and Memphis are located in the Central Time Zone.
In order to visualize the ATRI data, we had to first convert the tabular data exported from their database into a KML (Keyhole Markup Language) file, the native file format of Google Earth. The data provided by ATRI contained essentially two flat text files: one containing geographic information on the location of the one-mile square grid cells in the MSA areas (3-mile long cells along the I-55 corridor) and the other containing the time period, date, corresponding grid cell, and the truck count. Because we had data for the entire year, we first imported the data into a database so that it could be aggregated, normalized, and statistically examined on the basis of such things as 'day of the week.' To generate the visualization required a custom script that created the one-mile square grid cells overlaying Memphis, Jackson, and the I-55 corridor in between in KML. The final product includes 168 individual grids (7 days × 24 hours per day) for Memphis, TN, Jackson, MS, and the I-55 corridor.
After mapping their location, it is possible to use the time component features built in to the KML standard to create what amounts to an animation of the truck flow data over a period of time.
In this case, we gave a height (z-value) attribute to represent the number of trucks in a particular grid cell at a given time, and shaded the cells with a color based on that height to accentuate the differences. This created the three-dimensional geographic visualization of the ATRI truck data. A combination of OGR, an open-source geographic programming library, and custom scripting converted the HSIP Gold dataset features from shapefiles into KML, so they could be overlaid onto the map along with the ATRI data. KML files were then uploaded to ArcGIS Explorer's Open Street Map base layer, projected, and exported as picture files to create the .pdf with all maps included in the map file. Each map shows a snapshot of freight flow density in each cell at a specified day and time, overlaid onto the ArcGIS street map layer. An animated video of the hourly traffic flow for an average week in both Jackson and Memphis was then embedded into the maps, resulting in the interactive maps seen in the Main Map.