figshare
Browse

sorry, we can't preview this file

tract_pops_1990_2014_ny. (30.23 MB)

Census Tract population estimates by age and sex, New York State, by year, 1990-2014, using 2010 census tract definitions

Download (30.23 MB)
dataset
posted on 2016-03-16, 18:51 authored by Francis P. BoscoeFrancis P. Boscoe

Very often in my work I am requested to calculate disease rates for small areas for time periods that fall between censuses or span multiple censuses, such as 1995-2013. In the past I have used population estimates from private vendors, but these have had two important limitations: one, they are proprietary and cannot be shared, and two, they often contain significant omissions and errors. I decided instead to calculate my own populations using publicly available data and established interpolation methods.

To generate the data here, I began with the census tract populations by age (5-year age groups) and sex published in the 1990, 2000, and 2010 federal censuses (citations to exact tables to be added). These were converted to 2010 census definitions using the Longitudinal Tract Data Base (LTDB), available here: http://www.s4.brown.edu/us2010/Researcher/Bridging.htm. The LTDB provides precise conversions between different censuses. For example, 45.4% of the population of 1990 Bronx census tract 50 is assigned to 2010 tract 50.01, while 54.6% is assigned to tract 50.02. Census tracts with zero population in all three decades, consisting of water and certain parks and cemeteries in New York City, were omitted. The resulting file has data for 4,893 tracts.

Each age-sex group was summed to the county total, and compared with the county total as published by the National Cancer Institute’s SEER program. The SEER counts make adjustments to the counts by race and ethnicity, adjust the counts to reflect totals as of July rather than April, and other small enhancements, all of which are documented on their web page, http://seer.cancer.gov/popdata/. The census tract counts were then proportionally adjusted to match the SEER totals. For example, if the census tracts in a particular county added to 127 males aged 5-9, and the SEER total for this county was 131, then the count in each tract was multiplied by 131/127. This resulted in fractional populations, which were retained. Any user not desirous of fractional populations can simply round the values given here.

Next, geometric interpolation between census years was used to estimate tract-level counts for all of the non-census years, using the Das Gupta method that has been used extensively by the Census Bureau and described here: https://www.census.gov/popest/methodology/intercensal_nat_meth.pdf. For census tracts that are growing in population, this method results in more of the growth occurring later in the period. For census tracts that are shrinking, it results in more of the shrinkage occurring earlier in the period. For the relatively small numbers seen in individual census tracts by age and sex, the results are not very different than those that would have been obtained from linear interpolation. (For the years after 2010, this step was skipped because the 2020 census obviously does not yet exist). These interpolated counts were then proportionally adjusted to match the SEER totals by year and county, using the same procedure as above.

Data dictionary

The data file is a comma-separated file containing the following variables:

Year

Geoid10 – 11 digit code consisting of state FIPS code (36 for New York), county FIPS code (001-123 for New York), and census tract (6 digits, with leading and trailing zeroes as needed). These are the identical values used in many Census tables.

M0 – male population aged 0

M1 – male population aged 1-4

M2 – male population aged 5-9

M17 – male population aged 80-84

M18 – male population aged 85+

F0 – female population aged 0

F18 – female population aged 85+

Future work

Future versions of these data may add some or all of the following:

-          Additional states

-          Counts by race and ethnicity

-          Incorporation of a method to capture abrupt changes in census tract populations, such as when a new retirement community is constructed. The idea is to use American Community Survey population estimates to identify such instances.

-          Incorporation of post-censal corrections. Here, I have used the official tables published after each census. They do not incorporate the various small corrections that were made as a result of appeals and identification of errors. These corrections are mainly given in narrative form rather than in tables, and so incorporating them may be somewhat involved.

 - Francis Boscoe

University at Albany

Department of Epidemiology and Biostatistics

 Send questions, comments to fboscoe@albany.edu

History

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC