figshare
Browse
ECSupplementalInformationMovie.mov (16.12 MB)

An animated explanation of average linkage hierarchical clustering

Download (0 kB)
media
posted on 2014-10-24, 23:27 authored by Stacy Rebich-HespanhaStacy Rebich-Hespanha

This animated explanation of average linkage hierarchical clustering was created as supplemental information for:

Rebich-Hespanha, S., Rice, R.E., Montello, D.R., Retzloff, S., Tien, S., and Hespanha, J.P. (2015) Image Themes and Frames in U.S. Print News Stories about Climate Change, Environmental Communication, 9(4), 491-519. doi:10.1080/17524032.2014.983534.

We first coded a set of 118 themes in a set of 350 images that appeared with 200 US print news stories about climate change (Rebich-Hespanha & Hespanha, 2014). We then used clustering method described in this animation to identify dominant image frames based upon theme co-occurrence.

This clustering approach identifies two themes as related when they more frequently co-occur in the same images. To perform this analysis, coding data are arranged in a table so that each image corresponds to a row, and each theme to a column. Each cell represents the relationship between the corresponding image and the corresponding theme; a cell value of 1 indicates that the image contains the theme, while a 0 denotes absence of the theme in that image. Each image may be associated with more than one theme, and each theme may appear in one or more images. Agglomerative clustering begins with each theme as its own cluster and then sequentially identifies the most related themes (based on patterns of co-occurrence in the image set) and joins them to form larger clusters. Grouping of themes proceeds until all themes are part of a single large cluster with a hierarchical “tree” structure. This hierarchical tree is then segmented based on application of a threshold for maximum distance between members of the same cluster. Because hierarchical clustering approaches lack goodness-of-fit tests that can identify the number of clusters that significantly represent the patterns in the data, the choice of distance threshold for segmentation of the clustering tree was based on researcher intuition and implemented using a partitioning algorithm (Prosperi et al., 2011).

Prosperi, M. C. F., et al. (2011). A novel methodology for large-scale phylogeny partition. Nature Communications, 2, 321, doi: 10.1038/ncomms1325

Rebich-Hespanha, S. & Hespanha, J.P. (2014) Metadata table for 350 images associated with 200 randomly-selected US print news stories about climate change. figshare. http://dx.doi.org/10.6084/m9.figshare.1213654

Please contact the author if you would like a .pptx version of this animation so that you can modify it to suit your needs.

History

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC