figshare
Browse

File(s) under permanent embargo

Discovery of latent subcommunities in a blog's readership

journal contribution
posted on 2010-07-01, 00:00 authored by B Adams, Quoc-Dinh Phung, Svetha VenkateshSvetha Venkatesh
The blogosphere has grown to be a mainstream forum of social interaction as well as a commercially attractive source of information and influence. Tools are needed to better understand how communities that adhere to individual blogs are constituted in order to facilitate new personal, socially-focused browsing paradigms, and understand how blog content is consumed, which is of interest to blog authors, big media, and search. We present a novel approach to blog subcommunity characterization by modeling individual blog readers using mixtures of an extension to the LDA family that jointly models phrases and time, Ngram Topic over Time (NTOT), and cluster with a number of similarity measures using Affinity Propagation. We experiment with two datasets: a small set of blogs whose authors provide feedback, and a set of popular, highly commented blogs, which provide indicators of algorithm scalability and interpretability without prior knowledge of a given blog. The results offer useful insight to the blog authors about their commenting community, and are observed to offer an integrated perspective on the topics of discussion and members engaged in those discussions for unfamiliar blogs. Our approach also holds promise as a component of solutions to related problems, such as online entity resolution and role discovery.

History

Journal

ACM transactions on the web

Volume

4

Issue

3

Pagination

1 - 30

Publisher

Association for Computing Machinery

Location

New York, N. Y.

ISSN

1559-1131

eISSN

1559-114X

Language

eng

Publication classification

C1.1 Refereed article in a scholarly journal

Copyright notice

2010, ACM