Semantic Similarity Measurement Using Historical Google Search Patterns

Martinez Gil, Jorge

doi:10.6084/m9.figshare.6652373.v1

Semantic-Similarity-Using-Google.pdf (310.12 kB)

Semantic Similarity Measurement Using Historical Google Search Patterns

journal contribution

posted on 2018-06-22, 07:37 authored by Jorge Martinez GilJorge Martinez Gil

Computing the semantic similarity between terms (or short text expressions) that have the same meaning but which are not lexicographically similar is an important challenge in the information integration field. The problem is that techniques for textual semantic similarity measurement often fail to deal with words not covered by synonym dictionaries. In this paper, we try to solve this problem by determining the semantic similarity for terms using the knowledge inherent in the search history logs from the Google search engine. To do this, we have designed and evaluated four algorithmic methods for measuring the semantic similarity between terms using their associated history search patterns. These algorithmic methods are: a) frequent co-occurrence of terms in search patterns, b) computation of the relationship between search patterns, c) outlier coincidence on search patterns, and d) forecasting comparisons. We have shown experimentally that some of these methods correlate well with respect to human judgment when evaluating general purpose benchmark datasets, and significantly outperform existing methods when evaluating datasets containing terms that do not usually appear in dictionaries.

History

Usage metrics

Keywords

Licence

CC BY 4.0

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

Semantic Similarity Measurement Using Historical Google Search Patterns

History

Usage metrics

Categories

Keywords

Licence

Exports