5687527_Final_PhD_Thesis_Aryal_23192518.pdf (3.55 MB)
A data-dependent dissimilarity measure: An effective alternative to distance measures
thesis
posted on 2017-12-11, 22:30 authored by SUNIL ARYALIn data mining, the task-specific performances of conventional distance-based similarity measures vary significantly in different data distributions because they are data-independent and sensitive to units or scales of measurement. This thesis investigates a measure, where the similarity of two instances is determined by the distribution of data. It introduces a new (dis)similarity measure, which is data-dependent and robust to units and scales of measurement. The empirical evaluation conducted across a wide range of datasets shows that the new measure produces better or at least more consistent task-specific performance than widely-used distance-based measures, particularly in high-dimensional datasets.