figshare
Browse
SR-hyperSMURF.revision1.pdf (990.59 kB)

SR-hyperSMURF.revision1.pdf

Download (990.59 kB)
journal contribution
posted on 2017-11-25, 10:00 authored by Giorgio ValentiniGiorgio Valentini
Most of state-of-the-art ML-based methods do not adopt specific imbalance-aware learning techniques to deal with imbalanced data that naturally arise in several genome-wide variant scoring problems, thus resulting in a significant reduction of sensitivity and precision. We present a novel method that adopts imbalance-aware learning strategies based on resampling techniques and a hyper-ensemble approach that outperforms state-of-the-art methods in two different contexts: the prediction of non-coding variants associated with Mendelian and with complex diseases. We show that imbalance-aware ML is a key issue for the design of robust and accurate prediction algorithms and we provide a method and an easy-to-use software tool that can be effectively applied to this challenging prediction task.

History