figshare
Browse

Kernel Density Networks

Download (4.16 MB)
poster
posted on 2024-10-17, 01:44 authored by Ashwin De SilvaAshwin De Silva, Jayanta Dey, Joshua T. Vogelstein

A desirable property of a machine learning model is the ability to know when it does not know, i.e., the ability to make aptly-confident predictions when it encounters out-of-distribution (OOD) inputs that are far away from the training data (in-distribution). While this property is readily observed in human and animal intelligences, deep neural networks (ReLU-type networks in particular), despite having achieved state-of-the-art performance in many learning tasks, are known to produce overconfident predictions on OOD data. Deep networks partition the input feature space into convex poly- topes and learn affine functions over them. Since the outer poly- topes at the boundary of the training data extend to infinity, they produce high confidence predictions for test samples far away from the training data. Most methods proposed to mitigate this problem rely heavily on specific network architectures, loss functions, and training routines that incorporate explicit OOD data. Kernel Density Networks” (KDNs) overcome this problem by taking an already trained deep network and fitting Gaussian kernels over its polytopes instead of affine functions. KDNs yield class conditional density estimates for each class which are used during inference to compute the posterior class probabilities (confidences) and produce the final prediction.

History

Usage metrics

    Categories

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC