figshare
Browse

ONLINE LEARNING APPROACHES FOR ANOMALY DETECTION IN CONTINUOUS DATA STREAMS

Download (1.48 MB)
thesis
posted on 2025-04-24, 19:42 authored by Gregg D PuttkammerGregg D Puttkammer, Mark Daniel Ward, Bruce A. Craig, David GleichDavid Gleich

Anomaly detection is the act of identifying events that deviate from what is normal or expected behavior in a system. Traditional anomaly detection methods are often not suitable for the increasingly complex task of anomaly detection in modern systems. Today’s ever-increasing amount of high-velocity and dynamic data requires new methods to ensure that unknown events can be quickly detected, especially in critical infrastructure systems. This research explores online machine learning, where models learn incrementally as data arrive in real time. We demonstrate that online machine learning models reduce the need for labeled data and lower the computational burden often required in traditional batch learning, all while maintaining comparable accuracy. One drawback of using online machine learning models is that they struggle to capture larger trends in the data and do not align well with the traditional train-test split often used in machine learning environments. We explore several different testing environments in which online machine learning models can be applied. We also demonstrate that when applying online machine learning models without pretraining, the online models achieve comparable performance metrics to batch models, all while requiring as little as 1% of the total labeled data needed for traditional machine learning models.

Funding

40004597

History

Degree Type

  • Master of Science

Department

  • Statistics

Campus location

  • West Lafayette

Advisor/Supervisor/Committee Chair

Mark Daniel Ward

Additional Committee Member 2

Bruce A. Craig

Additional Committee Member 3

David Gleich