figshare
Browse

Fail Prediction

dataset
posted on 2025-05-05, 08:57 authored by Ömer ÖzdemirÖmer Özdemir

Continuous Integration (CI) is a development practice where developers regularly merge their code changes into a central repository, enabling simultaneous collaboration across a shared codebase. This frequent integration and automated building process in CI helps to detect and resolve conflicts or errors early in development. However, in large-scale systems, the build process can be costly. Each build incurs expenses, while skipping builds can increase the risk of undetected failures. This paper presents an empirical study within an industrial setting, investigating the use of machine learning techniques to predict build failures. Accurate predictions can help to identify builds that can be safely skipped to reduce CI costs. We evaluate various models and feature combinations on a dataset derived from real-world industrial projects. We observe high precision but low recall in predicting failed builds, allowing hundreds of successful builds to be correctly skipped, with around a dozen failures potentially being missed.

Funding

Vestel

History

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC