figshare
Browse
tinytocs-submit.pdf (84.7 kB)

Exact and Near-miss Clone Detection in Spreadsheets

Download (0 kB)
Version 2 2012-08-06, 15:51
Version 1 2012-08-09, 08:54
journal contribution
posted on 2012-08-06, 15:51 authored by Felienne HermansFelienne Hermans

Spreadsheets are used extensively in business, in many domains. The applicability of

software engineering methods to spreadsheets has been a topic of research for several years, but

the main focus has been on analyzing the formulas, and not on analyzing the data in the spreadsheets. One of the factors that plays a role in spreadsheet data quality is the occurrence of clones in the spreadsheet data.

 

Clones in data are caused by copy-pasting. This is a very common practice in spreadsheet use, however, it can have a negative impact on the spreadsheet's quality, since 1) editing the copied data needs to be done in multiple places increasing maintenance effort and 2) when editing, some copies might be forgotten, leading to errors.

 

Clone detection has been proven useful in the realm of source code analysis, in two different forms: exact clones, and clones that differ slightly, called near-miss clones.

Because of the success of clone detection and removal in source code, it seems feasible to research the applicability of both techniques on clones in spreadsheet data. Our work shows that this is a promising avenue.}

History

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC