figshare
Browse

File(s) stored somewhere else

Please note: Linked content is NOT stored on Carnegie Mellon University and we can't guarantee its availability, quality, security or accept any liability.

Topes: Reusable Abstractions for Validating Data

journal contribution
posted on 2008-05-01, 00:00 authored by Chris Scaffidi, Brad Myers, Mary Shaw

Programmers often omit input validation when inputs can appear

in many different formats or when validation criteria cannot be

precisely specified. To enable validation in these situations, we

present a new technique that puts valid inputs into a consistent

format and that identifies “questionable” inputs which might be

valid or invalid, so that these values can be double-checked by a

person or a program. Our technique relies on the concept of a

“tope”, which is an application-independent abstraction describing

how to recognize and transform values in a category of data.

We present our definition of topes and describe a development

environment that supports the implementation and use of topes.

Experiments with web application and spreadsheet data indicate

that using our technique improves the accuracy and reusability of

validation code and also improves the effectiveness of subsequent

data cleaning such as duplicate identification.

History

Date

2008-05-01

Usage metrics

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC