figshare
Browse
1/1
2 files

Supporting datasets for Comparison of Multi-locus Sequence Typing software for next generation sequencing data

dataset
posted on 2017-02-01, 14:12 authored by Andrew PageAndrew Page
To test the accuracy of MLST applications, we have constructed two datasets of simulated reads in FASTQ format. The first has perfect reads over the MLST genes, plus a flanking region (based on the Salmonella Typhi CT18 reference) in varying levels of coverage from 1 to 30. This allows for us to see at what point each software application can accurately detect an allele.
The second dataset is similar to the first, but contains 
2 Salmonella samples Salmonella Typhi CT18 and Salmonella Weltevreden, with the samples mixed in varying ratios. This allows us to see at what point software applications detect that there is a mixed allele/contamination.

Funding

This work was supported by the Wellcome Trust (grant WT 098051)

History

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC