<p>A database of over 10,000 colicins from over 50 species of bacteria were collated from the ENA as well as including some isolates from previously published sources. A multi-FASTA file containing the collated colicin sequences was utilised to generate a custom database via the prepareref command of ARIBA v2.14.6 where prepareref removes erroneous data and runs cd-hit to cluster the sequences based on a user-defined similarity threshold (90% in our case). ARIBA was then run with the FASTQ files of all isolates and the colicin database to report which sequences were observed in each isolate. </p>
Funding
Convergent evolution of Enterobacteriaceae in epidemiological networks with high antimicrobial use
Biotechnology and Biological Sciences Research Council