Scripts.PDF (103.04 kB)
Exome Scripts
Download (103.04 kB) This item is shared privately
dataset
modified on 2015-12-14, 10:58 The advent of next generation sequencing (NGS) technologies have revolutionised the way biologists produce, analyse and
interpret data. Although NGS platforms provide a cost-effective way to discover genome-wide variants from a single
experiment, variants discovered by NGS need follow up validation due to the high error rates associated with various
sequencing chemistries. Recently, whole exome sequencing has been proposed as an affordable option compared to whole
genome runs but it still requires follow up validation of all the novel exomic variants. Customarily, a consensus approach is
used to overcome the systematic errors inherent to the sequencing technology, alignment and post alignment variant
detection algorithms. However, the aforementioned approach warrants the use of multiple sequencing chemistry, multiple
alignment tools, multiple variant callers which may not be viable in terms of time and money for individual investigators
with limited informatics know-how. Biologists often lack the requisite training to deal with the huge amount of data
produced by NGS runs and face difficulty in choosing from the list of freely available analytical tools for NGS data analysis.
Hence, there is a need to customise the NGS data analysis pipeline to preferentially retain true variants by minimising the
incidence of false positives and make the choice of right analytical tools easier. To this end, we have sampled different freely
available tools used at the alignment and post alignment stage suggesting the use of the most suitable combination
determined by a simple framework of pre-existing metrics to create significant datasets.