Additional file 6: Figure S4. of A high-quality annotated transcriptome of swine peripheral blood

Flowchart for genome-guided transcriptome assembly, annotation and filtering. The diagram shows the steps involved in construction and filtering of the genome-guided assembly, and includes the number of PTs that resulted from each step, where appropriate. Refer to the Methods section for details. The raw RNA-seq reads that were preprocessed as above (see the legend of Additional file 4: Figure S2) were mapped to the USMARCv1.0 reference genome by using STAR, and then assembled into PTs by using Cufflinks. PTs of 200 bases or shorter in length were removed from the resulting genome-guided transcriptome assembly before further analysis. Splicing status of the 162,294 PTs was determined and unspliced intronic PTs were discarded. Among the remaining PTs, those with significant BLAST hits in the NCBI NT and NR databases were determined by using DC-megaBLAST and BLASTX, with E-value cutoffs of 10−20 and 10−6, respectively. The potential biotypes of the PTs were determined based on the biotypes of their significant DC-megaBLAST hits. To complete the filtering, we removed from the remaining PTs: (i) PTs with top DC-megaBLAST hits on sequences originating from mitochondrial genomes; and (ii) PTs mapped to genomic regions of maximal CPB lower than 50× (low-CPB regions). The final filtered genome-guided transcriptome consisted of 32,702 PTs. (PDF 360 kb)