A comparative study of commercial real-time reverse transcription PCR kits for forensic body fluid identification

ABSTRACT Identifying the type(s) of body fluid present in forensic casework exhibits can assist with scene reconstruction and indicate potential activity types. Confirmatory body fluid identification can be achieved using mRNA-based profiling assays. This commonly involves endpoint reverse-transcription PCR (RT-PCR) with capillary electrophoresis (CE) detection. In comparison, real-time quantitative RT-qPCR is more sensitive, quantitative, and does not require post-PCR processing. We have developed real-time RT-qPCR assays for forensic body fluid identification. This study compared the performance (sensitivity, PCR efficiency, precision) of five real-time RT-qPCR kits across circulatory blood, buccal, semen, and vaginal fluid samples with normal and extended (3-step) PCR cycling. An objective scoring system for the experimental performance parameters was considered along with other features of the commercial kits. Statistical analysis by ANOVA and post-hoc Tukey of the slope estimates, which relate to PCR efficiency, revealed that many observed differences were insignificant (p > 0.05). Sensitivity and precision were also similar across most kits and PCR cycling protocols. Using the scoring system, the five highest performing kit and cycling combinations were: TaqMan-extended, followed by the TaqPath-normal, QuantiNova-extended, QuantiNova-extended, then UltraPlex-normal. Based upon high performance, room temperature set-up, and multiplexing capability, the UltraPlex kit was selected for our body fluid identification assays.


Introduction
Identification of body fluid(s) present in a stain is important in a forensic context. This information may be used to corroborate alleged events, indicate potential types of activity that have occurred and prioritize samples for further analysis such as DNA profiling. 1,2 Conventional methods for body fluid identification can lack sensitivity and specificity 3 . Most confirmatory methods are only for use on one type of body fluid, are complex, or destructive. mRNA profiling is a confirmatory method for body fluid identification, which addresses some limitations of conventional methods. [4][5][6][7][8] Various mRNA-based multiplex assays have been developed, and are utilized in casework in New Zealand and the Netherlands. [5][6][7][8][9][10][11][12] Quantitative real-time PCR (qPCR) utilizes fluorescent markers to infer the quantity of DNA produced during cycles of PCR. qPCR combined with a reverse transcriptase step (RT-qPCR) allows for the detection of RNA by transcribing it into cDNA, which is the input to the PCR reaction. 13,14 RT-PCR can be a one-step or two-step reaction. Two-step reactions involve separately reverse-transcribing the RNA, then transferring an aliquot of the cDNA to be amplified into a second tube. For one-step reactions, the PCR reaction is included in the tube with the reverse transcription step. Although it is less sensitive than two-step RT-PCR, onestep RT-PCR is less complex, quicker, reduces risk of contamination, and is therefore ideal for high throughput screening. 15 RT-qPCR compared with endpoint PCR allows for detection of PCR amplification during the reaction, which is quicker, and increases the dynamic range of detection, and has shown 5-40-fold greater sensitivity. 13,16 Different reverse transcription and DNA polymerase enzymes, in addition to different cycling conditions employed by commercial kits, can alter performance. Some kits also include genomic DNA removal steps, and the inclusion of internal controls. The viscosity of the solution can also affect mixing and reproducibility. Therefore, the various kits were compared to determine the optimal kit(s) for this specific research application.
This study investigated the suitability of five real-time RT-qPCR kits and one reverse transcription kit for their use in our newly developed body fluid identification assay. The reaction efficiency, limit of detection (sensitivity), precision, and features of the real-time RT-PCR kits was evaluated.

Body fluid collection
All samples were collected with full informed consent, and were approved by the University of Auckland Human Participants Ethics Committee (reference #021082). Body fluid samples (circulatory blood, semen, buccal, and vaginal material) were collected in triplicate from three individuals per body fluid (to ensure reproducibility in marker expression across individuals). Circulatory blood samples required a finger-prick of the fingertip using a lancet and deposition of 10 µL on to sterile cotton swabs. Vaginal swabs were taken by each participant herself using a sterile swab. Semen samples were placed in a suitable container post-ejaculation and 10 µL aliquots were added to sterile cotton swabs. Buccal samples were collected by swabbing the inside of the mouth. Swabs were air-dried and stored at room temperature for no longer than 6 months prior to extraction of RNA.

Extraction of RNA
RNA was extracted from swab heads using the DNA IQ™ System (Promega Corporation, Madison, WI, USA) whereby RNA is present in the fraction usually discarded. This co-extraction method is used in casework at the Institute of Environmental Science and Research Ltd. as it allows for the simultaneous isolation of RNA and DNA from a single sample. The isolated RNA was then purified using the ReliaPrep™ RNA Cell Miniprep Kit (Promega Corporation, Madison, WI, USA) according to the manufacturer's instructions, with minor alterations as described in Albani. 17 Purified RNA samples were treated by Turbo DNA-free™ by Ambion® (Thermo Fisher Scientific Incorporated, Waltham, Massachusetts, USA) to remove DNA from the sample. The samples were then quantified by qPCR using the Quantifiler® Human DNA Quantification Kit supplied by Applied Biosystems® (Thermo Fisher Scientific Incorporated, Waltham, Massachusetts, USA). If any DNA was still present, as determined by the qPCR Quantifiler Human DNA kit, the sample was re-treated with the Turbo DNA-free™ kit and quantified again. , were selected from a survey of commercially available real-time RT-PCR kits. Both one-step and two-step kits were chosen to compare sensitivity. Kits were selected based on various features, such as their passive reference dye compatibility with the intended multiplex design of the assay, inclusion of an internal control, complexity of set-up, presence of RNase inhibitors, and whether multiplexing was supported.

cDNA synthesis
In the two-step RT-PCR reactions, 10 µL RNA in a 20 µL total volume was reversetranscribed into cDNA. cDNA was synthesized from RNA with random primers using the Applied Biosystems™ High-Capacity cDNA Reverse Transcription Kit (Life Technologies Corporation, Carlsbad, CA, USA) according to the manufacturers' instructions.

Real-time quantitative RT-qPCR
Singleplex real-time RT-qPCRs were carried out on 96-well Applied Biosystems™ MicroAmp™ Optical 96-Well Reaction plates on an Applied Biosystems™ 7500 Real-Time PCR System. The manufacturer's instructions were followed for each of the one-and twostep real-time PCR kits. Testing of an extended cycling PCR protocol (3-step) was achieved by adding a 60 second step at 72°C. For all reactions, 40 cycles were used with an annealing temperature of 60°C. The extraction negative control was a blank swab processed alongside samples in every extraction batch and used in all RT-qPCR runs. The fluorescence threshold was set at 0.1 relative fluorescence units (RFU).
Specific reporting by the use of custom TaqMan hydrolysis probes was utilized for the study across all body fluids (Table 1). mRNA targets were selected based on previous research into degraded RNA transcript stable regions (StaRs), and massively parallel sequencing data for endpoint RT-PCR body fluid identification assays. 11,12,18 The forward and reverse primers for CYP2B7P were obtained from Albani et al. 12 Probe sequences for realtime PCR for these were generated based on the existing forward and reverse primers for each target using Primer3 Plus v2.4.2 software. 19,20 For the other targets (HTN3, SLC4A1, PRM1, and KLK2), the StaR was used as the input sequence and Primer3 Plus selected the Table 1. Details regarding the mRNA body fluid markers used in real-time RT-qPCR. All assays are custom designs and labelled with FAM-MGB. Note that STC1 was only used for the determination of optimal C T cut-offs. optimal forward, reverse, and probe sequences. An assay for the menstrual fluid marker STC1 was unavailable for the testing of commercial kits, however work is underway in this area and future publications will include data using this marker.
The following parameters were specified; amplicon must span stable region, ideally also spanning exon-exon junction, melting temperature (T m ) of forward and reverse primers must be within 5°C of each other (but ideally within 1°C), primer T m between 55 and 65°C, primer length between 18 and 24 nucleotides (nts), primer, and probe GC content between 20% and 80%, probe T m between 65 and 75°C, probe length 15-25, no G base on 5' end of probe, last 5 nts of 3' end of probe have no more than two G and/or C bases, amplicon between 80 and 200 nts. In addition, tandem repeats, homopolymers, and high-end complementarity of primers were avoided.
Three participants donated per body fluid: buccal (22-year-old male, 25-and 27-yearold females), circulatory blood (22-year-old male, 25-and 27-year-old females), semen (two 22-, and one 24-year-old males), and vaginal material (25-, 27-, and 27-year-old females). RNA was then pooled in each body fluid for RT-qPCRs. The 1-step kits tested in the study across body fluids (UltraPlex, TaqPath, TaqMan, and QuantiNova) used 2 µL RNA input across a dilution series. For the two-step kit (Perfecta), the High Capacity cDNA synthesis kit was used (10 µL RNA input in a 20 µL total reaction). Two microlitres of input cDNA was used for the qPCR reaction, across a dilution serie wWhere appropriate, the MIQE guidelines were adhered to (see supplementary data Table S2). 14 RNA concentration and quality were not determined as forensic samples are usually of low quality. Standard RNA quantification methods do not necessarily lead to improved RNA profiles. 21 Ribosomal RNA (rRNA), which makes up approximately 80% of the total RNA, is used for quality assessment. However, mRNA is only 2-5% of the total RNA, and the quality and integrity of rRNA is not a reliable indicator for the target mRNA. In addition, differential degradation of mRNA has been demonstrated. 22 We decided this quality assessment was unnecessary given the circumstances. For this application, normalization by reference genes is not appropriate or necessary, due to the varied expression of reference genes across the body fluid panel independent of RNA input. [23][24][25]

Data analysis
All data processing, graphing, and statistical analysis were done in R. 26 PCR efficiency was calculated using the slope of the standard curve (Equation 1). When the two groups were compared, Student's T-test was conducted, and for more than two groups, an ANOVA with Post-hoc Tukey test was undertaken using the emmeans R package. 27 Assumptions of normality and equal variance of residuals were confirmed prior to means testing. The significance level (α) was 0.01.
Sensitivity analysis employed the use of receiver operating characteristic (ROC) curves to find the optimal cut-off point (see supplementary data). This approach is used for the analysis of RT-qPCR assays, and also for other body fluid identification methods (such as FTIR). [28][29][30] Smith et al. recently recommended the use of ROC curves in forensic science to measure the discriminability of a method. 31 The R packages plotROC, pROC, and OptimalCutpoints were used. The Youden index was calculated to identify the optimal cut-off point. 32

Results and discussion
Following preliminary testing on circulatory blood and a commercial probe assay (see Supplementary Data Figure S1), testing across all body fluids was undertaken using custom probe assays. This was in combination with testing an extended PCR protocol (3-step cycling for one-step kits) and in the case of the UltraPlex kit, an extended extension time (to 90 seconds), as recommended by the manufacturer. Note that the twostep Perfecta kit was not tested with an extended cycling protocol. For practicality, only target body fluids were tested, as sensitivity and reaction efficiency were of more interest than specificity at this stage of the current research project, which is unlikely to be greatly affected by altering the qPCR kit used. No results were obtained for the extraction of negative or no template control samples. Figure 1 shows the standard curve for each assay, condition, and kit. CYP2B7P is detected in the lowest dilution (1/1000) across all kits and conditions. PRM1 is detected in all dilutions except for the lowest dilution for both UltraPlex conditions. HTN3 is detected in all dilutions except for the lowest dilutions when using the Perfecta and QuantiNova kits. SLC4A1 is only detected in more than one replicate at 1/1000 in QuantiNova-extended, both TaqPath conditions, and UltraPlex-normal. KLK2 sensitivity is poor across all kits and conditions, with 1/1000 detection in TaqPath-extended, 1/100 detection (>1 replicate) in QuantiNova-normal, TaqMan-extended, TaqPath-normal, and 1/10 detection in the remaining kits and conditions. Lower dilution points were detected above the commonly used (arbitrary) cut-off (35 C T ). 30,31 This prompted further testing into receiver operating curves using only the UltraPlex kit, to identify a higher cut-off value maximizing sensitivity for each target assay (see Supplementary data Figure S2). The C T cut-offs were: HTN3: 38.2, SLC4A1: 39.9, PRM1: 38.1, KLK2: 38.6, CYP2B7P: 38.7. The dilution points detected below this cutoff were used in the scoring system when calculating sensitivity, to give weight in the context of specificity. The sensitivity results are shown in Table 2. Most PCR efficiency estimates were within or close to the ideal range of 90-110% (see Table 3). Those which were outside the ideal range were: KLK2 (Perfecta); PRM1, KLK2, CYP2B7P (UltraPlex -Normal); KLK2, SLC4A1 (UltraPlex -Extended); HTN3, SLC4A1 (TaqPath -Normal); HTN3, KLK2, CYP2B7P (TaqPath -Extended); SLC4A1, KLK2 (TaqMan -Normal); SLC4A1 (TaqMan -Extended); and HTN3 (QuantiNova -Normal).
To examine the uncertainty in slope estimates, an analysis of co-variance (ANCOVA) of normal versus extended PCR conditions was undertaken. This showed significant interaction with the slope (efficiency) and changing of kits, although, when examining the pairwise comparisons individually (using the R package emmeans, which does a posthoc Tukey HSD test), most were not significant. When comparing between extended and normal protocols, only one marker per kit showed a statistically significant change in slope (see Figure 2). Similarly, given the marker grouping and PCR protocol, comparison between kits revealed few significant differences in slope.
There is a significant change in efficiency (slope) for HTN3 (QuantiNova and TaqPath), and SLC4A1 (TaqMan and UltraPlex) when changing the qPCR protocol to extended 3-step cycling (Figure 2). Within each marker and condition, most slopes are similar across different kits, as indicated by overlapping confidence intervals (see Figure 2).   The kit or PCR protocol which showed the improved performance with one marker was not consistently high-performing when amplifying other markers. For example, the extended protocol with the UltraPlex kit improved efficiency for PRM1, however, all other markers showed a decline, which was substantial for some (for example, 92.06-49.79% for SLC4A1). It is also important to note that this decline was statistically significant, whereas the apparent improvement for PRM1 was not. This is likely due to the minor change in mean slope for PRM1 between the two conditions and overlapping confidence intervals. Therefore, when choosing an 'optimal' kit and protocol for further development, many factors need to be considered in addition to sensitivity, efficiency, and precision, including statistical significance of comparisons, inclusion of internal controls, and complexity of set-up. An additional consideration is to aim for similar PCR efficiency values of separate markers if the intention is to create a multiplex assay. To compare the experimental performance, sensitivity, efficiency, and precision, the parameters were combined and scored in a ranking system, where a lower score indicates better performance (Table 4).
Precision is consistent, apart from a few kit/condition combinations ( Table 5). The TaqPath kit is highly viscous and despite extensive mixing and centrifuging of the plates, it appears this may have influenced the precision under extended conditions. The TaqMan kit under normal conditions shows poor precision, particularly when compared to the same kit under extended protocols. This improvement in performance with 3-step cycling for the TaqMan kit is seen across all experimental parameters. The precision of the Perfecta kit is consistently poor across all markers, as demonstrated by the average coefficient of variation (%CV) ( Table 5), and the standard error associated with the slope estimate ( Figure 2). This may be due to the cDNA synthesis kit used (High Capacity cDNA Reverse Transcriptase). Previous work using this kit produced non-reproducible results (results not shown), which may be due to the lack of RNase inhibitors in the reaction mix. However, despite the TaqPath kit being the only other kit to not contain an RNase inhibitor, the precision does not seem to be greatly affected, suggesting this is not the only factor that causes non-reproducible results. Although, the presence of an RNase inhibitor is highly favourable given the obvious application of this assay on 'non-ideal' forensic casework samples, which are likely to contain RNases as they are ubiquitous.  Based on this scoring alone, the highest performing kit is the TaqMan under the extended protocol, followed by the TaqPath-normal, QuantiNova-extended, QuantiNovaextended, then UltraPlex-normal (scoring results are shown in Table 6).
While the basic scoring system gives an indication of general performance, it is also limited as it does not consider all factors. For example, extremely poor performance with one marker (when performance of that marker is improved with other kits or conditions) should rule that kit/condition combination out, such as SLC4A1 PCR efficiency when using the normal TaqMan protocol. The precision and sensitivity of KLK2 was consistently poor in this research across all kits. For further development and optimization of these assays, an alternative KLK2 assay will be utilized.
Other kit features were also considered when selecting a kit to move forward with ( Table 7). The reference dye is important, as it defines which other dye channels may be used and therefore influences multiplexing. All kits except TaqPath use ROX dye, and although only the Mustang Purple format was tested, there is the option to include ROX as the reference dye for that kit, which has the same reaction components (TaqPath™ 1-Step RT-qPCR Master Mix).
Hot-start technology inhibits the activity of DNA polymerase at room temperature, which promotes stable reactions by reducing non-specific amplification, primer-dimer formation, and is conducive to automated workflows. While the inclusion of an internal control in the QuantiNova assay is advantageous, it requires the separate purchase of the IC RNA assay to detect it. A separate synthetic RNA control and assay kit could also be purchased and used with any kit to achieve inter-run calibration. The separate purchase of an RNase inhibitor would be required for the TaqPath kit. Visual pipetting controls, while theoretically helpful, adds extra pipetting steps, increasing the reaction set-up time and sample handling. Similarly, single-tube (including all reaction components) or dual-tube master mix set-up reduces handling and time required. Cost is not a major consideration as most of the commercial kits were priced similarly (at the time of purchase), although it is still important, as it is a barrier to the adoption of qPCR technologies. 33 Ease of multiplexing is also ideal as the intention is to move to multiplex reactions. The QuantiNova Multiplex kit is the multiplex version of the QuantiNova Probe RT-PCR kit, which is only intended for singleplex/duplex reactions. Due to supply issues, the multiplex kit was only available during the preliminary testing, and therefore the singleplex version was used for the main portion of testing. Hence, if this kit was used for further development, retesting and potentially re-optimizing of the new kit would need to occur when developing the multiplex assay. This research utilized an objective scoring method to assess the performance of a range of commercial RT-qPCR kits and two different cycling protocols for the purpose of mRNA-based body fluid identification. While the scoring indicates a discrepancy between the TaqMan-extended and the other three high-performing conditions, statistical analysis showed that the differences in efficiency, given by the slope estimate, were insignificant as the confidence intervals overlapped. Therefore, based upon high performance, ability to set up at room temperature, and multiplexing capability, the UltraPlex 1-step kit was chosen for further development with the real-time PCR assay. Future work will involve multiplexing and validation of these body fluid identification assays for use in forensic casework.