An in-house multilocus SNP genotyping assay for evaluation of complex genetic diseases.

Abstract Background: With an increase in the discovery of newer genetic loci/polymorphisms in complex multifactorial diseases, there is also an increased need for methods that can simultaneously genotype multiple loci in a cost-effective manner. Using coronary artery disease (CAD) as a model, the study aimed to develop an in-house multilocus assay for simultaneous detection of 17 genetic variants in 11 genes implicated in CAD. Methods: A multiplex polymerase chain reaction (PCR)-based reverse line blot hybridization (MPCR-RLBH) approach was used, where each DNA sample was amplified using two separate MPCRs, and the alleles were genotyped using covalently immobilized, amino-linked sequence-specific oligonucleotide probes using an enhanced chemiluminescence system. The assay performance was tested on 75 healthy controls and 75 angiographically proven CAD cases. Validation was done by automated Sanger sequencing. Results: The assay could successfully discriminate both the alleles at CETP (I405V), LPL (D9N), NOS3 (T-786G and E298D), LIPC (C-514T), FGB (G-455A), ITGB3 (L33P), AGT (M235T), and MTR (A2756G) loci. Certain mutations included in this assay such as ins242G, ins397G, E387K, L393K in the LDLR; N291S in the LPL; D442G in the CETP; and T833C in the CBS genes were found to be absent. The genotype results obtained using this assay showed 100% concordance with sequencing. Conclusion: The study demonstrated development and validation of a multiplex SNP genotyping assay that can be used to assess genetic risk factors in CAD. The assay provides a cost-effective alternative to expensive high throughput genotyping systems in common molecular research laboratories.


Introduction
The last decade has witnessed the emergence of newer highthroughput technologies, which has led to greater understanding of genetic factors involved in complex multifactorial diseases such as atherosclerosis, cancer, diabetes mellitus, etc. Single nucleotide polymorphism (SNP)-based genetic association studies with candidate gene approaches have been widely used for the study of complex multifactorial diseases [1,2]. However, since complex traits are more likely to be caused by several SNPs, and even numerous genes, this approach is limited on its ability to include all possible causative genes and polymorphisms, each of which may have a small-to-moderate, but significant relative risk contribution. With the advent of several high throughput genotyping platforms, genome-wide association studies (GWAS) are being commonly performed for complex multifactorial diseases with much success [3,4]. GWAS provides a more comprehensive and unbiased approach to study markers encompassing the entire genome. However, the major limitations of these technologies are the need for large population-based samples to achieve genome-wide significance, its high cost, and technical expertise, which might not be readily available in common molecular laboratories, or in clinical hospital settings. Also, a candidate gene approach may still be needed in the post-GWAS phase (replication studies in the same or different populations), to study SNPs with low allele frequencies, and in situations where GWAS has not yet been performed or publicly available, in which case it would be cost-effective to perform a candidate gene study emphasizing the need to have candidate gene assays that can evaluate multiple markers of choice [5].
Development of a multilocus assay would greatly facilitate genetic association studies encompassing multiple markers that are critical to assess information on candidate genes for multifactorial disease. Thus using coronary artery disease (CAD) as a model, the present study aimed at developing a robust, rapid and cost-effective in-house multilocus assay for simultaneous genotyping of 17 genetic variants in 11 genes representing pathways implicated in the development of coronary atherosclerosis including: familial hypercholesterolemia (LDLR), decreased HDL levels (LIPC, CETP), hypertriglyceridemia and low HDL levels (LPL), hyperhomocysteinemia (NOS3, CBS, MTR), increased fibrinogen levels/binding (FGB, ITGB3), and blood pressure regulation and renin angiotensin system (AGT, AGTR1). The variants in these genes have been previously reported to be associated with CAD and its related risk factor across different populations, including Indians and hence were the focus of this assay [6][7][8][9][10][11].

Study design and setting
This study was conducted, to develop an in-house multiplex genotyping assay and validate in a case-control cohort, from 2012 through 2014 at Research Laboratories, P. D. Hinduja National Hospital and Medical Research Center, Mumbai, India. The study adhered to the tenets of the Declaration of Helsinki for research involving humans and was approved by the Institutional review board and ethical committee. All the participants signed an informed consent.

Target genes and their genetic variants
As shown in Table 1, a total of 17 genetic variants representing 11 genes, which are implicated in the progression and development of atherosclerotic plaques as well as thrombosis, were selected.

Multiplex polymerase chain reaction-reverse line blot hybridization technique
A multiplex polymerase chain reaction-based reverse line blot hybridization (MPCR-RLBH) technique was used for developing the multilocus assay for simultaneous genotyping of multiple loci [12]. A maximum of 21 biallelic sites can be genotyped using this approach. In this technique, aminolinked allele-specific oligonucleotide probes are covalently labeled onto a negatively charged membrane. These probes are then allowed to hybridize with biotin labeled MPCR products, which hybridizes depending upon the sequence homology. This hybridization is detected using streptavidin labeled with peroxidase, followed by enhanced chemiluminescence (ECL) detection on a light sensitive film.

Primers
The primer sequences were either designed manually or selected from the published literature. HPLC purified primers (50 nmole scale) were commercially synthesized (Operon, Bangalore, India). The 5 0 end of one of the primer pair (forward or reverse) was biotin labeled. Each primer pairs were designed to amplify PCR products of different sizes so as to enable detection on gel. The primers sequences and the PCR product length for all the genetic variants are listed in Supplementary Table S1 (available online only).

Multiplex PCR amplification
Amplification was performed in a MJ Research PTC-200 thermal cycler (Bio-Rad Laboratories, CA). Multiplex PCR master mix (QIAGEN, GmbH, Steinheim, Germany) containing HotStar Taq V R DNA polymerase, multiplex PCR buffer containing 6 mM MgCl 2 , and dNTP mix was used for amplification. Each DNA sample was amplified using two separate MPCR reactions, A and B, amplifying nine and eight genetic variants, respectively. Master Mix A (nine variants) consisted primer pairs for LDLR (E387K, L393R, The preparation of MPCR master mix for A and B reactions along with the PCR amplification conditions are shown in Supplementary Table S2. All the additions were performed using molecular grade barrier filter tips and reagents throughout the assay. The PCR amplified products were then electrophoresed on a 4% agarose gel (Sigma Aldrich, Steinheim, Germany) and visualized under UV light after ethidium bromide staining.

Allele-specific oligonucleotide probes
Two probes (wild-type and mutant type) of 16-24 mer each were designed for each biallelic site/genetic variant to detect and distinguish between the variant sequences. All the probe (HPLC purified, 200 nmole scale) sequences were synthesized with a 5 0 -terminal amino group (Operon, Bangalore, India), by which they could be covalently linked to an activated negatively charged Biodyne C membrane (Pall Life Sciences, FL). Candidate probe sequences were designed manually or selected from published GenBank sequences. For manual probe designing, 'Oligo Calc' which is a free online oligonucleotide properties calculator (http://biotools. nubic.northwestern.edu/OligoCalc.html) was used to avoid probe sequences with potential hairpin formation, 3 0 -complementarity and self-annealing sites. Careful designing of the probes allowed hybridization of all the probes at a single temperature. No probe redesigning was required for any of the genotypes in this assay. Concentration of the probes required to detect an intense signal for all the loci were standardized. Concentration of probes tested varied from 1-2000 pmole for different variants. Final concentrations of the probes were chosen so as to achieve signal balance between alleles at each variable site in a multiplex format, and for generally comparably clear signal intensities across all the variants. The oligonucleotide probes sequences for all the genetic variants and the probe concentration standardized for optimal hybridization are listed in (Supplementary  Table S3).

Probe array preparation
A miniblotter apparatus (Immunetics, MA, USA) was used for labeling the negatively charged membrane with the amino-linked oligonucleotides probes. The miniblotter allows the application of up to 43 probes/samples through dedicated slots. The probes (wild-type and mutant type) were diluted to an optimized concentration in 0.5 M NaHCO 3 (pH 8.4) (BDH, Mumbai, India). The Biodyne C membrane was then activated using freshly prepared 16% 1-ethyl-3-(-3-dimethylaminopropyl) carbodiimide (EDAC; Sigma Aldrich, Steinheim, Germany) in demineralized water. The membrane was then placed in a clean miniblotter system on a support cushion (Immunetics, MA) tightly. The slots in the miniblotter were filled with diluted probe solution. The first and the last slot were used to mark the membrane by adding diluted drawing pen ink. The membrane was allowed to incubate for 2 min at room temperature (RT) after which the oligonucleotide probes were removed by aspiration. Then the apparatus was dismantled, the membrane was removed using a forcep and placed in 0.1 M NaOH (BDH, Mumbai, India) for 10 min to inactivate the membrane. After rinsing with demineralized water, the membrane was then placed in 20 mM EDTA (pH 8.0) (Sigma Aldrich, Steinheim, Germany) at RT for 15 min, saran-wrapped, and stored at 4 C until use.

Hybridization and detection of alleles
In order to achieve specific hybridization for all the genetic variants included in this study on a single array, we tested several parameters in low to high stringency conditions. These included different hybridization temperatures (55-68 C), different oligonucleotide probe concentrations (1-2000 pmole), washing temperatures (60-68 C), salt (SSPE) concentrations (1-4Â), SDS concentrations (0.5-1%), and exposure time during chemiluminescence (10-20 min). The multilocus assay was eventually standardized at a hybridization temperature of 63 C, followed by washing in 2Â SSPE/0.5% SDS buffer at 65 C with varying probe concentrations (listed in Supplementary Table S3) and an exposure time of 20 min to achieve comparable signal intensity for all the target variants. Hybridization buffers were prepared from 20Â SSPE (Sodium chloride, sodium phosphate, EDTA) containing 3.0 M NaCl, 0.2 M NaH 2 PO 4 , and 0.02 M EDTA at pH 7.4 (Invitrogen, CA) and 10% Sodium dodecyl sulphate (SDS) (Sigma Aldrich, Steinheim, Germany). A 10lL of biotin-labeled PCR products each from MPCR-A and -B were mixed with 140 lL of 1Â SSPE/ 0.1% SDS, heat denatured at 95 C for 10 min and immediately cooled on ice. The probe labeled membrane was washed at 60 C in 1Â SSPE/0.1% SDS inside a Shake 'n' Stack hybridization oven (Thermo Scientific, OH) for 5 min. The membrane was then placed into the miniblotter tightly using a support cushion, in such a way that the slots were perpendicular to the line pattern of the applied oligonucleotide probes. Denatured biotin labeled MPCR products were applied into the slots (avoiding air bubbles) and allowed to hybridize for 60 min at 63 C on a horizontal surface (with no shaking) in the hybridization oven. Care was taken to avoid contamination of the neighboring slots. After the incubation, the samples were aspirated from the miniblotter and membrane was removed from miniblotter using forceps. The membrane was washed twice in 2Â SSPE/0.5% SDS buffer for 10 min at 65 C. After washing, the membrane was placed in a rolling bottle and allowed to cool in order prevent the inactivation of the peroxidase enzyme in the next step. The membrane was then incubated with streptavidin-peroxidase conjugate (Roche Diagnostics GmbH, Mannheim, Germany) at 42 C for 60 min, followed by washing twice in 2Â SSPE/0.5% SDS buffer for 10 min at 42 C and twice with 2Â SSPE for 5 min at RT. Chemiluminescent detection of hybridizing DNA was carried out using ECL detection reagents as recommended by the manufacturer (Amersham Biosciences, Buckinghamshire, UK). The membrane was wrapped in a transparent plastic sheet or saran-wrap and kept in a hypercassette (Amersham Pharmacia Biotech, Buckinghamshire, UK), followed by exposure to a light sensitive film (Roche Diagnostics, Indianapolis) for 20 min. After development of the X-ray film in an automated Xray film processor (Promax, Bangalore, India) the results are visualized as dark spots. Each spot represented a specific hybridization between a wild-type and/or mutant probe and the PCR product indicating the genotype for each allele. In order to re-use the membrane, the hybridized PCR products were stripped off from the membrane by washing twice with 1% SDS solution at 80 C for 30 min each followed by rinsing with 20 mM EDTA solution (pH 8.0) at RT for 15 min. The membrane can be re-developed at this stage to check for any carry over signals or saran-wrapped and stored at 4 C.

Testing and validation phase
The multilocus assay performance was tested by genotyping DNA samples from 75 angiographically proven CAD cases and 75 healthy controls. DNA was prepared from EDTAblood using a salting out method [13]. The inclusion and exclusion criteria for CAD and ethically matched healthy control group has been described elsewhere [11]. The multilocus assay results were validated by commercial automated Sanger DNA sequencing (Xcelris Genomics, Gujarat, India).

Results
Multiplex PCR was standardized amplifying all the 17 genetic variants, where the amplified products ranged from 160-577 bp, which could be easily distinguished on 4% agarose gel as shown in Figure 1. Although some products showed weak intensities as compared to others, in both reactions A and B, the yield was sufficient enough to be detected on the array by the immobilized oligonucleotide probes. A 198 bp APOC3 fragment was included in MPCR mix A to see if the presence of any additional unrelated fragment (a scenario where non-specific amplification may occur) would affect specific allele identification using the array.
A representative of the detection of the amplified alleles by the array is shown in Figure 2. The array could discriminate each of the three possible genotypes successfully at CETP (I405V), NOS3 (T-786G and E298D), LIPC (C-514T), FGB (G-455A), ITGB3 (L33P), AGT (M235T), and MTR (A2756G) loci. However, certain mutations included in this assay such as ins242G, ins397G, E387K, L393K in the LDLR; N291S in the LPL; D442G in the CETP; and T833C in the CBS genes were found to be absent in this cohort (as confirmed by Sanger sequencing of representative samples) and hence the identification of mutant alleles for these could not be demonstrated. A total of 150 DNA samples, which included 75 CAD cases and 75 healthy controls, were tested using this array. The total number of genotypes identified in these DNA samples is listed in Table 2.
A total number of 59 DNA samples were subjected to Sanger sequencing for validation of the array results. The chromatogram data were matched with GenBank sequences. No discrepancies were observed between sequencing and the array results. Representative chromatogram for each genetic variant is shown in Supplementary Figure S4.

Discussion
In this study, we report development of a prototype multilocus genotyping assay with a focus on a multifactorial disease such as CAD. The assay described here can be used to genotype 17 biallelic sites that were previously reported by our group in Indian CAD patients. In the past, multilocus assays have been developed [14] that were used to study population-based Bezafibrate Infarction Prevention (BIP) cohort, and the elderly population of France (Nancy) and China (Hong Kong) to assess cardiovascular associated risk [15,16]. Although the broad utility of both these assays are similar, there are certain important technical differences besides the number of alleles that can be genotyped. As opposed to the use of bovine serum albumin conjugated probes, our assay used amino-linked probes, which make the membrane reusable. And in contrast to using a colorimetric-based detection system, our method uses a more sensitive chemiluminescence assay.
There are newer high throughput genotyping technologies that have been described for genetic association studies using a candidate-gene approach. These include the SNaPshot V R Multiplex Kit (Applied Biosystems TM ) that can detect up to 10 SNPs using a primer-extension method with capillary electrophoresis, SNPlex V R Genotyping (Applied Biosystems TM ) that can genotype up to 48 SNPs using an oligonucleotide ligation assay followed by PCR and capillary electrophoresis, and the Sequenom MassARRAY system, that is also based on primerextension and uses a mass spectrometer for detection to genotype 36 SNPs on 384 samples per assay [17][18][19]. The instrumentation and the technical requirements for all these assays and the one described in this study vary and it is up to the investigator to make a relevant choice depending on the specific aims of the study.
The multilocus assay described in this study demonstrated genotyping of 43 samples for 17 variants per assay in a single day. However, it should be noted that identification of certain mutations (as mentioned in Results) could not be demonstrated. This was due to the fact that these mutations were found to be absent in the limited cohort included in this study to provide a proof-of-concept. In addition, unfortunately, no positive controls were available for these mutations. However, there was no discrepancy between the genotypes achieved by the multilocus assay and sequencing of representative samples for these mutations indicating that the assay could unambiguously identify a wild-type genotype without missing a mutant genotype. The assay can be expanded to identify 21 variants and the probe-labeled membrane can be re-used up to 12 times after simple washing steps to strip off the PCR products, thereby allowing genotyping of at least 516 samples (43 Â 12) on a single membrane, which approximately cost less than 30 USD per patient sample making it truly cost-effective. In this study we have re-used the membrane up to 5 times without any carry-over contamination problem. Large cohorts can thus be genotyped rapidly using this assay. Similarly, since the assay uses sequence-specific probes, it cannot be used to detect a variable number of tandem repeat polymorphisms or novel polymorphisms. Newer genetic variants identified in the future can be incorporated into the assay with little optimization.
In conclusion, the study demonstrated the development and validation of a multiplex SNP genotyping assay using RLBH that can be used to assess genetic risk factors in CAD. Although the method is limited by the number of SNPs that can be genotyped and involves multiple steps, it provides a cost-effective alternative to expensive high throughput genotyping systems in common molecular research laboratories and serves as a research tool for genetic predisposition studies of complex multifactorial diseases in large cohorts. . Each dark spot represents the specific hybridization between the polymerase chain reaction (PCR) product and the allele-specific oligonucleotide probes (wild-and mutant type). Hybridization seen only for wild or mutant probe indicates wild or mutant homozygous genotype, whereas binding for both the probes indicates heterozygosity.