Characterization of Aspergillus niger DNA by Surface-Enhanced Raman Spectroscopy (SERS) with Principal Component Analysis (PCA) and Partial Least Square Discriminant Analysis (PLS-DA) with Application for the Production of Cellulase

Abstract A new method based on surface-enhanced Raman spectroscopy (SERS) has been developed to monitor mutations in the DNA of Aspergillus niger to enhance the production of cellulase enzyme. The DNA of A. niger fungus was mutated at specific points by CRISPR-Cas9 and subsequently isolated prior to SERS analysis. Specific spectral features have been identified which are associated with the DNA bases where the mutations have taken place. Those mutations ultimately led to the enhancement of cellulase production. The SERS spectra have been differentiated by multivariate data analysis, including principal component analysis (PCA) and partial least square discriminant analysis (PLS-DA). The results indicate that the changes at 567 cm−1, 951 cm−1, 1300 cm−1, 1314 cm−1, 1393 cm−1, and 1442 cm−1 are associated with DNA mutations. PCA is able differentiate mutated from unmutated A. niger DNA. Moreover, the PLS-DA model identified and discriminated samples with an accuracy of 100%. These findings open up new avenues to rapidly and accurately characterize industrially-relevant biological samples.


Introduction
Aspergillus niger (A.niger) is commonly used in genetic research.Aspergillus genus includes hundreds of fungi species and has a vast range of industrial applications.For instance, it produces specific enzymes that are capable to lyse the complex mesh and highly tensile barrier of polysaccharides in cellulose, which is a main component of cell walls in plants.The extraction of cellulase enzymes from A. niger is one of the most economical routes of production of these enzymes.A. niger is ubiquitous in the atmosphere, and therefore, easy to handle for cellulase production (Rahman 2017).
Temperature and pH play significant roles in the production of cellulase enzymes, where the optimal temperature and pH are around 30 C and 5.0 for A. niger.Nevertheless, these species are relatively thermotolerant and produce active enzymes under a wide range of temperatures.Enzyme production also depends on nutrient availability.For instance, as the concentration of sugar increases, so does the cellulose production (Gautam et al. 2011).
The cellulase enzymes have significance in multiple industries, such as conversion of biomass into biofuels (da Silva Delabona et al. 2012), beverage production (Bhati, Kumar, and Shreya 2021), laundry (Østergaard and Olsen 2011), food, textiles (Adrio and Demain 2014), wine production (Ul-Islam et al. 2020), fruit juice and beer (Chakraborty et al. 2016), wastewater treatment (Jahangeer et al. 2005), and pulp and paper production (Sanchez and Demain 2017).Moreover, cellulases are utilized in pharmaceutics and in the treatment of several diseases, such as phytobezoars (Kramer and Pochapin 2012).However, all of these uses are rather small in comparison to the use of cellulases for the biotransformation of lignocellulosic bio-mass into fuel ethanol (Himmel, Baker, and Overend 1994;Minty et al. 2013).
Cellulase enzymes are produced by bacteria, protozoans, and fungi that catalyze the decomposition of cellulose.Cellulases account for about 20% of total enzyme trade in the world and are the third largest commercial enzyme (Singhania et al. 2010;Chandel et al. 2012).These enzymes are industrially produced through fungal and bacterial fermentation.Compared to bacteria, most fungus are able to produce a full system of cellulases (Cen and Xia 1999).Commercial cellulases are primarily manufactured by two strains of the soft rot fungi, namely A. niger and Trichoderma reesei, through submerged fermentation (Pandey, Soccol, and Mitchell 2000).Nevertheless, submerged fermentation is limited by its low productivity.Thus, solid state fermentation has been developed as an alternative for cellulases and other industrial enzymes (Couto and Sanrom an 2005).Solid state fermentation is advantageous over submerged fermentation as its environment resembles the natural habitat of the microorganisms, its yield is higher, and its costs are lower (Holker, Hofer, and Lenz 2004;Singhania et al. 2009;Amore and Faraco 2012).
Although production costs of cellulase enzymes are continuously increasing, cellulase production may be efficient if high-scale is achieved (Saini et al. 2015).Therefore, there is a need to improve the production of cellulase enzymes.An approach to enhance enzyme production is to induce random mutations in specific species of fungus, such as A. niger, through chemical and physical means.For example, if A. niger is irradiated with ultraviolet light under specific conditions, including distance and exposure time, its cellulolytic activities are enhanced (Vu, Pham, and Kim 2011;Jafari et al. 2017) due to oxidative mutations/variations.
Although inducing random mutations is a powerful approach to improve cellulase production, proper characterization of those genetic modifications is necessary.Different spectroscopic techniques have been used for the characterization of mutations in Aspergillus species, such as one-dimensional nuclear magnetic resonance spectroscopy (1D-NMR) (Hua et al. 2020) and ultraviolet-visible spectroscopy (UV-Vis) (Williamson et al. 1997;Naqvi et al. 2013).Raman spectroscopy may be used to acquire information about important biochemical features related to the molecular and internal structure of fungi and fungal mass.However, Raman spectroscopy shows low signal-to-noise ratio, and due to low sensitivity, it faces severe obstacles for analytical applications (Li, Yang, and Lin 2012).
To overcome these issues, surface-enhanced Raman spectroscopy (SERS), where the Raman scattering of molecules is strongly enhanced by the localized surface plasmon resonances of noble metal nanostructures, may be used.An obvious advantage of SERS compared traditional Raman spectroscopy is the several orders of magnitude enhancement of the signal, endowing quantification of low analyte concentrations, which may not be measurable by traditional approaches (Bonifacio, Cervo, and Sergo 2015).
In this study, a CRISPR-Cas9 based genome engineering system was used to induce mutations within A. niger.The enzymatic activities of the mutated colonies of fungus were screened, and two colonies were selected for further study.The DNA of these A. niger colonies was extracted and analyzed by SERS to detect the genomic changes.Furthermore, SERS spectral features were correlated to the levels of enzyme production by the fungus.

Synthesis of silver nanoparticles
The silver nanoparticles (Ag-NPs) were prepared by chemical reduction with silver nitrate (AgNO 3 ) (Sigma Aldrich; CAS number: 7761-88-8) as a precursor.Trisodium citrate (Sigma Aldrich; CAS number: 6132-04-03) was used both as a capping ligand and reducing agent as described previously (Kashif et al. 2020;. Fang et al. 2015).The prepared Ag-NPs were centrifuged and characterized by transmission electron microscopy (TEM) and scanning electron microscopy (SEM) as shown in Figure S1 in the supplementary material.The nanoparticles were oval-shaped from 65 x 45 nm as previously reported (Kashif et al. 2020).Ag-NPs of this size and shape have been reported to provide high SERS activity (Stamplecoskie et al. 2011;Kashif et al. 2020).

Fungal growth and total genomic DNA extraction
The parent and genetically engineered A. niger strains were cultured in Vogel's media at pH 5.0 (Javed et al. 2018) and 30 ± 1 C for 48 to 60 h in an orbital shaking incubator at 120 rpm.The cell mass (mycelia) from the fungal growth culture was collected into 2 ml sterilized microcentrifuge tubes, while the supernatant was separated by centrifugation at 6,000 rpm (4430 x g) for 10 min at 4 C. Genomic DNA from the harvested fungal mycelia was extracted by a literature procedure (Cenis 1992) and analyzed by electrophoresis using a 1% (w/v) agarose gel.

Sample preparation and SERS spectral acquisition
The DNA extracted from mutated A. niger was analyzed by SERS.The spectra were acquired with a Peak Seeker Pro 785 (Agiltron) spectrometer with a diode laser using a 785 nm excitation wavelength with a 40X objective and a power of 50 mW.The integration period was 10 s from 500 to 1600 cm À1 .50 mL of each sample were mixed with an equal volume of RNase free water in an Eppendorf tube.Following a 30-min incubation period, 40 mL were mixed with an equivalent volume of Ag-NPs, and the SERS spectra were acquired on an aluminum slide.15 spectra were recorded at room temperature for each sample.

Preprocessing of SERS spectra
The acquired SERS spectra were preprocessed to reduce baseline changes and contributions by the substrate with MATLAB 7.8 using in house developed chemometric codes.Preprocessing included spectral range selection, substrate removal, smoothing, baseline correction and vector normalization.The baseline of all spectra was corrected with rubber band and polynomial methods.The smoothing of data was done by a Savitzky Golay approach (Butler et al. 2018).Similar treatment was used for all spectra in the matrix.

SERS data analysis
Principal component analysis (PCA) and partial least square discriminant analysis (PLS-DA) were used to analyze the SERS spectra.Partial least square discriminant analysis (PLS-DA) was used to distinguish unmutated and mutated Aspergillus niger because of its ability to perform non-linear classification.

Results and discussion
In order to compare between the SERS spectra from unmutated and mutated A. niger DNA, the samples were categorized into three classes: unmutated A. niger (3 samples), mutated-I A. niger with high enzyme production (3 samples), and mutated-II A. niger with low enzyme production (3 samples).

Mean SERS spectra
Figure 1 compares the mean SERS spectra of unmutated DNA samples extracted from untreated A. niger fungus (blue), mutated-I with high enzyme production (cyan) and mutated-II with low enzyme production (pink).The most prominent SERS features are marked by solid and dotted lines.The solid lines represent characteristic features that differentiate specific groups, while dotted lines represent intensity-based differences among the groups.Peak assignments were performed based upon the literature as shown in Table 1.
There were also SERS characteristics shared among samples that displayed different intensities.Those are highlighted in the spectra with dotted lines.The SERS bands at 650 cm À1 (guanine), 788 cm À1 (phosphodiester bond of DNA), 951 cm À1 (ring vibrations of DNA), and 1092 cm À1 (asymmetric stretching of glycosidic link) are associated with DNA.The intensities of these features were lower in the mutated DNA compared to the unmutated samples.Interestingly, SERS may also be used to identify mutations in DNA bases, since replacing a base decreased the peaks associated with the former and increased the signal related to the latter.For example, while guanine intensities showed almost no change, a strong decrease was observed for the adenosine peak.Moreover, some changes suggest the replacement of cytosine by thymine.The stacking area of neighboring base planes changes as a result of the substitution of cytosine by thymine (Zeng et al. 2021).
The SERS bands at 670 cm À1 , 999 cm À1 , 1393 cm À1 , and 1442 cm À1 were higher in the mutated DNA compared to their unmutated counterparts.The band at 670 cm À1 is associated with thymine.This single nucleotide change indicates mutational events in that region with conversion from cytosine to thymine (Zhang et al. 2022).The appearance of a weak band at 999 cm À1 was due to C ¼ CH deformation associated with DNA and C-C stretching of an aromatic ring that was likely caused by cycloaddition between the C ¼ C double bonds of two pyrimidines and dimerization in between the adjacent pyrimidine bases (Braga et al. 2015).Moreover, the peak at 1393 cm À1 was assigned to C-H rocking of DNA, and 1442 cm À1 corresponds to the C-H bending of DNA.
Furthermore, strong SERS peaks at 567 cm À1 , 596 cm À1 , 702 cm À1 , 1300 cm À1 , and 1314 cm À1 were observed in DNA extracted from mutated-I and mutated-II A. niger.The peak at 702 cm À1 is induced by a phosphodiester bond in DNA.The prominent peaks at 567 cm À1 and 596 cm À1 are due to glycosidic ring deformation of DNA.The peak at 1300 cm À1 was assigned to CH 2 deformation in purine.The prominent peak at 1314 cm À1 was linked to P ¼ O of the DNA.These results are consistent with the high level of genes associated with purine and pyrimidine metabolism in mutated DNA compared to control samples (Jiang et al. 2020) Some SERS features were solely associated with unmutated DNA, including the bands at 586 cm À1 , 686 cm À1 , 727 cm À1 , 1072 cm À1 , and 1300 cm À1 .The 586 cm À1 peak is due to C-C out of plane bending of nucleic acid, while the band at 686 cm À1 was correlated to C-H in plane bending of DNA.Moreover, the band at 727 cm À1 was assigned to adenine, and the band at 1072 cm À1 associated with C-H bond in plane bending and out of plane deformation of DNA.The shift of the band at 727 cm À1 after mutation may be caused by the interaction between adenosine and other entities through hydrogen bonds or electrostatic interactions, as similar observations have been reported (Singh et al. 2020).
Hence, the SERS screening identified many variations in the genome of unmutated and mutated A. niger fungus which may lead to the enhanced production of cellulase.These differences included changes in the sequences of nitrogenous bases that are an integral part of the nucleic acids and are used as structural and functional units.

Principal component analysis (PCA)
Principal component analysis (PCA) and partial least square discriminant analysis (PLS-DA) were used to analyze the SERS spectra.PCA transforms potentially correlated variables into uncorrelated variables (smaller number of PCs) and reduces the dimensionality while maintaining variability by identifying factors that are recognized to be principal components (PCs) (Saade et al. 2008).The first principal component (PC-1) explains the majority of the sources of variability in the data, while all subsequent principal components explain the remaining sources of variability.The PC loadings may be considered to be an orthogonal dimension of the variability that separates groups of SERS spectra with their coefficients because all scored across these dimensions (Nawaz et al. 2017).
The SERS spectra of the samples were analyzed by PCA to highlight minor but relevant changes using all of the spectral information.Here, PCA showed a good separation of the SERS spectra of unmutated (pink), mutated-I (purple), and mutated-II (yellow) samples (Figure 2).Principal component-1 explained 86.6% of the variance.Figure 3a compares SERS spectra of DNA extracted from unmutated and mutated-I A. niger fungus.The unmutated A. niger fungus DNA spectra are clustered as yellow dots on the negative axis of the PC-1, while those of mutated-I, with more enzyme production, are clustered as cyan dots on the positive axis of PC-1.These SERS spectra were therefore differentiated by PCA.
The PC-1 loadings of the SERS spectra of A. niger are shown in Figure 3b which illustrates the differences among the samples.These PCA loadings describe the variables used for the differentiation of the SERS spectra.PCA loadings are both positive and negative which represent differentiating SERS bands of DNA extracted from mutated-I and unmutated samples.The variations in the PCA loading plot of the A. niger samples are similar to the previously identified SERS features.Notably, for a better understanding of the PCA loading responsible for the differentiation of the spectra, pairwise PCA scatter analysis is preferred which provides differentiation by the biochemical changes.These PCA loadings are more appropriately acquired by pairwise PCA of SERS spectra in which two groups of spectra are separately clustered on the positive and negative axis of PC-1.Two types of PCA loadings are obtained that include positive and negative types.
The characteristic SERS features were further confirmed by observing the positive loadings at 567 cm À1 (glycosidic bond), 670 cm À1 (thymine), 702 cm À1 (phosphodiester bond), 1092 cm À1 (asymmetric stretching of glycosidic link), 1314 cm À1 (P ¼ O bond), 1393 cm À1 (C-H rocking), and 1442 cm À1 (C-H bending), which are all related to DNA.These features were observed as positive loadings (from separating zero line) and associated with SERS spectra of the mutated-I DNA of A. niger that produced more enzyme.Furthermore, differentiating SERS features observed as negative loadings included 650 cm À1 (guanine), 686 cm À1 (C-H in plane bending), 727 cm À1 (adenine), 788 cm À1 (phosphodiester bond in DNA), 999 cm À1 (C ¼ CH deformation), and 1072 cm À1 (C-H band deformation).All of these features were associated with DNA extracted from unmutated A. niger.

Partial least square discriminant analysis (PLS-DA)
Although PCA visually differentiated SERS data sets of all A. niger samples, it did not provide quantitative information.To overcome this issue, PLS-DA was performed to the SERS spectra.An advantage of this supervised model is that biases may be avoided by compiling all spectra into a single matrix.Forty-five spectra were used as a training set.Independent splitting of unmutated and mutated DNA samples was done into two sets, with 60% as calibration points and 40% as validation sets.Eighteen optimal LVs were used to cross-validate the model.The fungus DNA was clearly differentiated based on the score plot (Figure 4), which presented a sensitivity of 0.97, a specificity of 0.98 and an accuracy of 100%, which corroborates the suitability of the PLS-DA model.
Moreover, partial least square discriminant analysis (PLS-DA) was used to discriminate unmutated and mutated A. niger fungus.PLS-DA is also known as a predictive modeling technique for non-linear classification.It further classifies into two stages, the first stage (PLS) component shows the dimension reduction whereas the second is associated with the development of a prediction model.PLS-DA transforms the categorical variable into continuous variables (i.e., þ1 and 0), and then computes the latent variables (LVs) to fit in order to utilize the covariance.Theoretically, PLS-DA combines discriminant analysis and dimensionality reduction into a single approach that is particularly useful for modeling high-dimensional (HD) data.Additionally, PLS-DA is more adaptable than other discriminant algorithms, such as Fisher's linear discriminant analysis, because it does not presuppose that the data fit a specific distribution.The performance of the model and the potential of the SERS method were evaluated using the receiver operating characteristic (ROC) curve for the classification of SERS data sets.
Lastly, the receiver-operating-curve (ROC) was prepared to confirm the efficiency of the PLS-DA model for the classification of the unmutated and mutated fungal DNA.The value of area under the receiver operating characteristics (AUROC) was 0.93 (Figure 5), indicating excellent efficiency.If this value is equal to one, the model shows the maximum accuracy.Thus, the PLS-DA model showed high accuracy for the discrimination of unmutated and mutated A. niger DNA.

Conclusions
A rapid, reliable, and sensitive SERS method employing Ag-NPs as the substrate has been established to characterize mutated and unmutated A. niger DNA that provides information about the biochemical components which allows the identification of mutations.CRISPR-Cas9 was used to induce site specific mutations in the fungal DNA for  enhanced production of cellulases.The results indicated that changes at 567 cm À1 , 951 cm À1 , 1300 cm À1 , 1314 cm À1 , 1393 cm À1 , and 1442 cm À1 were associated with DNA mutations.Mutated and unmutated DNA were differentiated by PCA in which the discrimination was illustrated by cluster formation.A PLS-DA model was employed to discriminate and identify the mutated and unmutated A. niger with 100% accuracy.These results report novel avenues for the characterization of industrially-relevant biological samples and may provide new opportunities in biotechnology and microorganism engineering.

Figure 1 .
Figure1.Mean SERS spectra of DNA extracted from A. niger fungus: unmutated, mutated-I with higher enzyme production, and mutated-II with lower enzyme production.The solid lines represent characteristic features that differentiate specific groups, while the dotted lines represent intensitybased differences among the groups.

Figure 3 .
Figure 3. Pairwise PCA: (a) scatter plot and (b) loading of SERS spectra of DNA extracted from unmutated and mutated-I A. niger.

Figure 4 .
Figure 4. PLS-DA score plot for the test data of DNA extracted from unmutated, mutated-I, and mutated-II A. niger.

Figure 5 .
Figure 5. Receiver-operating-curve (Roc) to evaluate the performance of the model for DNA extracted from unmutated and mutated A. niger.

Table 1 .
SERS spectral features of A. niger fungus with identification using the literature.