Molecular-level exploration of properties of dissolved organic matter in natural and engineered water systems: A critical review of FTICR-MS application

Abstract Dissolved organic matter (DOM) contains complex molecular compounds that dominate its heterogeneous dynamics and behaviors in aquatic environments. Fourier transform ion cyclotron resonance mass spectrometry (FTICR-MS) with ultra-high resolution has proven to be effective in characterizing aquatic DOM. However, a systematic summary of molecular-level compositions and behaviors of DOM in natural and engineered water systems remains insufficient. This study provides a critical review of DOM characterization by FTICR-MS, with emphasis on composition diversity, chemical properties, transformation, and dynamics in the natural and engineered water systems. First, FTICR-MS strategies for DOM characterization are introduced on data interpretation and collaborative analysis of complementary datasets (e.g. spectroscopic data). Second, DOM characteristics, including spatiotemporal distribution, photochemical activity, microbial modification, and interface adsorption in natural water environments were comprehensively summarized based on current FTICR-MS findings. Third, DOM molecular changes caused by different engineered treatment methods were reviewed to highlight the molecular variation, reaction, and transformation by focusing on the FTICR-MS results. Finally, we summarized current limitations, biases, and future directions of FTICR-MS, and future extended studies of natural/engineered-derived DOM behavior. This FTICR-MS application review provides favorable strategies for understanding the molecular chemistry and behaviors of aquatic DOM. Graphical Abstract


Introduction
Dissolved organic matter (DOM) is a complex and heterogeneous mixture containing various molecular compounds, which are widely distributed in natural and engineered water systems (Berg et al., 2019;Wu, Tanoue, et al., 2003). In natural water environments, photo-reactivity, microbial availability, and interface fractionation of DOM are closely related to its molecular compositions, resulting in different distribution characteristics with depth (Lv, Zhang, Wang, et al., 2016;Wu et al., 2018). In engineered water systems, DOM molecules undergo complex transformations in the treatment processes, which affects the interaction between DOM and pollutants and the generation of by-products (Lavonen et al., 2015). Therefore, exploring the compositions, properties, and dynamics of DOM in the natural and engineered water systems is of great significance to understand its geochemical behaviors and important roles in the water systems.
Fourier transform ion cyclotron resonance mass spectrometry (FTICR-MS) with ultra-high mass resolution and mass accuracy has been developed to identify DOM molecular compositions in various water environments Schmidt et al., 2009). However, understanding molecular and chemical properties of DOM remains a challenge without detailed interpretation of the original FTICR-MS datasets. Visualization and index interpretations [e.g. double-bond equivalents (DBE), aromaticity index (AI), and nominal oxidation state of carbon (NOSC)] of FTICR-MS molecular formula further describe the composition, unsaturation, and aromaticity of DOM (Hockaday et al., 2006;Kim et al., 2003;Koch & Dittmar, 2006). More efforts are still required to improve the discussion of FTICR-MS results, such as the uniform molecular classification and index fusion (Hockaday et al., 2009).
FTICR-MS combined with a variety of complementary techniques, such as liquid chromatography (LC), fluorescence excitation-emission matrix (EEM) spectroscopy, nuclear magnetic resonance (NMR), and gas chromatography-mass spectrometry (GC-MS), can comprehensively characterize DOM molecular features (McKee & Hatcher, 2010;Minor et al., 2014;Wu, Evans, et al., 2003). A combination of FTICR-MS and EEM spectroscopy can determine correlations between DOM fluorescents and molecular compounds (Herzsprung et al., 2012). NMR and GC-MS analyses combined with FTICR-MS can provide the complement molecular information of DOM structures (Hertkorn et al., 2013;McKee & Hatcher, 2010). Correspondingly, statistical methods can link the above heterogeneous datasets with various methods for an in-depth understanding of the structural and chemical characteristics of DOM (Sleighter et al., 2010). Notably, supplementary or statistical methods in processing huge FTICR-MS datasets still require more advanced statistical methods to improve the in-depth interpretations of compositions, properties, and dynamics of DOM in the aquatic systems.
In natural environments, the complex and diverse molecular properties of DOM have been determined through FTICR-MS, but there is still a lack of systematic summary to explain the molecular changes of DOM induced by different chemical reactions. FTICR-MS has been used in understanding the detailed molecular dynamics of DOM during photoreaction, microbial modification, and adsorption reactions. DOM molecules with high aromaticity are considered to be easily degraded under sunlight conditions (Gonsior et al., 2009). The microbial community and their activities in natural water environments are important for the structural reworks of DOM . Besides, the adsorption of DOM molecules on minerals and nanomaterials results in molecular fractionation of DOM, which can alter their distribution at the water-sediment interface (Wu & Tanoue, 2002). Although progress in understanding DOM behavior in natural environments has been made with the benefits of FTICR-MS, the spatiotemporal distribution, photochemical activity, microbial modification, and interface adsorption of DOM molecules still need to be systematically reviewed.
In engineered water systems, the molecular diversity and complex transformation of DOM at various treatment processes are also successfully investigated using FTICR-MS datasets (Tang et al., 2021). DOM from municipal and industrial wastewaters shows clear differences in unsaturation degree and molecular weights of molecular compounds . Moreover, FTICR-MS can track molecular dynamic changes of DOM during different treatment processes, which provides molecular information to understand the interaction between DOM molecules and engineered pollutants (Maizel & Remucal, 2017a). Traditional and advanced treatment techniques affect the molecular fractionation, blocking, and oxidization of DOM, which are important in evaluating DOM removal efficiency during different treatment processes (Gonsior et al., 2011). However, information on molecular diversity and heterogeneous changes of DOM caused by different engineered treatments is still insufficient due to the complexity of the engineered water systems. Therefore, a critical review of the molecular diversity, transformation, and dynamic changes of DOM in different treatment stages is necessary in understanding the important roles of DOM in engineered systems.
Although several relevant reviews summarized the ionization methods and data utilization of FTICR-MS for DOM characterization, this study still presents several important contributions to DOM research. Zhang et al. (2020) and Bahureksa et al. (2021) reviewed the sample preparation, ionization methods, and data exploitation of a single FTICR-MS. However, the summary of complementary characterization methods for FTICR-MS remains insufficient. Therefore, the recent advances and application advantages of FTICR-MS and its complementary characterization methods are fully summarized and prospected in this review, which can guide researchers to choose the most appropriate technical solutions for the comprehensive characterization of DOM molecules. Shi et al. (2021) and Anaraki et al. (2021) reviewed the identification of thousands of DOM compounds in wastewater and drinking water using FTICR-MS. However, FTICR-MS application on the molecular characteristics, dynamics, and transformations of DOM in different processing stages of engineering systems is still lacking. Therefore, another important contribution of this review is the summaries on characteristics and transformation of DOM in the engineered water systems, which provides valuable information about the assessment of DOM removal and technology optimization of engineering systems. Cooper et al. (2022) and Qi et al. (2022) reviewed the FTICR-MS developments in identifying the molecules of complex dissolved DOM mixtures in natural waters. More comprehensively, the molecular compositions, properties, distribution, and dynamics of DOM in natural waters are summarized in this review, deepening the understanding of complexity and diversity of DOM properties.
This review based on the recent FTICR-MS findings of DOM aims to (1) systematically introduce FTICR-MS strategies for the characterization of DOM with an emphasis on DOM data interpretation and collaborative statistical analysis with supplementary datasets; (2) comprehensively summarize the distribution and reactions of DOM in natural water environments focusing on the spatiotemporal distribution, photochemical activity, microbial modification, and interface adsorption at the molecular level using FTICR-MS; (3) fully review the molecular dynamic changes of DOM in engineered water systems with highlights of molecular variation, reaction, and transformation under different treatment processes based on the FTICR-MS datasets; and (4) highlight the current limitations and future directions of FTICR-MS applications, and the extended work on the heterogeneous behavior of DOM in natural/engineered water at the molecular level.

Principles, advantages, and functions of FTICR-MS
The core principles of FTICR-MS are to measure the m/z values of ions and to obtain the mass spectrum by plotting signal intensity against m/z value. Specifically, the ions transfer into a cyclotron cell with uniform magnetic field (B) and electric field after the ionization of DOM. An ion of mass m and charge q are subjected to a Lorentz force in radial direction and generate a cyclotron motion, whose frequency x c is described as Equation (1) (Marshall et al., 1998). A trapping electric field confines the ions in axial direction to ensure that the ions remain in a certain space for frequency measurement. In the presence of an electric field, the center of ion cyclotron movement shifts and produces axial vibration parallel to the magnetic field with a frequency x z , resulting in decreases in the cyclotron frequency (Equation (2)) and magnetron frequency (Equation (3)) (Cho et al., 2015). The resolution of FTICR-MS is described by the full width at half maximum (m/Dm 50 ) (Equation (4)). T is the data acquisition time for a given time domain signal and C is a constant (Qi & O'Connor, 2014).
FTICR-MS has several powerful advantages and unique functions to decipher the "black box" of aquatic DOM, benefiting researchers to overcome technical challenges in DOM research. First, FTICR-MS has high resolution and mass accuracy, which achieves peak separation within an extremely small mass unit, overcoming the technical challenges of insufficient molecular information obtained by traditional low-resolution spectrometry. The low-resolution mass spectrum does not allow peak separation to distinguish molecules whose mass changes are less than one mass unit (Sleighter & Hatcher, 2007). However, the mass resolving power of FTICR-MS can reach more than 30000, and the errors in m/z value for single DOM molecule can be less than 0.5 ppm (Qi et al., 2022). Mass differences of a few millidaltons can be distinguished by FTICR-MS, which improves the ability to assign unique DOM molecules (Hsu et al., 2011). Second, electrospray ionization (ESI) of FTICR-MS preserves molecular integrity as much as possible by simplifying injection process, which allows the infusion of aqueous solutions into the mass spectrometer and coupling of mass spectrometry with liquid chromatography (Reemtsma, 2009). This overcomes the technical challenges of destructing and derivatizing DOM molecules and allows the direct analysis of complex DOM. Third, FTICR-MS presents composition and chemical reactions at the molecular level of DOM mixtures. This overcomes the technical challenges of identifying the molecular diversity in composition and transformation of aquatic DOM during various environmental processes. The abundant molecular formulas make FTICR-MS currently one of the few techniques available to observe majority individual DOM components (Minor et al., 2014). Compared to spectroscopy, specific FTICR-MS molecular formulas and indicators help explore deeper understanding of DOM properties. However, several drawbacks and limitations of FTICR-MS also need to be considered (see details in the section 6.1).

DOM pretreatment before FTICR-MS analysis
DOM samples need to be extracted from environmental medium (e.g. water and sediment) and purified before FTICR-MS analysis due to the low concentration and interference of inorganic ions (Mao et al., 2016). DOM in sediments can be extracted using neutral deionized water or alkaline water (Hur et al., 2014). Moreover, two common enrichment and purified methods of DOM are introduced before FTICR-MS analysis. Specifically, XAD-resin adsorption can separate hydrophobic and hydrophilic components in DOM by multiple cyclic adsorption/desorption treatments (Thurman & Malcolm, 1981). Solid phase extraction (SPE) is conducted based on the adsorbent selectivity of DOM and inorganics (Supporting information) . SPE enrichment using stepwise elution can divide DOM into several fractions according to different chemical properties such as polarity, which improves the measurements of FTICR-MS . The number of confirmed DOM molecular formulas using stepwise elution is 50% higher than that using one-step elution. LC pretreatment before FTICR-MS help to deeply explore molecular characteristics by separating DOM into sub-fractions (Supporting information) .

Molecular parameters of DOM compounds from FTICR-MS
The molecular compounds in DOM based on the van Krevelen diagram with axes of H/C and O/ C ratios mainly includes lipid-like, carbohydrate-like, protein-like, lignin-like, tannin-like, and condensed aromatic molecules (Supporting Information, Figure 1a, Table 1). The modified AI (AI mod ) is proposed to correct the contribution of O atoms in AI calculation (Table S1) (Koch & Dittmar, 2006). The van Krevelen diagram combined with AI mod has been widely utilized to   Zhang et al. (2020) and Cooper et al. (2022) effectively identify molecular compounds with different aromatic degrees in DOM, which is closely related to the unsaturation degree of molecules indicated by DBE (Supporting Information) (Figure 1b, d). The AI mod > 0.67 is considered as the unequivocal criteria for the existence of condensed aromatic structures, and the AI mod > 0.5 represents the existence of aromatic structures (Kellerman et al., 2014). The highly unsaturated compounds are defined as the compounds located in AI mod < 0.50 and H/C < 1.5 (Figure 1b), mainly including degradation products of lignin (e.g. phenols) (Seidel et al., 2014). Moreover, the NOSC calculation facilitates the estimation of DOM oxidation using FTICR-MS molecular data ( Figure 1e). The range of DOM compounds plausible decomposition becomes increasingly constrained in low redox conditions, and NOSC provides a thermodynamic reference to determine whether the compounds can be decomposed (Boye et al., 2017). For example, protein-like compounds with low NOSC accumulate under anaerobic conditions, while carbohydrate-like and aromatic compounds with high NOSC are reduced (Boye et al., 2017). The Kendrick mass defect (KMD) provides insights of chemical reactions (e.g. methylation/ demethylation, hydrogenation/dehydrogenation, and oxidation/reduction) of compounds and associated resultant products ( Figure 1c) (Kim et al., 2003;Wilson et al., 2017). A series of homologues (e.g. alkyl, carboxylate groups, and O homologues) arrange along horizontal lines when the KMD values (Table S1) of molecules are plotted against their integer mass or carbon number ( Figure 1f) (Song et al., 2021;Waggoner et al., 2017). DOM molecular connections and transformation networks can be investigated by a combination of KMD and networking models such as the data-processing software Cytoscape equipped with MetaNetter plug-in. By connecting the values of m/z (listed as nodes) with edges in a network analysis, KMD interpretations of DOM molecules are proposed to exhibit the common chemical transformations of DOM molecules with ocean depths (Figure 1g) (Longnecker & Kujawinski, 2016). Therefore, the important FTICR-MS parameter of KMD provides valuable information regarding molecular compounds with homologous series, highlighting the chemical transformations of DOM regarding the compositions and reactivity in various aquatic systems.

Fluorescence, NMR, and GC-MS methods
The types, advantages, and application aims of FTICR-MS and its complementary methods are summarized in Table S2, helping to choose an optimal method to characterize chemical components in aquatic DOM (Figure 2a-d). The chemical components in DOM identified by fluorescence, NMR, and GC-MS methods can be found in Table 1, such as protein-like compounds (H/ C ¼ 1.0-2.2; O/C ¼ 0.1-0.67) are mainly composed of peptides with 3-6 amino acid residues (Hockaday et al., 2009). EEM-PARAFAC (parallel factor analysis) shows advantage in separating the overlap components in EEM detection, such as a clearer classification of individual proteinlike, humic-like, and fulvic-like components (Herzsprung et al., 2012). The widely used EEM and EEM-PARAFAC allows a meaningful statistical description but offers only limited insight into DOM composition, while FTICR-MS offers a wealth of qualitative molecular information with limited means to systematically elucidate the primary factors responsible for observed dynamics (W€ unsch et al., 2018). Therefore, considerable analytical advances in combination of fluorescence spectroscopy and FTICR-MS aid the understanding of structural and dynamic information of DOM ( Figure 2b) (Stubbins et al., 2014). Moreover, fluorescent-molecular correlations obtained from statistical methods can be established using a combination of FTICR-MS and EEM-PARAFAC (Herzsprung et al., 2012;Wilske et al., 2020). According to the extensive correlation data between FTICR-MS formulas and EEM peaks of DOM, humic-like peaks are highly related to the individual oxygen-rich/unsaturated molecules (Herzsprung et al., 2012). In addition, the contributions of molecules to fluorescent components can be further established based on the clearer component classification from a combination of FTICR-MS and EEM-PARAFAC. FTICR-MS formulas assigned to PARAFAC components of DOM are reported to represent 39% of the total number of formulas identified and 59% of total FTICR-MS peak intensities (Stubbins et al., 2014). In more detail, 32% of highly unsaturated molecules in DOM can contribute to the terrigenous, autochthonous, microbial humic-like components (Stubbins et al., 2014).
One-dimensional NMR spectroscopy coupled with FTICR-MS has been used to identify complementary information about molecular structures, isomers, and functional groups by combining the resonation of active nuclei and molecular mass information (Figure 2b) (Minor et al., 2014). Hertkorn et al. (2006) reported that the condensed alicyclic six-/five-membered rings or completely substituted aromatic rings determined by NMR highly represented the common features of carboxyl rich alicyclic molecules (CRAM) compounds. The high resolution of two-dimensional NMR further reduces the interferences caused by the magnetization loss of complex compounds during fast relaxation, which is favorable in characterizing DOM molecules combined with FTICR-MS (Hertkorn et al., 2013). Moreover, GC-MS provides complementary information for FTICR-MS to detect typical structures of DOM with low molecular weight and polarity. GC-MS combined with FTICR-MS can also achieve simultaneous identification of both nonpolar and polar components in DOM (Wang et al., 2014). The identification of ion fragmentation of DOM through GC-MS and FTICR-MS confirms that the corresponding C x H y NO homologues were long-chain alkyl amide compounds (McKee & Hatcher, 2010). The polar and non/weak polar organic compounds in acrylic fiber wastewater are distinguished by GC-MS and FTICR-MS with assumptions that the NSO heterocycles may be related to a bad biological treatment performance (Wang et al., 2014).

Statistical methods for data interpretation
The statistical techniques including rank correlation, clustering analysis, and data fusion model are necessarily needed to deal with the multiple datasets determined by FTICR-MS and other supplementary methods (Figure 2c) (Sleighter et al., 2010). The ranking correlations between fluorescence and MS parameters of DOM show that the coagulation processes selectively remove DOM with abundant O-bearing functional groups (Lavonen et al., 2015). Moreover, the clustering methods including principal component analysis (PCA) and hierarchical clustering analysis (HCA) are utilized to analyze differences in DOM components based on data dimension reduction and main features summarization (Figure 2c)  . The relationships between groundwater DOM variables and FTICR-MS molecules established by PCA indicate that groundwater DOM in high rainfall areas shows higher molecular weight and aromaticity than DOM in semi-arid areas . Furthermore, a more advanced method, advanced coupled matrix and tensor factorization (ACMTF) has been developed to explain the DOM datasets with heterogeneous nature (Figure 2c) (W€ unsch et al., 2018). ACMTF model provides more intuitive chemical results of DOM molecules by imposing the non-negativity constraints and the degree of molecular formula loadings. W€ unsch et al. (2018) used the ACMTF model to establish the connection between molecular formula and fluorescence information of DOM and described the attributes of ACMTF components including Stoke's shift, O/C, H/C, m/z, and DBE. The outcomes obtained through ACMTF are more specific compared to traditional correlation analysis, which helps in yielding broader insight into DOM composition.

Heterogeneous distribution of surface water DOM
The DOM molecules show the heterogeneous distribution with depth in natural water environments based on the high-resolution FTICR-MS datasets. The distribution of DOM molecules is affected by various biogeochemical factors at different depths ( Figure 3a) . The abrupt changes of DOM indicators that represented remarkable changes of molecular properties with lake depths are attributed to the key drivers of chemical reactions at chemocline, including microbial degradation and photodegradation of DOM. The NOSC, AI mod , and O/C are relatively stable above lake chemocline, but rapidly decrease to the minimum values, showing obvious   decreases of aromaticity and oxidation degree of DOM (Butturini et al., 2020). Due to the adsorption/desorption of secondary minerals discharged by human activities, the chemocline can be dominated by alternating O-containing DOM molecules (She et al., 2021). Gonsior et al. (2013) found that the seasonal water column turnover caused deep DOM to flow into surface water, which was then photobleached. The spatial characteristics of river DOM are reflected on the diversity, transportation, and transformation of DOM molecules. More refractory DOM molecules can be retained and diluted in the river transport process Seidel et al., 2015). The diversity of DOM molecules in aquatic systems (e.g. rivers and lakes) is significantly influenced by anthropogenic land use Zhang et al., 2021). The anthropogenic land uses encompass many different types of land-use changes, such as agricultural activities, tree cover reduction, urbanization canalization, and lake reclamation (Roebuck et al., 2020;Ye et al., 2022;Ye et al., 2019). Ye et al. (2022) reported that the lake reclamation in the Yangtze Plain in China resulted in more lipid-like compounds and condensed aromatic hydrocarbons in DOM, possibly due to increased microbial output and hydroxyl radical-initiated oxidation. The anthropogenic land use such as the conversion of natural landscapes for urban and agricultural practices can reduce molecular diversity and increased heteroatomic abundance of DOM in headwater systems (Wagner et al., 2015;Zhang et al., 2021).

Molecular properties, distribution and transformation of sediment DOM
The sediment DOM molecules determined by FTICR-MS show highly aromatic and N-/S-containing compounds due to the biotic/abiotic degradation and sulfurization of organic matter (Figure 3a) (Tremblay et al., 2007). Valle et al. (2020) found that sediment DOM in Swedish boreal lakes exhibited higher DBE/C (0.5-0.6) compared to water column DOM (DBE/C ¼ 0.4) due to the anoxic limitations in sediment. The sediment DOM derived from surface and pore water in sediments show different contents of aromatic molecules, which is associated with the redox conditions changes. Riedel et al. (2013) reported that the sediment DOM in pore water had higher proportion (35%) of aromatic molecular formulas than DOM in surface water (28%).
Sediment DOM molecules also present a depth-dependent distribution, which is attributed to microorganism activity, mineral sorption, and human activities (Figure 3a). The relative abundance of lipid-like compounds of sediment DOM is reported to increase with depth, while the abundance of carbohydrate/lignin-like compounds and condensed aromatics show opposite trends (Xu et al., 2016). The microorganism activity on sediment DOM can transform aromatic/unsaturated CHO compounds into saturated compounds as sediment depth increases (Oni et al., 2015). DOM in surface sediments was reported to have higher aromaticity and NOSC than DOM in deep sediments, which is related to the significant anthropogenic contribution of surfactants and black carbon . Moreover, the molecular evidences for the transformation mechanism of sediment DOM in abiotic vulcanization process and different hydraulic conditions have been revealed by FTICR-MS (Abdulla et al., 2020;Valle et al., 2018). For example, the reported relationships between DOM conversion and water residence time showed that CRAM increased with incubation time during anoxic incubation in sediment DOM of small lakes, while the lignin-/tannin-like molecules were enriched in sediment DOM of large lakes (Valle et al., 2018).

Photochemistry response of DOM molecular characteristics
Sunlight irradiation is a driving force for photo-transformation of DOM, whose photo-reactivity is mainly reflected in the transformation between molecular compounds with different aromatic degrees (Figure 3b) (Gonsior et al., 2009). The shifts from unsaturated/aromatic DOM compounds toward more saturated compounds are observed, which suggests high photo-reactivity of aromatic compounds (Gonsior et al., 2009). Aromatic DOM compounds are also reported to correlate with high photochemical activities of humic-like fluorescents (Stubbins et al., 2014). Moreover, more photochemical properties of DOM have been successfully investigated based on the molecular parameters (Gonsior et al., 2009;Stubbins et al., 2010). Stubbins et al. (2010) reported that the number of molecular formulas decreased from 2421 to 993 after irradiation, accompanied by new lipid-like molecules generated. The hydrophobic DOM was reported to transform into transphilic DOM after irradiation, showing the decreases of m/z from 420 to 390 and DBE from 11 to 9.7 (Niu et al., 2019).
The molecular complexity of aquatic DOM significantly affects the formation of photochemical intermediates (Figure 3b). Previous studies reported that the quantum yield of 3 DOM and 1 O 2 was negatively and positively correlated to the molecular weight and saturated molecular formulas, respectively (Berg et al., 2019;Maizel & Remucal, 2017b). Correspondingly, the generation of different ROS species shows effects on DOM molecular composition. For example, the presence of 1 O 2 and O 2 might cause the removal of molecules with O/C > 0.3 and O/C < 0.3 in ligninderived DOM, respectively (Waggoner et al., 2017). Furthermore, the photochemical variations of DOM partly explain the heterogeneous distribution in natural water environments. The aliphatic and aromatic compounds in lake/sea DOM show high abundance in surface water and deep water, respectively (Gonsior et al., 2013;Maizel et al., 2017). Prolonged sunlight irradiation of river DOM causes the decrease and increase of unsaturated/aromatic and aliphatic compounds, respectively (Berg et al., 2019;Medeiros et al., 2015). The photostability of lignin-like compounds can assist in tracking the movement of terrigenous DOM through rivers to ocean .

DOM molecular modification by microbial activities
Microbial activities play an important role in the degradation of DOM molecular compounds through uptake and transformation pathways. Microbial activities can preferentially degrade the O-/N-containing DOM molecular compounds with low aromatic degrees, leading to the production of lipid-like compounds (Figure 3c) (Bai et al., 2017). CRAM/lignin-like compounds in DOM are usually considered to be the main biostable substances (Bai et al., 2017;Kim et al., 2006). More N-/S-/P-containing molecules are observed in microbially reworked DOM, which indicates that the microbial activities are one of the important sources of heteroatomic molecules ( Figure 3c) (Antony et al., 2017). Meanwhile, DOM molecular composition can interact with the microbial community, resulting in the changes of both DOM composition and microbial species (Figure 3c). Wu et al. (2018) found that the primary orders, such as Burkholderiales and Rhodobacterale, were affected by tannin-like/protein-like compounds, and the increasing abundance of Cupriavidus and Luteimonas was associated with the lipid-like/lignin-like compounds.
The changes of DOM molecular characteristics caused by microbial activities also show profound impacts on the DOM distribution in natural water environments (Figure 3c). Butturini et al. (2020) found that lake DOM with saturated aliphatic groups had a mutational increase in the chemocline and at the bottom, attributed to the release of microbial-derived DOM and the degradation of cyanobacteria/algae detritus at the sediments-water interface. The accumulation of microbial-resistant compounds in DOM in lakes with high salinity led to high proportion of tannin-like compounds with high O/C ratios in DOM . Moreover, the different bioavailabilities of DOM molecules lead to their different distribution and migration characteristics in rivers . Microbial degradation has limited effects both on high concentration and low oxidation state of DOM in surface rivers and the low concentration of DOM in groundwater (Stegen et al., 2018).

Molecular fractionation by adsorption
The selective adsorption of DOM molecules by minerals or nanomaterials in natural water environments causes molecular fractionation, in which molecular changes and fractionation mechanism of DOM have been successfully revealed by FTICR-MS (Figure 3d). The iron/aluminum minerals are believed to readily adsorb high molecular weight, unsaturated or O-rich compounds in DOM (Galindo & Del Nero, 2014;. The aluminum oxide exhibits high surface affinity for the DOM aromatic compounds with multiple oxidizing groups and for aliphatic compounds (Galindo & Del Nero, 2014). The preferential adsorption of polyphenols and O-containing polycyclic aromatic hydrocarbons by ferrihydrite results in more DOM molecules with low DBE and O/C remaining in the supernatant (Lv, Zhang, Wang, et al., 2016).
The adsorption and fractionation of DOM by minerals or nanomaterials are affected by DOM properties and various environmental factors (Figure 3d). The molecular acidity influences the adsorption and fractionation of DOM molecules by aluminum oxide (Galindo & Del Nero, 2014). Liu et al. (2019) reported that both Ca and Cu background electrolytes reduced the fractionation of specific DOM molecules, such as highly unsaturated phenolic molecules with high molecular weights. Meanwhile, the surface complexing ability of adsorbent is also a key factor in the adsorption and fractionation of DOM molecules (Figure 3d). The chemical complexation of Ag nanomaterial (AgNM) can selectively adsorb CHOS compounds with low molecular weights and saturation degrees (Baalousha et al., 2018). FTICR-MS data combined with exploration of the surface OH groups reveal that singly coordinated À OH groups can form ligand-exchange complexes with DOM molecules on hematite surfaces (Lv, Miao, et al., 2018). This process preferentially adsorbs molecules with high oxidation states and aromaticity, inducing the selective fractionation of DOM molecules at the mineral-water interface (Lv, Miao, et al., 2018). 5. Molecular compositions, dynamics, and transformations of DOM in the engineered water systems

Molecular properties of wastewater and drinking water DOM
Molecular properties of DOM derived from wastewaters are obviously different owning to their various sources. DOM in municipal wastewater contains abundant CHO, CHOS, and CHON molecules and a small amount of P-/N-/Cl-containing molecules (Ye et al., 2019). The CHOCl molecules in DOM at all stages of municipal wastewater plant can be attributed to the presence of pharmaceuticals and pesticides (Maizel & Remucal, 2017a). The targeting and suspect screening analysis based on the FTICR-MS datasets of DOM can identify antibiotics and prohibited drugs in municipal wastewaters (Figure 4a) (Perkons et al., 2021). Moreover, the molecular compositions of industrial wastewater DOM vary by industrial facility types. DOM compounds in coal liquefaction and refinery wastewaters generally contain a high number of bio-refractory formulas with N and S (Ye et al., 2020;. The refinery wastewater DOM generally exhibits the lowest unsaturation degree due to the naphthenic compounds flowing into the wastewater (Li et al., 2015). The low H/C values of pharmaceutical wastewater DOM were partially due to the presence of personal care products or intermediates (Hu et al., 2017;. Furthermore, the effluents DOM from wastewater plants have different molecular compositions compared to the natural water DOM. The effluent DOM contains more CHOS molecular formulas than surface water DOM because the anthropogenic surfactants lead to a poor removal of CHOS (Gonsior et al., 2011). The N-/P-containing compounds with high aromatic degrees are retained in the effluents of wastewater plants, leading to a high biological activity of effluent DOM (Gao et al., 2021;Mesfioui et al., 2012). DOM compositions derived from drinking water are closely related to the water sources, which are mainly dominated by CHO molecules (Ye et al., 2019). Wang et al. (2017) characterized the DOM molecules in 20 drinking water sources in China, and found very low content of CHON and CHOS molecules compared to the abundant CHO molecules. The hydrological conditions and human activities are the key factors affecting DOM compositions in drinking water sources Zhou et al., 2020). Moreover, the DOM molecules in drinking water plants show abundant lignin-/tannin-like compounds and the production of chlorinated compounds with carboxylic and phenolic groups (Zhang et al., 2012). Different DOM molecules have different responses to drinking water treatments. Raeke et al. (2017) found that the DOM molecules in drinking water from forested sources exhibited high mobilization, especially for the oxygenated and unsaturated molecules with high molecular weight (m/z > 500 Da). During drinking water treatments, DOM molecules with low H/C (0.83), and high O/C (0.62) were reported to be selectively removed by coagulation, while molecules with high H/C (1.32) and low O/C (0.43) were largely removed by slow sand filtration (Lavonen et al., 2015).

Dynamic changes of wastewater DOM during traditional treatments
The molecular properties of DOM are significantly influenced by traditional wastewater treatments including coagulation, adsorption, and biological treatments (Gonsior et al., 2011). Coagulation and adsorption can cause molecular fractionation of DOM in wastewater, influenced by different coagulants/adsorbents (Chen et al., 2022). Therefore, relevant DOM studies will provide references for selection of sorbents/coagulants and the improvement of treatment parameters Yuan et al., 2017). Different coagulants/adsorbents exhibit different removal performances of molecular compounds in DOM. Aluminum/ferric sulfate and ferrihydrite show better ability to preferentially remove unsaturated (H/C < 1.0) and oxidized (O/C > 0.4) DOM compounds (Figure 4d) Yuan et al., 2017). Novel covalently bound coagulant exhibits better removal efficiencies of highly unsaturated phenolic compounds with more S-containing molecules compared to conventional coagulants (Geng et al., 2018).
The biological treatments with microbial metabolism under different oxygen conditions can achieve a comprehensive removal of DOM in wastewater. DOM variation characteristics in each stage of anaerobic-anoxic-aerobic (A 2 O) process have been clarified (Tang et al., 2021). The anaerobic process causes the increase of CHO/CHOS molecules in DOM, while the denitrification in the anoxic process leads to a decrease in CHON molecules (Tang et al., 2021). The increased intensities of recalcitrant DOM molecules are detected during the anoxic process, while this trend is the opposite during the oxic process (Tang et al., 2021). The effluent DOM from the A 2 O process shows a decrease in molecular weight and DBE, which suggests the degradation of macromolecules and unsaturated compounds (Li et al., 2015). Moreover, the molecular properties of DOM have significant differences in different biological processes (Yuan et al., 2017). The less-oxidated DOM compounds (O/C < 0.3) in landfill leachate are reported to be more responsive to biodegradation, accompanied by the production of highly oxidized molecules (O/C > 0.5) (Yuan et al., 2017). Similarly, the N-containing molecules in DOM with fewer O atoms are more easily removed during A 2 O process, while the DOM molecules with more O atoms are resistant to the biological treatments (Li et al., 2015).

Molecular variations of drinking water DOM during traditional treatments
The molecular variations of DOM are significant during different drinking water treatments, such as adsorption, coagulation, filtration, and disinfection. The adsorption and coagulation during drinking water treatment remove DOM compounds through the accumulation of colloidal particles and micro-suspended substances (Andersson et al., 2020;Zhang et al., 2012).  reported that hematite nanocrystals had a better removal efficiency of DOM molecules with high oxidation states and aromaticity through the -OH groups on the mineral surface. The coagulation process with common coagulants (e.g., Al 2 (SO4) 3 ) preferentially removes DOM molecules with high O/C and low H/C, which are important precursors of DBPs (Lin & Ika, 2020;Zhang et al., 2012). Moreover, the important membrane filtration contains ultrafiltration (UF), nanofiltration, and reverse osmosis, with different removal efficiencies of DOM molecules. The CHOS and CHONS molecules in DOM were reported to be readily removed by hydrophobic and hydrophilic UF membranes, respectively (Figure 4c) (Ly & Hur, 2018). The CHOS molecules in DOM show a high electrostatic repulsion caused by high negative charges on the hydrophobic UF membrane, while the CHONS molecules can form hydrogen-bonds with À OH groups on the hydrophilic membrane (Ly & Hur, 2018). The nanofiltration has a higher effect on DOM removal than the UF, which reflects on the comprehensive removal of reduced/oxidized and hydrophilic DOM components (Cort es-Francisco et al., 2014). Furthermore, the membrane filtrations can also achieve more comprehensive removal of DOM molecules by combining with other processes. Chen et al. (2020) found that reverse osmosis showed efficient removal on heteroatomic DOM compounds in membrane bioreactor effluents, especially polycyclic aromatic hydrocarbons.
Disinfection is the most important process to ensure the safety of drinking water, but the DOM molecules cause the DBPs generation that results in health risks during disinfection. Previous studies have achieved the identification of DBPs species and the clarification of the correlations between DBPs and their precursors in DOM by FTICR-MS (Postigo et al., 2021;Wang et al., 2016). The generation of DBPs, such as trihalomethane and haloacetic acid, is closely related to the composition of DOM in raw water . By using FTICR-MS, the polycyclic aromatics, polyphenols, phenolic, and unsaturated aliphatic compounds with high O/C and low H/C in raw water DOM shows significant contribution to the formation of typical DBPs, trihalomethane and haloacetic acid, during chlorination . These established correlations between DBPs and their DOM precursors show the powerful function of FTICR-MS to reveal the transformation and reaction pathways of DOM related to the evolution of emerging pollutants in drinking water treatments. Moreover, for DOM-involved mechanism of DBPs formation, the chlorinated DBPs are generated due to the oxidation and chlorine substitution on aromatic molecules into molecules with carboxylic-/alcohol-functional groups, which shows lower toxicity than Br-DBPs . Besides, the drinking water treatments before disinfection also influence the properties and generation of DOM-related DBPs. The pre-chlorination in the raw water distribution system was reported to control the accumulation of low-molecular-weight DOM compounds with high aromaticity/unsaturation and formation potential of DBPs precursors (Figure 4b) . Adsorption and coagulation processes can remove potential DBPs precursors in DOM compounds with high O/C and low H/C (Lin & Ika, 2020;Zhang et al., 2012).

Molecular transformation of DOM in advanced treatments
Advanced treatments such as catalytic oxidation, electrolysis, and photocatalysis have been widely used to remove refractory DOM from wastewater and drinking water (Lv et al., 2017;Wang et al., 2020). The oxidation processes (e.g. ozone and hydrogen peroxide oxidation) can reduce the degree of unsaturation and aromaticity of DOM (B. Zhang et al., 2021a;Zhuo et al., 2019). The chemical transformation and removal efficiency of DOM composition are influenced by the selection of oxidation technology. Zhuo et al. (2019) demonstrated that high-temperature H 2 O 2 oxidation removed most of the N-/S-containing compounds in DOM from refinery wastewater by breaking and oxidizing carbonyl compounds. Similarly, ozone oxidation has shown a high removal efficiency of unsaturated, highly aromatic, and heteroatomic DOM molecules in refinery wastewater . Ozone oxidation combined with catalysts can further improve the removal rate of wastewater DOM containing high-molecular-weight unsaturated/aromatic compounds .
Photolysis is another important advanced treatment that induces the chemical transformation of DOM molecules to achieve efficient removal (B. Zhang et al., 2021b). Photolysis combined with nano TiO 2 provides a great strategy for the removal of DOM compounds with high molecular weight and aromaticity, in which 52.7-82.1% of CHO and 47.3-84.8% of CHON can be removed (Lv et al., 2017). Moreover, DOM molecules exhibited various transformation reactions under photolysis treatments. The tri-hydroxylation (þ3O) reaction of DOM molecules is identified during UV/H 2 O 2 process, while the dihydroxylation (þH 2 O 2 ) reaction is dominant during UV/persulfate and UV/chlorine processes (Figure 4e) (B. Zhang et al., 2021b). The various radicals produced in UV photolysis show high reactivity with DOM molecules (Varanasi et al., 2018). Further reactions involving Cl in UV/free chlorine resulted in lower concentrations of some aromatic compounds compared to those in UV/H 2 O 2 . Other advanced treatments (e.g. electrolysis and photocatalytic membranes) are developed for DOM removal, exhibiting the chemical diversity and reaction complexity of DOM molecules. The combination of ozonation and electrolysis greatly improves the removal efficiency of lignin/CRAM-like and N-containing compounds in DOM from landfill leachate (Figure 4f) .

Limitations and technology perspectives of FTICR-MS
Several drawbacks of FTICR-MS for characterizing aquatic DOM need to be considered, including limitations in quantification and discrimination of isomers/structures of molecules and characterization of molecules >1000 Da. Although the use of spiked standards at known concentrations makes quantification of certain compounds of interest possible, quantification of a broader range of compounds in aquatic DOM presents a difficult task using FTICR-MS (Qi et al., 2022;Shi et al., 2021). As a non-target analysis with weak quantification capacity, FTICR-MS only provides relative content of molecules in the same DOM. Moreover, FTICR-MS cannot achieve the discrimination of isomers and structures of DOM molecules but only provides a large amount of molecular formulas. The in-depth exploration of structural information of DOM molecules needs the combination of FTICR-MS with tandem mass spectrometry (MS n ) (Zark et al., 2017). Furthermore, only extractable and easily ionized fractions that only represent a small subset of the total DOM pool are analyzed by FTICR-MS, which may bring uncertainties by inherent selectivity and unknown efficiency (D'Andrilli et al., 2010;Song et al., 2021). None of typical ionization modes provide universal ionization for all possible analytes of DOM. Single ionization mode may bias data toward certain molecular classes . Therefore, the molecular properties obtained by FTICR-MS make it somewhat one-sided in exploring the environmental behavior of aquatic DOM. Similarly, the molecular ions detected by FTICR-MS are restricted due to the corresponding MS signal of molecules with high weight (>1000 Da) is comparatively weak (D'Andrilli et al., 2020). Therefore, FTICR-MS needs to be combined with other complementary characterization techniques to provide more comprehensive molecular information of DOM.
Considering the current challenges, the technology development and application optimization of FTICR-MS need to be focused on, such as the optimization of performance design and mass spectral treatments for specific characterization purposes of aquatic DOM. The instrumental developments of FTICR-MS include improvement of ICR trap and application of high magnetic field, which will advance the accuracy and stability of results. The improved design of ICR traps makes the electric field closer to ideal and keeps the stability of trapped ions (Cho et al., 2015). The application of high magnetic field can increase the resolving power and benefit the critical performance parameters such as mass accuracy, analysis speed, and dynamic range (Cho et al., 2015). Moreover, the utility of mass spectral spacing patterns should be refocused because no manual assignments or algorithm can produce unambiguous molecular formula without them (D'Andrilli et al., 2020). The improvement of mass spectral spacing patterns can set the foundation of confirming charged ions and formula assignments for DOM, providing essential resources for targeted inquiries and biogeochemical relationships (D'Andrilli et al., 2020;Stenson et al., 2003). Furthermore, targeted approaches with well-defined hypothesis for FTICR-MS should be improved to achieve identification and characterization purposes, such as the reactions involved in DOM production and transformation (D'Andrilli et al., 2020). The collecting and interpreting FTICR-MS data with clear expectations of chemical indicators will avoid perfunctory, misidentified, or retroactive correlations, producing sound molecular-level interpretations and understanding of aquatic DOM (D'Andrilli et al., 2020).

Optimization of data interpretation and combination methods
Current methods of data interpretation still have some limitations such as inconsistent classification and improper indicator use. Specifically, different classification methods overlap and complicate DOM molecular boundaries, causing a significant reduction in comparability of DOM molecular properties. Thus, more uniform classification standards should be established to reduce the controversial attribution of DOM molecular formulae, to enhance the comparative analysis of DOM compounds from different studies (Hockaday et al., 2009). Moreover, the diversity of DOM molecules makes it difficult to use indicators for analyzing every single molecule; and the application of average indicators may lead to some neglect of certain molecular properties. Statistically averaged indicators can be applied in situations involving similar types of DOM samples. It is suggested that indicators should be used for individual molecular properties based on targeted purposes, and that conjunction with statistical averages can achieve the comprehensiveness and accuracy of DOM analysis.
Several recommendations need to be considered for DOM studies with regard to the improvement of data compatibility and development of flexible statistics technique. The spectroscopic properties of DOM have measurement biases to the specific components and cause information losses of undetected molecules in FTICR-MS measurement (W€ unsch et al., 2018). Therefore, additional types of supplementary methods should be combined to improve compatibility and provide more comprehensive structural information about DOM. Moreover, the traditional statistical methods for FTICR-MS datasets have limitations in dealing with anomalous data in massive datasets. Component orthogonality can influence the precise distinction of underlying factors of molecular formula matrices and the correlations between DOM molecular properties in PCA analysis (W€ unsch et al., 2018). Therefore, advanced statistical models such as ACMTF should be further developed to promote statistical precision, leading to the flexible handling of different types of DOM datasets.

Unified molecular database and prescreening methods of FTICR-MS
Although extensive literature reports the molecular characteristics of DOM using FTICR-MS, there is not yet a publicly available database. Presently, aquatic DOM identification and comparison by FTICR-MS is severely compromised by a lack of available database, as new data can hardly be compared with existing data (Reemtsma, 2009). Therefore, under the requirements of DOM evaluation in the water environments, available open-access FTICR-MS database for assigned molecular formulas of DOM needs to be established. This implementation of a unified database repository using standardized protocols worldwide is essential to accelerate the recognition of complex DOM characteristics and comparison across temporal and spatial scales in DOM samples from different studies (Bahureksa et al., 2021). Moreover, considering the high application cost of FTICR-MS, low-cost techniques can be used to provide prescreening for FTICR-MS, achieving the combination of real-time online monitoring and key spot check of DOM molecules. Such data repositories and prescreening methods can rapidly compare and identify peaks as potential molecular components by linking known and unknown compounds in correlation and network analysis and fragmentation spectra similarity in targeted analysis (Bahureksa et al., 2021). The advances in unified molecular database and prescreening methods of FTICR-MS would be critical in conducting statistics using larger datasets, and identifying reaction mechanisms and pathways of aquatic DOM in the natural and engineering systems.

Evaluation of diversity, transformation, and distribution of natural DOM in the micro-/ macro-level water environments
For the chemical and heterogeneous behaviors of DOM in natural water environments, recommendations to explore the microscopic reactions and mechanisms of DOM in micro-/macro-level water systems are detailed below. First, the micro-level reactions and mechanisms of DOM are recommended to further investigate, including primary aspects of photodegradation, adsorption, and microbial interaction of DOM. Specifically, more efforts are needed to establish the relationships between DOM molecules and photodegradation products, such as the mechanism of individual molecules to the generation of free radicals during natural lighting process (Maizel & Remucal, 2017b). Research priorities regarding the adsorption and fractionation of natural DOM are recommended to expand from common minerals to emerging nanomaterials because of the increasing use of discharged nanomaterials (Lv, Zhang, Wang, et al., 2016). The corona of nanomaterials can affect the adsorption and distribution of DOM on nanomaterials, of which the driven mechanisms are suggested to be deeply investigated to correctly assess their behaviors and ecological risks . In addition, it is necessary to deepen the understanding of dynamic molecular changes and response differences in DOM driven by microbial activities. The dynamic response mechanism of DOM molecules with regard to changes in microbial activities or related enzymes is recommended to explore the collaborative behaviors of DOM and microbial communities. The current understanding of the macro-scale behaviors of DOM such as the distribution and migration between different water ecosystems remain insufficient; thus, several prospects need to be considered. The macro-scale migration of DOM is highly associated with the carbon cycle among water systems. Therefore, the contribution and variation of DOM flux to the carbon cycle need to be further explored, such as the dynamic changes of DOM at system junctions. DOM signatures and reactivity within a watershed are impacted both spatially and temporally at macrolevel; thus future studies should also focused on how hydrology, climatic disturbance, wildfires, and other environmental drivers influence DOM composition within large basins using FTICR-MS. Moreover, the alteration in water conditions caused by natural disasters and climate changes will lead to variations in source, molecular composition, and distribution of DOM, in which the driven effects and mechanisms need more attention on a large-scale perspective . The molecular properties and release mechanism of exogenous DOM are also recommended to be determined, thus increasing the knowledge about DOM sources and diversity.
6.5. Evaluation of target molecule screening, toxicity association, and removal process improvement of DOM in the engineering water systems Considering the high complexity of DOM in engineered water systems, further evaluation of target molecule screening, toxicity association, and removal improvement of DOM should be explored by FTICR-MS. More FTICR-MS technical solutions for target screening and tracing of specific DOM compounds are recommended, such as the identification of low-concentration molecules in engineering waters (Perkons et al., 2021). Moreover, for the toxicity association of DOM in engineering water systems, additional studies are required to establish the interaction reactions and mechanisms between DOM and trace pollutants. Specifically, more integration of various DOM indices is needed to gain insights into key correlations of DOM molecules with wastewater toxicity. The assessment standards for the correspondence between DOM molecules and wastewater toxicity should be comprehensively determined, thus assessing the ecological and health risks of wastewaters. In the above suggested studies on wastewater toxicity, the key differences and driving factors of the interaction mechanism between DOM and toxic pollutants at different treatment stages are recommended to be more important research focus. The consideration and quantification of human-controlled conditions (e.g. chemicals utilization and operation parameters) are necessary in subsequent studies to accurately evaluate and improve DOM removal processes. For example, the differences in chemical dosage in various treatment stages and the normalization of unit agent dosage should be considered to avoid their influence on DOM removal efficiency . Meanwhile, exploring the DOM dynamic changes in different treatments is important to understand the molecular diversity and removal efficiency of DOM in all engineered water systems. Compared with the single removal process of DOM, the combination of novel processes (i.e. ceramic microfiltration) improves DOM removal efficiency. However, these combinations also lead to potential generation of DBPs, with high health risks associated with DOM removal. Future studies should not only focus on the improvement of DOM removal efficiency in multi removal processes, but also on the resultant risk of potential by-products.

Disclosure statement
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Funding
This work was financially supported by the National Key Research and Development Program (nos. 2021YFC3201000, 2021YFC3201001), Budget Surplus of Central Financial Science and Technology Plan (no. 2021-JY-09), and China Postdoctoral Science Foundation (nos. 2021TQ0315, 2022M713010).