Cadmium contamination in food crops: Risk assessment and control in smart age

Abstract With mankind entering the smart age, Cd contamination risk control in food crop revolution has been put on the agenda. Based on the theoretical basis, technical methods and developing trends, this review look back and forward the age of Cd contamination risk control driven by ‘genotype (G)+ envirotype (E)’ dual-engines. Focusing on G, an inter-specific Cd contamination risk assessment meta-analysis was carried, in which a higher Cd contamination risk in rice and wheat than maize was observed. So different strategies are recommended to be taken considering inter-specific difference. To control the risk in crops with high accumulating characteristic, smart creation of low-Cd crops can be applied by two methods: 1) Excavating and pyramiding natural variations in natural population and 2) designing and implementing artificial variations which do not exist in natural population. Focusing on E, the influence of environmental factors to food crop Cd accumulation was discussed and the strategy using Envirotype-to-phenotype (E2P) models to predict and implement safety threshold were offered. In the foreseeable future, with the support of environmental science, biology, big data, artificial intelligence and other interdisciplinary and multi-technology, Cd contamination risk control will move toward intelligent, efficient and directional, ultimately realizing the revolutionary transformation from ‘experience’ to ‘smart’. Graphical abstract


Introduction
Cadmium (Cd), a poisonous element which has a 10-30 years half-life in humans, can accumulate in multiple organs, mainly in liver and kidney, causing damaging effects on human health. Longtime Cd intake can cause different severe diseases, including cardiovascular diseases (Tellez Plaza limit ( Figure 1). Under the more-strict EFSA and European Union (European Commission, 2006) standards for Cd (rice: 0.2 mg/kg; wheat: 0.2 mg/kg; maize: 0.1 mg/kg), the trend was consistent: 59.84% in rice, 32.57% in wheat and 23.22% in maize exceeded the allowable limit. Under the strictest Chinese Cd standards (Ministry of Health of the People's Republic of China, 2012, rice: 0.2 mg/kg; wheat: 0.1 mg/kg; maize: 0.1 mg/kg), 59.84% rice is above the standard, compared with 43.94% in wheat and 23.22% in maize.
Considering that soil properties may affect risk assessment, we divided assays into 3 different sets according to the soil Cd concentration and pH: Level I, Level II and Level III. The parameters used for this classification were set based on the soil environment quality risk control standard set by Ministry of Ecology and Environment of China (Supplementary Table 1): the assays with soil properties lower than the risk screening values were considered as level I (safe or slight pollution), higher than the risk intervention values were considered as level III (heavy pollution) and those in between are classified as level II (moderate pollution) (Supplementary Table 2). Grain over-standard rate was calculated based on the grain concentration of the assays within each level. A same trend was still observed for all three food crops: Rice (Level I: 7.50%, Level II: 48.51%, Level III: 73.33%), wheat (Level I: 5.80%, Level II: 54.17%, Level III: 86.67%) and maize (Level I: 1.06%, Level II: 4.18%, Level III: 44.73%) (Figure 1 and Supplementary Table 3), reaffirming a higher Cd contamination risk in rice and wheat. It is notable that in Level I where the soil Cd concentration has a low or negligible risk to agri-products safety, rice and wheat still showed a nonnegligible over-standard rate (rice: 7.50% and wheat: 5.80%), while in Level II where the soil Cd concentration has a potential risk to agri-products safety, only 4.18% maize exceeded the allowable limit. This may be caused by the difference in Cd accumulation capacity among species, as it has been reported that the bioconcentration factor (BCF) values were higher in rice and wheat than maize. In our previous study, a relatively low Cd concentration level in maize was also discovered. On acidic farmland with different Cd pollution levels, Cd accumulation in maize kernels was stable and mainly concentrated in a relatively lower safety interval and most maize kernels exhibited no obvious threat to human health within 6.4 mg/kg Cd of soil (Feng et al., 2020). To some extent, maize can be considered a 'natural' low-Cd food crop. Accordingly, inter-specific  Table S1 -2. difference in Cd contamination could be considered when laying down criterion of risk regulating and designing farming structure in Cd-polluted fields. For example, the proportion of low-Cd maize can be properly enlarged in a high-Cd accumulation field. And stricter Cd criterion should be adopted when deciding whether rice or wheat is suitable for planting in one Cd contaminated field.

Smart creation of low-Cd food crop: Improving genotypic factors
Germplasm creation is the process of ameliorating genotype factors and obtaining low-Cd accumulation capacity in food crop. In past few decades, breeders have cultivated plenty of germplasms to provide enough food for increasing population. However, since grain Cd accumulation cannot be observed directly, breeders focused more on phenotype of yield and resistance rather than Cd accumulation. To date, stable and low Cd accumulation trait is still not a compulsory index in new germplasm authorization in many countries, posing considerable threats to food safety. It is high time for breeders to take Cd accumulation trait into consideration. Fortunately, smart age offers a higher starting point for low-Cd food crop germplasm creation with less time and cost. Here, the smart creation of low-Cd food crop will be discussed in detail.

Biological pathways for Cd accumulation capacity amelioration in food crops
Grain Cd accumulation is a complex physiology procedure: absorbing Cd from soil to the root, transportation in xylem from the root to the shoot, storing Cd to leaves, and finally distributing Cd into grains. As a complex multiple-gene controlled procedure, more than 30 genes have been identified involving in Cd accumulation in rice, maize and wheat (Supplementary Table 4).
Following the mechanism of Cd entering grains, such pathways can be regulated for low-Cd food crop creation: (1) Reducing Cd absorption efficiency in root. For example, by knocking out OsNRAMP5, an important Cd and Mn transporter involved in root uptake, root-to-shoot translocation and distribution process (Sasaki et al., 2012), a significant decrease in grain Cd accumulation was observed . Cd concentration was also significantly decreased in rice grains by mutation of a root Cd transporter, OsCd1 (Yan et al., 2019); (2) Increasing the efficiency of Cd compartmentalization in root and leaves. For example, over-expression of OsHMA3, a rice root Cd sequester, could enhance vacuolar sequestration of Cd in roots, thus resulting in about 94-98% decreased Cd in brown rice (Sasaki et al., 2014); (3) Reducing Cd loading and unloading efficiency in xylem. For example, in the rice grains of OsHMA2 Tos17-mutant, the Cd concentration decreased to about half that of the wild type, causing by a significantly decreased xylem loading (Satoh-Nagasawa et al., 2012); (4) Reducing the transportation efficiency through phloem to grains. For example, OsLCT1 functions at the nodes in Cd transport into rice grains. The knockdown plants showed up to 50% less Cd in brown rice than the control (Uraguchi et al., 2011). It is noteworthy that since Cd was always co-transported with cations such as manganese (Mn), iron (Fe), zinc (Zn) etc. Simply knocking-down or knocking-out Cd transporters may cause lacking other essential elements, resulting in decreasing performances of yield, resistance and so on. So how to give consideration to both minimizing Cd in grains and maintaining other important agricultural performances still need to be considered.

Approaches for low-Cd food crop smart germplasm creation
Smart germplasm creation is an innovative concept raised in the age of smart, which is diversified integration of big data, machine learning, gene editing, genome-wide selection and other technologies. One of the cores of smart germplasm creation is 'variation'. In natural cultivars populations, nature variations are generally observed across the genome, such as single nucleotide polymorphisms (SNPs), insertions and deletions (InDels), structural variations (SVs), presenceabsence variations (PAVs) etc. With the development of technology, protein engineers developed experimental and computational methods for designing sequence to improve proteins for human purposes, thus creating artificial variations not found in nature. Accordingly, smart germplasm creation is divided into smart hybrid creation and smart design creation ( Figure 2).

Smart hybrid creation
To date, hundreds of millions of natural variations were found in three main food crops. For example, 29 million SNPs, 2.4 million InDels and over 90 thousand SVs have been discovered in 3,010 diverse accessions of Asian cultivated rice . In maize haplotype version 3 (HapMap3), over 83 million variations were found in 1218 maize lines (Bukowski et al., 2018). In wheat, database WGVD contains 7,346,814 SNPs and 1,044,400 InDels from 968 individuals (Wang, Fu, et al., 2020) . Dealing with so many natural variations, smart hybrid creation focused on solving the following two questions: 1) how to excavate natural low-Cd superior variations and 2) how to pyramid them to create low-Cd cultivars.

Approaches for Natural Low-Cd Superior Variation Excavation
Joint linkage association mapping (JLAM) and genome-wide association studies (GWAS) are the most common methods applied to excavate low-Cd related natural variations and quantitative trait loci (QTLs). JLAM is a method combining the advantages of linkage mapping and association mapping. In practice, the effects of QTLs measured by their linkage high-density molecular markers to target traits were calculated analyzed in a population. By using artificial mapping population, JLAM can effectively and accurately excavate QTLs and natural variations related to low-Cd accumulation (W€ urschum et al., 2012). To date, efforts in JLAM have yielded many successes, over 30 Cd accumulation related QTLs have been excavated in maize, rice and wheat (Table 1), including many important Cd transporters. For example, OsHMA3, a rice gene isolated using the Anjana Dhan and Nipponbare derived population, is located in the interval defined by RM21251 and RM21275 (Ueno et al., 2010). Using a double haploid (DH) rice population derived from TN1 and CJ0620, the defensin-like protein, CAL1, was identified . In maize and barley, ZmHMA3 was identified by fine mapping with bulked sergeant RNA-seq analysis using a biparental segregating maize population of Jing724 (low-Cd line) and Mo17 (high-Cd line)  and HvHMA3 was fine-mapped from a cross between BCS318 and Haruna Nijo (Lei et al., 2020). At present, populations used in natural low-Cd variation excavation are mainly double-parent populations such as DH , recombinant inbred lines (RIL) (Oladzad-Abbasabadi et al., 2018), chromosome segment substitution lines (CSSL) (Abe et al., 2013), in which limited genetic variants have been obtained. A more comprehensive excavation of natural low-Cd variation needs more complex artificial multiparent population like complete-diallel plus unbalanced breeding-derived inter-cross (CUBIC)  population and nested associated mapping (NAM) population (Gage et al., 2020).
With the development of sequencing technologies and strategies, high-density molecular markers covering the whole genome now can be acquired easily by large-scale re-sequencing. GWAS was born in response to this proper time and conditions. GWAS uses genotypic and phenotypic data of research population with high varieties to excavate natural variations existing in the population. By analyzing these data, researchers can discover SNPs which are significantlylinked to target phenotype. With these SNPs discovered, underlying genes and variations can also be mapped (Zhao, Yang, et al., 2018). Since the first application in Cd accumulation trait in 2018, about 20 researches in rice, wheat and maize have been carried out (Table 1). By searching through these natural variations in high-density SNP data acquired by GWAS, QTLs have been mapped to Cd-related genes or found to be involved into Cd accumulation process including Cd absorption, translocation and accumulation in grains. After further confirmation, several genes and their superior alleles have been excavated. For example, OsCd1 with its low-Cd superior alleles OsCd1 V449 were identified from the GWAS using 127 rice cultivars (Yan et al., 2019). One major locus for maize grain Cd accumulation (qCd1), ZmHMA3, was also identified using GWAS (Baseggio et al., 2021). Except for confirmed genes, many other SNPs high-related to Cd accumulation with unknown function genes were also discovered by GWAS . Using GWAS in a population of 312 rice cultivars, Zhao et al. has discovered 14 QTLs related to Cd accumulation in rice grains (Zhao, Yang, et al., 2018). Some of candidate genes including OsLCD, OsNRAMP1, OsNRAMP5 and OsHMA3 were already identified to be related to Cd accumulation, but candidate genes of other 11 QTLs have not been revealed yet. Caused by allelic heterogeneity, distal noncoding regulatory elements and other complex linkage disequilibrium architecture, associated SNP might be located far from functional genes. Identifying the causal genes for association signals was still a challenge in several cases. Solutions to the issue depend on more comprehensive approaches, such as incorporating information from transcriptomic/proteomic variation or yeast complement systems to pinpoint causality of association signal and functional gene.

Approaches for Natural Low-Cd Superior Variation Pyramiding
For natural low-Cd superior variations, pyramiding is an approach of aggregating them into one rice cultivar, thus to produce a new germplasm with ultra-low-Cd trait. Hybridization is the major approach for pyramiding them. Marker-assisted selection (MAS) refers to indirect selection for a desired plant phenotype based on the banding pattern of linked molecular markers. By qLCd2, qLCd5, qLCd7, qLCd8, qLCd9 ( Zhao, Luo, et al., 2018) multi-generation backcross and MAS, researchers have created low-Cd cultivars by introducing single or multiple low-Cd superior alleles into high-accumulation cultivars. For example, introducing the low-Cd superior allele OsCd1 V449 to substitute for OsCd1 D449 in high-Cd background 9311 and GUICHAO-2 caused a significantly decrease of rice Cd accumulation in near iso-genic lines (NILs) (Yan et al., 2019). Yu et al., discovered that a duplication of OsNRAMP5 between SNP8881 and SNP8886 originated from a low-Cd cultivar Pokkali (Yu et al., 2022). By introducing the low-Cd duplication variation into Koshihikari using backcrossing, the rice Cd accumulation reduced by 64% when planted in high-contaminated soil. When pyramiding multiple low-Cd superior variations into one cultivar, breeders can achieve combining low-Cd and other superior traits. i.e., Liu et al. have created 2 rice lines pyramiding low-Cd QTL GCC7 with high-Zn or high-Se QTLs, which showed a significant decrease of rice grain Cd compared to wild type (Liu, Ding, et al., 2020). By introducing a segment on chromosome 7 which includes Cd-related genes OsHMA3-OsNRAMP5-OsNRAMP1 jap originated from low-Cd japonica variety IRAT129 into 93-11, Wang et al. created an improved 93-11 line with 31.8% decrease in rice grain Cd accumulation and no negative effects on yield (Wang, Yan, et al., 2021). All these facts proved that hybrid pyramiding low-Cd related natural variation is an approachable way of creating low-Cd varieties. But it is never enough just pyramiding such natural variations which have large effects with Cd accumulation. Variations with different effects on target traits can be divided into major-effect variations and minor-effect variations. These superior alleles were mainly major-effect variations. But a nonnegligible amount of minor-effect variations, which have small effects on the trait, also contribute greatly to Cd accumulation. How to pyramid these major and minor-effect low-Cd variations is also an important question for low-Cd germplasm creation.
In the smart age, with high-throughput sequencing technologies, genome-wide selection (GS) together with genotype-to-phenotype (G2P) method offers breeders a way to reveal the relationship between genotype and phenotype. By constructing models based on the measured phenotype and whole-genomic phenotype of training population, such a relationship can be revealed as a G2P equation. With this equation, breeders can predict the performance of cultivars that have been genotyped but not phenotyped. Recently, Yan et al exploited the genotype-to-phenotype relationship of maize kernel  and rice grain  Cd accumulation at whole-genome level, and developed GWAS-assisted G2P models using machine learning and linear statistical methods. With optimal parameters, the prediction accuracy reached as high as 0.89 and 0.81. Accordingly, a lowest-Cd virtual genome containing the most natural variations for low-Cd accumulation can be calculated and constructed, which helps breeders to plot the optimal route for creating low-Cd cultivars via GS. GS and G2P enabled breeders to select low-Cd cultivars before breeding cycle, which significantly cuts the cost in field testing. With reliable data from developing sequencing technologies, improved algorithms and proper model selection, GS is expected to be used in low-Cd smart hybrid creation with improved accuracy and decreased cost for breeders in the coming age (Figure 2).

Smart design creation
The essence of smart design creation is creating/designing artificial low-Cd superior variations which don't exist in natural germplasm and introducing them into genome of cultivars. A cultivar with artificially designed edited genome can spatio-temporal specifically express what the breeders want the cultivar to express, making the phenotype more controllable.

Approaches for Artificial Low-Cd Variation Designing
In the last century, breeders acquired new artificial variations which did not exist in nature by methods like ethyl methanesulphonate (EMS) mutagenesis and irradiation mutagenesis. For example, Tanaka et al. applied EMS mutagenesis method on Hitomobore and created a low-Cd line with a nucleotide substitution from CAT to CAA, causing a substitution of H242Q in OsNRAMP5. The Cd accumulation of this line was reduced to lower than 20% when compared to wild type. (Tanaka et al., 2016). Cao et al. created a low-Cd line in an indica cultivar 9311 using EMS mutagenesis, which has an artificial P236L amino acid substitution in a highly conserved region of the 7 th exon of OsNRAMP5 (Cao et al., 2019). Using carbon ion-beam irradiation, Ishikawa et al. created 3 rice OsNRAMP5 mutants: lcd-kmt1 with a deletion of one single nucleotide in the 9 th exon, lcd-kmt2 with a 433-bp insertion in the 10 th exon and lcd-kmt3 with a 227-kb deletion including the whole OsNRAMP5 gene. All those mutants showed <0.05 mg/kg Cd concentration in rice grains and <0.1 mg/kg concentration in straws (Ishikawa et al., 2012). Though proved efficient in artificial low-Cd variation creation, these artificial mutagenesis are random in creating mutation and not able to achieve oriented design in low-Cd germplasm creation.
With the help of emerging technologies such as de novo protein design and homology modeling, artificial design sites can be directionally and universally predicted. With deeper acknowledgement of mechanisms, researchers have designed artificial variations based on several key sites. i.e., Plegaria et al., de novo designed an artificial protein apo a 3 DIV containing a 3-cysteine heavy-metal binding motif/sites (Cys18-Cys28-Cys68), which showed activity of sequestering Cd, Pb and Hg and proved the feasibility of de novo designing Cd-chelating proteins (Plegaria et al., 2015) Using crystal structures and literature on NRAMP structure-to-function, libraries were semi-rationally built to create variants into native yeast heavy metal transporter SMF1. By introducing artificial sites of S105C, M276C, and S269T, a SMF1 transporter specific to Cd was created . Based on homology modeling, analysis highlighted amino acids sites forming the metal permeation pathway of HMA4, whose importance was subsequently investigated functionally through mutagenesis and complementation experiments in plants. The results revealed artificial HMA4 mutants exhibited different Cd translocation abilities in Arabidopsis, providing instruments for the future design of low-Cd food crops (Lekeux et al., 2019).
In the age of smart, AI models like AlphaFold2, RoseTTAFold and Colossus AI can also be used to predict or design new artificial sites involving Cd binding or permeation (Humphreys et al., 2021;Tunyasuvunakool et al., 2021). These artificial sites can be further sequenced and predicted by AI models, judging whether they are potential Low-Cd superior variations. Generative model is a machine learning model which can learn from existing biological-sequence and functional sites and design a brand new sequence. In practice, new artificial promoters  and new artificial proteins (Strokach & Kim, 2022) were reported to be designed by different kinds of algorithms, which provides breeders with a brand new direction in regulating Cd accumulation-related genes.

Approaches for Artificial Low-Cd Variation Implementation
Known as the most powerful gene editing tool till now, mechanisms behind clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated (CAS) genes are widely used in gene engineering. Using Cas9 and guide RNA (gRNA) to cut target DNA and then introduce InDels or donor homologous segments by DNA repair mechanisms, artificial variation knock-out or knock-in can be easily realized (Doudna & Charpentier, 2014). To date, genome editing has started targeting in Cd concentration phenotype. For example, using CRISPR/Cas9 system, Tang et al. created an OsNRAMP5 knock-out line with rice grain Cd concentration less than 0.05 mg/kg in contaminated field without negatively influencing plant growth (Tang et al., 2017). Liu et al. mutagenized OsNRAMP5 and OsLCT1 by CRISPR/Cas9 system and acquired two mutants nramp5x7 and lct1x1 both with an artificial insertion resulting in premature termination of proteins (Liu, Jiang, et al., 2019). Both rice lines showed significantly reduced Cd accumulation (67.2 $ 96.6% reduced in nramp5x7 and 45.9 $ 63.0% reduced in lct1x1) in Cd contaminated field with not-significantly influenced growth and yield.
Recently, many innovative gene editing tools have been developed, which allow not only randomly knock-down or knock-out genes, but also accurately deletion, insertion and substitution of one base in genome. Prime editing has come into sight because it can realize artificial variation creation including insertion, deletion and substitution between any base pairs. Using modified prime-editing-guide RNA (pegRNA) which guides the nickase to the proper editing target and serves as template of reverse transcriptase, prime editing can be more accurate with less sideeffect and no need of donor DNA (Lin et al., 2020). In plant cells, Zong et al. has created an engineered plant prime editor (ePPE) with 2 modifications on the reverse transcriptase, which provided a 5.8-fold enhancement for editing efficiency with no significant increase of byproduct and miss-target effect when tested in rice (Zong et al., 2022).
Though not applied in Cd-related traits yet, the potential use of accurate editing in Cd-safe production cannot be ignored. In the coming age, with thorough comprehension of mechanisms and technologies editing genes accurately, applying gene-editing technology is hoped to become less-cost, high-throughput, high-effective and not be restricted by the original genotype of cultivars. Breeders are expected to design and achieve artificial variation as their will, in this way, down-regulating Cd accumulation in grains without causing other negative side-effects on the cultivar can be expected to be achieved (Figure 2).

Safety threshold of low-Cd food crop: optimizing environmental factors
In agricultural practice, crops were grown in different kinds of environments. Uncontaminated food crops may come from contaminated soil, while contaminated food crops may be grown in uncontaminated soil. To a certain extent, there are no absolute over-limited food crop varieties, but inappropriate environmental parameters. So thoroughly understanding the effect of environmental factors and accurately assess the safety threshold is needed for precisely controlling the Cd contamination risk in agricultural practice.

Environmental factors influencing Cd accumulation capacity in food crops
Factors including soil moisture, pH, redox potential (Eh), organic matter and so on all influence Cd accumulation in food crops. Under flooding conditions, the soil pH increases, Eh decreases and soil sulfur (S) might potentially be reduced to S 2À . Under such these conditions, Cd tends to exist in less bioavailable forms such as CdOH þ , CdHCO 3 þ , CdCO 3 , CdSO 4 or CdSO 4 . The decreased soil Cd 2þ solubility subsequently result in a lower uptake by food crops. Caused by decrease of bioavailable Cd, a research reported that Cd accumulation in rice was significantly influenced by water treatment measures and the order of Cd concentration in brown rice was flooding (0.02 ± 0.02 mg/kg) < conventional (0.03 ± 0.02 mg/kg) <intermittent (0.14 ± 0.02 mg/kg) <aerobic (0.21 ± 0.02 mg/kg) (Hu et al., 2013(Hu et al., , 2015, indicating that flooding is a feasible way when simple reducing Cd concentration is needed. Influenced by water condition, Soil Eh also has dramatic impact on soil Cd bioavailability. When it was at drain period, fixed Cd in sulfides (CdS) with lower electrochemical potential could be oxidative, dissolved and released into soil, increasing bioavailable Cd. Both Sulfide of ZnS and MnS have a lower electrochemical potential than CdS, so in soil environment ZnS and MnS can serve as anodes preventing CdS from being dissolved into absorbable form. It was reported that 16 $ 30% decrease of extractable Cd and 60 $ 72% decrease of Cd concentration in brown rice were observed in a field test . In soil, zinc (Zn) serves as an antagonist against Cd, so applying Zn can decrease the oxidative stress caused by Cd, resulting in less leakage of electrolyte, increased photosynthesis and activated antioxidant system. Applying Zn both in soil and by foliage caused decrease of Cd concentration in grains (Hussain et al., 2021). Similar antagonism was observed between Mn and Cd and applying Mn together with zeolite stabilization in late rice was reported to effectively reduce Cd accumulation to 0.12 mg/kg compared with control group 0.63 mg/kg in grains (Liang et al., 2022).
Organic compounds in soil, measured in the form of organic carbon (OC), organic matter (OM) or soil organic matter (SOM), is another essential environmental factor influencing soil Cd bioavailability. OM in soil can provide chelate sites for Cd fixation and decrease pH, which decreases the bioavailability of Cd. Implementing organic fertilizers including biochar, straws, and manure results in pH increase and also soil OM increase, both of which can facilitate Cd fixation in soil, making less Cd available for crops . In rice, wheat and maize, Cd decrease in grains, above-ground part and available Cd decrease in soil were reported when applying different organic fertilizers. A research on rice showed a 10 $ 30% decrease in brown rice was observed when straws were applied together with inorganic amendments . In maize, applying chicken manure increased Cd concentration in root but decreased Cd concentration in grains (Mwilola et al., 2020). Bashir et al. has reported in wheat using composted biochar and farmyard manure combined with ZnO-nanoparticles had a significantly impact on reducing Cd concentration in grains and roots. (Bashir et al., 2020).

Envirotype-to-phenotype (E2P) models for Cd accumulation prediction in food crop
Inappropriate environmental parameters will aggregate the Cd contamination risk in food crops. To control food Cd contamination risk, knowledge of the relation between soil parameters and Cd accumulation capacity is essential. One key approach is to develop high accurate predictive models. To date, several prediction models based on grain Cd in food crops and soil properties such as soil total Cd, pH, SOM and clay content have been established.
Linear regression models were the most common and simplest model in grain Cd concentration prediction. The Cd concentration in grains or its log value (log(Cd-rice or wheat or maize) have been predicted by many soil factors which influence the concentration in grains in different researches. Using log(CaCl 2 -Cd), (HNO 3 -Cd), pH, log(Clay) and log(SOM) together, the accuracy (R 2 ) could reach 66.1% (Brus et al., 2009). Multiple regression methods can help. For example, using multiple regression analysis, R€ omkens et al. successfully predicted Cd concentration with an accuracy of 80.8 $ 87.3% by the model lg(Cd)¼ f Ã log(HNO 3 -Cd)þg Ã pH þ h Ã log(cation exchange capacity, CEC)þK (R€ omkens et al., 2009). Using stepwise multiple regression analysis method, three different models based on different characteristic values of Cd (Total Cd, HNO 3 -Cd and EDTA-Cd) were separately constructed by Wang et al. with a relatively high accuracy (R 2 ¼0.765-0.857) (Wang, Su, et al., 2020).
In smart age, prediction accuracy is improved by new algorithms like machine learning and deep learning. By using several machine learning models including gradient boosting regression tree (GBRT), random forest (RF), fully connected neural network (FCNN) and support vector regression (SVR), Gao et al. predicted root concentration factor of organic contaminants with higher R 2 (GBRT: 0.76, RF: 0.71, FCNN: 0.79, SVR: 0.68) than traditional linear model (0.62), indicating that these models can be also used to predict complex environment-and-genotype controlled phenotype . In another research, machine learning models convolutional neural network (CNN) and supporting vector machine (SVM) showed significantly higher R 2 (0.82 and 0.80, respectively) and back propagation neural network (BPNN) showed a lower R 2 (0.62), performing better than linear model Bayesian Linear Regression (BLR, 0.52) and Bayesian Ridge Regression (BRR, 0.51) . In wheat focusing on grain Cd accumulation, a model combined of multiple Gaussian and logistic Regression and RF was built and the reliability was 69%-82%, proving the feasibility of applying machine learning methods in Cd concentration prediction (Nguyen et al., 2021). Though not generally applied in Cd concentration in grains yet, new developed machine learning models based on environmental factors still have the potential of precisely detecting the relationship between Cd concentration in food crops and environmental factors and accurately predicting Cd concentration in food crops.

Safety threshold prediction and implementation
Safety thresholds are usually measured by ecological and environmental effect method. Based on the law of Cd transferring and transforming in the soil-cultivar system, the relationship between Cd concentration in soil and cultivar can be revealed by constructing E2P models. Based on such relationship, soil safety threshold can be estimated accordingly on the basis of the limit standard for food which marks the critical heavy metal concentration. In rice, Mu et al. estimated the soil safety thresholds of Cd using high-Cd accumulator (HCd) and low-Cd accumulator (LCd). The safety threshold of soil Cd for HCd was 0.27 $ 1.00 mg/kg, significantly lower than LCd (4.52 $ 46.9 mg/kg) (Mu et al., 2020). Gao et al. selected a medium-sensitive cultivar (Denong 2000) and tested it in 19 different paddy soils to acquire the safety threshold. As a result, the value of soil safety threshold increased when pH and soil OC increased. The soil Cd safety threshold was the lowest (0.34 mg/kg) when pH < 5 and OC ¼ 10 g/kg, while the highest (0.94 mg/kg) when pH > 7 and OC ¼ 30 g/kg (Gao et al., 2021). In wheat, the soil Cd safety thresholds both in alkaline (0.32-0.72 mg/kg, 0.19-0.44 mg/kg and 0.07-0.16 mg/kg when CEC were 5, 10 and 20 cmol/kg, separately) and acid soil (0.45-0.75 mg/kg, 0.48-0.80 mg/kg and 0.54-0.90 mg/kg when CEC were 5, 10 and 20 cmol/kg, respectively) were all significantly lower than in maize (0.89-1.85 mg/kg), indicating the interspecific difference of Cd accumulation characteristic between wheat and maize. (Zhuang et al., 2021) Generally, taking environmental parameters which have significantly impacts on grain Cd accumulation including pH, soil OC, CEC and soil total Cd or soil extractable Cd into consideration to estimate the soil safe threshold can assure the safety of agri-products planted on the soil. By reasonably managing such soil factors, cultivars with optimal Cd concentration with other important factors such as low-As concentration can be acquired (Honma et al., 2016). Soil microbes as another important but hard-measuring factor is also vital in grain Cd accumulation by influencing rhizosphere bacterial community and expression of plant genes . It is also noticeable that different levels of safety threshold could be acquired using different cultivars with different Cd accumulation characteristics, indicating that species and genotypic factors should also be taken into consideration.
It is also vital to keep the environmental factors stable in agricultural production. With smart age comes the concept of smart farming, which means precisely monitoring, analyzing and adjusting the environmental factors to improve quality and quantity of agricultural products by using an information system combining Internet of Things (IoTs), AI, cloud computing, edge computing, big data and so on. One case in Japan using smart model 'NoshoNavi1000' has proved the feasibility of smart farming system, resulting in increased scale of agricultural production (in farm H in Shiga Perfecture by increasing 3 ha and 8 ha optimal planting area), decreased tensity of labor and maximized economic effects (in farm B 50% decreasing of seedling cost and in farm Y increasing of 5% yield) with automated and smart agricultural devices . Though such a system is just getting started and has not been applied in low-Cd agricultural production yet, it can be gradually achieved with the development of computer technologies and smart devices. For example, sensors are applied to monitor soil properties including soil moisture, nutrition which has significant impacts on Cd accumulation in grains. Then the big data collected directly from field can be transferred to data processors in which choices and plants can be made according to soil Cd safety thresholds parameters. Then the automated adjusters adjust and control accordingly (Figure 3). To make the environment suitable for low-Cd production, the promptness and robustness of data acquisition, accurate calculation and prediction, precise and in-time regulation is urgently needed, bringing new opportunities and challenges. We believe in the near future, with the development of IoTs and smart devices, smart farming further will be eventually applied in low-Cd agricultural production.

Conclusions
Cadmium contamination risk in food crops is getting attention due to the nonnegligible and irreversible health hazard to human's daily life. 'G þ E' dual-engines will be the major driving force of Cd risk-control in crops in the future. On the one hand, considering genotypic factors, the metaanalysis has discovered significant difference of Cd accumulating characteristic in main food crops: rice and wheat higher than maize. By choosing crops with low-Cd accumulation capacity, maize for example, Cd accumulation in grains can be minimized as possible in arable land with high-Cd contamination risk. Considering the difference revealed by meta analysis, smart creation in high Cd accumulating crops can be applied using multidisciplinary smart hybrid and design. By excavating and pyramiding superior alleles from natural population or designing and implementing artificial variations, more low-Cd superior germplasm could be created and applied in low-Cd agricultural production, producing Cd-free food crops for mankind. On the other hand, considering environmental factors, by decoding the relationship between phenotype and environmental type and ascertaining the safety production threshold, and combining with big data analysis, environmental factors monitoring, smart decision offering, auto-controlling and future-planning, agricultural production can be carried under such safety production threshold with the consideration of envirotype. Figure 3. The concept of food crop Cd contamination risk control utilizing smart analysis-sensing-control system combining soil safety threshold data. Now the age of smart has come, with which Cd contamination risk control in food crop revolution has been put on the agenda. The future Cd pollution risk control will be a big data-driven, multidisciplinary integrated solution. In some fields like GS or G2P models there has been a preliminary progress, but the technical barriers still exist in most of the fields. Prospective top-level designs and combination with downstream industrial-chains on smart risk control are urgently needed. To protect food security and improve germplasm competitiveness, research institutions and enterprises should be encouraged to find their position and achieve complementary advantages solving scientific theory problems and engineering technique problems of smart control, thus facilitating the leap of engines from experience driven to G þ E driven in smart age.