Deep learning of dispersion engineering in two-dimensional phononic crystals

To control wave propagation in phononic crystals (PnCs), it is crucial to perform the inverse design of dispersion engineering. In this article, a robust deep-learning method of dispersion engineering in two-dimensional (2D) PnCs is developed by combining deep neural networks (DNNs) with the genetic algorithm (GA), which can be easily extended to reach any target in the trained DNNs’ calculation domain. A high-precision and robust DNN model to predict the bounds of energy bands of 2D PnCs is proposed, forming the forward prediction process. This DNN model shows high efficiency in the testing structures while keeping the mean relative error near 0.1%. The inverse design of PnCs is implemented by DNNs combined with the GA, building the back–forward retrieval process, which can exactly produce the desired PnCs with the expected bandgap bounds in only a few seconds. The proposed framework is promising for constructing arbitrary PnCs on demand.


Introduction
Phononic crystals (PnCs) are a type of functional medium formed by elastic solids or fluids, periodically arranged in another solid or fluid medium (Kushwaha et al. 1993). They have acoustic/elastic wave characteristics that do not exist or are difficult to realize in natural materials, showing promising application in the fields of information, communication, mechanics and medical engineering (Pennec et al. 2010). For arbitrary control of wave propagation, the inverse design of PnCs has become an indispensable way to realize wave properties on demand (Vasseur et al. 2011).
In general, several strategies can be adopted to design PnCs with diverse functionalities, such as acoustic filters, resonators, sources and lenses, acoustic signal processing and ultrasound imaging (Qiu et al. 2005;Jing et al. 2014;Kanno et al. 2014;Sukhovich, Jing, and Page 2008). To overcome the limitations of intuitive and empirical designs, topology optimization, as a classic inverse design method, shows great ability in the construction of many new high-performance PnCs, phononic topological insulators and other metamaterials (Zhang, Takezawa, and Kang 2019;Zhang et al. 2017;Li et al. 2019;Li et al. 2018;Dong, Wang, et al. 2017;Nanthakumar et al. 2019). The key to realizing the fast and even real-time inverse design of expected PnCs with requested functionalities is to establish a highly efficient mapping from various structure to wave characteristics.
Deep learning (DL) is part of the machine learning (ML) methods based on artificial neural networks (ANNs) (Deng and Yu 2013;Goodfellow, Bengio, and Courville 2016;Schmidhuber 2015). The earliest study related to ANNs dates back to the 1940s (McCulloch and Pitts 1943). Since then, many DL architectures have been presented by researchers (Goodfellow, Bengio, and Courville 2016). The deep neural network (DNN) is one of the most used DL architectures in pattern recognition, computer vision, material design, bioengineering, etc. (Schmidhuber 2015). Compared with the multi-layer perceptron (MLP) (Haykin 1994) and extreme learning machine (ELM) (Huang et al. 2012), DNNs usually have larger scale and can predict more complex problems (Deng and Yu 2013;Goodfellow, Bengio, and Courville 2016).
There have been many applications of DNNs or ANNs in artificial material design, such as predicting the gain and noise of photonic crystal fibre amplifiers (Zibar, Wymeersch, and Lyubomirsky 2017), computing optical properties (da Silva Ferreira, Malheiros-Silveira, and Hernández-Figueroa 2018) and predicting the size of crystallite and the energy bandgap of ZnO quantum dots (Pelicano et al. 2017). Tahersima et al. (2018) realized the inverse design of integrated nanophotonic devices using DNNs. Li et al. (2020) developed an inverse design of PnCs with anticipated bandgaps through a DLbased data-driven method. Luo et al. (2020) discussed the interactive inverse design of layered PnCs based on reinforcement learning. Most of those previously developed ANNs establish prediction models for parameter-defined structures, such as a single-parameter defined photonic crystal, and output some mechanical property. These ML methods are similar to some simple mathematic parametric optimization processes. They do not have enough flexibility to design structures for complex requirements.
In this article, an optimization framework with two processes for dispersion engineering of PnCs is proposed. The first process, i.e. the forward process or the prediction method, containing dual DNN methods-the simple deep neural network (S-DNN) model and the complex deep neural network (C-DNN) model-is developed in Section 2.3. A high-precision and highly robust DNN can improve the accuracy and performance in a later genetic algorithm (GA) inverse design process. The basic theory of DNNs and the details, accuracy and robustness of the predictive model are discussed. The second process, i.e. the back-forward process or the inverse design method, containing a genetic optimization algorithm, is proposed in Section 2.4. Three corresponding inverse design demonstrations and an evolution curve are given in Section 3.2 to prove the feasibility of this GA-based inverse design method. The computational performance is given in Section 3.3 to show the potential of DNNs based on the GA.
These two processes are combined to achieve the inverse design and to solve the following difficulties: (1) a long time consumption is needed in finite element method (FEM)-based optimization frameworks; (2) only forward prediction can be made in an MLP or ELM framework; and (3) the optimization domain is too tiny in models using ML only.
In the forward process, the dual DNN methods are proposed to predict the upper and lower edges of the energy bands of two-dimensional (2D) PnCs with multi-structural parameters. These prediction tasks are beyond what curve fitting and traditional mathematical methods can achieve. In the back-forward process, the inverse design of a PnC is performed by optimizing its parameters by combining the GA with DNNs. Three optimization cases are presented, including maximizing the third order bandgap, maximizing the fifth order bandgap, and maximizing the third and fifth order bandgaps simultaneously. After training a satisfied ANN model, any other optimal target can be obtained within one training time. The proposed inverse design methodology can be easily extended to design PnCs with other required properties.

Method
The basic theories of 2D PnCs in the in-plane wave motion, the basic theory of neural models, and the S-DNN and C-DNN models are introduced in this section, along with the PnC structures.

Wave motions of PnCs
This subsection will discuss the fundamental wave equation and FEM of a square-latticed 2D PnC formed by two different isotropic elastic solid materials.
The linear harmonic wave equation for the heterogeneous elastic medium is where λ and μ are Lame constants; ρ is the mass density constant; u is the displacement vector; ω is angular frequency; and ∇ is the del operator. For the 2D problem of elastic waves propagating in the x-y plane, the elastic wave fields are independent in the z-axis. Then, the in-plane and out-of-plane modes of elastic waves can be described by : (2) and respectively, where the vector r is the position (x, y). According to Bloch's theorem, the displacement vector u can be represented as where i = √ −1; u k (r) is a periodic function of r with the same periodicity as the structure; and k = (k x , k y ) is the Bloch wave vector.
For the numerical solving method of Equations (2) and (3) by the FEM, the discrete form of the eigenvalue equation can be written as where U is the column vector formed by the displacements at all the element nodes in the computational area (i.e. the unit cell); and K, M are the stiffness and mass matrices of the whole discrete system, respectively (Wang, Wang, and Su 2011). K and M can be represented as where B is the strain matrix; N is the shape function matrix; and V e represents the entire lattice region. The displacement matrix of the lattice can be represented as where the nodal displacement U i is According to Bloch's theory (Equation 4), the following relationship should be satisfied at the outer boundary of the unit cell: where a is the lattice constant vector. The eigenfrequency can be solved by Equations (5) and (10) with the given wave vector k. By substituting the eigenfrequency into Equation (5), the eigenmode U(r) corresponding to the eigenfrequency can be obtained. The energy band structure can be obtained by letting the wave vector k sweep the irreducible Brillouin region. This type of method is also known as the ω(k) method. To determine the bandgap, the boundary of the irreducible Brillouin region should be swept (Li et al. 2018).
In this article, FEM software COMSOL 5.4a is used to solve the band structure of the PnCs. After obtaining the band structure with the FEM, the relative bandgap width (RBW) is calculated to evaluate the quality of the bandgap. The RBW between the nth and (n + 1)th bands is simply calculated as where ω n is the absolute bandgap width between the nth and (n + 1)th bands; and ω c n is the midfrequency of the corresponding bandgap. If RBW ω n > 0, the structure has a bandgap between the nth and (n + 1)th bands.

ANN and DNN
ANNs are computing systems that are inspired by the biological neural network constituting the animal brain (Hassoun 1995). An ANN system tries to perform tasks without specific programming; and this process, driven by data and examples, is called 'learning'. Through the learning process, the model can extract latent relationships from the abstract training samples, which difficult for a human to find (Ghahramani 2015). As types of universal learning machine (Hornik 1991), the MLP (Hornik 1991), ELM, support vector machines and DNNs can be trained with an amount of appropriate labelled data if the model is suitable for training. In the present prediction model, the model input is a set of parameters relating to PnCs and the target output is the upper and lower energy band edges of the PnCs. The input data and output data have been scaled on a similar magnitude, and the output (frequency of band edges) is scaled to one ten-thousandth.
Similarly to the MLP and ELM, DNNs realize part of their approximation capabilities through their multi-layer feed-forward architecture (Hornik 1991;Schmidt, Kraaijveld, and Duin 1992). Cybenko (1989) gave the first proof of this classic universal approximation theorem for sigmoid activation functions. Compared with the other two methods, which have fewer layers, the universal approximation theorem for DNNs allows the depth to grow deeper (Deng and Yu 2013). A DNN with a rectified linear unit activation function can approximate any Lebesgue integrable function if the width of this network is strictly more significant than the input dimension (Lu et al. 2017).
The PnC structure can be designed by training a predicting process, as shown in Figure 1, and an inverse design process that combines a GA with the predicting method. The DNNs replace the time-consuming FEM in the fitness estimation process of a traditional GA. The FEM is only applied in the postaccuracy validation process in the present PnC design method. A structure designed by the DNN, its band structure and the transmission spectrum are given in Figure 1(c).
The input data can be described as where N represents the capacity of the data set; and a i and l i are the input and the output for the ith sample. N train = 5800 for the training set and N test = 533 for the test set. For every sample, a is a nine-dimensional vector and can be described as where a m i represents the mth input parameter of the ith sample (with m = 1, 2, 3, · · · , 9 for the present study). For every sample, the DNN output l is a four-dimensional vector and can be described as where are the four outputs of the DNN corresponding to the upper edge of the third order band, the lower edge of the fourth order band, the upper edge of the fifth order band and the lower edge of the sixth order band.
In the training program, the back-propagation gradient algorithm is used to update the weighting factor θ and deviation (Johansson, Dowla, and Goodman 1991;Hansen and Salamon 1990). The improved Adam optimizer is used to complete the learning. The updating process can be expressed as where t is the time step; θ(t) is the weighting factors of the time step t; ||.|| is the Euclidean space distance; α t is the learning rate of the time step t and is a variable controlled by the Adam iterator; and ∇J(θ(t)) is the cost function: where MSE(t) is the neural network's mean square error (MSE) of the time step t; N is the sample size of the training set or test set; l i and y o i correspond to the real value and the output value, respectively; and i represents the ith output sample.
The gradient vectors can be determined by the back-propagation algorithm (Hornik 1991;Johansson, Dowla, and Goodman 1991;Schmidt, Kraaijveld, and Duin 1992). The back-propagation gradient descent algorithm is used to optimize the developed algorithm in this article. The backpropagation algorithm is a measure of the effect of each parameter θ j on the current cost function ∇J[θ (t)], especially MSE(t). The regularization term will not participate in the gradient estimating process in training iterations (Svergun 1992), so the MSE(t) function can be derived as where δ o i is the sensitive term of the output-layer neurons. The following derivation describes how to add the regularization coefficient β to the gradient descent process (Svergun 1992).
The process of updating parameter θ involves finding the optimal solution of the cost function J by the gradient descent algorithm, which can be described as which can be expanded to where θ j is the jth parameter θ; α is the learning rate; β is the regularization coefficient term; and x i represents the ith parameter output by θ j . Equation (19) shows that application of the regularization coefficient is to multiply all θ j by a constant (1 − αβ/2N) at each update. The constant is slightly smaller than 1, depending on the learning rate, the total sample size and the regularization coefficient.

Establishment of C-DNN and S-DNN
The DNN prediction model is established and the architectures of S-DNN and C-DNN are introduced in this subsection. The training process for the two models can be described as follows: Step 1: Use the finite element software to generate raw data.
Step 2: Use MATLAB ® to extract the raw data for neural network inputs.
Step 3: Train the DNN model.
Step 4: Evaluate the model. If the test passes, go to Step 5; if the test fails (the precision cannot reach the demand of the present method), return to Step 3 and then adjust and retrain the network.
Step 5: Solidify the model for the use of back-forward retrieval model.
The target of the model is to predict simultaneously the upper edge of the third order band, the lower edge of the fourth order band, the upper edge of the fifth order band and the lower edge of the sixth order band of a 2D square-latticed PnC. The four outputs can also be recognized as the widths and positions of the third order and fifth order complete bandgaps. The two bandgaps are chosen because they are more flexible and controllable in this topology. If there is a design requirement for a direction bandgap, the frequency at a specified vector point can also be predicted by training several networks.
The PnC topology with three free movable circles (defined by nine parameters) and a fixed circle in the centre is introduced for the neural network (Figure 2a). A similar topology was also used by da Silva Ferreira, Malheiros Silveira, and Hernández Figueroa (2017) when studying deep-learningbased optimization of a photonic crystal with a wide bandgap. A topology with more geometric parameters should be more flexible in the design of PnCs with wide bandgaps. However, a neural network with more input variables will consume more training data sets and unaffordable computational resources. On the other hand, a topology with fewer parameters (e.g. two movable circles with six parameters) cannot provide enough design space. The presented topology can fit the purpose of this article (i.e. to demonstrate the applicability of DNNs in the bandgap design of PnCs).
This article develops two kinds of DNN: a simple one called S-DNN and a more complex one called C-DNN, which can achieve greater accuracy. The S-DNN model (Figure 2b) contains seven hidden layers and one input/output layer. The input variables are described in Figure 2(a). Each hidden layer contains 500-1200 neurons and uses a variety of activation functions, including softplus, rectified linear unit (ReLU), exponential linear unit and sigmoid activation functions, to accelerate the convergence and achieve nonlinear fitting. The network takes nine neurons as inputs, four neurons as outputs and seven hidden layers containing 1000 neurons (all linked with sigmoid activation functions). The weight bias matrix can be initialized as the normal distribution. To reduce the zero drift and solve the overfitting problem, a C-DNN model with four branches is developed. The branches of the C-DNN model are similar to the S-DNN model, and the weighting factor w of these four branches is not shared, but disconnected and updated independently. The structure of the C-DNN model is shown in Figure 2(c).
A complete C-DNN model is combined with four S-DNN branches. Each branch has the same structure and scale as the independent S-DNN model. The four branches of the model are trained separately. When a branch achieves the target-precise requirement, it is frozen. When all four branches have frozen, the C-DNN model is considered a well-trained model. In this way, the lower system error of the neural network, specific output bias and mean square error can be achieved. The C-DNN model is the integration of four split S-DNN models. This four-part model needs to be controlled separately in the training process, and the weighting factor w needs to be optimized continuously by oscillation to achieve better convergence. The primary source of bias in the S-DNN model is the deviation of training sets, rather than the lack of prediction ability by the DNN. Therefore, the C-DNN model with the branches of the S-DNN can reduce the deviation of training sets better than 4 × S-DNN, and consumes fewer computing resources than 4 × S-DNN.

Back-forward retrieval model
This section develops the back-forward process for the inverse design of a 2D PnC using the GA combined with the more accurate C-DNN method. The design of PnCs has an excellent implications for wave control and wave manipulation (Khelif et al. 2006). In practice, the goal is to design a PnC component with the optimal characteristics to satisfy the complex environment. In recent years, as a systematic design method of target orientation, structural optimization design (Guo, Wang, and Han 2010) has become irreplaceable in mechanics, astronomy, agriculture, ocean science, etc. Researchers are gradually becoming more aware of the limitations of classical discretization methods such as the FEM, and want to achieve inverse design without using classical discretization methods. Anitescu et al. (2019) solved the forward as well as the inverse partial differential problem using DNN and an adaptive collocation strategy.
The GA is an evolutionary algorithm that realizes the structural optimization design and has been widely used in PnC optimization Zhang, Takezawa, and Kang 2019;Li et al. 2019). The GA does not require the continuous differentiability of objective functions, and the definition domain can be assumed arbitrarily. These characteristics have made the GA a widely used optimization algorithm. However, the direct use of neural networks will encounter difficulties such as non-unique solutions; that is, the same or similar physical properties may correspond to several different structures. In this case, the cost function of the DNN will work hard to achieve convergence, as the optimal solution swings between distinct designs, and the DNN will be hard to train. Some tricks or networks can avoid this difficulty, such as the adversarial auto-encoder (Fan et al. 2020). However, generative adversarial networks will lead to a tremendous amount of resource consumption, and will be more challenging to train because of their recurrence process. Therefore, the direct inverse design DNN is hard to train in PnC dispersion engineering, and so the present article tries to combine the GA with DNN. In this way, the network can be easily trained and put into use, and the inverse design can be realized.
The optimization problem of the present DNN-based method can be described as ) where D is the computational domain of the designed structure properties (the properties can be characterized by the geometric parameters, material parameters and boundary constraints); and f is the objective function of the optimization problem. The basic theory and derivation are given in Supplementary Note 6. In Section 2.3, two high-precision DNN models were established, of which C-DNN achieves higher precision in most estimation indices. In this subsection, the inverse design of a PnC with a given bandgap can be organized as follows: Step 1. Set the prediction aim, i.e. the upper and lower edges of the given bandgap, set other iteration parameters and create an initial population.
Step 2. Estimate the fitness of the GA population with the C-DNN model.
Step 3. Choose individuals by a roulette method. The probability of each individual is related to its fitness.
Step 4. Crossover the population and execute the genetic operation.
Step 5. Shut down the program if the precision is satisfied or the script reaches the maximum iteration number; otherwise, repeat the process from Step 2.

DNN prediction results
This section presents several specific predictive structures, which are selected as typical examples reflecting the maximum error and minimum error. These specific examples demonstrate the bias tendency and distribution of the S-DNN.
In some theories, the selection of the training set can affect the training results. Unbalanced data sets can also affect the results. The model can extract and learn fewer features in the sparse area, but it can learn more features in the dense area. A chance index ϑ is proposed to reflect the density of data-set distribution and to evaluate whether the data-set distribution is related to the training results. The index ϑ can be presented as: where bias x i is the bias of each coordinate from its centre x i . The absolute value of centre x i for x i = x 1 , y 1 , x 2 , y 2 , x 3 is 0.25a, and centre y 3 is 0. It is worth noting that ϑ is only a factor to measure the chance of seed occurrence, instead of a probability (in fact, the integral of ϑ in the domain of definition is not equal to 1). For seeds with the same ϑ, the probability of the seeds' occurrence is equal. The value of ϑ increases linearly with the probability. Three examples which have relatively large errors in all samples are selected at random. This larger error group is shown in Figure 3 Three examples which have relatively small errors in all samples are also selected at random. This smaller error group is shown in Figure 3(d)-(f). The smaller error group contains samples that have test errors of less than 0.05% in C-DNN training.
It can be seen from Figure 3 that if a structure is related to a larger chance index ϑ, it usually has a larger error. This relationship is consistent with the distribution of data and the DNN-learned feature. Thus, if the target structure may be near a particular area, the script should give more structure near this region to avoid false-positive results. The performance of the DNN is estimated in Section S.2.4 in the Supplementary material. The DNN shows a significant advantage in specific prediction tasks.

Back-forward retrieval model results
Four specific demonstrations are employed to estimate the performance of the GA based on DNNs. Three demonstrations are targeted to maximize the specific bandgap (described in Section 3.2.1-3.2.4); and one demonstration is targeted to design an arbitrary bandgap (described in Section S.4.1 in the Supplementary material). These four demonstrations are based on a pretrained C-DNN model (as estimated in Section S.2.2 in the Supplementary material) to obtain higher precision. The parameters of the GA are selected as: the population size Np = 500, the maximum number of generations En = 2000, the crossover probability Pc = 0.4 and the mutation probability Pm = 0.3.

Demonstration 1: third order bandgap design
For the third order bandgap, the script is set to find the maximum width of the bandgap (i.e. the GA target). The program achieves good convergence after 200 iterations, which costs about 8 s. The output seed in the 2000th iteration is * The FEM results are close to the C-DNN results. The resulting structure and its first five energy bands are given in Figure 4(a). The neural logarithmic function is used to plot the distribution density, which can be formulated as num inFigure = ln(num + 1).

Demonstration 2: fifth order bandgap design
When the fifth order bandgap is maximized by the GA script, the output seed in the 2000th iteration is * which are close to the FEM-verified results: 3.6370, 5.6573, 6.1265, 7.3799] × 10 4 The resulting structure and its first five energy bands are given in Figure 4(b).

Demonstration 3: two-target design
If the GA target is set to maximize the third order and fifth order bandgaps simultaneously, the script output is * 0.1991, 0.1578, 0.1531, −0.2717, −0.2598, −0.2963, 0.2265, 0.2537, −0.1345] The maximum band edge positions calculated by the GA with C-DNN are 3.5751, 5.8441, 6.4291, 7.5218] × 10 4 The band edge positions verified by the FEM are close to the C-DNN results. They can be presented asŷ 3−5 = [3.5801, 5.8455, 6.4250, 7.5213] × 10 4 The resulting structure and its first five energy bands are given in Figure 4(c). A fitness function can be developed as where n = 4 is the output dimension;ŷ i and y * i are the C-DNN and FEM outputs, respectively, corresponding to the ith output; and i is the bandgap width defined by 1 = 2 =ŷ 2 −ŷ 1 and 3 = 4 =ŷ 4 −ŷ 3 . The fitness for the third order bandgap optimization is F 3 = 99.72%; the fitness for the fifth order bandgap optimization is F 5 = 99.68; and the fitness for the two-target design is F 3−5 = 99.82%. In the above three back-forward retrieval design demonstrations, the four results (the upper edge of the third order band, the lower edge of the fourth order band, the upper edge of the fifth order band and the lower edge of the sixth order band) have been output exactly and precisely. This process shows that the GA can find the widest bandgap exactly and the DNN model can be used precisely as an auxiliary of the GA.

Summary of the four demonstrations
From these four demonstrations, especially demonstration 3, the DNN-based GA is shown to be strongly robust. Once a satisfied DNN model has been obtained, the DNN combined with the GA can be used to realize the corresponding back-forward retrieval design and optimization. Multi-objective designs can be set in the DNN-based algorithm. This advanced algorithm is of great significance in realizing multiple designs or real-time designs in engineering applications.
It worth noting that different topologies have different band structures and it is appropriate to have a larger ω3 in the topology considered here.  and Li et al. (2018) proposed ω5 dominating topologies. These topologies usually have multiple scatterers in the matrix or the matrix material is filled into the scatterers. Besides, the results will be more reliable if multiple attempts at the same target can be made (the results may sometimes converge to the local optimum).
The results can be reproduced by running demonstration 3 six times (from a random starting position), and the corresponding evolution curves (simultaneous two-target design) are shown in Figure 4(d)-(f). The evolution curves all converge stably and achieve high fitness. The initial position does not affect the final result because the script has enough a large enough population and sufficient variation to ensure that the algorithm does not encounter the local-optimal solution.

Computing performance of GA and back-forward retrieval design
The DNN-based GA is a fast algorithm which can handle hundreds of iterations in 15 s. In a GA optimization running with a population of 500, the Python script takes 40 ms per iteration. The complete optimization process is finished in 8 s.

Discussion and conclusion
This article first develops two high-precision DNN models, named the S-DNN model and the C-DNN model, with a significantly smaller data set than used in previous models. The second part of the method is achieving the back-forward retrieval design of a 2D PnC. The first part is also called the forward process. In this process, both DNN models take nine parameters as the input, run through seven hidden layers and output four energy band edges simultaneously. The C-DNN model achieves higher precision than the S-DNN model by adding three additional branches (where each branch has the same scale as the S-DNN, containing 500-1000 neurons). For the C-DNN model, the network can achieve high precision on 96.1% of samples, with a mean relative error of less than 0.3%. In a typical training run, the mean absolute error of the C-DNN model is 56.6 Hz and the mean relative error is 0.112%. The performance is tested on a test set containing 533 samples. The training process achieves convergence in 24 h, predicting that the complete test set costs about 500 ms on a single-card graphics processing unit (GPU) device.
The model achieves higher precision, by two magnitudes, than in similar existing work [0.1% relative error in the present work compared with 5-8% relative error in the existing work (da Silva Ferreira, Malheiros Silveira, and Hernández Figueroa 2017)], while a smaller data set is used for training [the training set contains 5800 sets of the sample in the present work compared with the training set of 50,000-100,000 sets of the sample in the previous work (da Silva Ferreira, Malheiros Silveira, and Hernández Figueroa 2017)]. The trade-off of this improvement is the more extensive network and longer model training time. There are fewer than three samples in each dimension, on average, for a nine-dimension problem: the DNNs must find inner relationships from the nine parameters to the four bandgap edges, in contrast to many training tasks which just involve simple curve fitting. This task demonstrates the tremendous potential of DNNs for mechanical problems, if the input and output are set reasonably.
Secondly, a DNN-based GA script is developed. Differently from the FEM-based GA, this GA is based on the DNN and it achieves extreme speed to make a back-forward retrieval design. For the maximizing bandgap problem, the GA takes 8 s to achieve convergence. The feasibility of the DNNbased GA in the design of single-objective and multi-objective optimizations of PnCs is verified in three optimal demonstrations. The back-forward retrieval design results have passed the verification in the FEM, and three optimal tasks passed the verification successfully. Near-optimal topological prediction, by combining DNNs or ANNs with a topology optimization method, can be realized in the future.
Overall, the DNN-based inverse design method is an efficient way to realize the real-time customized design of PnCs and metamaterials. Researchers can realize the inverse design of PnCs' bandgaps by this method, and design metasurfaces with specific phase variation and transmission. In addition, a convolution neural network can be used to predict the grid structure map (or binary map) of PnCs and metamaterials with topology changes and realize the inverse design. However, the more complex the PnCs that the DNN can predict and design, the more difficult it is to achieve good training. A DNN with less predictable features needs only 10 4 data sets to be well trained, while a widely used DNN may need 10 5 -10 7 data sets to be well trained. And the generation of data sets will be a hard problem to ignore. Researchers need to think carefully and weigh up whether they need a DNN with extensive predictive ability or to decompose their problems and design them step by step to reduce the difficulties in training and design. In summary, design methods based on DNNs should be studied further and used to solve more practical engineering problems.