Deep-Learning-Based Metasurface Design Method Considering Near-Field Couplings

Planar metasurfaces have been applied in several fields. Near-field coupling is typically neglected in traditional metasurface designs. A numerical modeling method for macrocells that considers near-field couplings between meta-atoms is proposed. A deep neural network (DNN) is constructed to accurately predict the electromagnetic response from different macrocells. Transfer learning is employed to reduce the number of the training datasets. The designed neural network is embedded in the optimization algorithm as an effective surrogate model. Both the deflector and high numerical aperture (NA) metalens are simulated and optimized with our design framework, approximately 30% improvements of efficiencies are achieved.


I. INTRODUCTION
M ETASURFACES have attracted considerable attention [1] because of their ease of design and fabrication. Recently, the research and application of metasurfaces have extended to various fields, such as planar antennas [5], holographic imaging [6], beam reshaping [7], and cloaks [12].
In the traditional design of metasurfaces, meta-atoms are considered to be under periodic boundary conditions, where they are distributed infinitely in space [14]. However, the actual boundary condition of the metasurface is not periodic; each meta-atom is usually arranged around different meta-atoms to achieve spatial modulation, which is accurate only if the coupling differences between them are negligible [15]. The presence of such near-field couplings significantly affects the metasurface performance. Several studies have proposed solutions to this problem. For example, the local phase method [15] utilizes the equivalence principle to obtain the equivalent sources of each meta-atom, and the phase and amplitude of each equivalent source are calculated as the phase and amplitude of the meta-atom. For a high numerical aperture (NA) metalens, the power at the focal point is 15% higher than that obtained using the traditional design scheme [15]. In [16], the entire metasurface is decomposed into many subdomains containing several meta-atoms, and the electric field of the whole array is approximated by solving the scattered fields of these subdomains under periodic boundary conditions considering the near-field couplings between meta-atoms to a certain extent [16]. The overlapping-domain decomposition approximation is extended in [17], where an overlapping region between adjacent subdomains with the boundary condition of perfectly matched layers (PML) is introduced to improve the accuracy of the near-field coupling approximation. However, in the design and optimization process, it is difficult to considering the near couplings, because the increase of parameters in neighboring meta-atoms will lead to exponential computation complexity.
In metasurface design, obtaining the relationship between the parameters of the dielectric meta-atoms and electromagnetic (EM) response is critical. Generally, EM algorithm-based simulation software is used to obtain the response; however, this requires expensive time and memory [18]. Deep learning method is a new alternative choice for extracting the high order nonlinear relationship between the input simulation parameters and output EM response. Because of the strong nonlinear fitting ability of neural networks, they have widely applied for optical device design [19] and inverse design of the meta-atoms [19], [20], [21], [22], [23], [24], [25], [26], [27], [28], [29], [30], [31], [32], [33]. These inverse design models are usually based on the generative adversarial network (GAN) or concatenation of forward and inverse networks. In [21], an inverse design model in a probabilistically generative manner is proposed to obtain the deterministic relationship between the physical structures and optical responses. The nonunique mapping problem to obtain the complex structure-performance relationship is elegantly solved to a certain extent. In [24], a deep learning based inverse design framework is proposed for multilayer meta-atoms design, to avoid huge computational resources consumptions in conventional parameter-scanning design method. In [25], machinelearning based model is proposed for the automatic design of multifunctional metasurfaces, to pursue the ultimate functionalities limits for given meta-atoms. Low structural complexity can be achieved with respect to traditional design methods.
However, the element's near-field coupling effect on the metasurface design are not fully considered in the deep-learning based methods mentioned above. In this study, the slide overlapping domain method (SODM) is proposed, and the near-field couplings between meta-atoms are considered during the design, which is briefly reported at the conference [34]. In order to mitigate the huge computational resources requirements during optimizations, the deep neural networks (DNNs) are used to realize electric field prediction considering near-coupling. A similar approach that considering near-field couplings is recently proposed in [35], where convolutional neural networks (CNNs) are trained by images of meta-atoms. Whereas we use full connected neural networks and numerical structural parameters to reduce computational cost. Furthermore, we employ transfer learning method to reduce the number of the training datasets. During the optimization process, the real and imaginary part of ideal electric fields instead of the phase and amplitude are defined as the optimal goal. The proposed scheme is used to design and optimize several typical metasurfaces with different functionalities, and it can achieve approximately 30% improvement in efficiencies for EM field manipulations.

A. Sliding Overlapping Domain Method
First, a simple metasurface meta-atom is designed in which a rectangular element of GaN is placed on the Al 2 O 3 dielectric substrate. The design is carried out under visible light illumination at a wavelength of λ = 550 nm. The refractive indices of GaN and Al 2 O 3 at the given wavelength are 2.56 and 1.77, respectively. The dimensions of the GaN material (denoted in brown) are length l, width w, and height h (where w = l). For the meta-atom, the GaN structure is placed on the substrate, as shown in Fig. 1(a), or embedded in the substrate, as shown in Fig. 1(b). For an embedded structure, a small refractive index contrast reduces the field confinement inside the GaN particles and increases the near-field couplings between them [15].
In this study, transmissive metasurfaces are designed where the incidence is from the bottom of the structure. To reduce the loss of energy while passing through the entire metasurface, the transmission coefficient of the meta-atom must be larger than 0.9. Further, the design should satisfy the phase coverage of 0-2π by changing the meta-atom side length l to achieve complete phase control. The meta-atom is modeled and simulated using  Fig. 1(c). The red dotted line indicates the transmission coefficient, which is greater than 0.9, and the solid blue line indicates the phase curve, which covers more than 2π.
The embedded structure is shown in Fig. 1(b). For the phase profile can completely cover 2π with high transmission efficiency, the structure parameters (h, l, p) of the embedded meta-atoms are set differently from those of the non-embedded meta-atoms. the height (h) = 1000 nm, the period (p) = 165 nm (0.3λ), and the length (l) varies from 50-150 nm. As shown in Fig. 1(d), it can be observed that the transmission coefficient of the embedded meta-atom is slightly inferior to that of the non-embedded case, however it remains almost above 0.9.
In the traditional design approach, as shown in Fig. 2(a), each meta-atom in the array is designed with a periodic boundary condition and the surrounding meta-atoms are identical. However, the near-field couplings between the meta-atoms results in a certain phase error between the periodic and actual distributions in the designed metasurface, which affects the accuracy of the phase modulation [15]. For the proposed SODM that considers the near-field couplings between meta-atoms, as shown in Fig. 2(b), the target meta-atom in the actual array and its adjacent meta-atoms are simulated as a whole macrocell. The transmission EM response of the intermediate target meta-atom is achieved. The macrocell is obtained by slide decomposition for each meta-atom in the array, as shown in (b). This is equivalent to simulating the macrocell in the metasurface with different meta-atoms considering near-field couplings.
The near-field couplings to the target meta-atom gradually decrease or even become negligible with an increase in the number of neighboring meta-atoms for the design of the central target meta-atom. To achieve a balance between the approximation accuracy with adequate neighboring meta-atoms and cost of dataset generation, the number of test meta-atoms is  determined to be N = 3; each macrocell has seven meta-atoms, with three neighboring meta-atoms at each side of the target meta-atom, as shown in Fig. 3. For the boundary of the x-axis of the macrocell, both periodic [16] and PML boundary [17] conditions are effective for different metasurface design. Slight differences between them are discussed in detail in [17]. In this study, the periodic boundary conditions can express the near couplings between neighboring macrocell structures more accurately, numerical experiments as will be shown in Section II-B validate the accuracy of the proposed method.

B. Deep Learning Training and Results
The optimization process involves hundreds and thousands of simulations of the entire metasurface. To avoid time-consuming full-wave simulations, the Huygens-Fresnel principle [36] is used to calculate the electric field U (x, y, z) by determining the electric fields at the source points (ξ, η, 0): where the source point is denoted as P (ξ, η, 0), the target point in space is P (x, y, z), the electric field at the source point is U (ξ, η, 0), and r 01 is the distance between P and P . Each meta-atom in the metasurface can be treated as an ideal source point, and the selected macrocells are simulated to obtain the electric fields of the central target meta-atoms to obtain the ideal source point in the designed metasurface considering the near-field couplings. For the amplitude of electric fields at central axis evaluated by the proposed SODM, very good agreement can be found compared with the full-wave simulations at the wavelength of 400 nm, 550 nm, and 700 nm as shown in Fig. 4. When considering near-field couplings for macrocells containing multiple meta-atoms, it is crucial to find the desired macrocell in the design effectively. A DNN is used to extract the high nonlinear relationship between the input parameters and output EM response. The dataset contains 40000 macrocells generated randomly. The input parameters are the side lengths of seven meta-atoms in each macrocell. The central meta-atom of this macrocell is considered as an ideal source point, and the real and imaginary parts of the electric fields are obtained by FDTD full-wave simulations. The transmission of the designed meta-atom is defined as the average electric fields in a period when normalized by the incident field, and the distance between the integration plane and the upper surface of the meta-atom is 1/5 wavelength in our experience. The transmission phase (ϕ) and amplitude (T) of the source point can be obtained as follows: where E re and E im represent the real and imaginary parts of the electric field, respectively. The loss function in deep learning is defined as follows: where MAE(Y 1 ) and MAE(Y 2 ) denote the average mean absolute error (MAE) of the real and imaginary parts of the electric field output from the DNN, respectively. The MAE(Y phase ) term in loss function makes the deep neural network focused on the decline of phase error, which makes the error of the predicted phase lower. This will be beneficial to the optimization of the metasurface. In addition, the MAE of the phase corresponding to the complex electric field is also defined. For the phase error greater than π, it will be subtracted by 2π for the periodic nature of the phase.
To evaluate the accuracy of the DNN model, the average 2norm error is defined for the real and imaginary parts of the electric field as follows: where k denotes re or im, which are the real or imagine part of the electric fields. Y pred denotes the prediction results of the neural network model on the test datasets, and Y real denotes the actual data of the test datasets. Data normalization and disordering are performed on the generated 40000 datasets; 80% of the total datasets are used as the training set, of which 25% are used as the validation set, and the remaining 20% of the total datasets are used as the test set. A neural network model is constructed, as shown in Fig. 5(a). The loss curves for the training iteration process are shown in Fig. 5(b), where the solid red line represents the training set, and dashed blue line represents the validation set.  To verify the generalization performance of the model, 50 random data points that not in the datasets are used for testing, and the test results of the predicted and simulated electric fields are shown in Fig. 6(a) and (b), respectively, for a comparison. Fig. 6(c) and (d) show comparisons of the predicted and simulated results of the phase and amplitude, which showing significant agreement between them. The prediction performance is presented in Table I. For the test dataset, the errors in the real and imaginary parts of the output electric field are N re = 0.035 and N im = 0.031, respectively; and the phase error is MAE(Y phase ) = 3.5°. For the additional test data not in the dataset, we obtain similar results; the value of MAE(Y phase ) is only 1.7°. It can be concluded that the trained DNN model has a high accuracy and good generalization performance for EM response prediction.
At the same time, we consider the influence of different neural network structures on the prediction effect. For the CNNs such as GoogLeNet and MobileNet, they are trained and tested by the same dataset, respectively, and the comparison results of the test set are shown in Table II. Compared with the fully connected layer, the CNN layers have worse prediction precision in real and imaginary parts of the output electric field, and the absolute error of the phase is in the same order of magnitude. Moreover, the training time, memory, and computing power requirements of CNNs are far greater than those of fully connected layers. So fully connected layers instead of CNNs are used for the DNN model.

C. Transfer Learning
Deep-learning-based neural network design methods typically require complex networks, large scale datasets to achieve high prediction accuracy, and large time consumption for the dataset's simulations and networks training. When we design a metasurface with embedded meta-atoms, the simulation time required is high owing to the strong couplings between neighboring meta-atoms. Fortunately, it follows EM response principles similar to those of the non-embedded case. Based on this fact, we introduce transfer learning for EM response predictions of embedded meta-atoms.
Transfer learning refers to the existence of a trained network model whose main parameters are the number of layers, number of neurons per layer, and weight of the trained network. We refer to this as a pre-trained model. The structure and weights of the pre-trained model can be reused for the target network [37], and many studies use open source networks with very high accuracy directly, such as GoogLeNet, to reduce the time required for network design and training [38]. Owing to the layered structure of the network, it is possible to partially transfer the network and modify it with great flexibility. Generally, with transfer learning method, NNs still require large amount of data to be sufficiently trained (on the order of hundreds to thousands). This is because the transferred model is usually a well-trained model with extensive data features.In this study, the DNN for non-embedded macrocells is designed for a specific optical structure and sufficiently trained. It can be transferred for electric field prediction of another optical structure.
We use the DNN for non-embedded macrocells as a pretrained model to transfer the embedded macrocell predictions to reduce the number of training dataset with guaranteed training accuracy. According to the non-embedded macrocell approach, we must create an embedded macrocells dataset with 40000 samples, which is time-consuming. However, to achieve a balance between the transfer learning accuracy and computation time of the dataset, the number of the dataset is halved and the computation time of 20000 embedded macrocells is accepted. The transfer process is illustrated in Fig. 7. The gray neurons in the target model represent neurons whose weights are transferred and fixed, and the weights of the unfixed neurons are trainable in the training process. The structure of the network must be consistent with the training model and other network parameters such as activation and optimization and must also be consistent with the pre-trained model. The dataset is generated by the same method used in the nonembedded macrocells, 80% of the datasets are defined as training set, of which 25% is the validation set, and remaining 20% of the dataset are the test set. The network structure is designed to be consistent with the pre-trained model when constructing embedded macrocells; the network structure is depicted in Fig. 5(a). The network is first self-trained (without fixing the network weights transferred from the pre-trained model), and the phase error is MAE(Y phase ) = 3.5°, which is similar to that of non-embedded macrocells. The network weights of different layers are then transferred and fixed for comparison with the self-training results.
The results with different number of training sets are presented in Fig. 8(a), where each model is trained 10 times to obtain the average error. The number of transferred layers increased gradually. As expected, the error increases along with the number of datasets decreases, as shown in Fig. 8(a). For the same number of samples, the accuracy increases slowly along with the number of transferred layers increases. However, this accuracy reaches the highest level when the number of transferred weight layers is 2, after which the accuracy deteriorates. With only 2000 samples in the training set, the lowest MAE(Y phase ) of 6.9°can be achieved using the transfer learning method. However, the full self-training without transfer learning can only reach 12.49°.

D. Optimization Algorithm
The forward design process predicts the corresponding phase for inputting the parameters of the meta-atoms. Generally, it  is the inverse process in the design, and the parameters are required for a pre-determined EM response. The inverse process is difficult for neural networks because of the couplings between meta-atoms; the other reason is that the same input EM response has multiple meta-atom parameters. In this study, we insert the SODM and DNN as the surrogate model into the PSO algorithm [39] to obtain the parameters of the macrocells for multi-functional metasurface design. The entire flow chart of the design approach is shown in Fig. 9.
Two fitness functions are designed separately, the ideal phase and amplitude [18] and the maximum of the focusing efficiency: where n represents the number of meta-atoms in the complete array, and i is the index number of different meta-atoms. And the amplitude (A i ) and phase (ϕ i ) are predicted by DNNs. (9) is computed by (1) approximately in the optimization. The optimization objective of the focusing efficiency can be simplified to the electric field at the focal point, and the larger the amplitude of the electric field at the focal point, the larger the focusing efficiency. In this study, we optimize the maximum fields near the ideal focal points, that is, ±3 μm at the ideal focal length.

III. APPLICATIONS
In this study, the deflection efficiency is defined as the maximum peak energy in the angular spectrum (P max ) divided by the total energy of the incident light captured by the array (P i ): The focusing efficiency is defined as the energy in the first Airy spot in the focal plane (P first ) divided by the total energy of the incident light captured by the array (P i ):

A. Deflector
First, a deflector with a deflection angle of 27.2°is designed using the traditional method. The boundary condition in xand y-axis is periodic, and the deflector gives beam deflection within the xoz plane. A schematic diagram of the deflector with non-embedded meta-atoms is shown in Fig. 10(a) with phases of 60°, 120°, 180°, 240°, 300°, and 360°. The deflection efficiency of this initial design is 53.8%. The initial deflector is optimized using our proposed design framework. The iterations of the fitness function T during the optimization process are shown in Fig. 10(b), and the far-field angular spectrum [40] before and after the optimization is shown in Fig. 10(c). This shows that the first-order diffraction intensity is significantly enhanced after optimization. The electric field distributions before and after Fig. 13. Optimization of metalens (NA = 0.71) composed of embedded meta-atoms, (a) structure of the metalens, (b) iterative process when the fitness function is T, (c) electric field distribution along z-axis before and after optimization with the fitness function T, (d) iterative process when the fitness function is T E , and (e) electric field distribution along z-axis before and after optimization with the fitness function T E . optimization are shown in Fig. 10(d) and (e). The calculated deflection efficiency is 82.4%, an improvement of the initial design approximately 30%.

B. Metalens
A high NA metalens is designed using non-embedded metaatoms. The boundary condition in y-axis is periodic, in x-and zaxis is PML, and the metalens focus within the xoz plane. Firstly, it is designed using the traditional method, schematically shown in Fig. 11(a). The size of the metalens is 36.4 λ (NA = 0.71), and the initial focusing efficiency is 56.9%. Subsequently, it is optimized using the design framework with the two fitness functions proposed in Section II-D. The optimization iteration curves and electric field distributions in the z-axis direction when the fitness functions have the ideal phase and magnitude (fitness function is T in (8)) are shown in Fig. 11(b) and (c), respectively. The optimized iteration curves and z-axis electric field distribution curves are shown in Fig. 11(d) and (e), respectively, where the fitness function is the maximum value of the focal electric field (fitness function is T E in (9)). The focusing efficiency for the amplitudes 8.02 and 8.1 V/m of the electric fields at the focal point are 73.02 and 74.64%, respectively. The efficiencies of the two fitness functions are comparable. The electric field distributions of the metalens in the xoz plane before and after optimization are shown in Fig. 12, which shows that the focusing performance for the transmitted waves considerably improve after the optimization.
The metalens is designed using embedded meta-atoms with NA = 0.71, as shown in Fig. 13(a), with an initial focusing efficiency of 48.24%. The transfer learning DNN which is inserted into the optimization framework is used for optimization, and the results are shown in Fig. 13(b)-(e). When the optimization is performed with T as the fitness function, the iterative curve in Fig. 13(b) stops when the number of iterations is 15, making it difficult to achieve the desired efficiency; the design focusing efficiency is 53.64%.
In contrast, after optimization with T E as the fitness function, the amplitude of the electric field at focal point is improved from 6.07 to 6.96 V/m, with a considerably efficient focusing efficiency of 66.49%. The electric field distributions of the metalens in the xoz plane before and after optimization are shown in Fig. 14.
To verify the design optimization framework for a large scale metasurface in this study, a metalens of 80λ (NA = 0.71) with  an initial focusing efficiency of 47.84% is optimized using two different fitness functions. The iterative curves optimized by the fitness function T and T E are shown in Fig. 15(a) and (c), respectively. The electric field distributions of the optimized metalens along z-axis are shown in Fig. 15(b) and (d). The amplitudes of electric fields for optimization with two fitness functions are 10.88 and 12.06 V/m, respectively. The corresponding focusing efficiencies are 73.58% and 75.45%, respectively, and the optimization performance for the fitness function with T E is slightly  better than that with T. The electric field distributions in the xoz plane before and after the optimization are shown in Fig. 16, the electric fields focusing can be found to be improved after the optimization.
As shown in Table III, 100 populations with 500 iterations are set up based on the PSO algorithm, representing 50000 groups of metalen arrangements. For full-wave simulation, it takes 15 min on average to simulate the entire metalens array once; it will take approximately 12500 h if full-wave simulation is used for the complete optimization process. This is unacceptable for practical design and optimization. For the proposed design framework, it takes approximately 168 h to build the dataset of non-embedded macrocells with multiple computers in parallel, less than 25 min for DNN model training, and an average optimization time of 2 min. It should be emphasized that Table III shows the time spent to design a specific metalens; however, the optimization framework can be extended to metalens with different NA and deflectors with different deflection angles. They are based on the trained DNN for the same macrocells, as shown in Figs. 15 and 16. Therefore, the optimization framework designed in this study significantly improves the overall efficiency of metasurface design.

IV. CONCLUSION
We propose a framework for the design of optical metasurfaces; the framework integrates DNN with PSO algorithms. The SODM is introduced to construct the dataset for metasurface design considering near-field couplings, and transfer learning is introduced to reduce the number of the dataset. Efficiency is improved significantly for a typical deflector and metalens design. Without loss of generality, the meta-atoms with simple square cross sections are used for FDTD discretization with high accuracy, while the proposed method is universal for metasurface design considering near coupling. When extended to metasurface design with freeform geometries, a few more input parameters of DNN should be added to describe different samples. The meta-atoms are assumed to be a monopole in (1), and the high order modes should be used when considering more accurate calculation results. However, when we use the fitness function defined in (8), it is an error function from amplitude and phase of the designed meta-atoms obtained by full-wave simulations instead of (1). The design framework can be easily extended to other multi-functional metasurface designs that share the same macro cells.