Search for Efficient Wireless Network Structures

We consider the problem of finding an array of antennas that provides a desired distribution of signal strength in an urban environment. Our approach is based on a discretized Maxwell equation for the electromagnetic field. This equation can be solved numerically and provides training samples for a deep learning algorithm. Finally, the deep learning algorithm, based on a multi layer perceptron, is used to retrieve antenna arrays for a desired signal distribution. In our example we obtain for the retrieval a mean success rate of 90%.


INTRODUCTION
The propagation of electromagnetic waves is described by the fundamental equations of the Maxwell theory, also known as Maxwell equations. A typical application are cellular networks for telecommunication, complex WiFi or Bluetooth networks for the "Internet of Things" (IoT). Even more demanding will be the introduction of new concepts such as 5G/6G cellular networks. The request to the designers of such networks is always to find an efficient spatial antenna distribution (AD) in order to obtain a desired signal intensity for an urban or a mixed rural and urban environment or in an industrial complex of buildings. The economic aspect of the AD is a reduction of the energy consumption of the antenna network, which is adapted to the requirements of the intra-network communication.
Many practical problems of designing microwave structures for a given microwave application are difficult to solve with conventional analytic or numerical methods, since they are not invertible. This can be easily understood when we consider the fundamental theory, namely the Maxwell equations of the electromagnetic field. These equations are linear, and as such they are invertible, provided we avoid singular points. However, the electromagnetic field itself is often much less important for microwave applications than quadratic forms of the field. Examples are the energy density or the field polarization.
In principle, the Maxwell equations can be solved for a given AD and the corresponding signal distribution (SD) can be calculated from the electromagnetic field solution. Unfortunately, the calculation of the AD from a given SD is much more tedious. Therefore, the practical problem of finding the AD for a desired SD must be approached indirectly by many attempts of guessing an appropriate AD, which means that this procedure is time consuming and requires educated human resources. The problem becomes even more demanding as the technological evolution goes to smaller and smaller wavelengths of the electromagnetic field with more complex patterns of the SD. Therefore, it would be much more economic to employ a deep learning approach to find the AD directly for a desired SD. This is what we are going to address in this article.
The solution of partial differential equations, such as the Maxwell equations, by the machine learning approach has been quite successful and became common recently [1,2]. It was also applied directly to the Maxwell equations [3]. Moreover, it was used to various problems in the field of wireless communication [4,5]. A more general review of different concepts for image retrieval by deep learning methods is given in Ref. [6].

ELECTROMAGNETIC FIELD IN A COMPLEX ENVIRONMENT
A rough estimate of the decaying electromagnetic energy density E(r) in free space is for large distances r from the antenna asymptotically E(r) ∼ E 0 r −2 due to energy conservation. This can be quite different though in the presence of scattering objects due to interference effects. Phenomenological models, which take into account the effect of buildings, give a r −α decay with an exponent α = 3, . . . , 5 [7]. Although based on averaged empirical results, this behavior is not reliable and fails in particular for an array of several antennas when complex scattering takes place. As an alternative approach we return to the Maxwell equations. In that case an electromagnetic field is induced by local currents j(r, t) in the antennas of the network. Assuming that an antenna is a Hertz dipole which is oscillating with frequency ω, we consider a local antenna current j(r, t) = j(r)e iωt .
When all antennas of the network radiate with the same frequency ω, the created electric field E is determined according to the Maxwell theory by the inhomogeneous linear differential equations [8] where ∇ is the spatial gradient, (µ) is the dimensionless, space-dependent relative permittivity (relative permeability) and c is the speed of light. For simplicity, we assume for the relative permeability µ = 1 in the following. The second equation is the condition for a charge-free space. The corresponding magnetic field can be immediately calculated from the electric field E as B = ∇ × E/iω but this is not considered here. What matters for the communication in the network is the effective signal strength, which is given by the electric energy density E = 0 |E| 2 with the dielectric constant 0 = 8.8 · 10 −12 Ws/V 2 m. The linearity of the Maxwell equations enable us to treat a network with different frequency separately for each frequency. Therefore, the analysis for one frequency ω can readily generalized to a multi-frequency network. By solving Equation (1) we determine the SD as the energy density E(r), created by a current density j(r) in the presence of the relative permittivity . The two-dimensional environmental structure enters the calculation through a spatially varying , where the third dimension is negligible. For a uniform relative permittivity and a single antenna the electromagnetic field is represented by a circular wave. This changes drastically though when varies in space or in the presence of several antennas due to complex wave scattering. In other words, the field intensity is strongly affected by an inhomogeneous relative permittivity, as it occurs in an urban environment. Concrete, the typical material of buildings, has a relative permittivity 1 = 4.5, while the relative permittivity of air is air = 1.

Physical Model: Discrete Maxwell Equation
Starting from the Maxwell Equation (1) we continue with a division of the space of a city or urban/rural region into N cells whose size is several times the wave length λ = c/ √ ω and equal to the smallest building size. Assuming that the buildings match a regular arrangement of cells we have a uniform relative permittivity in each cell and we can solve the Maxwell Equation (1) for each cell separately. This assumption might be reasonable at least for modern cities like Manhattan, while for ancient cities the regular arrangement of cells can be considered as a reasonable approximation. Then the matching condition between neighboring cells leads to the linear vector equation where E, J are N -dimensional vectors. Each vector component consists of three electromagnetic components E x , E y , E z and J x , J y , J z . The 3N × 3N matrix M is given by the building distribution (BD) through the corresponding spatially varying permittivity , J are the local currents of the AD, while E is the resulting electric field. In other words, a given BD is represented by M . Then for an AD, represented by J, we obtain the electric field E from Equation (2) by inversion: When M −1 exists, we get for the electric field E from J as The SD of the electric field reads as the energy density E = 0 |E| 2 , such that we can calculate E for any given AD J via Equation (3). The practical question of calculating the AD J for a desired SD requires two steps: (i) we retrieve the electric field E from the SD E. And second, (ii) we insert E into Equation (2) to obtain the AD J. For a specific BD the corresponding SD is visualized in Fig. 1 for a single antenna at the center and in the lower left corner. The step (i) is the most difficult part, since, in general, this may not have a unique solution. Moreover, there exist solutions which are not linked to a reasonable AD with identical antennas. Therefore, we must add as an additional requirement that only an AD of identical antennas is allowed. This complex problem can be efficiently treated within a deep learning approach. The central idea is that an artificial neural network learns the relation between the SD and the AD from Equation (3) as E = 0 |M −1 J| 2 for many different J at a fixed BD. Then the mapping AD −→ SD is invertible, since for each SD we allow in the learning approach only one AD. This enables us to retrieve an AD for a desired SD with a given BD.
The two examples in Fig. 1 are created with the ME in Equation (3) and demonstrate that even for a single antenna it might be difficult to associate the SD for a complex urban environment.

Artificial Neural Network Approach
In the next section we describe a method that retrieves the AD for a desired SD with a fixed BD. We employ supervised learning with backpropagation [9]. It is based on an artificial neural network in the form of a Multi Layer Perceptron (MLP) [10]. The MLP consists of several layers, where the layer j is represented by the rectangular weight matrix W j and a bias vector V j , which is added after the matrix multiplication. A geometric interpretation of the matrix multiplication and adding of a bias is that a single neuron inside a layer acts as a linear classifier, where the matrix column assigned to the neuron is the normal vector of a hyperplane, and the bias is the hyperplane offset. Therefore, the sign of the result tells us whether the input is on the front-or on the backside of this hyperplane, which can be interpreted as a linear classifer. The first layer matrix W 1 acts on an input vector x = (x 1 , . . . , x k ) which creates the new vector x = (x 1 , . . . , x k ), where k = k in general. In the next step the activation function f act (x i ) maps the components of x to the vector F 1 (x). This includes a batch normalization, which is applied as a normalization the output of the layer [10,11] to stabilize the training of the network. F 1 (x) is the input vector of the next layer with layer matrix W 2 . We repeat this mapping for n layers to obtain eventually F n (F n−1 (. . . F 1 (x) . . .)).
For the activation function we have used the swish function [12] f act ( In the last layer a special activation function is applied. This corresponds to the specific network task (e.g., softmax for classification, linear for regression etc.). In the present case we have chosen a sigmoid function f out (x i ) = 1/[1 + exp(−x i )], since the output should be binary for the position of the antennas. For this purpose, the output threshold is 0.5 to determine antenna positions. For technical reasons the sigmoid function is used that is smooth in contrast, for instance, to a step function. For the network loss, which quantifies the error of the network (lower is better), the mean squared error (MSE) is used. The size of the weight matrices {W j } depends on the input vector size of the layer and on the number of neurons in the layer. To determine the number of neurons per layer, as well as the number of layers, an evolutionary algorithm is used. The evolutionary algorithm operates on a population size of 20 solution candidates, where a solution candidate is a list of neurons for the layers of the MLP. For selection a random individual of the top four is chosen. The mutation is performed by three different operations: remove a random layer, add a random layer or replace a random layer. The crossover is performed by splitting two solution candidates at a random position (layer), and concatenate the splits between the solution candidates.
The schematic implementation of the MLP approach and its connection to the discretized Maxwell equation is illustrated in Fig. 2. This approach is employed in the creation of the plots in Figs. 3(b), 4 and 5 for the BD given in Fig. 3(a).

RESULTS OF THE RETRIEVAL
First, we create an AD-SD dataset for different AD by solving Equation (3)  for a minimal number of two layers and a maximal number of 12 layers. The MLP is trained with a batch size of 1024 and 400 epochs. An algorithm for first-order gradient based optimization of stochastic objective functions (ADAM) is used with an initial learning rate of 0.001 [13]. The best performing network has nine layers with the structure N opt = {2048, 768, 400, 512, 600, 400, 600, 256, 300}.
To measure the quality of retrieval in general, we can define the "intersection over union" (IoU) ratio by comparing pairwise the directly calculated set of antennas c and the retrieved set of antennas r. For this purpose we determine the number of antennas in the intersection of the two sets n c∩r and the number of antennas in the union of the sets n c∪r and define the IoU fraction as the ratio R = n c∩r /n c∪r . The intersection r ∩ c = [1, 0, 0, 1, 0, 0] has two antennas (n c∩r = 2) and the union c ∪ r = [1, 0, 1, 1, 0, 1] has four antennas (n c∪r = 4), such that the ratio reads R = 1/2. According to the histogram in Fig. 6, most retrievals are in the interval from R = 0.7 to R = 1 (i.e., the retrieval rate is typically between 70% to 100%). The mean value isR = 0.9, while the ratio of the number of an exact retrieval R = 1 and the number of all other retrievals with R < 1 is 0.6. Fig. 3(b) presents a characteristic example for the creation of the SD from a given AD by the ME and the retrieval of an AD from the same SD by the MLP from the BD in Fig. 3(a). Other examples are depicted in Figs. 4, 5. This result is reasonable in comparison with the general success rate of deep image retrievals [6]. However, one must keep in mind that the successful retrieval of an AD is different from a more qualitative retrieval, as for instance, that an image contains a certain number of antennas but not their specific locations. This reflects a high degree of accuracy in the above definition of a successful retrieval. Figure 4: Four examples for ADs vs. SD on a 9 × 9 cell structure that are successfully retrieved (R ≥ 0.9). In each 2 × 3 block the middle column represents a given AD, from which the SD in the right square is calculated with the discretized Mawell Equation (3). The square on the left is the retrieval result from the Multi Layer Perceptron, using the SD of the right square.
(a) (b) (c) (d) Figure 5: Four examples for ADs vs. SD on a 9 × 9 cell structure that are unsuccessfully retrieved (R ≤ 0.1). Figure 6: Distribution of the successful retrievals of the antenna arrangement, expressed by the ratio R in Equation (5).

DISCUSSION
A single antenna in an urban area is not efficient due to the effect of localization: Its range is short due to strong interference of scattered waves. The decay of the signal strength E with distance r from the antenna is exponential as E(r) ∼ E 0 e −r/ξ in contrast to a free area, where the signal intensity decays according to a power law E(r) ∼ E 0 r −2 . The decay length ξ depends on the distribution of buildings. Two typical examples with a single antenna are presented in Fig. 1. To cover an urban area we need an array of antennas. This must be adjusted according to the environment (i.e., the BD) and the required SD. The situation is similar to the task of installing street lights in order to illuminate an urban area, with the difference that light is completely absorbed by buildings such that geometric optics rather than wave scattering can be applied in this case. The positions of the antennas in the array (i.e., the array structure) for a desired SD is retrieved with the help of the MLP. The MLP itself was constructed from the discrete Maxwell Equation (3) through an intensive training. This method is quite flexible and can be applied to different situations. For instance, a finer cell structure for the ME with larger cell numbers N can be introduced. In general, the size of the cells are determined by the size of the scattering objects (e.g., buildings).