Fast Specific Absorption Rate Aware Beamforming for Downlink SWIPT via Deep Learning

This article investigates fast deep learning based transmit beamforming design for simultaneous wireless information and power transfer in the multiuser multiple-input-single-output downlink, with specific absorption rate (SAR) constraints. The problem of interest is to maximize the received signal-to-interference-plus-noise ratio and the energy harvested for all receivers, while satisfying the transmit power and the SAR constraints. The optimal solution can be obtained via convex optimization but incurs a high complexity. To reduce the computational complexity, this article proposes a model-driven deep learning technique that only needs to predict key features of the problem with much reduced dimension but enhanced performance compared to widely used data-driven machine learning. Simulation results demonstrate that our proposed algorithms can significantly reduce the algorithm execution time, while maintaining satisfactory performance.


I. INTRODUCTION
Simultaneous wireless information and power transfer (SWIPT) is a new technology where information and energy flows co-exist, co-engineered to simultaneously provide communications connectivity and energy sustainability [1], [2]. However, the SWIPT technology may significantly contribute to the electromagnetic pollution (electrosmog). For example, one of the main applications of SWIPT is for medical devices in wireless body area networks, where an access point will support the communication connectivity and the power sustainability of a short-range sensor network in, on, or around the human body [3]. Therefore, it is necessary to incorporate exposure constraints in the design of SWIPT systems.
One widely adopted measure on RF exposure is the specific absorption rate (SAR) that measures the absorbed power in a unit mass of human tissue by using units of Watt per kilogram [4]. In our previous work [5], we have studied SAR-aware beamforming design in a multiuser multiple-input single-output (MISO) downlink channel, where the receivers are characterized by both quality-of-service (QoS) Manuscript  and energy harvesting (EH) constraints using the power splitting (PS) SWIPT approach [6]. We have derived the optimal beamforming and power splitting solutions using semidefinite programming (SDP). However, the complexity of the optimal beamforming solution is high so a new low-complexity high-performance solution is needed.
In this article, we aim to tackle the complexity challenge of the optimal beamforming solution to the problem in [5] using deep learning. The rational is that a deep learning technique trains neural networks offline and then deploys the trained neural networks for fast online optimization. Deep learning has been widely used in optimization tasks of wireless resource management, and early works use deep neural networks (DNNs) mainly to predict transmit power [7], [8], and later to directly estimate the beamforming matrix [9]- [11]. However, most existing works are data-driven and do not exploit the problem structure, which will lead to high training complexity and poor prediction performance.
In stark contrast to existing efforts in data-driven machine learning which often ignores the useful problem structure, the novelty of our work is that we introduce a model-driven deep learning-based technique to reduce the computational complexity of the optimal SAR-aware beamforming solutions by exploiting the problem structure and properties, inspired by recent progress in model-driven learning for communications systems [12]- [14]. This method improves the learning accuracy compared to existing data-driven direct beamforming learning approaches by specifying the most appropriate features to be learned with reduced dimension. Our main contribution is to propose two model-driven learning approaches that dramatically accelerate the optimization of SAR-aware beamforming. They predict different primal and dual variables to help recover the beamforming solutions, and as a result lead to different performance-complexity trade-offs. To the best of our knowledge, this is the first work that adopts model-driven learning for the optimization of SAR-aware SWIPT beamforming. Our simulation results show that the proposed solutions can significantly reduce the computational time compared to the optimal solution and the heuristic solution with satisfactory performance, and outperform the data-driven approach that directly predicts the beamforming solution.
The remainder of this article is organized as follows. Section II introduces the system transmission model and problem formulation. In Section III, we propose two fast deep learning-based algorithms to solve the SAR-aware beamforming problem with different performance and complexity. Simulation results are provided to validate the proposed algorithms in Section IV and our work is concluded in Section V.
Notation: All boldface letters indicate vectors (lower case) or matrices (upper case). The superscripts (·) † and (·) −1 denote the conjugate transpose and the matrix inverse, respectively. The identity matrix is denoted by I. z denotes the L 2 norms of a complex vector z. Tr(A) denotes the trace of a matrix A, while A 0 indicates that matrix A is positive semidefinite.

A. System Model
We consider a MISO downlink channel consisting of an N t -antenna transmitter (e.g., a base station (BS)) and K single-antenna receivers that employ single-user detection. The BS transmits with a total power P T and let s k be its transmitted data symbol to receiver k, which is Gaussian distributed with zero mean and unit variance. The transmitted data symbol s k with normalized power is mapped onto the antenna 0018-9545 © 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
array elements by the beamforming vector w k ∈ C N t ×1 . The received baseband signal at receiver k can be expressed as where h k ∈ C N t ×1 is the channel between the BS and the k-th receiver and n k denotes the additive white Gaussian noise (AWGN) component with zero mean and variance N 0 . Therefore, the received power at receiver k is equal to The receivers have RF-EH capabilities and therefore can harvest energy from the received RF signal based on the power splitting technique. With this approach, each receiver splits its received signal into two parts with a parameter ρ k ∈ (0, 1): a) 100ρ k % of the received power is converted to a baseband signal for further signal processing and data detection, and b) the remaining is driven to the required circuits for conversion to DC voltage and energy storage. During the baseband conversion, additional circuit noise, v k , is present due to phase offsets and non-linearities which is modeled as AWGN with zero mean and variance N C . The signal-to-interference-plus-noise ratio (SINR) metric characterizing the data detection process at the k-th receiver is given by On the other hand, the total power that can be harvested is equal to is a non-linear parametric EH function which will be presented later.
As discussed, wireless communication devices are subject to SAR limitations. Previous reported results such as [15] and [16] have shown that the pointwise SAR value with multiple transmit antennas can be modeled as a quadratic form of the transmitted signal, and the SAR matrices fully describes the SAR measurement's dependence on the transmitted signals and are positive-definite conjugate-symmetric matrices.
Since SAR is a quantity of the transmit signals averaged over time for a specific human tissue [15] and [16], we model the l-th SAR constraint with a time-averaged quadratic constraint given by where A l 0 is the l-th SAR matrix and P l is the l-th SAR limit.

B. Problem Formulation
We study the balanced optimization problem of maximizing the ratios of the received SINR and EH over the target requirements, the SAR and total power constraints, whereγ k ,λ k are the SINR and the EH requirements, respectively. This choice of the objective function will balance the received SINR and the EH between users. To make the problem more tractable, we introduce an auxiliary variable t, and formulate the optimization problem in P1 where P T is the maximum total transmit power, and L is the number of SAR constraints. F (x) is output DC power at the k-th receiver represented by a nonlinear function and x is the input RF power.
The nonlinear EH function can take many forms to capture the relationship between the input and output power at the energy receiver [17], [18], and [19]. In general, the nonlinear EH function is monotonically increasing, and therefore we can find the inverse mapping F −1 (·), and the EH constraint (8) can be rewritten as It is difficult to solve the above problem P1, because both SINR and EH constraints (6) and (7) are nonconvex, and we also have additional multiple SAR constraints. In our previous work [5], we have shown that the optimal solution to P1 can be found by solving the power minimization problem P2 on the next page, where we have defined new matrix variables W k = w k w † k , ∀k. To be specific, if P1 is solved and the optimal t * is achieved, then the same beamforming and power splitting solutions are also optimal for P2 to achieve the same SINR and EH, and the optimal minimum transmit power will be P T . Because P1 is a quasiconvex problem in t, once P2 is solved, P1 can be solved via the bisection search method. Therefore, in the rest of the article we will focus on solving the problem P2.

A. The General Structure
Our proposed model-driven deep learning-based structure for the beamforming optimization is shown in Fig. 1. Existing data-driven approaches directly predict the beamforming matrix with NK complex elements, which may lead to inaccurate and even under-fitting results that cannot guarantee the end performance. To tackle this challenge, the main idea of our proposed deep learning structure is to predict only the main features of the problem with reduced dimension, and then find the beamforming solution in a fast way by using these features and exploiting the problem structure. Therefore it includes two We choose the convolutional neural network (CNN) layers followed by the feedforward layer as the base of the neural network module, because the CNN has strong ability of extracting features. In addition, the CNN can reduce the number of learned parameters by sharing weights and biases. To overcome the challenge of predicting the beamforming matrix directly, we propose to predict some chosen key features extracted from the problem structure, which is much less than the number of elements in the beamforming matrix. The beamforming recover module will then find the beamforming solution according to the expert knowledge specific to the problem and the learned features.
The complex channel coefficients are fed into the neural network module to predict the key features as output, but complex inputs are not yet supported by standard neural network software. To deal with this issue, we separate the complex channel vector into the in-phase component R(h) and quadrature component I(h).
In general, the more useful features are learned, the easier it is to recover the final beamforming solutions but this may also result in a performance loss due to the higher prediction error. Therefore in the following we propose two learning algorithms which will predict different features and achieve different performance and complexity trade-offs.

B. Primal Learning Algorithm
In the first learning algorithm, we choose the primal power splitting variables {ρ k } as the features to predict. The reason is that although the problem P2 is convex, it does not belong to a specific category of standard convex programming problems such as SDP or second-order cone programming (SOCP) for which efficient algorithms exist because of the presence of {ρ k }. If we can predict the most computational demanding variables {ρ k }, this will significantly reduce the overall complexity.
To recover the original beamforming solution, once {ρ k } are predicted, we can solve P2 with the learned {ρ k } which becomes a standard SDP problem and can be more efficiently solved than the original P2. In this method, the number of real variables to predict is K.

C. Primal-Dual Learning Algorithm
In the previous primal learning method, the complexity of the beamforming recovery module is still relatively high because only the primal variables {ρ k } are predicted and an SDP problem needs to be solved. In the second learning algorithm coined as Primal-Dual Learning, we propose to predict more features so as to reduce the complexity of the recovery module. To be specific, we choose {ρ k } and additionally some of the dual variables as the features. To this end, we first derive the dual problem of P2 (given {ρ k }) below: where ν, α, β are dual variables, and θ k α k − β k . Instead of learning all dual variables, we propose to choose the main features of {θ k }, {ν l } and the primal variables {ρ k } to predict. These chosen features will help us derive the direction of the beamformer. To recover the original beamforming solution, we first find the beamforming direction given bỹ With learned {ρ k } and {w k }, the remaining problem reduces to the power allocation problem P3 to find {p k }, which is a linear programming problem and much faster to solve than SDP in P2, i.e.

P3 : min
In this method, the number of real variables to predict is 2K + L. As expected, the primal-dual learning predicts more features, so it will be faster to recover the original beamforming solution than the primal learning method. This will be verified in the next section. Complexity Analysis: For the optimal solution obtained by solving the problem P2, the complexity is dominated by the SDP constraints. According to [21, 6.6.3], the complexity for solving P2 is O( ). For comparison, as shown in [13, Sec. III], the complexity of predicting the primal and dual variables using both proposed learning algorithms is O (N t K). To recover the beamforming solution, the primal learning method still needs to solve an SDP in P2 which has similar complexity as the optimal solution, but because {ρ k } have been predicted, the overall complexity is much lower than the optimal solution. While for the primal-dual learning method, the complexity of beamforming recovery is determined by solving the linear problem P3, and can be expressed as O( √ 3K + L(3K 3 + LK 2 )) [21, 6.6.1].

IV. NUMERICAL RESULTS
In this section, we carry out numerical evaluation of the performance of the proposed deep learning-based beamforming solutions. We consider a MISO downlink consisting of N t = 3 transmit antennas and K = 2 receivers randomly located around the BS with distance l k and direction ζ k drawn from the uniform distribution, l k ∼ U (1, 5) m and ζ i ∼ U (−π, π). Each receiver can harvest energy at frequency f = 915 MHz and the antenna gains at the BS and receivers are 8 dBi Fig. 2. The minimum achievable SINR and EH ratio vs SAR. and 3 dBi, respectively. The path loss coefficient is 2.5. Because of the short distance between the BS and the receivers and dominance of the line-of-sight (LOS) signal, Rician fading is used to model the channel and the Rician factor is 5 dB. We consider one SAR constraint, i.e., L = 1, P T = 2 W, N 0 = −70 dBm and N C = −50 dBm, while the SINR and EH thresholds are the same for all receivers, i.e. Γ k =Γ = 10 dB,λ k =λ = −15 dBm, ∀k. We adopt the nonlinear energy harvesting model below proposed in [18] F (x) =ā x +b with fitted parametersā = 2.463,b = 1.635, and c = 0.826. The SAR matrix is given below by [22] A = In our simulation, we generate 10,000 training samples and 1,000 testing samples with independent channels, respectively, using the optimal algorithm in [5]. We use four CNN layers each having 8 kernels with size 3 × 3 and the ReLU activation function. Adam optimizer is used with the mean squared error based loss function. We will make comparisons with the optimal solution, the direct learning algorithm that learns to predict the real and imaginary parts of the N T × K beamforming matrix w = [w 1 , . . . , w k , . . . , w K ] directly, and the zero-forcing (ZF) solution [5]. Note that unlike traditional closed-form ZF solutions without SAR constraint, the SAR constraint does not permit a closed-form solution, therefore we use CVX [20] to solve the ZF solution, so its complexity is still high. Fig. 2 depicts the average minimum achievable SINR and EH ratio by different investigated algorithms against the SAR. The performance of the proposed primal learning solution 1is very close to that of the optimal solution, and significantly outperforms the ZF solution, while the performance of the proposed primal-dual learning solution is similar to but still outperform that of the ZF solution. The direct learning method cannot guarantee satisfactory performance and is much worse than the ZF solution. Fig. 3 shows the feasibility of various schemes to satisfy both the SINR and the EH constraints (i.e., the value of the objective function of P1 is greater than or equal to 1), which follows the similar trend as the results in Fig. 2. The effect of the SAR can be observed from both Fig. 2 and 3, i.e., a stricter or lower SAR constraint  reduces the achievable SINR and the received energy as well as the feasibility probability.
In Fig. 4, we plot the percentage of the running time relative to the optimal solution for various schemes. We can see that the ZF solution requires similar time as the optimal solution. This is mainly because of solving variables {ρ k } incurs a high complexity. It is observed that the proposed primal learning solution can save 35-40% of running time compared to the optimal solution while achieving near-optimal performance as shown in Figs. 2 and 3, and is much faster than the ZF solution. The proposed primal-dual learning solution requires slightly more time than the direct learning algorithm and can achieve nearly two orders of magnitude gain in running time compared to the optimal solution, while achieving better performance than the ZF solution as seen in Figs. 2 and 3. The average training time for the proposed primal learning solution, primal-dual learning solution and direct learning is 35s, 95s and 70s, respectively, on a computer with Intel i7-7700U CPU, a Titan Xp GPU and a 32GB RAM. As expected, the primal-dual learning solution requires longer training time than the primal learning solution but saves the running time significantly.

V. CONCLUSION
In this article, we have studied the optimization of SAR-constrained multiuser transmit beamforming for SWIPT systems. To reduce the complexity of finding the optimal beamforming and power splitting solutions, we designed two fast solutions using deep learning techniques which predict the main features of the optimization problem. Our simulation results have shown significant improvement of the proposed learning based solutions over the heuristic ZF solution and the direct learning approach. The takeaway message of our work is that it is necessary to explore problem-specific features to leverage the full potential of the deep learning approach for SWIPT, while direct application of it does not always lead to satisfactory results. Notice that both QoS and EH constraints are time-varying in practice, thus an important future direction is to study the generalization of the proposed neural networks to efficiently adapt to such changes.