Low-Complexity Antenna Selection and Discrete Phase-Shifts Design in IRS-Assisted Multiuser Massive MIMO Networks

We propose two novel antenna selection (AS) and discrete phase-shifts design (PSD) schemes for use in intelligent reflecting surface (IRS) assisted multiuser massive multiple-input multiple-output (mMIMO) networks. The first AS and PSD method aims at maximizing the gain of the channels; while the second method is an iterative sum-rate maximization (ISM) scheme that aims at maximizing the total achievable rate. For the AS part, we demonstrate that the ISM method achieves near optimal performance with much lower complexity compared to benchmark AS schemes, and can be utilized with any precoder at the mMIMO base station. For the PSD, our proposed successive-refinement optimization methods are not only efficient, but their complexities scale linearly with the number of elements at the IRS, making them highly attractive when dealing with large surfaces. A thorough complexity analysis for the proposed methods is carried out in terms of the number of floating point operations required for their implementations. Finally, extensive numerical results are provided and some key points are highlighted on the performance of the proposed schemes with both conjugate beamforming and zero-forcing precoders.

can offer significant advantages, it is well known that BSs with a large number of active antenna elements suffer from a considerably high power consumption, since each antenna is connected to a separate power-demanding radio frequency (RF) chain. In fact, RF chains are responsible for 50% − 80% of the total transceiving power consumption of communications systems [1]. In addition, adding more RF chains would increase the hardware cost and complexity of the system. One way to overcome the aforementioned challenges while maintaining the advantages of mMIMO is via applying antenna selection (AS) techniques. However, unlike conventional MIMO systems where the AS is carried out with the main focus being on enhancing the performance; designing AS in mMIMO is much more challenging as the computational complexity can become the main bottleneck and must be taken into account when designing such schemes. In fact, all works on AS in mMIMO have the same motivation, that is reducing the hardware complexity, cost, and power consumption that comes with massive antenna arrays; while the quality of any AS algorithm in mMIMO can be assessed by answering the following question 'how good is the trade-off between the required computational complexity and the achieved performance? '. There has been a considerable amount of work on AS in mMIMO in recent years. For example, the authors in [2] performed AS on measured channels, and they showed that AS can significantly reduce the hardware complexity and power consumption without large degradation in the performance. In [3], a low-complexity AS scheme was proposed to maximize the constructive interference using a conjugate-beamforming (CB) precoder. In our previous work in [4], user-centric and semiblind interference rejection AS schemes were proposed to maximize the signal-to-interference plus noise ratios (SINRs) of multiuser mMIMO with CB, and a similar methodology was utilized in our work in [5] alongside optimal power control to maximize the rates of cell-edge users. In [6] the authors proposed a branch-and-bound scheme for AS to maximize the MIMO channel capacity which can be achieved via dirty paper coding. Moreover, the authors in [7] proposed greedy AS schemes inspired by matching pursuit techniques; while in [8], the authors proposed a self-supervised learning based Monte-Carlo tree search AS algorithm to maximize the channel capacity. Furthermore, evolutionary methods have also been utilized for AS in mMIMO to maximize the channel capacity, for example, 0018-9545 © 2022 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
Moving beyond mMIMO, however, is not an easy challenge, as on one hand, increasing the number of active antenna terminals comes at a huge price in terms of hardware complexity and power consumption; while on the other hand, the demand for higher data rates continues to increase rapidly and more than ever before [14]. To that end, the recently proposed intelligent reflecting surface (IRS) has emerged as a possible solution to satisfy the growth in demand for data rates in beyond 5 G networks while maintaining low cost and power requirements [15]- [18]. In principle, IRS is similar to full-duplex amplify-and-forward relaying, with the main difference being that IRS cannot provide active amplification as it does not contain power amplifiers, but rather reflecting the incident electromagnetic waves on its planar surface with certain phase shifts, such that the direct-path signal and the one reflected by the IRS are constructively combined at the receiver end. Thus, with sufficiently large surfaces, it has the potential to provide reliable, cost-effective and energy efficient communication links [19]. Therefore, in this work our aim is to design efficient low-complexity methods for AS and phase-shifts design (PSD) for IRS-mMIMO networks.
To ensure efficient utilization of IRSs in multi-antenna communications systems, passive beamforming, i.e. the phaseshifts of reflecting elements at the IRS, should be properly designed. Conventional methods can include semidefinite relaxation (SDR) followed by randomization to approximate near optimal solutions [20]- [23]. However, such methods can suffer from extremely high complexity when the number of reflecting elements is large. Moreover, performing AS for mMIMO in the presence of IRSs is more challenging, as each antenna is influenced by the phase-shifts of the reflecting elements at the IRS. Therefore, complexity considerations become even more vital when designing AS in such scenarios. Thus far, the only work on AS with PSD in multiuser IRS-mMIMO was reported in [24], where the authors designed their schemes to maximize the MIMO channel capacity, which can only be obtained via dirty paper coding. Moreover, the authors assumed continuous phase-shifts, meaning that each reflecting element at the IRS can have any phase-shift value between 0 and 2π. However, designing AS and passive beamforming with such assumptions might not necessarily lead to a high performance in practical scenarios. In particular, dirty paper coding is known to suffer from extremely high complexity, and thus is unsuitable for mMIMO systems, where linear precoders are applied such as the zero-forcing (ZF) and CB. Furthermore, in practical scenarios, the IRS is expected to be implemented with only a few quantization levels [25], and as a result, it is more convenient to take discrete phase-shifts into account when designing such algorithms. In [26], the authors studied the problem of AS in an IRS-assisted network for the case of a single-antenna user and continuous phase-shifts. However, performing AS and PSD for multiple users is much more challenging and requires an entirely different design approach.
Few works have addressed the practical PSD in IRS, for example, the authors in [27] and [28] studied the power minimization problem of an IRS-assisted multiuser network under discrete phase-shifts, where they proposed a sub-optimal 'successive refinement' scheme, such that the phase of each reflecting element is optimized while keeping the phase-shifts of other elements fixed. Furthermore, the authors in [29] and [30] solved the discrete phase-shifts design via a mixed-integer linear programming method, which can be tackled through the branch and bound (BaB) algorithm. However, such an algorithm can suffer from exponential complexity, and the authors further proposed a lower-complexity design scheme that is similar to the successive refinement method. Despite the fact that the proposed works in [27]- [30] have lower complexities than conventional SDR or BaB schemes, they still suffer from a large number of vector/matrix multiplications to optimize the phase of each reflecting element; while in our proposed methods, our design aims at avoiding all vector/matrix multiplications during the optimization of all reflecting elements. It is worth to point out that the considered objective function for the discrete PSD in our work is different from those in [27]- [30], as our aim is to maximize either the gain of the channels or the achievable sum-rate, rather than minimizing the total transmit power.
In this paper, we propose two novel AS and discrete PSD (AaP) schemes with a unified framework, while maintaining low computational complexities. In particular, the proposed schemes are designed in a way such that the amount of vector/matrix multiplications is minimized, thereby resulting in a dramatic complexity reduction. Moreover, a key advantage of the proposed schemes is that they can be applied with any type of active precoding at the mMIMO BS. We test the performance of our proposed methods, under perfect and imperfect channel state information (CSI), with two types of linear precoders ZF and CB. Our main contributions in this paper are summarized as follows r We propose two novel AaP schemes for use in multiuser mMIMO-IRS networks. In particular, the first method is the maximum channels' gain (MCG)-AaP scheme, where both AS and PSD are carried out to maximize the 1 norms of the effective channel matrix.
r The second method, called iterative sum-rate maximization (ISM)-AaP, is designed to maximize the total sum rate of the network. More specifically, for the AS, we follow a decremental approach where at each iteration the antenna that contributes the least to the total sum rate is discarded. Then, the PSD stage takes place where the phase of each reflecting element is optimized to maximize the total sum rate using the remaining set of available antennas. Moreover, the proposed design criterion is flexible in the sense that the PSD stage can be carried out after discarding any number of undesired antennas, and thus both AS and PSD can be applied either separately or in an alternating fashion. r A thorough complexity analysis is provided to show that our schemes scale well when the number of antennas and reflecting elements become too large. In particular, and for the PSD, the complexities of our schemes increase linearly r Numerical results demonstrate that the proposed MCG-AS is efficient when applied with a ZF precoder; while the ISM-AS achieves near optimal performance and can be applied with any type of active precoding. Moreover, we show that when the number of reflecting elements is small, CB outperforms the ZF precoder especially when the number of selected antennas is not too large. The rest of this paper is organized as follows. The adopted system model is given in Section II. The proposed AS and discrete PSD schemes are introduced and explained in detail in Section III. The complexity requirement of the proposed schemes is thoroughly investigated in Section IV. Different numerical results are presented and discussed in Section V. Finally, conclusions are drawn in Section VI.
The list of acronyms used throughout this work is shown in Table I.
Notations: Matrices and vectors are represented by uppercase and lowercase boldface letters, respectively. A T , A H and Tr(A) are the transpose, Hermitian transpose, and the trace of matrix A, respectively. [A] m,n represents the nth element of the mth row of A.
[A] i,: and a i both refer to the ith row of A (unless a i was defined clearly in the manuscript as a column vector) and used interchangeably as appropriate, while [A] :,j represents the jth column of A.
[A] :,S is a submatrix of A which contains the column vectors of A that belong to set S. S j represents the jth element of S, while S j is a subset that has all but the jth element of S, i.e. S j = S \ S j . Furthermore, [x] i is the ith element of x, and [x] S is a subspace vector that has all the elements of x that belong to S. |S| represents the cardinality of S. diag{a} is a diagonal matrix whose diagonal contains the elements of a, while diag{A} is a vector whose elements are the diagonal of A. In addition, I N is the N × N identity matrix. E{.} and V {.} are the expectation and variance operators, respectively. Finally, ||a|| 1 is the 1 norm of a ∈ C N ×1 , and can be defined as ||a|| 1

II. SYSTEM MODEL
We consider a single-cell downlink network operating in time division duplex. We assume a single BS equipped with N antennas is transmitting independent signals to K single-antenna users, with the help of an IRS with M reflecting elements, as shown in Fig. 1. The line-of-sight (LoS) paths between the BS and K users are assumed to be blocked, while the BS and all users have clear LoS links to the IRS. Let Q = [q 1 , q 2 , . . ., q K ] T ∈ C K×N be the Rayleigh fading channel matrix between the BS and K users; G = [g 1 , g 2 , . . ., g K ] T ∈ C K×M denotes the channel matrix between the IRS and K users and follows a Rician distribution with both LoS and non-LoS (NLoS) components; while U = [u 1 , u 2 , . . ., u M ] T ∈ C M ×N denotes the Rician distributed channels between the BS and IRS. Similar to [6] and [22], we assume frequency-flat fading channels and perfect CSI is available at the BS. 1 Moreover, the reflecting elements at the IRS are controlled via the matrix Θ = diag{[e jθ 1 , e jθ 2 , . . ., e jθ M ]}. Therefore, the effective channel matrix H ∈ C K×N can be expressed as follows We consider a practical scenario where each reflecting element can only have one of L = 2 b quantization levels in α, where b is the number of quantization bits and α can be given as α = 1, e j2π/L , e j4π/L , . . ., e j2π(L−1)/L .
Assuming that all N antennas at the BS are activated, and by letting W = [w 1 , w 2 , . . ., w K ] ∈ C N ×K be any active precoding matrix (note that {w 1 , . . ., w K } are column vectors), the transmitted signal from the BS is where K = {1, 2, . . ., K} is the set of all users, and x k is an information symbol with E{|x k | 2 } = 1. Moreover, the precoding matrix must satisfy the constraint where P t is the total transmit power. Then, and assuming that signals reflected from the IRS more than once have extremely small powers and thus can be neglected, the received signal at the kth user is where n k ∼ CN (0, σ 2 ) represents the additive white Gaussian noise (AWGN). Therefore, the corresponding SINR for the kth user can be given as For a system with K users, the spectral efficiency in bits/s/Hz can be defined as follows It is clear that for any MIMO system, the performance is highly affected by the precoding scheme applied, and throughout this work, two different types of linear precoding are utilized, namely the ZF and CB, such that [31] where δ ZF and δ CB are scaling factors to ensure that the power constraint in (4) is met, and they can be given as [31] It is worth to point out that for the ZF precoder, the inter-user interference, i.e. the term in the denominator of (6), is nulled. However, here we present a general formula for γ k that holds for any precoding scheme.
Our main goal in this paper is to propose efficient methods to optimize Θ, and to select N s out of the available N antennas at the BS. Note that for the considered multiuser mMIMO-IRS scenario, AS and discrete PSD are both in fact NP-hard problems. Therefore, our aim in this work is to provide sub-optimal, yet highly efficient AS and PSD schemes with low computational complexities.

III. PROPOSED ANTENNA SELECTION AND PHASE-SHIFT DESIGN METHODS
We consider two methods for the AS and PSD, the first one is the maximum channels' gain method, where the antennas and phase-shifts are selected/optimized to maximize the effective channels' gain. For the second method, we propose an iterative AS and PSD to maximize the total sum-rate for any given active precoding matrix W. Both methods aim at reducing the computational complexity by minimizing the number of vector/matrix multiplications as will be thoroughly explained in the following subsections.

A. Maximum Channels Gain AS and PSD (MCG-AaP) Scheme
Here we present the ultra-low-complexity MCG-AaP method, and introduce our novel PSD scheme. We next formulate the corresponding optimization problem.
1) Problem Formulation: for the MCG-AaP scheme, both the AS and PSD are carried out to maximize the effective gains of the channels in terms of the 1 norm, this optimization problem can be formulated as follows [Δ] n,n ∈ {0, 1}, ∀n ∈ N , (10b) . ., M}, and Δ is a binary diagonal matrix that controls the selected antennas, such that It is clear that the optimization variables are coupled in this case, which makes the optimization difficult. Accordingly, we divide this problem into two sub-problems. At first, we select the antennas with maximum channels' norms under random phaseshifts. Then, we propose a novel low-complexity PSD scheme to achieve further channel gain enhancement for the selected antennas.

2) Maximum Channels' Gain AS (MCG-AS):
we start by recasting the formulation of effective channel matrix as follows where φ = diag{Θ}, and Ξ k = diag{g k }U ∈ C M ×N is the cascaded channel, which can be represented in a matrix form as follows Note that the evaluation of Ξ k is crucial to reduce the complexity during the PSD that follows the AS stage as will be seen later. 2 Therefore, evaluating the channel vectors h k in terms of Ξ k at this stage will result in avoiding any additional computations during the PSD stage.
After evaluating Ξ k (∀k ∈ K), we generate a random phaseshift vector φ and obtain H in (12). Then, the set of selected antennas, denoted as S, with the maximum channels' gains can now be given as follows indicates that the N s highest values of the argument will be identified. Next, we shift our attention to the optimization of φ.
3) Maximum Channels' Gain PSD (MCG-PSD): for the phase-shift design, we adopt a successive refinement method, such that we optimize the phase of one reflecting element while keeping the phases of the remaining (M − 1) elements fixed. This optimization problem can be expressed as follows subject to where maximize [a] j means that the optimization will be carried out only on the jth entry of a, while the remaining elements are fixed. Note that only the columns/entries which correspond to S in Ξ k and q k are involved in the above optimization problem, and this is due to the fact that the phase-optimization is carried out only for the selected antennas.
Optimizing each of the M phase-shifts according to (15) is straight forward; we simply need to evaluate the objective function L times and chose the phase-shift value that leads to the maximum channel gain. However, this method is not efficient in terms of the required computational complexity, and our aim is to reduce the complexity of (15) without any degradation in the performance.
To achieve that, let us have a closer look at the kth row of H, we can observe that it is clear that to optimize the phase of each reflecting element, we need to take into account all available antennas and users. Accordingly, we search over the L phase quantization levels in α to identify the optimal phase value. Considering the mth reflecting element, this can now be performed as follows  (17), we can perform the search/optimization over the L quantization levels for all M reflecting elements at the IRS using only element-wise operations, and without any compromise in the performance compared to (15). Moreover, after optimizing the phase of each reflecting element, we update H as follows and the optimal phase value for the mth reflecting element can now be obtained as The steps of the MCG-AaP scheme are given in Algorithm 1.

B. Iterative Sum-Rate Maximization AS and PSD (ISM-AaP) Scheme
In general, selecting all antennas at once can often lead to a relatively poor performance. In contrast, iterative schemes can be much more efficient in terms of performance when designed properly, but they require higher complexities compared to noniterative schemes. In this section, we propose an iterative AS and PSD scheme to maximize the total sum-rate of the network. In the following, we start by formulating the optimization problem and then thoroughly explain the proposed methods.
1) Problem Formulation: analytically, and for a given precoding matrix W, the optimization problem for maximizing the achievable sum rate via AS and PSD can be expressed shown at the bottom of the page. It is clear that this optimization problem is more complex than that for the MCG case. In particular, both AS and discrete PSD are in general NP-hard problems. Moreover, any selection for the antennas is influenced by the phase shift vector φ, and therefore, one should take the phase-shifts into account when designing such an algorithm. Next we introduce our proposed iterative schemes for the joint AS and PSD.
2) Iterative Sum-Rate Maximization AS (ISM-AS): the main idea behind iterative schemes is to select/discard one antenna at each iteration to maximize/minimize a given cost function. We follow a decremental approach, such that at each iteration we eliminate the antenna which contributes the least to the total sum-rate. Mathematically, and for a given W and φ, this can be represented as shown in (20), where S is the set of available antennas at any given iteration. To be more specific, we have |S| = N before eliminating any antenna (i.e. before the algorithm starts), while |S| = N s after discarding the least desirable (N − N s ) antennas. Moreover, ζ is the index of the discarded antenna at the current iteration. Note that the selection in (20) requires performing a very large number of vector and matrix multiplications, resulting in a considerably large computational complexity for a system with large number of antennas.
Accordingly, we aim at reducing the complexity without compromising the performance by minimizing the required amount of vector multiplications while selecting the antennas. Specifically, we generate a random phase-shift vector φ and obtain initial channel matrix H according to (12). 3 Then, we evaluate an initial precoding matrix W according to (8). Moreover, by defining Π = [π 1 , π 2 , . . ., π K ] T ∈ C K×K , such that where S is used at this stage only for convenience to express a general case regardless of which iteration the algorithm is at, since discarding each antenna results in removing one column of H and one row of W as will be explained later. It follows that the SINR for the kth user in (6) can now equivalently be expressed as Then, the least desirable antenna, denoted as β, can now be identified/discarded as follows 4 β = arg max n∈{1,...,|S|} k∈K Note that the search over all available antennas to discard the least favourite one is now carried out only through element-wise operations, which results in a tremendous complexity reduction compared to (20) without any compromise in the performance. In other words, both (20) and (23) lead to the exact same solution, and the latter approach requires much lower complexity than the former one. After eliminating each antenna, we update Π, Ξ k , W, H, and S as follows initial H by optimizing the initial phase-shifts using (17), and H is then updated after optimizing each reflecting element according to (18). It should be noted that at this stage we have S = {1, . . ., N}, and this initial optimization is carried out only once before the iterative algorithm starts. 4 Note that even when ZF precoding is applied, the interuser interference will not be zero when discarding any antenna, and our aim is to discard the antenna that leads to the minimum loss in the total sum rate. maximize Δ, φ k∈K subject to and constraints (10a) and (10b) (19b) It is noteworthy that in case the AS, PSD, and active precoding are carried out separately, then Π has to be evaluated only once. In contrast, for alternating optimization where the phase-shifts and active precoding are updated after discarding each v antennas (v ≥ 1), then Π has to be re-evaluated every time W is updated. However, when it comes to the effective channel matrix H, it undergoes two different types of reformations. In particular, after eliminating each antenna, one column will be removed from H according to (24d); while the same matrix will be updated according to (18) after optimizing the phase of each reflecting element as will be highlighted again in the following subsection.
3) Iterative Sum-Rate Maximization PSD (ISM-PSD): similar to the MCG case, we adopt a successive refinement method such that we optimize the phase of only one reflecting element at a time. However, the objective function considered in this case is maximizing the total sum-rate. This optimization problem can be expressed as follows subject to The above optimization problem can be solved by evaluating the objective function L times, and then set [φ ] m as the phase-shift value in α that achieved the highest sum-rate. 5 However, this method can suffer from high computational complexity as will be seen later. Accordingly, we aim at reducing the number of vector and matrix multiplications involved in the optimization to its minimum. However, the current formulations of the SINRs in (22) and/or the objective function of (25) do not help our cause in reducing the complexity. In particular, we need to look at the SINRs from a different angle where we can represent the signal and 5 Note that since q k was not directly involved in the AS design stage, S is utilized when dealing with these channels as can be seen in (25); while this is not the case for Ξ k and W which are updated according to (24b) and (24c), respectively, after discarding each antenna.
interference terms for each user as a function of the M reflecting elements. Therefore, and recalling that h T k = φ T Ξ k + [q k ] S , we recast π k in (21) as follows where c k = [q k ] S W ∈ C 1×K contains the signal and interference values for the kth user through the direct link with the BS; while Z k = [Ξ k w 1 , Ξ k w 2 , . . ., Ξ k w K ] ∈ C M ×K , which can be expressed in a matrix form as shown in (28) at the bottom of the page contains the signal and interference powers for the kth user via the IRS path before being multiplied or affected by φ. Note that we only need to evaluate {Z 1 , Z 2 , . . ., Z K } once for the optimization of all M reflecting elements. In particular, and considering the mth reflecting element, it is not difficult to see that the optimal phase can now be obtained as shown at the bottom of next page which does not involve any vector or matrix multiplications, resulting in a significant complexity reduction compared to the conventional successive refinement scheme given in (25), and without any compromise in the performance. After optimizing the phase of each reflecting element, H should be updated according to (18), while Π can be updated as follows (27) and the optimal phase for the mth reflecting element can be obtained as [φ ] m = [α] l m . It is worth to highlight that {c 1 , . . ., c K } in (26) are implicitly included in Π, and we do not need to evaluate them since we only need {Z 1 , . . ., Z K } to update Π after optimizing the phase of each reflecting element.

4) The Overall
Algorithm: we initialize a random phase-shift vector and obtain an initial channel matrix. Then, we carry out an initial phase-shift optimization on φ to maximize the channel norms of H before evaluating W. After that, T = (N − N s )/v iterations of AS and PSD are carried out for sum-rate maximization. In particular, and for each of the T iterations, we eliminate the least desirable v antennas according to (23), and then the M phase-shifts are optimized according to (29), shown at the bottom of the next page, for further performance enhancement. Note that when v = 1, φ will be optimized after each antenna is eliminated; while if v = N − N s , that means AS and PSD are carried out separately. Moreover, the precoding matrix W is re-evaluated after each time φ is optimized. The steps of the proposed ISM-AaP scheme are given in Algorithm 2.

IV. COMPLEXITY ANALYSIS FOR THE PROPOSED SCHEMES
In this section we present the complexity analysis of the proposed AS and PSD schemes in terms of number of floating point operations (FLOPs). In particular, we follow the analysis in [34], such that each operation (addition, multiplication, subtraction, or division) between two real numbers is equivalent to one FLOP. We also assume that for any x ∈ R, finding √ x and log 2 (x) are each equivalent to 1 FLOP.

A. Complexity Analysis of the MCG-AaP Scheme
1) Antenna Selection: evaluating Ξ k (∀k ∈ K) according to (13) requires MNK complex multiplications, which corresponds to 6MNK FLOPs. 6 Then, obtaining each row in H according to (12) requires NM complex multiplications and NM complex additions. As a result, obtaining H takes 8MNK FLOPs. Finding the 1 norm for all columns in H requires N (5K − 1) FLOPs. Finally, sorting the N channel gain values 6 Note that when a passive IRS is employed, the BS can estimate the cascaded channel Ξ k at once without performing the matrix multiplication in (13). Thus, our analysis in this section represents the worst-case scenario in terms of complexity requirements. before the selection requires N log 10 N FLOPs. Therefore, the total complexity of the MCG-AS scheme is C MCG AS = NK(14M + 5) + N (log 10 N − 1), (30) and the complexity order of the MCG-AS can thus be given as O(NMK + N log 10 N ).
2) Phase-Shifts Design: finding l m for each reflecting element according to (17) requires 15N s KL FLOPs. Then, updating H according to (18) after optimizing each reflecting element requires 10KN s FLOPs. Therefore, the total number of FLOPs required for optimizing all M reflecting elements with the necessary update of channel matrix is 3) Total Complexity: since this approach is non-iterative and does not require active precoding, it follows that the total complexity of the MCG-AaP can be given as follows

B. Complexity Analysis of the ISM-AaP Scheme
First, we need to include the complexity of the initial phaseoptimization to maximize the channel norms to ensure good convergence behaviour. Obtaining Ξ k (∀k ∈ K) and H (under random phase-shifts) require 6MNK and 8MNK FLOPs, respectively. Then, and following the analysis of the MCG-PSD, it is not difficult to see that the complexity of initial phase-optimization and the corresponding channels' update is MNK (15L + 10). Therefore, the initial complexity for obtaining Ξ, H, and carrying out the initial phase-shifts optimization and channels' update is Now we shift our attention to the complexity analysis of the iterative ISM scheme. In particular, since this method requires evaluating the precoding weights, we start by evaluating the complexities of CB and ZF precoders. Then we follow to evaluate the complexities of AS and PSD. Moreover, we take into account that the precoding matrix must be updated after discarding each v antennas, meaning that W has to be evaluated (T + 1) times. However, only the first T evaluations of W are required for the AS and PSD, and the last one is carried out specifically for the data transmission, and is not utilized for either AS or PSD purposes, and thus is neglected in our analysis.
1) Complexity of CB: to evaluate δ CB , we first need to obtain the channel cross-correlation matrix R = HH H , which requires P t /tr(R) is 2KT FLOPs given that the trace of a square matrix is the summation of its diagonal. Finally, obtaining the CB weights W = δ CB H H requires T −1 t=0 2K(N − tv) FLOPs. Therefore, the total complexity of the CB for any arbitrary number of AS and PSD iterations T can be given as 2) Complexity of ZF: obtaining the channel crosscorrelation matrix R requires FLOPs. Finally, finding the ZF precoding weights W = AR −1 requires FLOPs. Therefore, the total complexity for the ZF precoder for any number of AS and PSD iterations T can be given as 3) Antenna Selection: once an initial precoding matrix W is evaluated, T = (N − N s )/v iterations are carried out for AS and PSD. As a result, matrix Π = [π 1 , . . ., π K ] T has to be evaluated T times according to (21), which requires T −1 t=0 K 2 (8N − 8tv − 2) FLOPs. Eliminating the least desirable N − N s antennas according to (23) requires N −N s l=1 (14K 2 + 3K)(N − l + 1) FLOPs. Finally, updating Π according to (24a) after discarding each antenna requires 8K 2 FLOPs. Therefore, the total complexity for the AS can be given as (36) and the corresponding complexity order of the ISM-AS can be upper bounded to O (N (N − N s )K 2 ). 4) Phase-Shifts Design: at first, Z k (∀k ∈ K) has to be evaluated for each iteration in t, which requires T t=1 K 2 M (8N − 8tv − 2) FLOPs. Moreover, finding l m according to (29) requires (16K 2 + 3K)T ML FLOPs. Then, updating Π according to (27) requires 10K 2 MT FLOPs; while updating H according to (18) requires T t=1 10MK(N − tv) FLOPs. Therefore, the total complexity for the PSD can be given as and the corresponding complexity order can be expressed as O (K 2 MN s ), assuming that N s > L, and T = 1; while

5) Total
Complexity: the overall complexity required for the ISM-AaP can be given as follows where ρ ∈ {ZF , CB}, depending on the type of precoding utilized.

V. NUMERICAL RESULTS AND DISCUSSIONS
We start by introducing the properties of different channels used in this work. In particular, Q was modelled as Rayleigh fading with q k ∼ CN (0, d −ᾱ BS,k I N ), where d BS,k is the distance between the BS and kth user, andᾱ is the path-loss exponent for the NLoS links. In contrast, G follows a Rician distribution such that where g LoS k contains the deterministic LoS components for the kth user with a constant variance of d −α IRS,k , with d IRS,k being the distance between the IRS and kth user, andα is the path-loss exponent for the LoS channels; in contrast, g is the Rayleigh distributed channel vector between the BS and mth reflecting element. In our work, we assumeα = 2.5 andᾱ = 3.5, while σ 2 was set to −80 dBm. Unless stated otherwise, we also assume N = M = 64, K = 6, b = 3, and K Rician = 5 dB.
The simulation setup is shown in Fig. 2, where the BS is located at the origin of a 2D plane such that (x BS , y BS ) = (0,0), while (x IRS , y IRS ) = (80, 0) meters. Moreover, the K users all have equal distances to the IRS, and lie evenly on half a circle of radius d 0 = 30 meters centred at (x IRS , y IRS ). In particular, the location of the 1st user, denoted as u 1 in Fig. 2, is fixed at (80, 30), and similarly, the K th user is located at (80, −30).

A. Efficiency of Our Proposed AS Schemes
We compare the performance of our AS algorithms with two schemes. The first one is the random-based AS (R-AS); while the second scheme is the convex optimization-based AS (C-AS) utilized in [2], where the authors stated that it can provide near optimal performance. Moreover, to ensure a fair comparison, we generate random phase-shifts (R-PS) to obtain H, which is then utilized in all different AS schemes (ISM-AS, MCG-AS, R-AS, and C-AS), and we skip the PSD part to focus only on the performance of the proposed AS schemes and avoid any kind of bias in the comparison. Moreover, we set v = N − N s , meaning that for our iterative ISM-AS scheme, the active precoding and AS will be carried out separately.
As demonstrated in Fig. 3, the proposed ISM-AS significantly outperforms all other AS methods and for different number of selected antennas. In addition, when N s = 32 and the transmit power is greater than −16 dBm, the ISM-AS outperforms even the full-system case where all antennas are activated at the BS. This is due to the fact that with CB, some antennas can be harmful as they cause more interference than they contribute with useful signal power [4], [35]. For example, when the transmit power is −10 dBm and the number of selected antennas is 32, the ISM-AS outperforms the full system, C-AS, MCG-AS and R-AS by 4.26, 6.57, 8.61, and 9.28 bits/s/Hz, respectively.
When the ZF precoder is adopted at the BS, the ISM-AS provides the exact same performance as the C-AS as demonstrated in Fig. 4, and for all considered values of P t and N s . However, the ISM scheme requires much lower complexity than the C-AS, especially when the number of antennas at the BS is large as will be seen later. Moreover, the ultra-low complexity MCG-AS becomes near optimal when K = 6 and the number of selected antennas is relatively large, in this case when N s = 32, which was not the case with CB. Note that when ZF precoder is applied, the optimal performance is obtained when all antennas are activated, which was not the case with CB. This is due to the fact that the interuser interference is completely nulled when the ZF precoder is applied, and therefore activating more  antennas will always result in higher achievable rates. It should be noted that the main motive behind AS design is to enhance the energy efficiency (EE) performance of wireless networks. In general, both AS and transmit power play key roles in the EE performance of mMIMO systems. However, our focus in this work is on the AS part. Nonetheless, joint AS and power control design for energy-efficient mMIMO-IRS networks will be the topic of future investigations. In particular, and for CB, the ISM-PSD provides a significant 6.8 and 3.8 bits/s/Hz gain compared to the random phase-shift case, for v = 1 and v = N − N s , respectively, given that P t = −10 dBm; while for the ZF precoder, a gain of 4.13 and 2.89 bits/s/Hz is obtained compared to random phase-shifts, when v = 1 and v = N − N s , respectively, and under the same power budget of −10 dBm. We can conclude that when CB is applied, setting v to a small value (i.e. updating the phase-shifts after discarding one or few antennas) is beneficial in terms of performance, although that would come at the price of increased complexity; while for ZF, carrying the PSD after discarding all N − N s undesired antennas can still provide high performance. Note that the results shown in Fig. 5 highlight the gain obtained via the proposed PSD only, as all different curves adopt the same AS scheme. Fig. 6 demonstrates the efficiency of the MCG-PSD scheme. It is clear that, and similar to the AS case, when the total transmit power is relatively high, the MCG-PSD works well with the ZF precoder but not with CB. This is due to the fact that maximizing the channel gain does not contribute much to the total sum-rate when CB is applied, as the interference among users becomes the main bottleneck and needs to be taken into consideration. However, with the ZF, the ultra-low-complexity MCG PSD can provide a decent gain of 3 bits/s/Hz compared to random phaseshifts when P t = −10 dBm. Fig. 7 shows the performance of the proposed ISM-AaP for a wide range of number of reflecting elements and different number of quantization levels. The results indicate that when the BS has a small number of RF chains, CB outperforms the ZF when the number of reflecting elements at the IRS is not very large, regardless of the number of quantization levels available at the IRS. In contrast, when M is large, ZF becomes the better option and regardless of the number of available RF chains at the BS. Moreover, when the number of IRS elements is small, having a large number of quantization levels at each reflecting element does not provide much gain. For example, when N s = M = 16 and ZF precoder is applied, having 2 quantization levels per reflecting element can provide a sum-rate that is only 0.4 bits/s/Hz less than that provided by reflecting elements with 16 quantization levels; while the same performance gap becomes  2.3 bits/s/Hz when the IRS is equipped with 128 reflecting elements. Fig. 8 highlights the performance of different AS and PSD schemes with ZF and CB and for a wide range of number of users. Our results indicate that for both different AS and PSD schemes, when the transmit power is limited (in this case −20 dBm), CB outperforms the ZF unless the number of users is very small (less than 6). In contrast, when the transmit power is relatively high (in this case −10 dBm), ZF can provide a significant performance gain compared to CB, as long as the number of active users is less than the number of RF chains at the BS. In addition, the ZF MCG-AaP is no longer near optimal in this case, which demonstrates the importance of the ISM-AaP scheme even when the ZF precoder is applied. For example, when the number of users is 12 and P t = −10 dBm, the ZF ISM-AaP outperforms the ZF MCG-AaP and CB ISM-AaP by 9 and 9.8 bits/s/Hz, respectively. However, when the number of users is equal or close to the number of selected antennas, CB becomes the better choice as the ZF suffers from a dramatic performance loss and regardless of the AS and PSD schemes utilized, this is due to the fact that most of the available power will be used to null the interference among users.

C. Performance of the Proposed Schemes With Different Number of Reflecting Elements, Quantization Levels, and Users
It is worth noting that the proposed discrete PSD schemes in this work are sub-optimal, and thus are not guaranteed to converge to a global solution. However, our main focus for the design of discrete phase-shifts is to reduce the cost of implementing such methods to a considerably low value.

D. Robustness Against Imperfect CSI
In this subsection, we investigate the robustness of our proposed schemes against channel estimation (CE) errors. Note that despite the passive nature of IRSs, estimation of wireless channels with satisfactory accuracy can still be achieved as demonstrated in [32], [36]- [38] and the references therein.
However, estimating each of the two links u m and g k (m ∈ M, k ∈ K), of the cascaded channel Ξ k separately can be challenging when dealing with passive IRSs. Instead, one can obtain an estimate of Ξ k altogether using orthogonal pilot sequences sent by the users [32]. 7 Moreover, estimating the direct links between each user and BS q k (k ∈ K) can be performed while the IRS is "switched off" (i.e. working in absorbing mode rather than reflecting mode). As a result, we introduce two independent CE errors, one for the direct links, and one for the cascaded channels as followŝ whereq k andΞ k are the complex Gaussian CE errors [39], which are assumed to be uncorrelated with q k and Ξ k , but have the same statistical properties as of q k and Ξ k in terms of mean and variance values (see appendix C), and σ e ∈ [0, 1] accounts for the estimation accuracy. Then, we can represent the channel coefficients for the kth user aŝ As Fig. 9 demonstrates, both the ZF and CB suffer from performance degradation when dealing with imperfect CSI. However, the CB is more robust against CE errors compared to the ZF, as the latter is known to be more sensitive to imperfect CSI and unmodeled interference. Moreover, the ISM-AaP scheme maintains its superiority over the MCG-AaP even for imperfect CSI, and for both CB and ZF precoders.

E. Computational Complexity
Here we highlight the complexity requirement for the proposed AS and PSD schemes. We first demonstrate the complexities of AS and PSD separately, for a wide range of N and M values, respectively. Then, we show the complexity of the overall design. 7 Note that our proposed AaP schemes require the knowledge of only the cascaded channel of the IRS (Ξ k ), and thus can be implemented even when the IRS is entirely passive.  For the AS, the type of precoding utilized is irrelevant when MCG-AS is applied, as this method requires only the knowledge of H. In contrast, the ISM-AS requires the knowledge of both H and W. Therefore, the complexity of the ISM-AS is the sum of C W and C ISM AS . We compare the complexities of our methods with the C-AS in [2], which relies on interior-point methods [7], and has a complexity order of O(N 3.5 ) [24]. Finally, for the MCG-AS, we focus only on the computations required to perform the AS after obtaining H to ensure a fair comparison among different schemes. Note that for our own AaP methods, we show the exact complexities required for their implementations based on our analysis in Section IV.
As demonstrated in Fig. 10, the MCG requires the least complexity; however, that comes at the price of a degraded performance when CB is adopted, or when ZF is applied and the number of users is relatively large. In contrast, the C-AS has the highest complexity, especially when the number of antennas at the BS is large; while the ISM-AS scheme has lower complexity than the C-AS method and achieves near optimal performance regardless of the precoding scheme utilized, which is not the case for both C-AS and the MCG-AS schemes.
For the PSD, we show the complexity as a function of M . In particular, the complexity of the proposed MCG-PSD is given in (31); while the complexity of the proposed ISM-PSD is the sum of C ISM P SD and C ISM init after subtracting the complexity required for finding H, which accounts for 14MNK. We compare the required complexities of our proposed implementation methods with that required by implementing conventional MCG and ISM (cMCG and cISM) successive refinement schemes based on our complexity analysis in Appendix A and Appendix B.
As Fig. 11 demonstrates, the proposed design methods achieve significant complexity reduction compared to conventional implementation methods and without any compromise in the performance. More importantly, the complexity of the proposed ISM and MCG schemes scale linearly with M , which makes the proposed designing methods in this paper highly attractive when dealing with large IRS arrays. Finally, the total complexity is shown in Fig. 12, which demonstrates that the proposed AS with PSD schemes scale well when both the number of antennas at the BS and the number of reflecting elements at the IRS are large. Moreover, we can observe that the complexity of the MCG-PSD becomes dominant for the MCG-AaP scheme, since optimizing the phase of each reflecting element affects all available antennas resulting in higher complexity compared to the MCG-AS, which works only on the effective channel matrix H, and thus does not involve the M reflecting elements in the selection process.

VI. CONCLUSION
We proposed two novel AS and PSD schemes in multiuser mMIMO-IRS networks, and thoroughly evaluated their performances under both CB and ZF precoders. The proposed algorithms were designed in a way such that matrix/vector multiplications were minimized, and the AS and PSD were carried out using only element-wise operations, and without compromising the performance. In particular, for the PSD, our proposed schemes provided a significant performance gain compared to random phase-shifts, and their complexities scaled linearly with the number of reflecting elements at the IRS; while for the AS, the ISM approach provided near optimal performance with reduced complexity compared to other efficient AS methods found in the literature, and can be utilized with any type of active precoding at the mMIMO BS.

APPENDIX A
To optimize each of the M reflecting elements according to (15), we start by evaluating φ T [Ξ k ] :,S + [q k ] S 1 , which requires (8MN s + 5N s − 1) FLOPs. Then, summing over all k ∈ K and for all L possible values of [φ] m results in (8MN s KL + 5N s KL) FLOPs. Therefore, the total number of FLOPs required for the conventional MCG (cMCG)-PSD to optimize all M reflecting elements is and the corresponding complexity order is O(M 2 N s KL).

APPENDIX B
Evaluating the objective function of (25) for each of the L values of [φ] m in α at any given iteration t requires MK 2 (8N − 8tv) + K 2 (8N − 8tv + 4) + (3K − 1) FLOPs. As a result, the total number of FLOPs required for optimizing all M phaseshifts with an arbitrary number of iterations T , utilizing the conventional ISM (cISM)-PSD approach in (25), is and the corresponding complexity order when T = 1 can be reduced to O(M 2 K 2 N s L).

APPENDIX C
Here we present the derivation of statistical mean and variance values for the cascaded channels.
For the mth link between the kth user and the IRS, we have where E{[g LoS k ] m } = 0 since we assume random phases for the deterministic LoS links similar to [40]. Moreover, the variance for the same channel is where cov(x, y) is the covariance between x and y, and is equal to zero when the two random variables are uncorrelated. Similarly, for links between the IRS and BS, we can write and we drop the indices of (m, n) in β U as the distance between any antenna at the BS and any element at the IRS is assumed to be equal for all antennas/reflecting-elements. Finally, for the cascaded channel we have