Maximum-Likelihood Detection With QAOA for Massive MIMO and Sherrington-Kirkpatrick Model With Local Field at Infinite Size

Quantum-approximate optimization algorithm (QAOA) is promising in Noisy Intermediate-Scale Quantum (NISQ) computers with applications for NP-hard combinatorial optimization problems. It is recently utilized for NP-hard maximum-likelihood (ML) detection problem with challenges of optimization, simulation and performance analysis for <inline-formula> <tex-math notation="LaTeX">$n \times n$ </tex-math></inline-formula> multiple-input multiple output (MIMO) systems with large <inline-formula> <tex-math notation="LaTeX">$n$ </tex-math></inline-formula>. QAOA is recently applied by Farhi et al. on infinite size limit of Sherrington-Kirkpatrick (SK) model with a cost model including only quadratic terms. In this article, we extend the model by including also linear terms and then realize SK modeling of massive MIMO ML detection. The proposed design targets near ML performance while with complexity including <inline-formula> <tex-math notation="LaTeX">$O({16}^{p})$ </tex-math></inline-formula> initial operations independent from problem instance and size <inline-formula> <tex-math notation="LaTeX">$n$ </tex-math></inline-formula> for optimizing QAOA angles and <inline-formula> <tex-math notation="LaTeX">$O(n^{2} \,p)$ </tex-math></inline-formula> quantum operations for each instance. We provide both optimized and extrapolated angles for <inline-formula> <tex-math notation="LaTeX">$p \in [{1, 14}]$ </tex-math></inline-formula> and signal-to-noise (SNR) < 12 dB achieving near-optimum ML performance with <inline-formula> <tex-math notation="LaTeX">$p \geq 4$ </tex-math></inline-formula> for <inline-formula> <tex-math notation="LaTeX">$25 \times 25$ </tex-math></inline-formula> and <inline-formula> <tex-math notation="LaTeX">$12 \times 12$ </tex-math></inline-formula> MIMO systems modulated with BPSK and QPSK, respectively. We present two conjectures about concentration properties of QAOA and near-optimum performance for next generation massive MIMO systems covering <inline-formula> <tex-math notation="LaTeX">$n \lt 300$ </tex-math></inline-formula>.


I. INTRODUCTION
Quantum-approximate optimization algorithm (QAOA) is a future promising quantum algorithm to be utilized on Noisy Intermediate-Scale Quantum (NISQ) devices [1] to provide high-performance solutions for NP-hard combinatorial optimization problems [2].It is applied on a rich set of engineering problems with extensive theoretical and performance simulation studies especially for Max-Cut problems in graphs while promising quantum computational advantage (quantum supremacy) on NISQ computers [3].QAOA is a hybrid method where the angle parameters of the layers of its quantum circuit with a depth count denoted by p ≥ 1 are optimized in a classical computer while the solution is obtained among the measurement results of QAOA circuit running on a quantum computer.There are extensive theoretical and simulation studies for QAOA circuits with p = 1 calculating the optimum angles and expected cost explicitly while it becomes challenging to calculate for p ≥ 2 [4].Besides that, Sherrington-Kirkpatrick (SK) model denotes a spin system with all-to-all couplings among the spins while defining a combinatorial search problem to find the value of the lowest energy with a recent classical solution [5], [6].Application of QAOA for general SK models including only quadratic terms is introduced by Farhi et al. in [5] where an algorithm for expectation calculation is presented with the computational complexity of O(16 p ) for the infinite size limit of SK model.In [7], a generalized multinomial theorem is provided to calculate expectations at the infinite size limit for various random ensembles including mixed spin models consisting of a cost function not only including local and quadratic terms but also higher order terms until q ≥ 2 including clauses with more than two terms.The computational complexity is shown to be either O(4 q p ) or O(p 2 4 p ).In this article, we extend the algorithm in [5] for SK models which include linear terms in addition to quadratic terms and where the variances of quadratic and linear cost coefficients increase as O(n) and O(n 2 ), respectively, with respect to problem size n.Then, we apply QAOA with the extended SK model on NP-hard optimal maximum likelihood (ML) detection problem [8].We observe near-optimum performance in extensive simulation studies with promising applications of QAOA in NISQ devices.
QAOA is recently applied for optimal ML detection problem especially for multiple-input-multiple-output (MIMO) communication systems as a low complexity alternative for the optimum ML detector for which the complexity increases exponentially with respect to symbol size n for a general n×n MIMO system with large number of users and transmitters [8].Massive MIMO systems with optimum ML detection capability are highly promising to increase the performance of next generation communication systems compared with conventional solutions including minimum mean square error (MMSE) detection with low error performance and sphere decoding resulting in high complexity for large systems [9].The number of studies analyzing the application of QAOA for ML detection is highly limited.In [8], ML detection problem is modeled as a quadratic unconstrained binary optimization (QUBO) problem and QAOA implementation is proposed as a solution for NP-hard ML detection problem with application for binary-phase-shift-keying (BPSK) MIMO systems.On the other hand, computational complexity of finding optimal angle values for QAOA is presented as O(m p 5 2 3 n ) where n is the number of transmit antennas, p is the QAOA depth parameter and m is the number of iterations.Theoretical analysis is performed for p ≤ 3, while simulations are performed for significantly small scale systems with n ≤ 3 and p ≤ 20.It is not clear how to calculate expectation of QAOA circuit, to optimize angle parameters and to simulate performance for large n and high depth p as a significant challenge for the utilization of QAOA in practical ML detection problems.Similarly, in [10], proposed design is generalized for higher order modulation schemes as an extension of BPSK model.In [11], a theoretical analysis of QAOA expectation for p = 1 for channel decoding with arbitrary binary linear codes is presented and simulations are performed.In [12], QAOA is utilized for turbo detection in coded MIMO systems by defining a learning algorithm with recurrent neural networks to optimize parameters with simulations showing high performance.
On the other hand, there are high number of studies analyzing the application of quantum annealing (QA) and Ising machines for ML detection problems.In [9], Coherent Ising Machines (CIMs) are presented for performing near-optimal MIMO detection for MIMO systems as large as 64 × 256.The importance of realizing massive MIMO systems with nearoptimal performance is emphasized for allowing both large number of users and antennas at the same time.In [13], a multi-stage version is presented for high order modulations and large systems.In [14], QA is utilized for vector perturbation precoding (VPP) in MIMO systems reaching sizes of 12 × 12.There are also additional studies including D-Wave based experimental tests for MIMO ML decoding in [15]- [17].On the other hand, in [18], Grover search based quadratic speed-up is utilized in MIMO ML detection.

A. Contributions
In this article, we firstly include linear terms for the local fields in addition to quadratic terms of SK model by extending the algorithm in [5].Then, we apply QAOA on SK modeling of ML detection problems for n × n massive MIMO systems.We present an optimization algorithm of QAOA angle parameters with significant advantages of independence from specific problem instance and n.The proposed algorithm is an extended version of the expectation calculation algorithm in [5] designed by Farhi et al. while preserving the computational complexity of O(16 p ).We present optimized angles for p ∈ [1,7] achieving near-optimum performance with p ≥ 4 for 25 × 25 MIMO systems and SNR < 12 dB.Besides that, we propose a heuristic extrapolation algorithm for p ∈ [8,14] achieving high performance in simulation studies with the ratio of expected cost of QAOA to the optimum cost, i.e., denoted as approximation ratio conventionally, reaching ≈ 0.9364 [19].In addition, we define additional approximation ratio performance metrics with respect to minimum measured cost among QAOA circuit outputs and observing almost unity ratios achieving ML performance with p ≥ 4. We present two conjectures modeling the concentration properties and performance of QAOA for ML problems.Conjecture 1 is supported by extensive simulation studies for n ≤ 28 while requiring theoretical and simulation based verifications for larger n values as an open issue.Conjecture 2 promises that presented angles for p ∈ [1, 14] achieve near-optimum performance for massive MIMO symbol length n < 300.
We provide extensive simulation studies for QAOA in IBM Quantum Lab by analyzing statistical properties of QAOA measurement outputs for 236500 different QAOA circuits corresponding to random problem instances for varying n, p and SNR [20].We share statistical properties for each problem instance in a data repository in [21] by creating an extensive data set for further analysis and verification by researchers.We theoretically analyze and simulate problem instance based mean and variance of measured minimum cost of QAOA circuits as an extension to the statistical properties of expected cost for SK models in [5].Proposed solution promises widespread industrial utilization of QAOA in NISQ devices with error mitigation approaches [22]- [24] for near future massive MIMO wireless communications.
The remainder of the paper is organized as follows.Next, in Sections II and III, background about QAOA and application of QAOA for ML problems are presented.In Section IV, SK model of ML problem is given.In Section V, application of QAOA on SK model is extended by including linear terms for the local field.In Section VI, concentration properties of QAOA measurements and problem based expectations are discussed.Then, extensive simulation studies for QAOA circuits solving ML problem are presented in Section VII.Finally, in Section VIII, conclusions are presented.II.BACKGROUND ABOUT QAOA QAOA is a promising to be utilized on NISQ devices for addressing combinatorial optimization problems [1], [2].QAOA prepares a quantum state approximating the ground state of a problem Hamiltonian which corresponds to the optimal solution of a classical cost function C(z) defined on n-bit strings z = (z 1 , z 2 , . . ., z n ) ∈ {+1, −1} n [5].A typical cost function in a combinatorial optimization problem is represented in Ising form as follows [4]: where J j,k values are the quadratic terms (interactions) and h j terms are the linear terms (local fields).Besides that, SK model generalizes the Ising model of a classical spin system with couplings between all n spins while J jk and h j are chosen as independent Gaussian random variables [5].Cost function is mapped onto Ising model Hamiltonian H C defined as follows: where Z j are Pauli-Z operators acting on the j-th qubit.QAOA prepares a quantum state |Ψ(γ, β) such that expectation value of the problem Hamiltonian with respect to the quantum state, i.e., C or C QAOA , is minimized where the expectation of the cost is defined as follows: State |Ψ(γ, β) ≡ |Ψ minimizing C is prepared as follows: where B = n j=1 X j is a mixing Hamiltonian, X j is the Pauli-X operator acting on jth qubit, initial state is the equal superposition state, and U (B, β r ) = e −ı βr B and U (C, γ r ) = e −ı γr H C are unitary operators parametrized by the optimized angle sets γ = (γ 1 , γ 2 , . . ., γ p ) and β = (β 1 , β 2 , . . ., β p ) while p is the layer depth of QAOA circuit.Angle parameters γ and β are optimized classically to maximize the expectation of the problem Hamiltonian [25].The measurement of |Ψ(γ, β) results in a string z such that C(z) is close to C .

III. APPLICATION OF QAOA FOR ML PROBLEMS
On the other hand, QAOA is recently applied for optimum ML detection in MIMO systems in [8], [10] by also proposing QUBO model for the problem.Assume that H represents real valued n × n channel gain matrix known by the receiver for a specific instance of the estimated channel, H(k, l) is the element of H at the kth row and lth column, T is the noise vector.Then, the received vector y is given as follows: Then, ML detection problem is defined as follows: In [8], the cost function is transformed into the following form: where J jk , h j and constant A are defined as follows: and the corresponding quadratic term represented with the matrix J including the elements J i,j at ith row and jth column, and linear term represented with the column vector h with elements h j at the jth row are defined as follows: The performance of QAOA is analyzed in [8] for small length symbols, i.e., n ≤ 3, and for specific instances of H by showing promising results for the application of QAOA for large problem instances which is one of the the main targets in this article.We not only provide optimization for large n but also provide instance independent solution working for all instances once the optimization is achieved classically for the general model of the problem based on statistical properties of channel matrix H and noise n, SNR and layer depth p.
Classical calculation of C for large values of n is required to optimize γ and β without requiring a quantum computer to calculate the expectation.The classical calculation is provided explicitly for p = 1 in [4] while the complexity of the calculation is not clearly defined for p ≥ 2 and for large values of n in the fully coupled model of the cost function in (8).Next, SK model of ML problem is defined which allows to calculate C for large values of n and large depth p for the typical instances of H with the elements of the matrix H represented as independent and identically distributed Gaussian random variables with zero mean.

IV. SK MODEL OF ML DETECTION PROBLEM
Conventional SK model as discussed in [5] includes the classical cost function with a scaled J j,k as follows: where J jk is a normal random variable.In [5], the expected value of C / n and (C / n) 2 as n → ∞ with respect to the instances of J j,k are calculated as follows: where for typical instances of ML problem.Besides that, in [7], a generalization of multinomial theorem is presented to calculate expectations at the infinite size limit for various ensembles of random combinatorial optimization problems including mixed spin models defined with the following cost function: where c q for q ∈ [1, q max ] are real variables and J i1,...,iq is the cost tensor multiplying the expression z i1 z i2 . . .z iq .The model generalizes [5] by including not only linear terms but also higher order terms compared with quadratic terms.Since the elements of H are independent and identically distributed Gaussian random variables with zero mean, J j,k and h j are approximated also as Gaussian random variables due to law of large numbers for large n.As a result, ML problem is modeled with SK model with the difference that it also includes the linear terms h j z j compared with (14) and without direct normalization of J j,k and h j as described next.
Assume that variances of real H and σ 2 n , respectively, and their mean values are zero denoted with µ H = µ n = 0. SNR for the th element of y, i.e., y , is calculated as follows: As proved in Section V, SK model with independent J j,k and h j allows to calculate C with O(16 p ) complexity based on the extended version of the algorithm presented in [5].The empirical mean and variance of J j,k and h j are approximated as follows for n 1 and typical transmitted BPSK symbols s with equally probable individual bits s i ∈ {−1, +1} for i ∈ [1, n] as proved in Appendix A: where It is an open issue to extend the algorithmic framework in [5] to extend for correlated J and h.The correlations are neglected in this article to exploit the described heuristic framework showing high performance.

A. QAOA Solution with SK Model
QAOA solution is simply realized by creating J and h for a given problem instance with known values of H and received vector y by using ( 12) and ( 13), respectively, and then realize p-layer QAOA circuit with angle parameters γ and β optimized with respect to SK model formed with zero mean J j,k and h j having variances n σ 2 J and n 2 σ 2 h as defined in (20) and (21), respectively.Then, we perform measurements of the implemented QAOA circuit outputs.We calculate the minimum cost corresponding to measured strings, i.e., denoted as min z {C QAOA (z)} as defined in the next section, to obtain the QAOA solution string z giving the minimum cost.In the next sections, we interchangeably use ] and E J,h [.] denotes expectation with respect to varying problem instances with different J and h.

V. QAOA FOR SK MODEL WITH LINEAR TERM
We extend the moment calculations of SK model at infinite size limit in [5] for the cost function in (1) including the local field term h j in addition to the quadratic term J j,k .The algorithm in [5] allows extension with a minor modification for the following differences of the problem model.We follow a similar method in [5], i.e., extended version of the equation (76) in [5], while with the following fundamental differences: 1) λ C / n 2 and λ C 2 / n 4 are calculated instead of λ C / n and λ C 2 / n 2 .2) Linear h j terms are included in the cost function with the assumption that J j,k and h j are zero mean linearly independent and identically distributed Gaussian random Algorithm 1 Calculation of W u for u ∈ A (Local field included extension of the algorithm in [5] for SK model) ) / 2 such that if the group index of u j is denoted with j , then j ≤ k if j ≤ k. 4: for j ← |D| to 1 (starting with j = |D| and decreasing to j = 1) do 5: Calculate X uj and ∆ uk.uj for k > j with (27,28). 6: W uj = − W uj 8: end for 9: for a ∈ A p+1 do 10: W a = Q a 11: end for variables with variances σ 2 J and σ 2 h increasing as O(n) and O(n 2 ) with respect to n, respectively.3) Optimized angles γ decrease with 1 / n as γ = γ / n for optimized γ with elements We include additional term p in the subscripts of γ r , γ r and β r to better analyze in simulation studies.
On the other hand, the generalized algorithm for mixed spin model of order q in [7] also assumes that the variance of J i1,i2,...,iq equals to 1 / n q−1 .This assumption is also different compared with our case that σ 2 J = n σ 2 J and σ 2 h = n 2 σ 2 h increasing as O(n 3−q ) with n.We provide the next theorem as the extended version of [5]: Theorem 1.Given the angles γ p,r = γ p,r / n and β p,r for r ∈ [1, p] for QAOA circuit and the problem Hamiltonian where J j,k and h j are independent zero mean Gaussian random variables with variances σ 2 J = n σ 2 J and σ 2 h = n 2 σ 2 h for constant σ 2 J and σ 2 h , respectively, the following equalities hold: where V p ( γ, β) is defined as follows: where is the set of 2 2 p strings and W u is calculated in Algorithm 1. Furthermore, E J,h C / n 2 and E J,h C 2 / n 4 for large n include additional terms decreasing with O(1 / n).
As described in [5], the configuration basis A is defined as follows such that kth index bit (z On the other hand, the a operation is defined in [5] as follows for a ∈ A for 1 ≤ ≤ p: where where Q b and g b are defined in ( 56) and (60), respectively.Some other variables utilized in Algorithm 1 are described as follows based on the extension of the iterative formulation in [5]: where Φ a = p r=1 γ p,r (a r:p − a −r:−p ) .

VI. CONCENTRATION PROPERTIES OF QAOA
Theorem 1 in this article and 'Concentration Corollary' in Section 4 in [5] utilizing Chebyshev inequality result in the following concentration properties as n → ∞: where P Q denotes probability with respect to QAOA measurement, The meaning of ( 29) is that for typical instances of the SK model with local field and for large values of n, QAOA measurements result in a symbol string z such that C(z)/n 2 ≈ C / n 2 ≈ V p ( γ, β).Therefore, Theorem 1 allows the following approximations for large n: Expected cost with a measured string at the QAOA circuit for typical instances of SK problem, e.g., in our case ML problem, is very close to n 2 V p ( γ, β) independent from the specific instance of the problem.Besides that, σ C ≡ where O 2 ≤ O 2 is utilized for operators [5].This means that as the mean E J,h [ C ] increases as O(n 2 ), the standard deviation for various instances of J and h increases with a smaller rate of O(n √ n) resulting in concentration property.On the other hand, we observe in our simulations in Section VII that instance based mean of the distance between measurement mean, i.e., C QAOA , and the minimum measured value, i.e., min z {C QAOA (z)}, shows O(g p (n) n 2 ) increase with increasing n as presented in the following conjecture: Conjecture 1. Problem instance based expectation of the difference between the measured mean cost and minimum cost denoted by C QAOA and min z {C QAOA (z)}, respectively, for QAOA circuit measurements of ML problem increases as O g p (n) n 2 with increasing n 1: where g p (n) > 1 / √ n is a slowly decreasing function of n for each p.Besides that, instance based variance of min z {C QAOA (z)} ≡ m Q increases as O(n 3 ) leading to the following equality for some constant c p depending on p: The supporting evidence is provided in simulations in Section VII.Although standard deviation of min z {C QAOA (z)} with respect to problem instances increases as O(n √ n), it is conjectured that distance between C QAOA and min z {C QAOA (z)} increases with a larger rate of n at least until some large value of n depending on p.We observe in simulations that variances in (33) and (35) increase as O(n 3 ) supporting Theorem 1 and conjectures.It is an open issue to determine the probability distribution type for ML problems while QAOA measurements are observed to have approximate Boltzmann distribution in [26], [27].
We assume that ML cost and solution cost are close to each other for high SNR as shown in Section VII.Based on our observations in simulation studies, we assume distribution of C QAOA with respect to varying problem instances has a Gaussian form as a simplified assumption.For simplicity, if min z {C QAOA (z)} is assumed to have mean C QAOA with a variance denoted by σ 2 QAOA (p) combining the variance of distributions of C QAOA with respect to J , h and distribution of the measurements for an individual problem instance, then probability of min z {C QAOA (z)} < C(s) with respect to instances J and h becomes proportional to the following: where σ 2 QAOA (p) increases as O(n 3 ) with increasing n based on observations in simulations and Φ(.) is the cumulative distribution function of a normal random variable.We heuristically define a variance combining the variances with respect to both all the problem instances and individual measurement outcomes of a single QAOA circuit as follows: where both C(s) and C QAOA increase as O(n 2 ) and ) is the performance constant for r * and A(p) is calculated as follows: Then, n max ∝ α 2 (r * ) / A 2 (p) is obtained.Substituting (31) and (38) into (39) and including additional constant for fitting result in the following: Conjecture 2. The maximum value of n denoted by n max for p-layer QAOA circuits achieving near-optimum ML performance for high SNR ( ≥ 12 dB) is approximated as follows: where α 1 (r * ) and α 2 (r * ) are empirical approximation constants depending on the target approximation ratio r * and corresponding to first and second moments of the measured costs for p-layer QAOA circuits and χ(p) is defined as follows: (42) The proofs and verification of both conjectures require to obtain QAOA performance for large n which is beyond our capability with access to public simulators with qubit numbers smaller than 32.Extensive simulation results show that n max ≈ 28 for p = 4 and SNR = 15 ≈ 12 dB with near-optimum performance.

VII. QAOA SIMULATIONS
We provide extensive simulations by using IBM Quantum Lab and "ibmq qasm simulator" back-end by IBM Quantum [20].Simulation parameters are presented in Table I.
The number of instances denoted by M is the number of realizations of the main ML problem in (6) with randomly generated real valued H, symbols s ∈ {−1, +1} ⊗n and noise n in each instance.Elements of H and n are generated with variances σ 2 H = 1 and σ 2 n = n σ 2 H / SN R = n / SN R, respectively, for each instance.Therefore, the number of instances is also equal to the total number of estimated symbols with length n allowing to calculate total probability of bit error with P e = N b / (M n) where N b is the total number of bit errors in all M symbols.J and h are calculated with ( 9) and ( 10) and then QAOA circuit output is simulated.The number of measurements (shots) obtained in each simulation instance is chosen as 4096.We simulated a total of 236500 QAOA circuits for varying n, p and SNR by providing an extensive result set for the performance of QAOA for ML problems.We share measurement results for C QAOA , C 2 QAOA , min z {C QAOA (z)}, C(s) and n b for each random problem instance in [21] as a data repository to be utilized in research studies of QAOA where n b denotes the total number of bit errors in a single problem instance among n bits.
We simulate QAOA performance for varying symbol lengths n ∈ [10,28], SNR ∈ {1, 5, 10, 15} and circuit depth p ∈ [1,14] as shown in Table I.The maximum value of SNR is set to 15 ≈ 12 dB due to the requirement of very high M to reliably calculate P e .We achieve 4000 ≤ M ≤ 5000 for the cases requiring accurate estimation of P e such as for the combinations of p ∈ [4,6], SN R = 15 and n = 25 or for varying n to accurately observe the highest performance value n max achieving ML performance in Section VII-E.
Various performance metrics of QAOA are defined.C ≡ C QAOA , C 2 ≡ C 2 QAOA and min z {C QAOA (z)} ≡ min z {C QAOA (z)} correspond to expectation of the cost C(z), C 2 (z) and the minimum cost, respectively, based on the observed measurement outputs z of QAOA circuit in each simulation instance.We calculate mean of all these expectations with respect to all M different problem instances with different H, s, n, J and h by using the expectation operator defined as E J,h [.].

The values E
the instance based mean of the variance of C QAOA with respect to measured costs in each problem instance.C(s) corresponds to the solution cost for the problem with the transmitted symbol s while C M L and C M M SE denote the costs for ML and MMSE solutions in each problem instance.We compare QAOA performance with both ML and MSSE solutions.MMSE solution is defined as follows [28]: where f sign (.) is the sign function since we utilize BPSK symbols with equally probable {±1} values.We obtain cost values of ML solution for n = 25 with classical simulation for the number of simulation instances being equal to 1500, 2500, 5000 and 5000 for SNR ∈ {1, 5, 10, 15}, respectively.P e is We define various approximation ratios in addition to the conventional approximation ratio r M L as the ratio of the expected cost obtained for a specific problem instance to the minimum possible cost.We revise the definition by using expectations with respect to varying problem instances.We also utilize normalization for a smooth performance comparison as a precaution due to finite number of simulation instances and difference in the number of simulations among various p and SNR values.The following are defined: where we normalize costs for the definitions of r M L and r * M L by dividing with C(s).r and r * measure performance of QAOA with respect to the solution cost while r M L and r * M L measure with respect to the performance of optimum ML solution which is the conventional performance metric.We achieve approximation ratio r ≈ r M L ≈ 0.9364 with p = 14 for SNR ≈ 12 dB while r * ≈ 1 with p ≥ 5 for SN R ≥ 10.In addition, QAOA solution is highly close to ML solution with r * M L ≈ 1 with p ≥ 5 for SN R ≥ 5.

A. Optimization of QAOA angle parameters
We optimize angles γ = n γ and β for p ∈ [1,7] based on (31) by minimizing V p ( γ, β) while obtaining V p ( γ, β) for each γ and β with substitution of W u into (24) while W u is calculated by using Algorithm 1.The optimization is performed in classical computer by providing initial points as shown in Table II and then applying unconstrained nonlinear optimization with quasi-Newton method as a type of gradient based methods available in various numerical packages.( γ 1,1 , β 1,1 ) region covering an area of (2π × 2π) is discretized into a 100 × 100 grid for p = 1 by determining 100 × 100 initial points where γ p,j and β p,j denote jth angle parameters for p-layer QAOA circuit for j ≤ p.Then, optimization is performed by starting with each initial point and the angles giving the minimum cost are chosen as the optimum angles for p = 1.For p = 2 and p = 3, the initial points are chosen such that γ p,r = γ p−1,r and β p,r = β p−1,r for r ∈ [1, p − 1] while ( γ p,p , β p,p ) region covering an area of (2π × 2π) is discretized.It is observed that optimized γ p,p and β p,p are in the neighborhood of γ p−1,p−1 and β p−1,p−1 , respectively, for p ∈ [1,3].This observation is utilized while determining the grid sizes and boundaries for the initial point regions for p ∈ [4,6] while with decreased number of points due to complexity as shown in Table II.For p = 7, initial point is chosen with a heuristic extrapolation method described next and the same nonlinear optimization method is applied only for a single point.Extrapolation methods are also applied for Max-Cut problems in QAOA for high depths [5], [29].Optimized angles for p ∈ [1,5] and SN R = 15 are presented in Table III.The values of γ for SN R = 15 are shown in Fig. 1(a) by including the extrapolated curves for p ∈ [8,14] as discussed in the next section.All optimized parameters are available in data repository [21].

B. Extrapolation of γ and β for higher depths
We design a heuristic method to calculate angles for p ∈ [8,14] by observing the pattern of the optimized angles for p ∈ [1,7].The angles of γ p,r for p ∈ [1,7] and r ∈ [1, p] are presented in Fig. 1(b) for SNR = 15.It is observed that γ p,r values increase as r increases while the curves shift to the right as p increases.Similar pattern is observed also for β values.Then, an initial estimation is performed manually for p = 7 and an optimization is performed with this initial point to calculate the angles for p = 7.For the values p ∈ [8,14], a numerical algorithm is designed which is composed of three basic steps as shown in Algorithm 2. Firstly, we realize extrapolation for γ p,p by using the following interpolation function based on the observation of a saturation behavior for γ r,r for r ∈ [1, p − 1]:  P e values obtained with QAOA, ML and MMSE solutions for varying SNR and depth p ∈ [1,7] are shown in Fig. 2(a).It is observed that p ≥ 4 reaches near-optimum ML performance for all SNR values while MMSE performance is significantly worse compared with QAOA and ML.In Fig. 2(b), the performance is shown for SN R = 15 and p ∈ [1,14] showing the saturation at the optimum performance for p ≥ 5 with small oscillations due to finite number of experiments M and potentially finite size effects for various p.
The cost performances are shown in Fig. 2(c) for varying SNR for QAOA, ML and MMSE where expected costs are normalized by multiplying with (−2 n 2 σ 2 H ) / E J,h [C(s)] to obtain smoother and accurate comparison between the results due to different number of simulation instances for varying SNR and p.In Fig. 2(c), it is observed that as p increases from 1 to 7, expectation E J,h [ C QAOA ] curves for varying SNR get closer to the ML cost curve.Expected solution cost is also shown for reference.In Fig. 2(d), performance is shown for high depths with p ∈ [1,14] and SN R = 15.It is observed that at high SNR, ML cost is almost equal to solution cost while E J,h [ C QAOA ] is getting closer to ML cost compared with the performances for p ∈ [1,7].Therefore, our heuristic extrapolation method surprisingly provides significant performance even though they do not include any optimization.It is observed in Fig. 2(e) that observed minimum values of the cost among QAOA circuit output measurement results are significantly close to ML cost.For example, at p = 3, Approximation ratios r, r * , r M L and r * M L are shown for varying p ∈ [1,7] and SNR values of 1, 5 and 10 in Figs.3(a), (b) and (c), respectively, while for varying p ∈ [1,14] and SN R = 15 in Fig. 3(d for p ≥ 5 and SNR = 15.Both ML and QAOA solutions are more separated from the solution for low SNR value of 1 as shown in Fig. 3(a).r is not close to 1 for SN R = 1 for all p values while still r * M L ≈ 1 with larger difference compared with SN R ≥ 5. Similar observation is obtained for r M L for SN R = 1 with an increase towards unity as p increases.

D. Statistical properties of QAOA cost metrics for varying n
We simulate statistical properties formulated in Theorem 1 and the corresponding results in (31-35) allowing us to present Conjecture 2. In Figs.4(a) and (b), ( 31) and (32) are verified with simulations for p ∈ [1,4] and n ∈ [10,28] showing that E J,h [ C QAOA ] and E J,h C 2 QAOA increase as O(n 2 ) and O(n 4 ), respectively, with respect to n.On the other hand, in Figs.4(c) and (d), it is observed that variances increase as O(n 3 ) with respect to n supporting concentration property defined in (33).It also shown that measurement result of each QAOA instance is also showing an increase of variance as O(n 3 ) on average over many individual instances with respect to the symbol length n.In Fig. 4(e), variances are shown for n = 25 and varying p ∈ [1,14] decreases with respect to p resembling a saturation while instance variance of the mean C QAOA has a much smaller oscillation with respect to p while saturating for large p.On the other hand, in Fig. 4(f), it is observed that distance between the measured minimum cost and mean normalized by n 2 , i.e., E J,h [< C QAOA > − min z {C QAOA (z)}] / n 2 , decreases slowly for p = 1 and p = 2 as n increases supporting (34) in Conjecture 1.The same decrease is not observed for p = 3 and p = 4 probably due to their much slower decay function g p (n) compared with g 2 (n) and g 1 (n).In other words, it is conjectured that E J,h [< C QAOA > − min z {C QAOA (z)}] / n 2 for p ≥ 3 starts to slowly decrease with increasing n if n > 28.
Similarly, variance of min z {C QAOA (z)} increases with n as O(n 3 ) as shown in Fig. 4(g) supporting (35) in Conjecture 1.
E. Simulations for Conjecture-2 P e for QAOA circuits optimized for p ∈ [1,4] is calculated for varying n ∈ [10, 28] and compared with P e of ML solution to estimate the maximum n value denoted by n max for QAOA circuits achieving ML performance.In Fig. 5(a), P e of QAOA circuits for p ∈ [1,4] are compared with P e of ML detection.It is observed that P e with respect to n shows a local minimum approximately at the point n max where its performance starts to degrade compared with ML for increasing n.Approximation ratio r * performance for varying n and p ∈ [1,4] is shown in Fig. 5(b).We obtain n max ∈ {16, 18, 23, 28} with the corresponding depth values of p ∈ {1, 2, 3, 4}, respectively, for r * = 0.9995 and n max ∈ {18, 22, 28} with p ∈ {1, 2, 3}, respectively, for r * = 0.9967.In Fig. 5(c), extrapolation based on (41) in Conjecture 2 is shown where n max for varying E J,h [ C QAOA − C(s)] / n 2 is shown.The first four points for p ∈ [1,4] are extracted from simulations shown in Fig. 5 / n 3 in the numerator of (41) is calculated based on the observations in Fig. 4(e) and E J,h [ C QAOA − C(s)] / n 2 is calculated based on experiments for calculating the values in Fig. 2(d).Then, n max is extrapolated for target r * values.We estimate that approximately n max < 200 and n max < 300 for r * = 0.9995 and r * = 0.9967 with p ∈ [5,14] and p ∈ [4,14], respectively.
It is an open issue to verify the extrapolated performance with high number of qubits in NISQ devices and near future quantum computer simulators.

VIII. CONCLUSIONS
We apply QAOA on ML detection of massive MIMO systems with an optimization algorithm of angle parameters having computational complexity of O(16 p ) for p-layer QAOA circuits based on SK modeling.Proposed algorithm optimizes angles independent from n and specific problem instance with significant practicality.We provide extensive simulation studies for QAOA by analyzing statistical properties of QAOA measurements in IBM Quantum Lab.We present a conjecture about concentration property of QAOA and share corresponding measurement statistics of C QAOA , C 2 QAOA , min z {C QAOA (z)}, C(s) and total number of bit errors for a total of 236500 individual QAOA circuits for varying n, p and SNR as a data repository in [21].We observe that QAOA for p ≥ 4 achieves near-optimum ML performance in terms of P e for large scale 25 × 25 MIMO systems and SNR < 12 dB.In addition, we present extrapolated angles for p ∈ [8,14] for SN R ≈ 12 dB with a conjecture claiming that presented angles achieve near-optimum performance for n ≤ 300 with applicability for next generation massive MIMO systems.

ACKNOWLEDGMENTS
This work was supported by TUBITAK (The Scientific and Technical Research Council of Turkey) under Grant #119E584.

DATA AVAILABILITY STATEMENT
Problem instance specific values of C QAOA , C 2 QAOA , min z {C QAOA (z)}, C(s) and total number of bit errors obtained with 236500 randomly generated individual QAOA circuits in IBM Quantum Lab are available in the publicly accessible data repository in IEEE DataPort in [21].Researchers can access the dataset and its associated documentation for further analysis and verification.Digital object identifier (DOI) of the repository is 10.21227/x0g7-n411.

APPENDIX A STATISTICAL PROPERTIES OF SHERRINGTON-KIRKPATRICK MODEL OF ML PROBLEM
The law of large numbers allows to approximate the distributions of J j,k and h j as Gaussian for n 1.We also assume typical input symbols s of length n with s i being equal to ±1 with equal probability of 1 / 2 leading to µ s = 0.Then, the approximate values of mean and variance are calculated for typical instances of H, n, J , h and s by using Then, by substituting ( 9) and (10), we obtain µ J ≈ 2 n µ 2 H = 0 and µ h ≈ 0. σ 2 J is approximated where the terms proportional to µ H and O(1 / n 2 ) are excluded in the steps of the derivation by setting them equal to zero resulting in σ where we use the fact that the expectation of H 4 ( , j) is equal to 3 σ 4 H and we excluded the terms proportional to O(1/n 2 ).
On the other hand, the mean and variance of the cost for the target solution are calculated by using (1), ( 9) and ( 10) as H .The variance denoted by σ 2 Cs is trivially calculated after a set of conversions as σ ).This results in concentration property as follows:

APPENDIX B PROOF OF THEOREM 1
Following the approach in [5] as defined between equations (73)-( 76), e ı λ C n 2 is calculated as follows by also including the effect of the local field of SK model: where z , z [±r] 2 , . . ., z ≡ z e ı βr B z = z.z|e ı βr B |1 = (cos β r ) g + (z) (ı sin β r ) g − (z) , z.z is the bit-wise product of z and z , g + (z) and g − (z) count the number of ones and minus ones in z, respectively, and φ j,k and φ j defined as follows: where γ p,r ≡ γ p,r / n for r ∈ [1, p] based on our assumption.
The characteristic function of a Gaussian random variable X with mean µ X and variance σ 2 X or the expectation E X [exp(ı t X)] equals to exp(ı µ X t) exp(− σ 2 X t 2 / 2).Furthermore, taking expectation with respect to J j,k z m j z m k and h j z m j are the same with taking the expectation with respect to J j,k and h j , respectively, due to symmetry.After replacing J j,k with J j,k z m j z m k and h j with h j z m j , the expectation of e ı λ C / n 2 with respect to J and h is calculated as follows: E J,h e ı λ C / n 2 = z [±1] ,...,z [±p]    where σ J 2 = σ 2 J / n, σ h 2 = σ 2 h / n 2 .In Section 6 of [5], a simplifying approach is proposed to calculate a similar form by summing in a configuration basis instead of all (2 n ) 2p possible strings z [1] , . . ., z [p] , z [−p] , . . ., z [−1] in (54).Summation in (54) is converted into the following by expanding the square terms and then taking the first derivative of both sides with respect to λ at λ = 0 (similar to the formulation between the equations ( 86)-( 95) and ( 117)-(118) in [5]): Assume that the set R denotes one of sets {u, v, x, y}, {u, v, x}, {u, v} or {u} for given strings u, v, x and y.Following the similar methodology in [5], the following functions corresponding to S u,v,x,y , S u,v,x , S u,v and S u are defined: where g a.b and g a are defined as follows: and the function f (R, n) is defined as follows: Therefore, the problem is converted to the calculations of S u,v,x,y , S u,v,x , S u,v and S u for u, v, x, y ∈ A. In a similar manner to the proof of Lemma-2 in Section 6.2 in [5], these functions can be expressed by separating the summation with respect to {n a } into two subsets of the set A. The partition of A as A = B ∪ A p+1 = D ∪ D c ∪ A p+1 is described in Section V.Then, S R is expressed as follows: = The following observations as described in Sections 6. where . The only difference between (65) and the equation (132) in Proof of Lemma-2 in Section 6.2 in [5] is

h C / n 2 2 (
where a.b denotes bitwise product of a and b, multinomial co-efficient n {na} is equal to the value of n! / (n 1 !n 2 ! . . .n 2 2p !) such that 2 2p i=1 n i = n, Φ a = p r=1 γ p,r (a r:p − a −r:−p ), a r:p = p l=r a l , a −r:−p = −r l=−p a l for r ∈ [1, p]and Q a is defined as follows: following is obtained for the second moment:E J,h C 2 / n 4 = σ J

1 and 6 .
2 in[5] are used to simplify the equation.It is observed that g a.b = g b.a , g a.a = 1 since Φ a.a = 0 for a, a ∈ A p+1 and g a = 1 since Φ a = Φ a.1 = 0 where both all ones string 1 and a ∈ A p+1 .Then, (64) is converted into S R = n t = 0 s R (t, n) where s R (t, n) is simplified as follows: s R (t, n) 38)This definition allows us to include the effects of the variance of the mean C QAOA with respect to different problem instances, i.e.,E J,h C QAOA 2 − E 2 J,h [ C QAOA ],and variance of the QAOA circuit output measurement for each individual instance, i.e., E J,h C 2 QAOA − C QAOA 2 .Then, n max is approximated by setting input to Φ (.) function in (36) to a constant value α 2 (r * ) depending on the target approximation ratio r * resulting in the following: