Machine Learning calibration of Angle of Arrival methods based on different experimental Unified Linear and Rectified Array measurements

— Generic Angle of Arrival methods for indoor positioning are highly affected by specific antenna and environment scenarios through design impurities or multipath- component propagations. Here we acquired a large dataset of four different antenna designs in three different measurement environments with >140000 snapshots obtained from Bluetooth 5.1 receiver. Using the spatial power spectral densities of the PDDA angle of arrival algorithm as feature set for a small Random Forest model, we could show that angle estimation performances for all antennas in all measured environments were significantly improved (PDDA MAE >16 vs RF MAE < 3). Based on the small model size the proposed architecture can be implemented in microcontroller applications for super resolution angle of arrival applications.


I. INTRODUCTION
Traditionally Angle of Arrival (AoA) estimations are based on spatial covariance matrix calculations (Schmidt, 1986). Subsequently, different computational effective and precise algorithms had been proposed (Al-Sadoon et al., 2017;Roy and Kailath, 2009). Antenna specific characteristics, such as the number of antenna elements and the element spacing, are in every algorithm explicitly declared. The incoming angle is estimated from the resulting received spatial power spectrum. Due to antenna imperfections and multipath propagation (reflections, diffraction and scattering (Yu et al., 2004)) the positioning estimation is highly subject to errors in typical indoor deployments. From the early investigation into AoA estimation, Machine Learning (ML) implementations using Radial Basis Function Neural Networks (RBFNN) were proposed to be a computational feasible method with outstanding performance (El Zooghby, 1997;El Zooghby et al., 2000). Ongoing work showed the high potential of ML with outstanding precision (Adavanne et al., 2018;Bialer et al., 2019;Huang et al., 2018;Khan et al., 2019;Ravindran and Jose, 2019). Those investigations are mostly based on a specific antenna design and had been evaluated in a simulation, where noise distributions were parametrized as gaussian or white noise with fixed Signal to Noise Ratio parameters (Huang et al., 2018). On the contrary, real field measurements are expected to yield distinctive measurement attributes, which vary first with respect to the used antenna and its design impurities and second, with the deployed measurement environment. Especially indoor positioning environments are highly affected by multipath propagation components. Few ML employments were done using hardware measurements (Agatonović et al., 2013). Agatonović et al showed that frequency specific channel measurements varied in an anechoic chamber and had an impact on the used AoA generic algorithm (Agatonović et al., 2013). Since wireless technology standards commonly allow for multiple-channel transmission [12], we investigated here the ML input features to be the concatenation of multiple channel spatial power spectral densities (PSD) obtained from the propagator direct data acquisition (PDDA) algorithm (Al-Sadoon et al., 2017). But an arbitrary AoA algorithm comprising antenna specific characteristics into a spatial power spectrum could have been used. Here we propose a robust method to employ ML in a specific environment irrespective of the antenna design. We show that multipath propagation and PSD generating generic algorithms suffer from poor performances in certain environment measurements. Through a lightweight Random Forest ML method, the PDDA PSD's can be used as input features, and significantly enhance the performance through learning (1) design specific impurities and (2) environment specific multipath components.

A. Data Acquisition
We acquired from different antennas in different environments raw IQ samples for azimuth and elevation PDDA angle estimations using Bluetooth transmission according to version 5.1. The antenna layout in the receiver was varied by the number of elements and the respective element spacing. The layout was used in a square (4x4 and 3x3 elements) and rectangular shape (2x8). One measured antenna used dual polarized elements. Indoor short-and longrange environment measurement scenarios were defined, as well as a reference outdoor measurement. All measurements scenarios were acquired for all antenna types, except for the outdoor measurement only two antenna measurements were performed. Using a custom build rotational table, all angles from -80 to 80 degrees were obtained in steps of 5 degrees of equal sample size. In total more than 140000 measurement snapshots were acquired of four antennas (Table 1) in three different environments (Table 2). As shown in Figure 1 the PDDA algorithm was then used to obtain each measurement respective spatial power spectrum for the three different Bluetooth Low Energy advertising channels 37 (2402 MHz), 38 (2426 MHz) and 39 (2480 MHz). . It was shown to yield low error predictions with limited number of sensors and snapshots. Different numbers of sources were accurately detected in a simulation setting. Thus, given the raw IQ measurements, the PDDA spectrum comprises a fast feature preprocessing step for further ML usage. For super resolution AoA estimation, data driven approaches should optimally incorporate environment and design specific PDDA spectra information. Here we specified the feature set to be the PDDA obtained PSD of a single acquisition frequency or the concatenation of three different frequencies. For each PSD a "Received Signal Strength Indicator" (RSSI) was measured.
In combination with each PSD of size 181, a total number of 543 (181•3+3) features were defined for the channel concatenated feature set.

C. Machine Learning model
In this analysis the primary focus was model size and embedded implementation capability. Commonly neural networks require a high number of parameters and computational expensive matrix operations. Decision trees and random forests, first described by Breiman (Breiman, 2001), were proven to be a robust, accurate and successful tool for solving countless of machine learning tasks, including classification, regression, density estimation and manifold learning or semi-supervised learning (Gall and Lempitsky, 2013). Random forests (RF) are an ensemble method consisting of many decision trees. A decision tree is a statistical optimal data segregation method, that is only controlled by conditional sequences. For classification a commonly used criteria for node splitting is Gini impurity. For a given dataset X with N m observations at node m, the probability for a certain class k is defined as follows: The impurity is then defined for that particular node m: For regression trees however the Mean Squared Error, minimizing the L 2 Norm is commonly used. Here the mean of all labels i in node N m is computed.
The impurity ( ) is then defined: Irrespective of a regression or classification tree, new nodes are generated according to the splitting policy. The used splitting criterion θ(j,t m ) * of a feature j and threshold t m is chosen such that the data point balanced impurity of the newly generated datasets Q left and Q right is minimal: * = ( ) + ( ) Random Forests combine multiple decision trees. Individual decision trees typically tend to overfit with high variance. By averaging over sufficiently many estimators, the variance is reduced by having the slight disadvantage of a low increased bias. Random splits of data ("bagging") and feature randomness in each respective node is used in order to construct an ideal forest out of uncorrelated trees. The overall estimation output of B trees with the regression or classification estimator f b is the mean of all trained trees on the newly unseen test set ′: Important hyperparameters are the number of trees, the depth of nodes and the node splitting criterion function. Classification and regression trees were previously shown to outperform different kinds of Machine Learning models. In an extensive study of 121 datasets 179 classifiers from 17 families including Support Vector Machines (SVM's), neural networks, CART methods, discriminant analysis and nearest neighbors, Random Forests showed the best performance (Fernández-Delgado et al., 2014). In a similar study of 165 publicly available bioinformatic classification problems ensemble classifiers performed best (Olson et al., 2018).

III. RESULTS
An important prerequisite for ML usage in the AoA context is the understanding of spatial PSD based features. In Figure  2 the mean PSD's for every antenna are displayed channel wise in each environment for the true angle of 65 degrees. The example shows that outdoor measurements have a significantly higher signal to noise ratio. Multipath components are only visible as small peaks throughout the spectrum. For indoor measurements multipath components are in fact dominating the PSD distributions, thus leading to low performance. Visible becomes here also the different distribution performance of specific channels. Thus, different transmission channels undergo different path propagation and should in return be used as a complementary feature for a robust AoA estimation. Based on these findings, a data driven approach should be capable of mapping antenna, environment, and channel specific contributions to a true angle of arrival. In order to investigate the influence of frequency specific contribution of the PDDA error rate, first individual channel specific error distributions for all antennas in all environments were estimated. Then the three spectra were combined using the spectra sum and product. To compare these results against a Machine Learning model, different model architectures with different hyperparameter settings were investigated. Here we present results for a simple Random Forest Regressor with the following specifications:  Number of trees: 7  Max depth: 7  Splitting criterion: Mean Squared Error  Minimum sample split: 2  Minimum samples per leave node: 1 Using a stratified 5-fold cross validation for a specific antenna in a specific environment the Mean Absolute Error (MAE) performances are estimated. The trained model on the whole antenna-environment specific measurement is then saved and the model size is obtained. Here a simple Random Forest architecture could significantly outperform PDDA baseline estimations for individual and combined spatial power spectra ( Figure 3 Figure 4. PDDA and RF errors are lower for outdoor than long range indoor measurements. Antenna specific performance differences become visible as well. Investigating 5-degree angle bin error distributions reveals a persistent offset of PDDA predictions through the -80 to 80 degrees range ( Figure 5). Errors become here significantly higher towards the positive angle range of 40 -80 degree.
The Random forest mean errors on the other hand show consistent target predictions, even though slight offsets at the edge angles >|±75| are also visible. The individual RF model sizes were obtained by saving the trained model (datatype float64). The model size of 39.8±7.94 kB is sufficiently small to be employed in the Flash and RAM of modern microcontrollers. Significant model size differences with respect to the employed environment were not determined ( Figure 6).

IV. DISCUSSION
Our analysis shows that generic AoA methods are highly affected by antenna and environment specific propagation scenarios. Outdoor measurements showed lowest PDDA errors. Indoor performances on the hand are highly affected by multipath propagation components. Combination of different acquisition channels did lower PDDA MAE rates non-significantly (PDDA product combination vs channel individual estimations ch. 37 p=0.12, ch. 38 p=0.051, ch. 39 p=0.11). Machine Learning on the hand was able to learn specific antenna-environment fingerprints. This could be validated using all antennas and measurements having MAE < 3 degrees for Random Forest concatenated channel PSD's against MAE > 16 degrees for PDDA spectrum approach. Investigating the specific error distribution revealed that highest errors are related to the spectrum edges of angles >|50| degrees. While individual spectra show in this range high similarity, a simple Random Forest architecture was precisely able to distinguish angles in this range. Our proposed model architecture was shown to have a mean model size smaller than 40 kB and is thus able to be implemented in modern microcontroller hardware. A limiting factor was the line of sight setting in every measurement. We strongly hypothesize though that specific non line of sight multipath propagation can be equally well captured by the proposed method. Non line of sight signals would thus be uniquely identifiable by specific PSD's and presumably channel specific PSD features.