Gaussian Process As a Benchmark for Optimal Sensor Placement Strategy

Optimal sensor placement is an important problem to look at. This problem becomes all the more relevant nowadays due to advancements in infrastructure monitoring robotic technologies including underground sensing. While there are multiple ways to solve optimal sensor placement problems, one of the most generic methods available is Bayesian Optimization and its variants. In this paper, we present a simple benchmark-like formulation for exploiting Gaussian Process uncertainty for sensor placement to measure a scalar field.


I. INTRODUCTION
Within the task of measuring scalar fields, optimal sensor placement, or computing subsequent best, or subsequent most informative location(s) for sensor placement, is an important question to look at. This question has become especially relevant due to the modern advancements of robotic capabilities available to support sensing tasks [1], including underground measurements [2]- [6]. An approach that is classically used to solve such sensor placement problems is Bayesian Optimization and its extensions [7].
However, Bayesian Optimization approaches come with the burden of computational complexity [7], [8], especially due to some Gaussian Process (GP) [9] computations that have to be made. This burden is especially worrisome in real time computations for robotic deployments. Therefore, it becomes of interest to assess alternative approaches that can optimize sensor placement in computationally simpler ways.
When it comes to assessing such alternative approaches though, one would at some stage have to compare the alternatives against Bayesian Optimization methods. This comparison becomes necessary because Bayesian Optimization approaches, despite their computational burdens, can be argued to be a powerful benchmark means [7] to solve many a design of experiments problem, including sensor placement problems.
Having that premise set, with an intention of analyzing different sensor placement strategies in future, we present in this paper a simplified yet generic formulation for exploiting Gaussian Process coupled with the Squared Exponential kernel [9], describing a benchmark method for sensor placement in a 2D environment. The formulation can be extended to 3D environments and higher dimensions as well. We also demonstrate the performance of the benchmark through a Pulsed Eddy Current sensing experiment targeted at underground reinforced concrete sewer inspection [10].

II. PROBLEM FORMULATION
We are given a bounded 2D plane, on which any point can be given by the coordinate pair (x, y) where x ∈ R and y ∈ R. The plane can have a 'k' number of boundaries. The boundaries are known. With respect to those boundaries, the plane can be written as a region that the coordinates (x, y) satisfy Here b 1 , b 2 , ..., b k are functions that perform R 2 → R, and these functions can be nonlinear for generality.
Our bounded 2D plane carries a scalar field. Said more specifically, every point (x, y) in our plane will have a measurable and static (i.e., non time-varying) scalar quantity given by g = f (x, y). Here, g ∈ R, and f is a smooth continuous function such that f : R 2 → R. But, the function f is unknown. Now, we are given a fixed set X n containing an 'n' number of coordinates taken from our bounded plane. X n can be written as in (1). It must be assumed that the coordinates in X n are sufficiently dense so that between any two adjacent coordinates the function f remains approximately smooth. We are then asked to measure G n -the set given in (2) that contains the scalar values corresponding to the coordinates in X n .
We are given one sensor to take measurements. Measurement-taking will have to be iterative. At every iteration, the sensor can take a single measurement by reaching any point (x i , y i ) ∈ X n . We are thus required to iterate to cover the space taking one measurement at a time.
Given a task like this, typically, we would go and measure 'g' values at every given coordinate. However, suppose measuring at all coordinates is considered tedious, and we would like to do fewer measurements, but still get a good enough job done. Achieving that objective is the problem we address in this paper. Our problem can be expressed mathematically as follows.
Given a fixed set of coordinates X n , we would like to identify an optimal subset X m , where X m ⊂ X n and obtain a corresponding set of measurements G m , where G m ⊂ G n , such that the underlying unknown continuous scalar function f that does the mapping g = f (x, y), can be estimated. The estimate of f must converge under some convergence criteria we define.
Solving the aforesaid problem essentially addresses a version of the optimal sensor placement task in 2D. Moreover, obtaining a converged estimate of f enables finding the scalar values at coordinates in the set X n ∩ X m through prediction and without measurement (here, X m is the complement set of X m ). In the subsequent sections, we formulate a solution for this problem using Gaussian Process (GP) [9]. Our formulation can also be used as a benchmark to be compared against to evaluate other methods.

III. METHODOLOGY
This section presents the GP-based formulation to accomplish the sensing task described in Section II. An iterative method for GP maximum uncertainty-driven, information maximization-based, active measurement-taking is proposed. Examples are available to how GP maximum uncertainty criteria has been exploited in different contexts [11], [12].
Given any set of prior measurements, this method calculates the next best, or next most informative point to take a measurement. The next best point is considered where the maximum GP uncertainty (explained later) results from the remaining unmeasured points. Sequential steps to be followed are summarized herein.
Step 1, Conduct prior measurements: Select a set X m (where X m ⊂ X n ) that contains an 'm' number of 'uniformly' or 'evenly' distributed points across the given region to sense (note: select the number m such that m/n × 100% ≈ 50%). Then conduct the measurements corresponding to X m , and construct the corresponding set G m . Also construct the corresponding unmeasured coordinate set X * where X * = X n ∩ X m . Now arrange X m , G m and X * to be corresponding matrices as shown in (3), (4) and (5). The notation . . . T henceforth denotes matrix transpose.
Moreover, the notation X (i) m and X (j) * will henceforth denote any generic i th and j th row of two matrices X m and X * respectively. X (i) m and X (j) * will be considered vectors in R 2 . This notation is useful for (6).
Step 2, Estimate Hyper-Parameters: The objective of the second step is to estimate and save the hyper-parameter set θ (l) = {α (l) , η (l) , σ (l) }, where θ (l) ∈ R 3 . The superscript (l) denotes the number of iterations the measurement-taking has gone for. The value of (l) will be one when entering Step 2 soon after completing Step 1, i.e., taking the prior measurements. On subsequent returns to Step 2 (as described later), the value of (l) will increment.
This hyper-parameter set in R 3 results by considering the Squared Exponential kernel and Gaussian likelihood in GP formulation [9] (presenting full GP formulation is not done in this paper, only vital steps are shown, readers are advised to refer to works like [9] for complete derivations). Taking the squared exponential kernel becomes appropriate for our task due to our underlying condition that the function 'f ' is smooth and continuous (recall from Section II).
To calculate hyper-parameters following GP formulation [9], we construct matrix Σ m such that Σ m = K(X m , X m ) + σ 2 I. Here, σ ∈ R is a hyper-parameter to be estimated, K(X m , X m ) ∈ R m×m , and I is an identity matrix of corresponding size. A generic formulation for X a ⊂ X n ; is the squared exponential kernel given by (6) where || . . . || denotes the Euclidean Norm and α, η ∈ R are hyper-parameters to be estimated.
We then proceed to derive maximum likelihood estimation formulation [9]. This results in the negative log marginal likelihood given by (7) where | * | denotes the matrix determinant.
After expressing − log[p(G m |X m , θ)], we can then estimate hyper-parameters by minimizing the negative log marginal likelihood as in (8). A way to initialize the hyper-parameters to perform this optimization is presented in [13].
Step 3, Check Convergence: This step prescribes a convergence, or a stopping criteria for the sensing task. To define a stopping criteria, in every iteration (l), we make a prediction on the full coordinate set X n based on the measurement sets X m and G m available at that iteration. This means by following GP prediction [9] we perform: G * n |G m , X m , X n ∼ N (µ * n , Σ * n ), such that µ * n and Σ * n are given by (9) and (10). Then, for the (l) th iteration we calculate γ (l) as shown in (11) where | * | denotes absolute value. γ (l) here is the measure of mean percentage value of 2×prediction uncertainty with respect to prediction mean. As we collect more and more measurements, i.e., as our X m set gets larger in every iteration (l), we would expect γ (l) to decrease. Therefore, we define the stopping criteria to be γ (l) ≤ γ c condition be satisfied for N it consecutive iterations. Here, γ c will be some small positive number we define. Similarly, N it will be a number of consecutive iterations we define. This means, as long as γ (l) > γ c prevails, the iterative measurement-taking will continue. It should also be noted that for the way we calculate γ (l) in (11), we do not wish any µ * (i) n to lie close to zero. Therefore, we recommend adding a bias constant to the measured G m set prior to doing GP in a way that all values in G m become positive and the minimum value in G m becomes larger than one.
Step 4, Compute and Perform Next Best Measurement: This step is performed for as long as the convergence criteria in Step 3 is not satisfied. To compute the coordinate for the next best measurement, we make a prediction on the unmeasured coordinate set X * based on the measurement sets X m and G m available at iteration (l). This means by following GP prediction [9] we perform: G * |G m , X m , X * ∼ N (µ * , Σ * ), such that µ * and Σ * are given by (12) and (13). When µ * and Σ * are computed, we can then determine the coordinate with highest uncertainty in iteration (l) by solving (14). The resulting coordinate (x (l) , y (l) ) is deemed as the next best point to take a measurement according to our maximum uncertainty criteria. Once the point (x (l) , y (l) ) is determined, the corresponding measurement g (l) is taken. We then add point (x (l) , y (l) ) to the matrix X m and g (l) to the matrix G m , and return to Step 2 thereby closing the loop of the algorithm to continue iterating.
IV. RESULTS AND CONCLUSION Results from an example where the proposed method was applied is shown. The example comes from [10]. A Pulsed Eddy Current (PEC) sensor is used to locate reinforcement rods (i.e., rebars). This application is related to underground reinforced concrete sewer infrastructure inspection. The PEC sensor is allowed to move across a planar surface above the rebars. Signal intensity is indicative of the sensor moving above the rebar (see Fig. 1(a)). This exercise has 117 (i.e., 13 × 9) uniformly placed square grid locations. The sensor can take a maximum of 117 measurements-one measurement per grid location. Fig. 1(a) shows the case where all 117 measurements have been taken. Fig. 1(b) shows the measurements completed at the point of convergence by following the method proposed in this paper. White squares are indicative of unmeasured points. Fig. 1(c) shows the completed map by filling the blank spaces of Fig. 1(b) via GP regression. The effectiveness of the method is evidenced by there being hardly any visible difference in Fig. 1(c) from Fig. 1(a). Fig. 1(d) shows the percentage error between the GP predictions in Fig. 1(c) and the corresponding actual measurements in Fig. 1(a). The maximum percentage error does not exceed 5%. Fig. 2 shows how the convergence measure γ (l) has varied along the iterations (l). Since there are 117 maximum allowable measurements in this example, the prior measurements (i.e., 50%, recall from Step 1) accounted for 59 points. After those measurements, only 40 more measurements were required (notice from Fig. 2-convergence reached in the 41 st iteration) to reach convergence at the convergence parameters set to γ c = 10% and N it = 10. Thus can be demonstrated the capability of the proposed method. Future work can explore different convergence criteria, different next best point calculation criteria, and use the proposed method as a benchmark to be compared against when evaluating other methods for similar purposes, and also larger data sets [14].