Experimental comparison of techniques for localization and mapping using a bearing-only sensor

:We present a comparison of an extended Kalman (cid:12)lter and an adaptation of bundle adjustment from computer vision for mobile robot localization and mapping using a bearing-only sensor. We show results on synthetic and real examples and discuss some advantages and disadvantages of the techniques. The comparison leads to a novel combination of the two techniques which results in computational complexity near Kalman (cid:12)lters and performance near bundle adjustment on the examples shown.


Introduction
In this paper we will present a comparison and experimental evaluation of techniques for localization and mapping for a mobile robot equipped with a bearing-only sensor.The robot has no a priori knowledge of the environment (no map) and can only observe its egomotion through odometry and the bearings to landmarks in the environment with an omnidirectional camera.The robot has no external position or heading reference, and cannot measure the range to landmarks.

Related work
Much of the work published in Simultaneous Localization and Mapping (SLAM), also referred to as Concurrent Localization and Mapping (CLM), makes use of active range sensors.Approaches generally rely on recursive ltering using a Kalman lter or some variant 1].There have also been attempts to nd optimal estimates using batch techniques 2] or to deal with large environments using a hierarchical approach 3,4].
The Computer vision literature contains a large body of work on multiframe Structure From Motion (SFM).Bundle adjustment is a batch technique that is widely used for SFM under perspective projection 5,6] and is applicable to SLAM.Some attempts have also been made to recursively compute structure and motion using Kalman lters 7,8].
The SLAM and SFM problems contain features relevant to localization and mapping with a bearing-only sensor.The goal of this paper is to present some comparisons on how techniques from SLAM and SFM perform on the bearing-only SLAM problem including results from experiments on simulated

Localization and Mapping
For simplicity we will only consider a 2D planar environment.The robot position at time i is m i = (x i ; y i ; i ) T and the feature locations are x j = (x j ; y j ) T .
The odometry is modelled as a stochastic measurement of motion between two consecutive robot poses such that where is N(0; R d ) noise.The bearings are modelled as stochastic measurements of the bearing towards each visible landmark in the rover coordinates, z ij = h(m i ; x j ) + !ij (2) where ! is N(0; ! ) noise.

SLAM as 2D Structure from Motion
The Structure from Motion (SFM) work in computer vision provides insight into what we can expect from bearing-only SLAM.In SFM the 3D structure of an environment is reconstructed using 2D projections (images) from unknown camera locations.The perspective camera projection model is very similar to a bearing-only sensor; depth information is lost in the projection.The only knowledge of the scene from a single image is that scene points lie somewhere along a ray originating at the camera center of projection and passing through the image plane at the point where the feature appears.In bearing-only SLAM, landmarks are projected onto a unit circle around the robot (Figure 1), and a similar constraint applies for the ray intersecting this projection surface.
If only the bearing measurements are available, and not the odometry, then a simple counting argument gives us a necessary condition for when the 2D structure from motion problem can be solved.For each robot position there are 3 unknowns, and for each landmark 2 unknowns.If there are M robot positions and N landmarks, then there are 3M + 2N unknowns.Assuming that all landmark features are observable from all robot positions, there will be MN measurements, and in general these will be linearly independent.There are 4 gauge freedoms 9] in the estimation, i.e. we can translate, rotate, and scale the solution and the measurements will be unchanged.We must set an absolute coordinate frame and absolute scale with 4 additional constraints.We can nd a solution to the problem when MN + 4 > 3M + 2N, or (M ?2)(N ? 3) > 2 (3) This constraint on the solution to the problem shows that there is no solution to the two frame SFM problem in 2D, regardless of methodology.Only when there are three robot positions and at least ve landmarks or more than three robot positions and at least four landmarks can we get a solution, and this solution is only de ned up to the gauge freedoms.In contrast, aligning just two scans from an accurate range-bearing sensor can give an estimate of vehicle motion and make up for errors in odometry.
In most SFM algorithms, little or no a priori information about the camera motion is used for reconstruction, camera motion is recovered in the estimation along with scene structure using only the image measurements.Odometry can be seen as providing a prior for the camera motion which is absent in general SFM problems.If we add odometry then we can immediately disambiguate the scale.There is even a unique solution to the two frame problem if odometry is available since it will provide a direct estimate of the relative pose of the second view.However until the third view is incorporated, bearing measurements cannot begin to correct odometry errors within the estimation algorithm.This can cause initialization problems with Kalman lters which are observed in practice, particularly when odometry is poor.
A nal note is that bias in odometry cannot always be corrected in bearings-only SLAM.If the measurements of distance travelled are biased by a scale factor then the estimate of robot motion and landmark positions will be scaled by the same factor and no estimator can recover the bias using the bearing measurements.This is an inherent ambiguity.In contrast, SLAM with range-bearing sensors does not su er from this problem since the range measurements provide direct information about the scale of the solution.

Estimating parameters
Finding an estimate for the model parameters m = (m 0 T ; m 1 T m M T ) T and x = (x 0 T ; x 1 T x N T ) T can be done in many ways.In this paper we wish to make some comparisons between two of these methods in particular.The rst method is the Extended Kalman lter and the second is bundle adjustment.Each of these methods has properties which make it attractive for use in SLAM, and the nature of the SLAM problem is important to the way in which these techniques are used and the results that can be expected.

Extended Kalman Filter
The Extended Kalman Filter (EKF) has been used in SLAM and SFM to recursively estimate a state vector which consists of the current rover pose or camera parameters and the positions of all landmark or scene features 1, 8].
As a new measurement is made, the lter goes through prediction and update steps which incorporate the new measurement and generates a best estimate (mean) and an uncertainty estimate (covariance) for the current rover pose and the map.The mean and covariance are kept as a su cient statistic for the posterior probability over state space, and old observations are discarded after being incorporated.Nonlinearities in bearing-only SLAM cause di culties to the EKF; in particular, getting started is tricky.The posterior over robot pose and landmark position is signi cantly non-Gaussian such that the mean and covariance in the lter are not a reasonable su cient statistic for the data.For this reason, we use bundle adjustment (to be explained in the next section) to get an initial state estimate and initial covariance matrix using some portion of the data, then recursively process the rest of the data with a Kalman lter.The quality of the result depends on how much of the data is used to get the initial estimates.
Space considerations prohibit detailed explanation of the use of Kalman lters in SLAM but there exists a large body of literature on the topic, particularly for problems where range and bearing measurements are available 1, 10].Although they should be included in a complete treatment on this topic, we will also omit discussion of modi cations to the Kalman ltering method such as implicit Jacobian computations 11] and covariance intersection 12].

Bundle Adjustment
Bundle adjustment is a full nonlinear optimization which does not rely on a mean and covariance as a su cient statistic for previous observations and state estimates.Instead it linearizes the estimation problem at every step using all available observations and the current best estimate.Bundle adjustment may optimize all motion and structure parameters at every step, so the state vector is the entire history of robot pose (trajectory) and the entire map.
We wish to minimize the cost function E( ) = 1 2 (d ?f( )) T R ?1 d (d ?f( )) + 1 2 (z ?h( )) T R ?1 z (z ?h( )) (4) where = (m T ; x T ) T is the vector of parameters to be estimated, d and z are all odometry and all bearing measurements stacked into vectors, and f() and h() are all predicted odometry and bearing measurements stacked into vectors.The rst term in (4) penalizes robot motion that does not agree well with odometry measurements and the second term penalizes robot motion and landmark map combinations that do not agree well with bearing measurements.
Taking the rst and second derivatives of (4) we nd r 2 E = ?r 2 fR ?1 d d + J T f R ?1 d J f ?r 2 hR ?1 z z + J T h R ?1 z J h (6) where J f = r f and J h = r h are the Jacobian matrices of the measurement equations for odometry and bearings, d = (d 1 ?f(x 0 ; x 1 ); d 2 ?f(x 1 ; x 2 ):::) T is the di erence between measured odometry and odometry predicted by the model parameters, and z is the di erence between measured and predicted bearings.These last two quantities are analogous to innovations in a Kalman ltering context.When the innovations are small the Jacobian inner products dominate expression (6).The Jacobian term will also dominate when the measurement equations are approximately linear (or linear), since r 2 f or r 2 h will be small (or zero).In our case, f is in fact linear, but h contains an arctangent.Bundle adjustment drops the second derivative terms and approximates the Hessian of the cost function using only the Jacobian inner products, i.e.H = r 2 J T f R ?1 d J f + J T h R ?1 z J h : (7) Note that the true H is the Fisher Information Matrix when is the true parameter vector.We will use it as the inverse covariance matrix for estimating uncertainties.Newton iterations are used to minimize (4) by solving H k = ?(JT f R ?1 d d + J T h R ?1 z z ) (8) and then computing the update k+1 = k + k (9) In general solving equation ( 8) is cubic in the number of model parameters (3M + 2N here).However, photogrammetrists have known for decades how to speed up bundle adjustment using the sparse nature of the structure from motion problem.We can take a similar approach here by exploiting the nature of the bearing-only SLAM problem.
Each bearing measurement depends only on the landmark being measured and the robot pose at the time that the measurement is taken.Therefore, each row of the Jacobian of the bearing measurement equation, J h , has nonzero entries only for the columns corresponding to the parameters which represent that landmark position and robot pose.The sparse structure is shown in Figure 2(a).The Jacobian of the odometry measurement equation, J f , depends only on two consecutive robot poses.Each row of the Jacobian of the odometry measurement equation, then, has nonzero entries only for the robot poses before and after the motion.Odometry contains no information about landmark positions and corresponding columns of J f are zero.The Jacobian has the structure shown in Figure 2(b).Because odometry and bearing measurement errors are assumed to be uncorrelated, R d and R z (and their inverses) are block diagonal.The Hessian H = J T f R ?1 d J f + J T h R ?1 z J h has the sparse structure shown in Figure 2(c).The upper left is a block tridiagonal matrix U, the lower right is a block diagonal, and the upper right and lower left are rectangular matrices W and W T .We can write the equation ( 8) as where the Hessian H, the parameter update , and the right hand side have been partitioned.Now we premultiply both sides by I 0 and solve equations (11) in two steps.We rst solve the bottom equation to nd x and then substitute it into the top equation, rearranging to get U m = (m) ?W x (13) and solve for m.Computation of (V ?W T U ?1 W) is O(MN 2 ), and solving (12) is O(N 3 ).Substituting x into (13) and solving only requires O(MN) due to the block tridiagonal structure of U, so the overall complexity of bundle adjustment for bearing-only SLAM is O(MN 2 +N 3 ), which scales linearly with the size of the trajectory.The complexity of bundle adjustment is larger than that of a Kalman lter, which has a complexity of O(N 3 ) for N landmarks, but it is not complex as a general inverse Hessian approach which might require O((N + M) 3 ).

Results
Results were generated for synthetic and real data.In the synthetic example, we simulated a robot driving in a large circle of radius 100 meters, with 50 landmarks scattered uniformly in a square area circumscribing the robot path.
Odometry was simulated by computing the true robot motion and adding Gaussian noise to the estimates of along-track and cross-track motion to simulate slip as well as the rotation.Figure 3 shows estimates of robot motion and landmark map from a Kalman lter and from a bundle adjustment.In each gure, the robot trajectory is the circular path of dots and the landmarks are scattered inside and outside the path plotted with their corresponding uncertainty ellipses.
Both methods were also used to analyze real data from an RWI ATRV.Unfortunately there is no ground truth for this experiment.The robot drove in an approximately elliptical path between six di erent arti cial landmarks.
There was a problem with the drive mechanism which caused bias in the odometry measurements.Dead reckoning estimated that the robot drove in a spiral shaped path.Bundle adjustment results are shown in Figure 4(a).Without modelling the bias the bundle adjustment recovers the elliptical path of the robot.
The Kalman lter diverges in the presence of the odometry bias.The odometry was calibrated by tting a linear function of the data to the bundle adjustment trajectory estimate.With the calibrated odometry and an initial model from bundle adjustment using the rst 20 robot poses, the Kalman lter was able to produce the result in Figure 4(b).The lower left portion of the trajecory shows some reconstruction behavior which is most likely due to the introduction of the lower left feature.Since it processes one measurement at a time the Kalman lter does not initialize features well with bearing-only measurements.

Synthesis
The properties of the methods discussed above and the performance seen in experiments suggests a new approach to bearing-only localization and mapping.
We wish to have a recursive formulation so that the computational burden does not grow with the number of sensor scans made, but we also wish to avoid the bias introduced in the linearization phase in the Kalman lter by considering batches of data rather than individual measurements.
A way to accomplish this is to process data in batches, using observations made during a time interval and computing an estimate for the map and for the pose of the robot during the time interval.At the end of the time interval, the state estimate and all of the data from the time interval is collapsed onto a su cient statistic, a mean and covariance for the map of landmark features and the last robot pose.
The very rst batch of data can be processed in exactly the same way as was described for bundle adjustment.At the end of that estimation, the information matrix will have the same sparse structure as before and can be partitioned as it was in equation (10).We will further partition the U matrix into the block dealing with robot pose from time 1 through k ? 1, the block dealing with robot pose at time k, and the corresponding o -diagonal blocks as follows In order to marginalize this matrix and remove the states m 1 ::m k , we compute 1::k?1;k W T 1::k?1 U ?1 1::k?1;1::k?1 U 1::k?1;k W 1::k?1 (15) The state (m k ; x) and the marginalized information matrix P ?1  k are used as a su cient statistic in the next step of the lter.In general, P ?1  k will be a full matrix, it will no longer have the sparse structure of the Hessian from bundle adjustment due to the marginalization step.
As observations are made from new robot positions, the state vector grows to include fm k ; m k +1; m k +2; :::; xg and the information from the bearing and odometry measurements are added into the information matrix, growing the inverse covariance matrix up and to the left.The upper left block will once again have a block tridiagonal form which can be used to provide fast optimization of the parameters.Since the blockwise elimination and backsubstitution done in equations ( 10) through (13) ll in the lower right block anyway, the computational complexity is not a ected by lling in the block in the previous step.The update step of the new recursive/batch lter is O(N 3 + kN 2 ) where k is the size of the time window used for the batch update.For small xed k this is approximately the same computational complexity as the Kalman lter.The algorithm is similar to one reported recently for SFM 13].
Results of applying this lter with a batch size of 10 to the experiment discussed earlier are shown in Figure 5(a).The uncertainty ellipses on each robot pose are plotted as computed at the end of a batch update, so smoothing does not take place over the whole trajectory as it does with full bundle adjustment.Due to this the groups of 10 poses per batch are apparent from the

Discussion
The examples here illustrate the advantages of bundle adjustment over Kalman ltering for the bearing-only SLAM problem.When Kalman lters are used, they need to be initialized with a good estimate of the robot state and landmark locations and bundle adjustment is an e cient way to compute optimal least squares estimates in order to initialize the lter.Bundle adjustment is empirically more robust to bias in the system model, which is to be expected since the linearizations are recomputed each time and are computed near the optimum.We did not investigate robustness to outliers since there exist both Kalman lters and bundle adjustment techniques speci cally designed to deal with outliers 14, 6] but these were not implemented.
For computational and memory requirements, bundle adjustment is not practical for large problems.Because the entire observation history must be recorded, the memory requirement scales as O(MN).By contrast the Kalman lter requires O(N 2 ) memory.The Kalman lter requires O(N 3 ) computation for each update.Computation required in bundle adjustment is O(N 3 +MN 2 ) which grows linearly in the number of robot positions in the estimate.This is signi cantly less than a general second order least squares optimization but still not practical for a robot operating for long periods of time.
By combining Kalman ltering and bundle adjustment concepts, a new recursive/batch lter was introduced.The lter allows a exibility in the batch size, trading o computational requirements and performance.If the batch size is set to 1, the extended Kalman lter results, and if the batch size is unlimited, full bundle adjustment results.

Figure 1 .
Figure 1.Bearing-only sensor projects landmark position onto unit circle and real data.To date, little has been published in terms of direct experimental comparison of existing techniques for solving this problem.

Figure 2 .
Figure 2. Sparse structure of derivatives.(a) Jacobian of bearing measurement equation.(b) Jacobian of odometry measurement equation.(c) Approximation to the Hessian of the cost function.When the innovations are small the Jacobian inner products dominate expression(6).The Jacobian term will also dominate when the measurement

Figure 5 .
Figure 5. (a) map and trajectory estimated by recursive/batch lter using a batch size of 10.(b) result superimposed with result from bundle adjustment to show agreement.uncertainty ellipses.Figure 5(b) shows the results from bundle adjustment and recursive/batch methods plotted against each other for comparison.Results for the example considered are very similar for the two methods.