Automatic Calibration of Spinning Actuated Lidar Internal Parameters

Actuated lidar, where a scanning lidar is combined with an actuation mechanism to scan a three‐dimensional volume rather than a single line, has been used heavily in a wide variety of field robotics applications. Common examples of actuated lidar include spinning/rolling and nodding/pitching configurations. Due to the construction of actuated lidar, the center of rotation of the lidar mirror may not coincide with the center of rotation of the actuation mechanism. To triangulate a precise point cloud representation of the environment, the centers of rotation must be brought into alignment using a suitable calibration procedure. We refer to this problem as estimating the internal parameters of actuated lidar. In this work, we focus on spinning/rolling lidar and present a fully automated algorithm for calibration using generic scenes without the need for specialized calibration targets. The algorithm is evaluated on a range of real and synthetic data and is shown to be robust, accurate, and have a large basin of convergence.


INTRODUCTION
Lidar 1 has proven to be one of the most useful sensors for field robotics. It has found broad application to tasks including simultaneous localization and mapping (SLAM), obstacle avoidance, object detection/recognition, and ground surface estimation, among others. This is evident in its early adoption in robotics research as a primary threedimensional (3D) sensor (Hebert & Krotkov, 1992;Singh & West, 1991) and its sustained use to date (Mertz et al., 2013). This is not surprising due to the accuracy of lidar measurements at long-range as well as the ease and flexibility of its construction.
In comparison to other range sensors, lidar's accuracy is on the order of a centimeter even at long-range (Wong et al., 2011). Construction is relatively simple, relying on a motor, gear train, and an encoder (Hebert & Krotkov, 1992;Wulf & Wagner, 2003). Moreover, the approach is flexible and allows for a wide range of customization for the application at hand. The two most common actuation schemes are spinning/rolling and nodding/pitching, although others such as a two-axis wobbling lidar are possible. Spinners rotate the lidar about an axis parallel to the viewing direction and are most useful for tunnels or corridor-like environments (Fairfield, 2009;Thrun et al., 2003). Nodders, on the other hand, rotate the sensor around an axis in the scanning plane orthogonal to the viewing direction and Corresponding author: http://www.cs.cmu.edu/ßhalismai 1 Also known in the literature as LADAR and single line Laser Range Finder (LRF). are commonly used for general field robotics applications, where a higher density of points on the ground plane is desired (Stentz, Bares, Pilarski, & Stager, 2007).
The operating principles of actuated lidar are simple. A single line scanning lidar, such as a SICK or a Hokuyo, is mounted on an actuation platform. By actuating the lidar's scanning plane, we can sweep a 3D volume. Due to the mechanical construction of the sensor, the center of actuation of the motor may not coincide with the center of rotation of the lidar spinning mirror. Hence, in order to obtain precise 3D measurements, the transformation between the two centers of rotation must be estimated. Once this transformation has been estimated, a point cloud can be triangulated from a single effective scanning point. We refer to this problem as estimating the geometric internal parameters of the sensing assembly, 2 as illustrated in Figure 1. Once these parameters have been identified and estimated, the sensing assembly is geometrically calibrated.
This estimation of internal parameters-calibration in this work-is performed offline as a prerequisite for many robotic applications (Fairfield, 2009;Kelly et al., 2011;Stentz et al., 2007). While the calibration of the sensor is an important step in these applications, the geometric calibration problem has been largely neglected. Prior work has focused on two main problems involving lidar. One is sensor performance characterization in different operational settings (Alwan, Wagner, Wasson, & Sheth, 2005; Kneip, Tâche, Figure 1. Overview of the calibration problem. We seek to find a rigid body transformation A H L between the lidar's frame L and the actuator's frame A. Note that the coordinate system is arbitrary and illustrated here for concreteness. Our calibration algorithm and implementation are not tied to a specific coordinate system convention. Caprari, & Siegwart, 2009;Okubo, Ye, & Borenstein, 2009). The other is estimating the calibration with respect to rigidly mounted sensors on the robot, such as cameras (Alismail, Baker, & Browning, 2012;Unnikrishnan & Hebert, 2005).
In recent years, multibeam sensors (e.g., Velodyne) have proven popular in robotics research, especially in the context of autonomous driving (Urmson et al., 2008). This is evident with the surge in research work on calibration methods for multibeam lidar. Methods for Velodyne calibration can be categorized based on whether they require a calibration target (Levinson & Thrun, 2014;Mirzaei, Kottas, & Roumeliotis, 2012;Muhammad & Lacroix, 2010). Targetless calibration methods are highly desirable as they greatly simplify the calibration process. Aside from a targetless calibration for Velodyne (Levinson & Thrun, 2014), Sheehan et al. (Sheehan, Harrison, & Newman, 2012) propose a method to calibrate three rigidly coupled lidar units. The three units are mounted on a rotating plate to achieve a 360 • view of the environment. The algorithm is based on maximizing the quality (or crispness) of the resulting 3D point cloud via entropy optimization. Entropy-based methods have been shown to be robust and accurate for point-cloud registration (Jian & Vemuri, 2011;Tsin, 2003), but they are sensitive to the sampling uniformity of the point clouds. Hence, entropy-based methods may not be suitable for sensors producing highly nonuniform sampling, such as the actuated lidar considered in this work.
Also related to our work is the recently proposed calibration method for a two-axis scanner by Dong et al. (Dong, Anderson, & Barfoot, 2013). The authors rely on forming an image of the scene via lidar intensity returns. Having obtained this image, the authors propose to model the lidar calibration problem as a lens distortion estimation (under spherical projection) and solve it with Bundle Adjustment (Triggs, Mclauchlan, Hartley, & Fitzgibbon, 2000). Other examples of using intensity images can be found in the work by Pradeep, Konolige, & Berger (2010), who jointly calibrate a nodding/tilting Hokuyo to other sensors on the robot.
While the aforementioned sensors require an actuation step to obtain 3D point clouds, their calibration methods do not apply to actuated lidar sensors discussed in this work due to the different geometry of the sensing mechanism. Notwithstanding, actuated lidar remains an important 3D sensor for robotic applications. For example, in contrast to a Velodyne, the field of view of actuated lidar is highly customizable to fit the application at hand. It is also cheaper and more compact for deployment on various robotic platforms.
Here, we tackle the problem of calibrating a spinning/rolling actuated lidar. A spinning lidar is constructed by rigidly mounting a laser scanner on a continuously spinning motor. Since it is not physically possible to mount the laser at the center of rotation of the motor, some mechanical offsets exist and must be estimated via a suitable calibration procedure. This calibration problem is performed offline prior to using the data for 3D perception purposes. It is normally performed once and may sometimes be repeated if evidence indicates loss of calibration. The problem amounts to estimating a (possibly constrained) rigid body transformation accounting for the offsets between the center of rotation of the lidar's spinning mirror, and the center of rotation of the actuating motor.
In practice, calibration is often performed manually, either by measuring the offsets on the sensor using a tapemeasure, relying on computer-aided design (CAD) models, or a combination of both (Scherer et al., 2012;Weingarten, 2006). Assessment of the calibration quality and accuracy is usually performed by visual verification. Visual or manual verification is an important part of quality control, but it does not provide the ability to estimate a confidence measure associated with the estimated parameters. While a careful manual calibration can be done with reasonable accuracy, it is time-consuming and error-prone. It also does not provide the ability to verify the calibration by relying on the repeatability properties of an automated algorithm. Hence, an automated algorithm is highly desirable and can have greater benefits than its manual counterparts.
In this work, we develop an algorithm to estimate the internal parameters for a spinning actuated lidar without requiring special calibration targets. Our approach is accurate and convenient. We make the weak assumption that the environment consists of (mostly) locally smooth surfaces at the scale of a local neighborhood of the input lidar point cloud. That is, over some small subset of neighboring lidar points, the surface can be well approximated by a plane, as is often implicitly assumed for point-cloud registration [e.g., with point-plane iterative closest point (ICP)]. The automation of the algorithm makes calibrating the internal parameters of the actuated lidar an easy and fast affair. The method presented here is based on the work of Fairfield (2009), in which the author designs an automated algorithm tailored for tunnel-like environments.
This work is an extension of our previous work , in which we contribute the following enhancements: (i) increased robustness by filtering correspondences and automatic estimation of the neighborhood size for normal extraction, (ii) a method to estimate a covariance associated with the estimated parameters, and (iii) experiments on synthetic and real data assessing the quality and repeatability of the algorithm under various circumstances. We also address the applicability of the method for different lidar actuation mechanisms. Finally, we make the calibration code available for the research community. To the best of our knowledge, this is the only work in the literature that provides a generic, convenient, and accurate method to recover the internal calibration parameters for a spinning actuated lidar.
In the next section, we describe the basic principles of actuated lidar for 3D perception to set the stage for our automated calibration approach.

ACTUATED LIDAR FOR 3D PERCEPTION
Laser rangefinders, discussed in this work, operate by the time-of-flight principle. A narrow laser beam of an appropriate wavelength is shot through a spinning mirror. The spinning mirror transforms the 1D beam of light into a 2D scanning plane. Laser beams forming the scanning plane hit objects in the scene and return to the sensor. Given the time it takes for each beam to return, combined with the known mirror angle and the speed of light in the medium, we can obtain range measurements relative to the center of rotation of the spinning mirror. Not all beams return to the lidar. Some never hit a reflective object in the scene, while others might return multiple times.
Depending on the lidar and the application at hand, multiple returns could be a useful phenomenon (Renslow, Greenfield, & Guay, 2000). In other applications, however, multiple returns could obstruct important information and must be filtered out (Meng, Currit, & Zhao, 2010). In this work, multiple returns are not a cause for concern. We only work with valid returns (the ones that do return), and we deal with multiple returns by simply using the first. Thus, we assume calibration in an environment without significant dust or precipitation, which we find to be reasonable in practice. Extending to use robust cost functions to be able to reject reasonable multiecho responses (up to some limit) would be relatively straightforward but likely is not useful in practice.
A simple solution to obtaining 3D scans is to actuate the lidar about an axis nonparallel to its scanning mirror such that we sweep the scanning plane across a volume in space. For this approach to work, we need to know two pieces of information. One is the pose of the actuator. This is typically a rotation about some known axis with a known measured angle, and is readily available by reading off the encoder value attached to the motor. The other is the offset between the center of rotation of the lidar's mirror and that of the actuation motor. This is what we refer to as the calibration of internal parameters, the estimation of which is the subject of this work.
Estimating the internal parameters is often performed manually, either by manual measurements or using a CAD model for guidance. When carefully performed, manual calibration can provide acceptable results. Nonetheless, it is laborious, error-prone, and not repeatable. This is notably true when the CAD models for the sensor may not be available or when manufacturing tolerances do not adhere to the required accuracy. Hence, it is necessary to be able to estimate these parameters using an accurate, robust, and repeatable algorithm with no user intervention.

Coordinate Conventions
For concreteness and clarity of presentation, we will adopt the common camera coordinate system conventions used in Computer Vision. The right-handed coordinate system is shown in Figure 1, where the Z-axis points forward along the viewing ray, the X-axis points to the right, and the Y-axis points downward. Our algorithm and implementation do not require a specific coordinate system.

Calibration Parameters for a Spinning Lidar
Actuated lidar calibration amounts to estimating a rigidbody transform between the instantaneous center of rotation of the actuator ( A I CR) and that of the lidar's spinning mirror ( L I CR), where A and L denote the actuator and lidar frame, respectively.
Consider a spinning lidar assembly with the coordinate system shown in Figure 1. A 2D point in the lidar frame is obtained by converting a range measurement ρ from polar to Cartesian coordinates using the current position of the lidar rotating mirror θ . This in-plane point in homogeneous coordinates is given by To obtain a triangulated point in space, we apply the actuator's rotation associated with the in-plane point. In our example, this is a rotation about the Z-axis. However, for the triangulation to be performed correctly, the in-plane point must be transformed from the lidar's frame to the where F R a (φ) denotes a rotation matrix with angle φ about the unit axis a in coordinate frame F , Finally, the calibration transform is the 4 × 4 matrix A H L that maps points from the lidar's frame L to the actuator's frame A. This is the unknown quantity we seek to estimate.

Degrees of Freedom (DOF)
In the case of an actuated lidar, the calibration is not a general 6 DOF rigid-body transform due to the restricted motion pattern of the mechanism. In our example of a spinning lidar, the model is at most 5 DOF corresponding to three rotational degrees of freedom and two translations. The translations are along the X-axis (left-right sliding) and the Y-axis (up-down). Translation along the rotation-axis, the Z-axis, is unobservable. We may simplify the calibration model further if the scanner's pointing direction is aligned with the motor's spinning axis. In this case, a rotation about the Z-axis is underconstrained. Table I summarizes the degrees of freedom for commonly used actuated lidar sensors.

ALGORITHM
The algorithm exploits the periodicity of the actuation mechanism to partition a stream of data from a stationary sensor into spatially overlapping sets. We assume that a measurement of the same surface patch in each set is performed with different joint angles (the actuation angles and the lidar mirror angle). This assumption works for spinning lidars and some two-axis systems. It will not in general work for nodding systems, which are unable to measure the same spot from two different joint angles, and it is therefore a limitation to the approach that is discussed later. Consider a spinning lidar, where the lidar rotates about its viewing axis (cf. Figure 2). Without loss of generality, let the first scanned point be at motor position 0.0 rad and let a full-scan be the one with motor position at 2π rad. The periodicity of such an actuation mechanism causes a symmetry for every half-scan. That is, we can split the full scan into two scans such that both sample the same surfaces. 3 Namely, we have two point sets as a function of the motor angles, When the sensor is stationary, these two point sets sample the same surfaces in space, but the joint angles associated with the measurements will be unique for the same physical spot. Hence, the calibrating transform is such that both point sets are aligned, where f is a function of the dissimilarity between the two point sets. This transform does not depend on the specific details of the actuation mechanism, such as the acceleration rate of the motor, or the type of motion, provided that we can split the full scan into two halves. In practice, while the two half-scans sample the same surfaces, there are no strict point-point correspondences. This is due to the continuous nature of the actuation motion and sampling artifacts. Here, we use a point-plane distance as the dissimilarity function (Chen & Medioni, 1992). Our optimization objective, therefore, takes the following form: where n i = (n x i , n y i , n z i , 0) is a unit normal at point x i from the first half-scan, w i ∈ [0, 1] is a weight indicating our confidence of the estimated normal/correspondences, and x ϕ(i) is the corresponding point to x i from the other half-scan, where ϕ(i) : |X | → |X | is a function that returns the index of the corresponding point to x i from the set X . This correspondence function is based on the closest point as commonly performed in registration problems (Fitzgibbon, 2003). To increase efficiency and robustness, we choose to use bijective correspondences with ties broken based on distance. In the expression above, we dropped the explicit specification of the coordinate frame for clarity.
This nonlinear weighted least-squares optimization problem can be solved efficiently with standard methods given a reasonable starting point. In the following, we discuss the details of our implementation.

Rotation Parametrization
Rotation estimation and parametrization is a well-studied problem in Robotics and Estimation theory. The main complication is the nonlinearity of the configuration space, the special orthogonal Lie matrix group SO(3). An excellent discussion of the various representations and associated distance metrics on the manifold can be found in the works of Kuffner (2004) and Hartley, Trumpf, Dai, & Li (2013).
In this work, since rotational offsets are expected to be small and we would like to constrain the estimated parameters, we opt to use the representation with the most intuitive physical explanation, Euler angles. Euler angles allow us to easily constrain the optimization in a straightforward manner. They are also relatively easy for derivative computations (Kelly, 1994). Extending our approach to other rotation representations is straightforward.
Euler angles, however, are susceptible to the wellknown singularity of gimbal lock (when two of the gimbals lie on the same plane). Nonetheless, we are almost guaranteed to avoid this singularity because the orientation offsets are small for this type of calibration problem. If the orientation misalignment is large, a rough initialization may be required. Alternatively, one could use more robust rotational representations. 4

Optimization
The cost function [Eq. (7)] is a nonlinear function of the parameters. While it is possible to linearize the rotations explicitly, we choose to directly solve for the parameters using standard nonlinear least-squares methods such as that of Levenberg-Marquardt (LM) (Levenberg, 1944;Marquardt, 1963). For most cases, the offsets are small and an initialization with the identity ( A H L = I 4 ) is close to the local minima. Hence, the user is alleviated from any manual measurements to initialize the optimization.
In addition to obtaining a solution for the estimated parameters, we would like to estimate the covariance associated with these parameters. Let the vector of parameters be θ . Then the Jacobian is the matrix of partial derivatives of the error function e i with respect to θ and is given by where e(θ) is the error function defined to be the difference between the measurement vector z and the predicted parameters f(θ ), where indicates the difference in vector space. Let the covariance of each measurement be i , and let be a blockdiagonal concatenation of the measurement covariance matrices, i.e., Then, upon convergence of LM we arrive at the Fisher information matrix: To obtain a covariance associated with the estimated parameters, we simply invert the information matrix. This is the Cramér-Rao lower bound (Haralick, 2000;van de Geer, 2005). The estimate of the covariance is rather crude and may overestimate the true covariance (Censi, 2007). Nonetheless, it is useful to assess the accuracy and the reliability of the solution by analyzing the shape of the local extrema, as will be demonstrated in Section 4.

Normal Extraction
Our optimization function [Eq. (7)] depends on the quality of the estimated normals. A common approach to determining a local tangent plane normal on unorganized point sets is based on the eigenvalue decomposition of the covariance matrix in a specified neighborhood. Let (x; r) be the (weighted) covariance matrix estimated in a neighborhood of radius r, where N(x; r) denotes the neighborhood of radius r for the point x, μ is a weighted centroid estimated within the same neighborhood, and w : R → [0, 1] is a monotonically decreasing function of distance, such as the Gaussian or the squared exponential: This weighting function is normalized with The tangent plane normal can then be estimated as the eigenvector corresponding to the minimum eigenvalue of the covariance matrix (x; r). Such a method to compute surface normals is common in computer graphics and surface reconstruction (Amenta & Kil, 2004). Further, an estimate of the goodness of planarity can be obtained by analyzing the eigenvalues (Bosse & Zlot, 2009;Bosse, Zlot, & Flick, 2012). Let the three eigenvalues of (x; r) be λ 3 ≥ λ 2 ≥ λ 1 . Then provides a confidence estimate of the degree of planarity, where p c → 1 indicates higher confidence in the estimated normal.

Adaptive Neighborhood Size
The quality and accuracy of normal estimation depend on the extent of the neighborhood used to compute the covariance [Eq. (13)]. Too large a radius increases the chance of including points that do not belong to the local plane, consequently smoothing out the estimating normals. In contrast, too small a radius increases the likelihood of capturing the noise in the data rather than surface details, therefore roughening the estimated normals. There exists specialized methods for estimating the scale for normal extraction, or other surface features. However, such methods might have high computational requirements. For example, the algorithm described in Unnikrishnan, Lalonde, Vandapel, & Hebert (2010), while producing excellent results, requires a large amount of memory for the computation of geodesic distances.
In this work, we choose a simple method for scale selection. We select the neighborhood size based on the largest distance to the kth neighbor. The intuition is that the distance to the kth neighbor is an indication of the sampling density. Normal vectors estimated at points with higher sampling densities will be given a smaller neighborhood size. These points are usually close to the lidar where more details can be detected. At longer range, or oblique scanning angles, the sampling density is lower. Hence, a larger radius is assigned, thereby increasing the robustness of normal estimation.
The implementation of this strategy is straightforward. For every 3D point, we need two nearest-neighbor computations: the first is to select the scale/search radius, and the second is to select points within the radius. Nearestneighbor search is performed via a KD-tree (Muja & Lowe, 2014) 5 to keep the computation efficient.
This strategy was found to produce normals with better quality than using a fixed radius or a fixed number of neighbors in qualitative comparisons, outweighing the cost of an additional KD-tree lookup per point. In this work, we found k = 50 to be a good value.

Summary
A summary of the algorithm is shown in Algorithm 1. Starting from an initialization H (0) , we compute weighted normals and point correspondences based on the closest distance. These plane normals and correspondences are held fixed in the inner loop to produce a calibration H (k+1) . This process continues until convergence. The process has to be repeated in this fashion because the shape of the point cloud depends on the value of the calibration. Each estimate of the calibration will produce a different point cloud, and consequently different normals. Convergence is determined if Algorithm 1 Overview of the automated calibration algorithm 1: function Calibration(X ,X ;H (0) ) 2: repeat 3: {X , X } = ApplyCalibration(X ,X ,H (k) ) Shape of point cloud depends on current calibration 4: Optimize for H with fixed normals and neighbors 7: until convergence 8: end function a maximum number of iterations is reached or when the change in the estimated parameters is too small.

Synthetic Data
We use synthetic data to obtain a quantitative assessment of the algorithm's accuracy, as there is no readily available and reliable method for obtaining ground truth on a real sensor. Synthetic data also allow us to vary the actuation mechanism and noise magnitude. For ease of visualization and analysis, we use a cuboid as the simulation environment (shown in Figure 3). The cube's side length is 10 m with the sensor assembly located at its center.

Synthetic Spinner
A spinner is an actuated lidar with a rolling motion about the pointing direction of the lidar. Figure 3 shows the result of our calibration in a simple noise-free scenario, in which case the algorithm estimates error-free calibration parameters (up to floating point precision). It is instructive to look at the resulting scanning pattern of the calibrated sensor (right column of Figure 3). Note the gap centered in the middle of the plane. This is the plane orthogonal to the spinning axis and facing directly in front of the lidar. As expected, this is the blind spot of the spinning assembly. The blind spot is inevitable and is caused by offsets between the center of rotation of the lidar's mirror and the center of rotation of the actuator (this blind spot can also be observed in real data as shown in Figure 9d). The extent of this blind spot is proportional to the mechanical offsets between the center of rotation of the lidar's mirror and the center of rotation of the actuator's. In contrast, assuming that the centers of rotation align (left column of Figure 3), a blind spot is lacking and the geometry is clearly incorrect.
We evaluate the algorithm's accuracy using the synthetic data (shown in Figure 3) by performing the calibration 50 times with different noise levels σ r . The data are generated from a full revolution of the lidar with motor resolution of 1.618 • yielding approximately 120,000 points per half-scan. Zero-mean Gaussian noise is added to the returns where the noise uncertainty is varied as a function of the angle of incidence on the surface and the dependence is calibrated from Velodyne data as in Browning, Deschaud, Prasser, & Rander (2012). We also vary the ground-truth calibration parameters for every run according to a normal distribution with t x and t y translational offsets distributed with a mean of 5 cm and a standard deviation of 1.618 cm. The synthetic calibration values are large in comparison to real systems. In contrast to our real sensor, the values are more than five times along the X-axis (left-right sliding) and more than two times along the Y-axis (up-down sliding). Results are summarized in Figure 4. The results show that the algorithm is accurate and repeatable under various magnitudes of noise. In fact, the largest translation error across all experiments is 0.78 mm, while the largest rotation error is 0.03 • . Figure 6 shows an example of the input point cloud to the calibration algorithm to illustrate the magnitude of noise. This is shown for two values of σ r : 4 and 64 mm. Typical noise values for lidar are on the order of 10 to 15 mm at most, which is under our maximum noise test level of 64 mm (note that this is the standard deviation, not the variance as shown in Figure 7. These large values were included to stress-test the algorithm's performance. The algorithm is accurate and can handle a large amount of Gaussian noise, even with a standard deviation close to 7 cm. The consistency of the results can also be viewed in terms of the number of (outer) iterations needed for convergence. Figure 5 shows this number as a function noise. The maximum number of iterations was capped at 50. For small to moderate levels of noise, the algorithm reaches a solution quickly. For small noise magnitudes, we expect the algorithm to converge in fewer than 10 iterations. Otherwise, some error might have happened and further investigation might be in order. Yet an usually large number of iterations is not necessarily an indication of a bad quality calibration, as shown in Figure 4. In real scenarios, however, we expect to observe less variability in the number of iterations as noise in real data is less than our simulation.
The number of inner iterations needed for the optimizer to reach a minimum is consistently less than 10 despite the magnitude of noise. A possible conclusion of this behavior is that for a given point configuration and an appropriate calibration scene, the optimization problem is well constrained with a sharp extrema.

Basin of Convergence
We use the simulated cube data to study the basin of convergence of the algorithm. This is achieved by a generating a synthetic calibration with the most common calibration parameters. We simulate translations along the Xand Y-axes over a grid of values {(t x , t y )} = {(0.5, 25.0) × (0.5, 25.0)} cm and a resolution of 0.5 cm. All experiments were initialized with the identity. Results are shown in Figure 8. The algorithm is able to recover the correct parameters up to a translation vector of ≈ 20 cm on each of the Xand Y-axes. Beyond 20 cm, the shape of the point cloud becomes severely distorted, e.g., see Figure 22. Nonetheless, 20 cm on both axes is a very large calibration offset in comparison to values found on real actuated lidar sensors. It is significantly less than typical manufacturing errors commonly observed in real sensors.

Real Data: Hokuyo Spinner
For real data from a spinner, we use a Hokuyo UTM-30LX-EW laser scanner mounted on a spinning motor (cf. Figure 2). The Hokuyo is configured to have 270 • field of view, with angular resolution of 1 / 4 of a degree, producing 1,081 measurements per scan line.
Calibrating a spinning lidar is challenging as the motor's home position is arbitrary. Every spinning sensor potentially has a different homing position. Hence, it is not an easy task to integrate scene constraints into the optimization. For example, it would be difficult to include a ground plane constraint such as the one used by Underwood, Hill, Peynot, & Scheding (2010) into the calibration framework without incorporating additional information or assumptions about the scene.  In the following, we perform various experiments for real-life situations that could impact calibration results using data from the Hokuyo Spinner.

Effect of Missing and Sparse Data
Here, we address the question of missing correspondences and show that the distance to the tangent plane metric is not severely affected by the lack of precise point correspon-  Figure 3. The maximum number of iterations allowed was capped to 50. Noise, σ r , is shown in millimeters.
dences. In Figure 9, we show our calibration results using an indoor office environment. We colorize the points with reflectance values from the Hokuyo for better visualization. The automated calibration results show a big improvement, as can be seen by the geometry (orthogonality of planes constituting corners) and the sharpness of the visualized reflectance values.
Using the same office data (shown in Figure 9), we evaluate the algorithm's repeatability and robustness in the face of missing point correspondences. We randomly remove a percentage of points from the original dataset, run the calibration algorithm, and report the variance of the estimated parameters. For every percentage of removed points, we repeat this process 50 times, each with a different random subset of points removed. Results are summarized in Table II, which indicate very small variations in estimated parameters due to missing correspondences.

Effect of Actuation Speed
The actuation speed of the motor has a major influence on the sampling density. A lower actuation speed results in a higher sampling density, and consequently more accurate normals. Nonetheless, since the calibration parameters do not depend on the speed of actuation, we desire an algorithm that is independent of the actuation speed as well. This is convenient, as data collection for calibration purposes need not have stringent requirements.   between the two half-scans and normals are extracted reliably. At 30 rpm, however, the sampling is sparse, and we do not have enough overlapping segments. Normal estimation is also inaccurate as many surfaces are sampled as lines due to the high speed of actuation. Notably, point-plane optimization combined with the scene structure is able to recover the rotation estimates consistently across the different sampling rates. This is because a small number of normals is enough to constrain the orientation. Nonetheless, for an accurate estimate of translation, we require better correspondences, which were not possible at high rpm.

Effect of Field of View
The field of view (FOV) of the lidar determines how much of the scene the lidar can observe. In some situations, it is not possible to use the full FOV for the lidar due to its placement on the sensor. For example, if the lidar was mounted on the grill of an autonomous car, then the maximum FOV would be 180 • . Figure 8. Basin of convergence. Translation error (left) and rotation error (right) for a grid of translation along the Xand Y-axes starting from 0.5 to 25 cm. The algorithm is able to estimate the correct calibration parameters up to the boundary of 20 cm. The last row shows a zoomed-in view of the error surface for a translation magnitude up to 10 cm, where the maximum translation error is 0.34 cm and the maximum rotation error is 0.045 • .
Here, we experiment with different FOVs from the Hokuyo sensor. We use the same dataset for calibration, but with different lidar FOVs. The different FOVs are obtained by post-processing the data. The data used in this experiment are shown in Figure 11.
To assess the effect of the FOV, we let the calibration estimated with the full FOV of the lidar be the baseline comparison. For all other FOVs, we report the calibration signed difference against the baseline, i.e., for f ∈ {200, 180, 90, 45}. Differences are computed individually for each of the five estimated parameters. Namely, xaxis rotation δr x and translation δt x , and y-axis rotation and translation δt y . Results are reported in Table V. We observe an acceptable estimate of rotation for all reduced FOVs. This is due to the small rotational offsets of the uncalibrated sensing assembly. Furthermore, we observe an acceptable performance up to a 180 • FOV. For smaller FOVs, translation estimates become unreliable due to the lack of structure in the calibration scene, as can be observed in Figure 11. Degradation of the calibration quality due to limited FOV can also be seen by examining the uncertainty estimates obtained from the optimization procedure [Eq. (12)]. Taking the determinant of the covariance matrix, averaged over multiple trials, as a single number indication of uncertainty, we can see a significant increase in uncertainty with reduction of the FOV, as shown in Table VI.

Effect of the Environment
The structure of the calibration environment plays a role in the accuracy of the results. A good calibration algorithm, if possible, should not have stringent requirements on Table II. Summary statistics for a 4-DOF calibration on the same data with varying percentages of randomly deleted points. The initial point set size is ≈ 400,000 points. Mean (μ) and standard deviation (σ ) of each parameter is reported, r x and r y are rotation angles about the Xand Y-axes in degrees, t x and t y are the translation parameters in millimeters. Variations in estimates due to missing point-point correspondences are practically negligible. The ability to use the algorithm in various environments is especially important for robotic applications, where it is not uncommon to require an in-field recalibration. Rough terrain, environmental conditions, and transporting the vehicle to the field might comprise the integrity of a previous calibration. Figure 10. Calibration results from two representative rpm settings. Top row shows data collected with 5 rpm, the bottom row shows 30 rpm. Data were collected in our test facility. The height of the building measured from the ground to the top of the ceiling (left column) is approximately 18 m, while the length (right column) is approximately 48 m. While at first glance both calibration results produce similar point clouds, closer inspection shows some errors using data collected with 30 RPM. We used the same number of neighbors to compute the adaptive neighborhood for each of the datasets.
Here, we show that the algorithm is repeatable when used in different environments provided that the geometry constrains the calibration parameters. In particular, we do not require planar man-made environments of certain dimensions; local planarity is sufficient. Further, we will show that our algorithm is not strongly affected by the maximum range return values used to calibrate. This is important because for the same sensor we would like to obtain the same calibration parameters whether the calibration was performed in a small room or out in the open. Figure 12 and Table VII show the calibration results from the same sensor with data collected from different indoor scenes with different range variations. Estimates of the calibration parameters are consistent across different datasets (see Figure 12). The exception is the case of the long narrow corridor. As expected, the structure of the narrow corridor does not provide enough constraints on the calibration parameters. This situation can be detected by examining the covariance of the estimated parameters.
Finally, we apply the algorithm on data collected from an outdoor scene. Results on outdoor data are shown in Figure 14 and verify that the algorithm is suitable for use with outdoor data.

Application to 3D SLAM
We demonstrate the quality of the calibration for a 3D SLAM application. We use the spinning sensor shown in Figure 2. The assembly consists of a Hokuyo range scanner mounted on a spinning platform and rigidly coupled with a 1 megapixel stereo camera. The sensor assembly performs  Table IV. Calibration results for the same sensor and the same environment with various actuation speeds. Actuation speed is measured as revolutions per minute (RPM). Rotations are reported in degrees, and translations are in millimeters. Estimation of rotation parameters is consistent across various speeds (different data densities). However, estimation of the translation parameters becomes unstable at the highest speed. This can be explained by the low-density cloud leading to poor pointplane correspondence, as shown in Figure 10. The last column shows the determinant of the estimated covariance matrix associated with the parameters, which indicates a degradation of calibration certainty with reduced sampling rate. a full revolution every 4 s (i.e., point cloud output is 1/4 Hz). The Hokuyo has a field of view of 270 • with an angular resolution of 0.25 • (i.e., 1,081 range measurements per scan line). Stereo visual odometry (VO) (Alismail, Browning, & Dias, 2010) is used to account for the motion of the vehicle while scanning. This is possible via another calibration procedure between the camera and the lidar assembly . We receive pose updates from VO at the rate of 10 Hz. The poses are interpolated per range measurement to allow us to reconstruct the point cloud from the moving sensor. In case of VO dropouts-e.g., due to sudden changes in lighting or motion blur-we drop the VO frame and the associated lidar points. VO accumulates drift over time. To reduce this drift, we apply point-point ICP (Besl & McKay, 1992) for every 2 s worth of lidar data. This is shown to reduce local VO drift without the need for a Bundle Adjustment procedure that our embedded computing platform cannot perform in real time. While it is possible to use SLAM methods designed to capture the continuous nature of the scanning process (Alismail, H., Baker, L., & Browning, 2014;Bosse & Zlot, 2009;Tong, Furgale, & Barfoot, 2013), we opt for this registration-based SLAM approach due to its popularity and the impact of internal calibration on performance. The sensor is mounted on a small-sized mobile robot platform (Packbot from iRobot) with the primary application for 3D mapping of underground mines. The system, however, is not restricted to underground environments.
We show examples of our mapping results to illustrate the quality of the calibration. As with all complex systems, calibration plays a small but important role in the overall system performance. If our calibration accuracy was not satisfactory, we expect to see visible errors in the resulting scene structure. In this case, such errors are expected to manifest as "double-walls," or ghosting artifacts in the resulting scene reconstruction. We illustrate this effect in Section 4.8. Figure 15 shows some examples of our system's performance in an underground mine. The closeup views show crisp point clouds and the lack of any visible artifacts. In all of the results shown here, a loop closure procedure was not performed as to not hide any calibration errors. A longer run is shown in Figure 16, where the longest corridor in the mine is approximately 150 m without chances for loop closure.  Perhaps a better illustration of the calibration quality is by inspecting the reconstruction of indoor environments, where it is easy to verify the correctness of scanned manmade environments. Figure 17 shows the results of our system in an indoor industrial complex collected at the second floor of our facility. The original point cloud has more than 40 million points. Figure 18 shows some detailed views of the mapped building.

Effect of Calibration on 3D SLAM
In this section, we show the effect of calibration on SLAM results. Figure 19 shows SLAM results using the same sensor and SLAM algorithm described in the previous section. The left panel shows the result without applying the calibration. While we do not have ground truth, we can see that SLAM results with our calibration method are more accurate. The resulting map is sharper and the indoor environment is more defined. The structure on the ground plane is also straight after calibration, in comparison to a bent structure prior to calibration.

DISCUSSION
In this work, we did not make use of lidar intensity returns, which could contribute useful constraints on the calibration. Nonetheless, forming this image is not always possible. For some sensors, intensity returns may not be available, or may not have enough sampling density to construct an image suitable for calibration. In such cases, interpolation artifacts might severely impact the accuracy of corner localization. Consider, for example, a spinning lidar observing a calibration target at long-range. Regardless of the spinning resolution of the actuator, a clean intensity image is difficult to obtain (cf. Figure 9).
In addition, intensity-based methods rely on calibration targets. This has some drawbacks, namely the need to correctly identify and extract the calibration pattern from a large amount of data. This also restricts the calibration to close-range, where the calibration target can be accurately localized in the intensity image and may induce rangedependent bias in the results.
We note that it is possible to use an actuated lidar in SLAM applications without a specialized calibration step, as described in the work by Bosse & Zlot (2009). The algorithm proposed by Bosse & Zlot (2009) solves the SLAM problem by optimizing a continuous trajectory that aligns half-scans from a spinning lidar assembly. However, calibration remains important and useful. During their recent work, Zlot & Bosse (2014) indicated the use of their SLAM algorithm to calibrate a spinning lidar from stationary data prior to using the sensor for SLAM. Similar to our results, Zlot & Bosse (2014) observe an improved SLAM system when the internal parameters of the sensor have been calibrated.
Our algorithm is accurate, automated, and generic in the sense that it does not enforce restrictive assumptions on the sensor configuration. The only requirement on the actuation mechanism is that points from each set (half-scans for spinner) are produced from distinct joint angles (lidar mirror and actuation angles). Otherwise, both sets will be identical and do not provide enough information to allow for an automated calibration.
The canonical example for this case is a nodding/pitching sensor, as illustrated in Figure 21. On the left column of Figures 21(a) and 21(b), we show data from a calibrated nodder (using CAD measurements). To the right is an illustration of how the data would appear if the calibration offsets were not accounted for. We have used a large offset of 1 m to stress the effect visually. The datasets contain several nods of the scene. In Figure 21(c), both calibration and uncalibrated are overlaid to show the severe distortion. Not only are the distance measurements incorrect, but we also observe nonrigid shearing and stretching. Solving the nodder calibration problem would require making more restrictive assumptions on the environment (e.g., it consists of large, readily identifiable planes such as the ground surface), or integrating motion that would couple the pose estimation problem.
As can be seen, data collected while the sensor is nodding upward are identical to the data collected while the sensor is nodding downward. This is because the lidar beam sampling the scene is identical in both nodding directions. In the spinning lidar case, the second half-scan switches the beams sampling the scene, and we obtain enough information for automated calibration.   Figure 12 for a 4 DOF model (r y , r x , t x , t y ), with rotations in degrees and translations in millimeters. Results are consistent among the Corner and the Mailroom data. Nonetheless, we can observe an underestimated translation in the Corridor data due to the underconstrained geometry of the scene. This can be detected by observing the covariance matrix where the uncertainty of the estimated translation is orders of magnitude higher for the Corridor data compared to the other datasets.

Failure Cases
Although we have observed excellent empirical performance on spinning and wobbling (two-axis) lidar systems, the algorithm is not fail-proof. It is expected to fail in the following scenarios: r Insufficiently constrained geometry. For example, a single plane in the scene will not be able to constrain the full degrees of freedom of the calibration model. To avoid this situation, we recommend collecting data with at least three visible plane orientations. Roomlike structures, with several orthogonal plane orientations, are ideal.
r Lack of surface correspondences. We have shown that the algorithm is resilient to the lack of precise pointpoint correspondences. However, for extremely sparse         Figure 19. The robot trajectory without calibration exhibits periodic staircase effects in which the ICP tries to align lidar half-scans (shown in the closeup views and highlighted with arrows). Jumps in the trajectory correspond to the start/end of a half-scan. The jumps are very close to the estimated calibration parameters ≈ 3 cm.
data the probability of correct surface correspondences drops and might affect the accuracy of the algorithm. Such situations might happen when the actuator's speed is very fast and the majority of the scene structure lies at a far distance (i.e., the point sampling density is large with respect to the surface geometry roughness). This is easily avoided by adjusting the actuation speed and selecting the proper environment for calibration.
r Very large calibration offsets with a lack of good initialization. Our problem here differs significantly from standard registration problems. The calibration values directly affect the resulting point clouds. This is better visualized in Figure 22. The figure shows a large offset that causes severe distortion of the data that we cannot recover from without the aid of an initialization. The initialization need not be very accurate, it is merely needed to reduces the distortion in the data.

Accuracy Assessment
Calibration results will need to be inspected carefully. As was shown in Figure 12, it is possible for the algorithm to generate a very visually convincing result on underconstrained geometry. In addition to inspecting the covariance of the estimates, we recommend a visual inspection of the calibration results using a "test data" of different scenes. Yet a better verification of the calibration accuracy is an evaluation as a function of the system's performance, such as SLAM results compared to ground truth if available.

CONCLUSIONS AND FUTURE WORK
In this work, we presented an algorithm for fully automated targetless calibration of actuated spinning lidar internal parameters. These parameters are the mechanical offsets between the center of rotation of the lidar's mirror and the center of rotation of the actuation mechanism. The algorithm was evaluated on real and simulated data and was shown to be repeatable, robust, and accurate. Our algorithm does not rely on specific calibration targets and is readily applicable for any locally planar environments. Robustness to errors in normal estimation was Figure 22. Illustration of large calibration offsets. The ground truth is shown on the left column. The right column shows a large calibration offset with 1 / 2 m translation and 10 • rotation about the Y-axis, with the points triangulated assuming an identity calibration. The top row shows a front view of the simulated scene, while the bottom row shows a 3D view. While the magnitude of the offset could be managed with a standard registration problem, the problem is exacerbated in the case of calibration. This is because the calibration values produce the point cloud. Notice the distortions in the point cloud shown at the bottom right corner where scene planes are severely warped. In this case, our calibration algorithm fails to find the correct solution. The algorithm correctly estimates 10 • rotation about the Y-axis, but adds an extra 15 • rotation about the X-axis at the expense of translation estimates.
achieved by using an adaptive data-driven selection of the neighborhood radius. Furthermore, by automatically detecting unreliably estimated planes, we can reduce their effect on calibration accuracy by down-weighting their contributions to the error function.
In this work, we did not address calibration problems pertaining to timing errors, range bias due to different object materials, or bias due to thermal fluctuations. These might be avenues of future work.
We also did not make use of lidar reflectance/intensity information in the calibration process. This is an advantage that makes the algorithm applicable to a variety of sensors. However, reflectance data contain important information that could be used to aid in the search for correspondences and even improve the cost function. We have not observed a need to include lidar reflectance in any of the steps. A more interesting direction of future work is designing an automated algorithm for nodder calibration.
Another venue of future work is studying the impact of calibration errors on the shape of the point clouds (Habib & Kersting, 2010), and the overall system performance. A better understanding of calibration error effects on system performance is important to achieving high accuracy, identifying miscalibration events, and devising online calibration methods (Hansen, Alismail, Rander, & Browning, 2012;Levinson & Thrun, 2014;Roy & Thrun, 1999).