Path planning with hallucinated worlds

We describe an approach that integrates midrange sensing into a dynamic path planning algorithm. The algorithm is based on measuring the reduction in path cost that would be caused by taking a sensor reading from candidate locations. The planner uses this measure in order to decide where to take the next sensor reading. Ideally, one would like to evaluate a path based on a map that is as close as possible to the true underlying world. In practice, however, the map is only sparsely populated by data derived from sensor readings. A key component of the approach described in this paper is a mechanism to infer (or "hallucinate") more complete maps from sparse sensor readings. We show how this hallucination mechanism is integrated with the planner to produce better estimates of the gain in path cost occurred when taking sensor readings. We show results on a real robot as well as a statistical analysis on a large set of randomly generated path planning problems on elevation maps from real terrain.

As mobile autonomous systems have become more and more capable, the tasks that they can perform have also scaled up.In the early stages of mobile robots, robots were able to navigate around in artificially constructed worlds and mazes.Currently, most mobile robotic applications are still in the domain of constrained environments, such as for example, indoor applications.However, mobile autonomous systems have now entered into tbe realm of unstructured environments, with the Demo ID 1231, Perceptor 1291 and DARPA Grand Challenge programs as well as the Mars rover expeditions being the hest known.
The core technology that allows these autonomous systems to succeed is a good representation of the environment in which they operate.However, if these environments are not known beforehand and if there is no strumre to exploit, traditional navigation and sensing methods will fail.Specifically, they will suffer from the so called myopic planning effect 1181.This effect is due to the fact that the planner is limited by the maximum range of the mobility sensors.Since typical mobility sensors, such as LADAR (mobility laser range) or passive stereo vision, will only acquire data up to a few tens of meters, the planner has no knowledge about what to encounter beyond the sensed perimeter.As a result, the planner is unable to anticipate obstacles sufficiently early and has no choice but to plan paths close to obstacle boundaries.For autonomous navigation over long distances, this limitation degrades the performance of the system by greatly increasing the length of the path traveled by the vehicle.Consequently, the power consumed is increased and, more importantly, the relative short range of the mobility sensors forces the vehicle to drive closer to terrain obstructions than is safe or necessary.
Sensing solutions that could provide the planner with information about the environment at much longer range are becoming available.These so called "mid-range sensors" are capable of partially reconstructing the threedimensional environment of the vehicle at longer range, e.g., up to 500m.These sensors are based either on reconstructing StruCNre by integrating long sequences of monocular images acquired as the vehicle moves [ZOI, or on reconstructing the structure by stereo matching across widely separated views either from multiple vehicles or from a single vehicle at two different positions [I].In addition we refer to sensing systems for obstacle avoidance as "mobility sensing", these include sonars, short baseline stereo and laser range finders which provide data up to a maximum of a few tens of meters 151.
With mid-range data available to the path planner, it can anricipate obstacles ahead of time and can therefore plan a more optimal path to the goal.However, there are several major issues that need to be addressed first for this to work.
First, it is not practically feasible to construct such a large map of the terrain in all directions around the vehicle since it would require a prohibitive amount of computation and special hardware to achieve sensor coverage.Therefore, it is important that we only focus sensing and computation to those areas that are of importance to the path planner.Second, all the mid-range sensing techniques require more costly data collection and processing procedures than the real-time sensing strategies used for mobility, Therefore it is not possible to continuously scan the environment and process the data.An active strategy is needed to decide when to incur extra sensing cost to aid collecting terrain data critical for navigation.
We have proposed an approach to address these issues in 1191.However, even with optimal sensor planning, the planning decisions are still based on partial data from the sensors.The effectiveness of the planner is greatly enhanced if we could infer (or hallucinate) a more complete representation of the underlying world from the data.
Figure 2 illustrates this idea.In Figure 2(a), the partial map built by the robot's sensors suggests that a gap exists between the two obstacle regions at the center of the map, leading to the path from the current robot position to the goal shown in Figure 2(a).However, i/ the gap were actually blocked by as-of-yet undiscovered obstacles, the robot would incur a much higher-cost path by skirting the obstacle regions by using the shorter-range mobility sensors (this path is shown in red in Figure 2(h)).If we had discovered the occlusion earlier by using mid-range sensing, the robot would have used a more efficient path (shown in blue in Figure 2(b)).Therefore, the right sensing strategy would have been to take one mid-range sensor reading in the direction of the potential occlusion.
Fiz. 2. The pmtiitial map in (a) suggesesls a gap between thc two obstacle regions.h l e q e deviation from the path will be incurred if the gap is actually blocked (b).
To implement such a strategy, the question that needs to be answered is: How do we detect automatically such "interesting" parts of the environment at which to take sensor readings?In [19], we used simple heuristics to predict the location of such areas.In this paper, we propose a principled approach to inferring likely obstacle configurations from sparse data and from a probabilistic model of typical configurations of natural environments.We summarize the framework from [I91 (with a different formulation) in Section 111.We show how the problem of predicting map configurations can be implemented by using a probabilistic formulation and how it can be integrated in the sensor planning approach in Section N. We conduct experiments on real and synthetic data i n Section VI.These experiments parallel closely the experiments of [191, with the crucial difference that the predictions used in the sensor planning are generated by using the inference techniques of Section IV.

PRIOR WORK
The work that we previously presented [19] identified two main bodies of work, sensor planning and coverage planning, that are related to the framework described here.In addition, there is also related work in the computer vision community that is of importance to the creation of larger volumes of data from sparse sets.We will briefly discuss the relevance of this work.
When we consider the literature that describes sensor planning as an active vision problem, the work concentrates mostly around hand-eye coordination [271 or view planning for dense reconstruction [15][11].All these techniques exploit the fact that the environment is constrained and that there is a specific object of interest.
The second category looks at the planning for sensing problem by formulating it as a coverage or a n exploration problem.Examples of work in this area are [7] which covers the well known art galley problem, Moorehead [I61 who focuses on coverage.Some of the concepts about information gain as presented in his work have a similar flavor to the work presented here.Gonzaez-Bsos [SI focuses also on coverage.It is an integrated sensing planning approach, however his method does not allow for incremental updates on already covered areas if more accurate data becomes available.
In 1121 and [13] Lauhach presents a navigation strategy based on the "Bug" family of path planners.It does incorporate sensing and path planning as an integrated solution.It does however have some drawbacks that make it less usable for navigating in unstructured environments.It has no notion of uncertainty in sensor data and it needs continuous detectable obstacle boundaries.It also only senses at discrete decision points.In addition, the traversability cost is limited to a binary representation, traversable or non-traversable, which can not express costs of traversing for example, vegetated terrain.In addition, its requirement for continuous obstacles prevents the use of sparse mid-range data.
None of these approaches use any prior learned model to infer denser world models from sparse representations.The work presented in this article was inspired by the data hallucination framework as presented in 121 and is well known in the vision community.Applying data hallucination for path planning is a novel application of this concept.The work presented in [6] and [I71 uses a Bayesian framework to estimate map smcture to aid in localization.Which is very much in the same flavor as the work we describe here, however, their method relies on a structured environment and is not usable for robot navigation.

SENSOR PLANNING
We propose a solution to incorporate the usage of midrange data with forward simulation and data hallucination.
At the core of this solution is a grid based D* planner I251, [26].The D* planner takes as input a collection of cost maps that describes the current knowledge about the world.In our case, we use I)* to compute the best path from the current vehicle location to the goal point as usual, but, in addition to the usual obstacle costs, we also use costs that reflect the utility of mid-range sensing from different locations in the map and to denote traversable areas according to our hallucinated world model.With these extra terms added to the planners cost function, the planner will attempt to find a path through traversable terrain, that still avoida obstdcles but that also maximizes the utility of sensing.This approach of incorporating utility into a cost function is similar to the approach Rosenblati [211[22] takes i n his arbiter.However, his utility map does only look at a few possible actions over a limited horizon and does not deal with additional utility for sensing purposes, nor does it have any notion of what to anticipate in the future.
With this approach we address two key issues in planning: . Active planning for mid-range sensing, addresses which locations the robot could visit in order to collect mid-range data that would be beneficial to the path planner.Anticipating traversability, this will be addressed by hallucinating worlds based on sparse evidence from sensor data.
In our approach, the planner uses a cost computed at every location of a grid covering the extent of the map.We decompose the cost into three parts: The current State of the world p, the added utility for sensing purposes $ and a heuristic cost that incorporates mid-range and hallucinated data w.p is the actual cost of traversing the cell based on going to the goal directly or going to the goal via an intermediate observation location.p expresses our knowledge about the traversability of a grid cell as discovered hy the mid-m2.esensing system and expanded by the hallucination algorithm.This cost is similar t o p in that it encodes the local obstacle-ness of the terrain.It is useful to separate the two costs because, we have less confidence in the hallucinated world.The combination of these costs, C, captures the notion of utility for sensing purposes and will also make cells that are anticipated to be traversable based on our both actual and hallucinated data more favorable.In the planner, the cost are combined as with where the parameter t controls the amount of deviation from the behavior of a traditional mobility-based path planner.If r = 0, the vehicle executes the default path computed from the traversability maps alone, it will not do any sensor planning, nor will it exploit the mid-range and hallucinated data.r captures the level of exploration the robot might do while on-route.If r = 0, the robot will stay close to its current most promising path.With increasing r, the robot will stray off its path to visit observation locations with a good vantage point on its way.
The computation of the combined utility value needs not be reseicted to this simple linear combination, as long as this function is admissible.That is, the cost bas to increase monotonically with p, 'p.@.
data from the short range obstacle detection system.This is the cost that is used in the simplest mobility system in which mobility sensors, such as LADAR or stereo.insert obstacles in the world map.In practice we use a normalized and smoothed version of the observed terrain gradient:

IV. HALLUCINATING DATA FOR PLANNING
Following our approach from [19], the sensing cost @ is based on evaluating the gain obtained by taking a sensor reading for a given robot location xtE', observation location U and goal location VGCLR' (Figure 3).The utility  C,,(L,x;vc) the path cost for navigating irom x to VG without mid-range sensing and C,(L,x:u,vc) represents the path cost for navigating from x to VG via an intermediate location U for collecting mid-range sensor data at U. (Figure 3).Since the goal location (vc) is assumed to be fixed we will avoid overloading the notation by omitting VG.Therefore, the total Benefit, $", at a position U averaged over all possible labelings of the world, for visiting a particular observation location while on-route to goal, $.,(L, x, vc) can he expressed as:  L , x , u , v c ) ( 6 ) D, the data, is the gradient of the terrain at every cell of the map.C is the set of all possible labelings of L. Note that each value of L represents a world model (configuration of obstacles) that is consistent with the data but is not actually observed.

A. MRFModel
Each L is a world configuration hallucinated from the data.If N is the number of cells in the map, IC1 = Z N becomes computationally intractable in practice even with small values of N. In order to make it possible to compute we will use a common approximation that assumes that P ( L I D ) can be approximated by a delta function [IO].
This amounts to replacing the average configuration of the world with the most probable configuration.

L
Intuitively, i is the most likely world configuration inferred from the observed data.
The next problem we need to address is how to compute P ( L I D).We start with the observation that the labels on cells expressing traversability are not independent.If a cell is classified as non-traversable, then there is high probability that the neighboring cells are also non-traversable except at the discontinuities.This is also m e for the traversable cells.This kind of contextual dependencies in the labels on 2D lattices has been well studied in computer vision tasks and is often referred to as spatial smoothness.
A popular approach for representing spatial smoothness is to use Markov Random Fields (MRFs), which can incorporate local contextual constraints in labeling problems in a principled manner [14].MRFs are generally used in a probabilistic generative framework that models the joint probability of the observed data and the corresponding labels.In other words, let D be the observed data from the terrain, where D = {di)i,s and ditlR, di is the data from the i th site, and S is the set of sites.Let the corresponding labels at the terrain sites be given by L = {li)ies,li and /;€{-I, 1).In our case, li indicates the presence or absence of an obstacle at location i.In the MRF framework, the posterior over the labels given the data is expressed using the Bayes rule as, P ( L I D ) cx P ( L , D ) = P ( D I L ) P ( L ) (9)

B. Dafa Term
The first term in the product, p(D 1 L ) , in Equation 9is known as the likelihood of the data.In general, it is assumed that the data at each site is conditionally independent given the labels at that site.i.e. p(D I L ) = n i e s p ( d i I l i ) [31[14].We model the likelihood for each class, traversahlelnon-traversable, as a Gaussian.
The parameters of the two Gaussians representing the likelihood for these classes can be easily leamt from labeled training data.This leaming takes place for a typical type of environment where the robot is to navigate in, for example on Mars, an exemplar is used for leaming where after the robot will use these parameters for online classificationhallucination of terrain.For the current model, this learning boils down to fitting a single Gaussian two each of the two classes.

C. Field Model
After modeling the likelihood of the data, we need to model the prior over label configurations P(L), which encodes the notion of spatial smoothness of the terrain labels except at discontinuities.Since we have only two classes (Uaversablelnonaaversable), this can be expressed in a MRF formulation of a binary classification problem.
The data likelihood p ( D I L ) is assumed to be conditionally independent given the labels and the label interaction field P ( L ) ir assumed to follow a homogeneous and isotropic Ising model with only a pairwise interaction term: log(P(L)) = 131ilj where fi is the interaction parameter of the M W .The king model favors neighboring sites with the same labels and penalizes the dissimilar labels by cost P UOI.
Combining the likelihood model P ( D I L ) with the prior over labels P ( L ) , we can write the overall posterior over labels as follows: In which Zm is a normalizing constant, often referred to as the pastition function.For this M W (Equation I I),

D. Inference
The min-cutslmax-flow algorithm, that splves the MRF equation ( 11) produces a labeling L of uaversablelnonuaversable cells that is most consistent with the world model and the current data.This binary labeling of traversable and nontraversable cells is updated at every cycle with the current data and represents the most anticipated configuration of the world (Figure 4).

Pig. 4.
Typical cxmpla that shows Ihc influence of the obstacle hallucinalion.The iohot (in green) irilvcis to the goal (blue).The path is shown in yellow.Thhe hallucinutcd ohslaclm arc in white, hallucinated emply sppacc in gray, ohserved mwmsahlc in gmcn and observed obvaclcs are d.In Ihc top pancl.the planna uses only the information r m thc scnmn and it plans a path rroush Ihc obslacle.In h c bottom pancl.the hallucination algotiihm mcrgcd ohstacles comeclly logcthcr and pmduccd a hettrr path.
To illustrate the operation of the obstacle inference algorithm, we have included an example experiment (Figure 5 and Figure 6).In Figure 5, the example terrain is shown with its ground truth obstacle labeling.The planner simulated a robot traversal from the red to the yellow markers.
The number of map cells that are detected as obstacles at every position along the path is shown in Figure 6, along with the number of hallucinated cells.The ground truth, that is, the total number of actual obstacle cells computed directly from the underlying elevation map of Figure S(left) is also included.This number provides a baseline for reference: It is the total number of obstacle cells that would he included in the map if the entire world had been explored and sensed.
Initially, there is no data available, therefore the hallucination algorithm has no evidence for any possible obstacles.It will therefore not infer any obstacles.As some sparse terrain data has been sensed, there is enough evidence for some cells to be inferred and classified as obstacles.However, when even more of the environment becomes known and inferred obstacles get conhrmed, the number of inferred obstacles decreases.In the Limit, when all cells have been explored there will be no inferred obstacles.
Pis. 6.The number of ohstacle cells is plotted ovcr the come of an examplc lrajeciary The graphs confirm lhc inluition that while more of the environment gcis explored, less needs to bc infcmd. in the IimiL the number of inlemd cells will go to lem and Ihc numhcr of lotal known obstacle cell$ will equal the number for thc ground truth labelin&.

Smoothing
Since we want to combine the hallucinated traversability map into a smooth cost landscape for our planner we use a smoothed version of this hallucinated map to guide our robot over anticipated traversable terrain: In which L is the hallucinated map and G, a Gaussian kemel with mean a.

Resrricring Sensor Orientation
The formulation of the benefit cost as defined in Equation 5 assumes an omni-directional sensor.Since, in practice, a mid-range sensor will have a limited field of view, our model needs to be adapted to compensate for this limitation.We can easily do so by taking the maximum of the sensing cost d over all possible sensor orientations for a sensor with a limited field of view: As we have discussed in previous work (191, this is computationally expensive.However, we also showed that we can restrict ourselves to only looking at interesting areas.We would therefore fix 0 in such that it would be looking at a "Point Of Interest" (POI) (Figure 7).

VI. DISCUSSION
The experiments we have conducted on real and synthetic data parallel closely the experiments of [19], with the crucial difference that the predictions used in the sensor planning are generated by using the inference techniques presented in the previous sections.
We conducted an example experiment in an outdoor environment in which we compare our approach to the standard mobility sensing and planning approach.
For this purpose, we had our Pioneer DX robot navigate across the parking lot to find a goal behind a huilding.
The path that the robot followed is indicated by a dark trace in Figure 8. Data was previously collected along this path with a wide baseline stereo algorithm [19].The points were used then as the single source of information in the batch algorithm to provide the mid-range sensing/planning algorithm with data.The batch algorithm used a camera model that was identical to the camera on the robot.The This point of interest is chosen to be at a location that is likely to be traversed, has not been seen before.or is in an area cluttered with obstacles.This can be expressed with the following definition: In which r is the (inverse) distance from the current path, 19 is equal to 1 if the cell bas been observed and n expresses traversability.This traversability is the union of both the traversability from both the mobility data and the hallucination framework IF = max(p.p).

When to Look
The framework as presented so far, has a way of evaluating the benefit of sensing.This utility is used in two ways: First it is used to decide where to sense.Since the utility is folded into the navigation cost, the benefit ?+b will guide the robot towards locations that are advantageous for taking mid-range sensor measurements.Second, it i : used to decide when to sense to to avoid taking measurements continuously.If the benefit at the current location d, is above a threshold the robot will take a sensor measurement.mid-range sensing/planning approach did detect that the straight path to the goal was blocked and found immediately a passage around the obstacle, as indicated by the lighter trace in Figure 9.The path as found by the midrange sensing approach had a total length of 63 meters and is 56% shorter than the path generated by mobility sensing only.
We also ran experiments in simulation.For the terrain we used real terrain data from the USGS database.Different s t m and goal positions were chosen randomly within the constraint that there exists a path between start and goal.The parameters for this example can be found in table I.
Results from 100 trial runs were collected.

Simulated run
Fig. 10.The Mid-rangc planning acnsing algorithm compamd to the standard naviglion approach.We used the same presenMion of the data as in OUT prcvious work Since !he same qumlilies WLR measured.Note hawwci Ihm the results m based on be ncw data hallucination approach.
A positivc pcrcemagc shows haw much shoncr ihc path from the midn n ~e scnsdplanning is over the mobility only approach.The avcrage gain in path length is 2%.However.more iniercsling is that 47% of the runs exhihit positive gain (up to 34%) and that of tho% that haw negative pain, {he maximum loss is -7%.'This is duc to the overhead inlrcduccd hy the explorc hchavior that lncs to find hetur paths.For 2 7 2 of h c runs Lhcre was no gain or loss.
Figure 10 compares the lengths of the paths generated by using the mid-range sensor planning method with the paths executed by using mobility sensing only.The runs are sorted in order of increasing gain for the sensor-based planning method.This analysis allows us to assess the amount gained from using sensor planning.It is important to note that the gain can vary dramatically depending on the start and goal points.Intuitively, little gain can be expected if the area between the sm and goal points is completely unobstructed, in which case an? planning strategy would perform well.The graph shows clearly that, for an unobstructed path, the algorithm leads to a slightly longer path.This is due to the fact that the vehicle might veer off the "path" in order to get better coverage.On average, the reduction in path length is 2%.However, more interesting is that our results show that 41% of the runs exhibit positive gain (up to 34%).And that of those that have negative gain, the maximum loss is only -7%.furthermore, as the left illustration in the figure shows, these cases are those for which the paths in the environment are unobstructed.
It is also interesting to compare our method against an approach in which the system senses all the timeleverywhere, since this will be the upper hound of how well our algorithm could perform.If our planner were to generate paths that are substantially higher cost than those generated by using the sense all the timeleverywhere strategy, it would indicates that our heuristics can be improved.The results of this experiment can he seen in Figure 11, in which the lengths of the palhs generated from our planning approach and from the sense all the timeleverywhere approach are plotted as a scatter plot.The plot verifies empirically that our hypothesis is correct since the values are scattered near the diagonal (the correlation coefficient is p = 0.99) which indicates that the path lengths are similar.VII.CONCLUS~ON We have presented our initial results on a combined sensinglplanoing approach that can alleviate the well h o w sensing horizon problem.Our method exploits the usage of mid-range sensing data by hallucinating most likely traversability maps.It also takes into account sensor planning to reduce the cost incurred with collecting mid-range data.We have shown that our method incurs only a small extra cost in open environments, hut in more cluttered environments a great benefit can be achieved.
Current work involves primarily the quantitative characterization of the increase in performance in plan execution in typical missions.The objective is to use the sensing strategies developed under the Robotics CTA in order to demonstrate the combined planning and sensing system.Longer-term research involves more extensive tests in both simulation and on red outdoor autonomous systems.Negotiation between multiple vehicles for optimal sensing strategies is of particular interest.In that respect, the costbenefit model described above fits well in the negotiation strategies based on economic models as described in

'
Fig. I.An illuslralion OC a long nnec navigalion task in unknown Mania" terrain.lf major obalaeles can be idenlified in an early smgc, thc ~U L O ~O ~O U S sy~lem can plan a pslh accordingly.(From h1~p:liphoIojoumd .jpl.nasa.gov) (1" (C, @ObservedTerrain)I = ' ~ ObstacleThreshold with While this is a simple cost model for traversability, this function can be arbitrarily complex and might encode many different other costs, our framework does not restrict the choice of this model [221,[241.is the (inverse) benefit of visiting a particulas cell based on its added utility for sensing purposes.Lower values indicate a vantage point for the midrange sensor that is most beneficial given the current world knowledge and the currently planned path.This utility captures the trade off between for visiting U while navigating to the goal is defined as:In which L is the obstacle Laheling of the world map cells,

Fig. 5 .
Fig. 5.The (emin tha was used in h e examplc 10 show the influence of thc obatade hallucination over the come a f a lypical travnsc.On the left is a prrspclive view of lhc tcmin.with stan and goal lacation marked.Thc gmund Uulh obsmcle laheling is displayed on the tight.Obstacle cell^ am m k c d in hlur. Thc rclalion bctwccn 8 and (zpO,.yp.i)

Fig. 8 .
Fig. 8.The experiment was conducted on the par!&$ 101, with lhc s m in front of h e huilding and the goal (om of view) hehind it.The dark uacc skelchcs the p l h compuled hy B mohilily only approach.Lhc lighter races shows approximately *e palh h m the mid-ranpc en sing approach.

Fig. 9 .
Fig. 9. a hirds-eye view of the real lest scenario, in which the mohilily s " n g navigation skins around the huilding (dark) and thc Mid-ran& rcnsin&4kming algotithm (light) finds a passqe and hwds straight towards it.

Fig. 11 .
Fig.11.The Mid-rang planing scnsinp algorilhm compared in a scatter plot to a hypothetical continuous scnsing mcthod with an unlimitcd rantc and 360' view range Bnda (no ricwpoint planning).Thc hotimantal axis is !he path length from continuous sensing and planning.the venical axis the Mid-range senrin@planning path length.

TABLE I THE
PARAMETERS FOR THE I00 TRIAL RUN EXPERlMENf.