Alternative Representation of Uncertainty for Probabilistic Planning

The purpose of this document is to list a set of possible schema for the representation of uncertainty to be used by a Probability-Aware Planner. The high level design places this planner into an interleaved plan execution system, which allows the planner to elaborate a plan and execute it. If the plan executes flawlessly, the systems goals are achieved, however, in the event that part of the plan fails, or that upon termination the system is not in its desired state, planning begins again, and the new plan is executed.

As a note I will refer to these two stages as plan elaboration and plan execution, or more simply elaboration and execution stages. The primary reason for splitting these two apart is that some representations of uncertainty may provide strong benefits during one of these stages, but result in high penalties in the other. By separating these two issues, a better side-by-side comparison can be made of the candidate representations.

I have listed six candidate representations for the uncertainty of applying actions in a domain with variability. Below, each is expanded to identify its strengths and weaknesses, and these are assigned values for several features.

The feature set that is being used for evaluation is evolving, as various aspects of the representations are explored, it may be that additional features will be added to the evaluation criteria. If that is the case, the summary tables will be updated to show the evaluation of each candidate on that feature axis.

The feature set is:
Feature Definition/Purpose
Quality What is the impact on the quality of the solution. For an uncertain domain, the quality would be the the one that provides the best overall plan satisfaction. best
Size The required storage of the information.
Availability How will the data be acquired, and what is the level of effort needed to supply it in the general case.
Plan Space The effect of the representation on the plan space that must be explored to find plans
Elaboration Time What is the estimated impact on elaboration time

A note on Execution time

Any use of execution time is currently missing from the feature space. This has a couple of reasons. First, by splitting elaboration time out, we would be left with the time needed to execute the plan, which will really vary depending on the hardware. Second, and more importantly, all of these planning systems produce output which can be mapped onto a lookup table. In the case of a classical plan, an ordered sequence of steps, this maps onto a table with one action per entry, and a simple counter, the algorithm is PC=PC+1, Execute Act[PC]. In the case of the more complex models, we have ('Current State' , 'Action') pairs; lookup the current state, and do the associated action. So there is no real difference between the execution times of any of these.

The argument might be made, "But wait, in the classical planning mode, you will have to re-plan many times during the execution, shouldn't this be counted as part of the execution time?" Or at least this should be represented in the elaboration time, to capture the fact that several plan elaboration phases may be needed to achieve a single goal, while the complete models will produce a single policy that covers all the possible failures. This is a good question, which will be addressed below.

No Representation (Classical Planning)

This option has no representation of uncertainty at all. From the point of view of the planner, all actions are equally likely. If this likelihood is assumed to be 1, then all goal achieving plans are equally likely, and there is no preference shown to any one plan over another. If the likelihood of all actions is less than one, then the goal achieving plan with the fewest number of actions is more probable that longer plans, and will be preferred.

Quality

This representation provides no improvement in plan quality over a classical planner. If it happens that the Shortest Feasible Plan is the most probable, or if there is only one feasible plan, then it does as well as any other representation.

Size

This is the base case, with no representation, so we'll group it with the constant sized representations. O(17)

Availability

All that is needed for this representation is the list of actions available to the planner. So this is the most easily available, so we'll label it 'trivial'.

Plan Space

The plan space tries to capture the effect that utilizing this representation has on the space that must be explored. No representation corresponds to the the traditional planning model, where there are an exponential number of possible plans. If there are A actions that can be applied, then there are A^N plans of length N. These must be compared to find the 'best' feasible plan, so we're O(A^N)

Elaboration Time

The traditional approaches to planning with no representation of uncertainty (and no functions) suggests that this is exponential in the size of the input (primarily actions). Ref: Bylander 1994

Representation Quality Size Avail. Plan Space Elab. Time
None LuckO(17) Trivial O(A^N) exp

Return to Candidate List


Probability of Success (POS)

The basic concept here is to simply attach a single value to each action that captures the naive probability that the action will succeed. This allows the planning system to choose between actions on the basis of maximizing the overall probability of success of the plan.

Since no representation of the results of the failure of the action are included, the system cannot reason about what might happen if the action fails, avoiding the multiple worlds problem associated with more complex representations. However, that has the problem of not being able to reason in a more global manner, possibly resulting in sub-optimal planning.

Quality

Here we get a mapping between the likelihood of an operator succeeding, and the selection of that operator by the planning system. However, this model tacitly assumes that if things go wrong, nothing bad will happen (the fail negative hypothesis), so it has no way of avoiding potentially bad consequences.

Size

The size impact is pretty close to minimal, since this single naive probability can be represented by a single real value, again, a constant.

Availability

The availability of this number is almost as trivial as the base case. Either by modeling the actions, or by empirical studies, the developer can establish a good approximation for the success rate. In addition, the planning system itself can update these values with very little overhead.

Plan Space

This has no impact on the plan space, since the planning system is simply executing a search for an optimal plan, but it is the maximum probability plan instead of the shortest feasible plan. Research done by Ferrer and Gunderson show a linear time transform to reduce this to classical planning.

Elaboration Time

The planning process is the same as for classical planning, so there is no effective change in the elaboration time

Representation Quality Size Avail. Plan Space Elab. Time
Prob of Success Good O(17) Almost Trivial O(A^N) exp

Return to Candidate List


The Bad, and the Ugly, and the Good (BUG)

The notion behind this model is to overcome (to a degree) the fact that POS has no idea what can go wrong. For example, suppose that there were two actions, either of which could be used to cause a desired outcome. Action A has a probability of success of 0.9, and when it fails, nothing happens, the world simply remains in its original state. Action B has a probability of success of 0.9 as well, however, when it fails, the arm falls off the robot. To a planner that has only a POS representation, these two actions are equivalent, and either one may be used to achieve the goal.

In terms of 'intelligent' behavior (whatever that means) these two actions are far from interchangeable, however. It would be nice if a planner could take into account the possible negative ramification of pursuing a particular course of action. This idea, when executed fully, results in the development of complete models. As described below, these models can have significant problems, so BUG is an attempt to capture some of the benefits of complete modeling, while avoiding the problems.

Quality

Now, in addition to picking the 'most likely to succeed' plans, we can also attempt to avoid the really bad plans.

Size

Clearly this model will increase the size of the input set. It will be necessary to not only describe the intended outcome and the probability associated with that, but to produce a second value that captures the probability of critical un-intended outcomes, or the magnitude of the risk associated with the action. However, this increases the input size by a single value.

Availability

Availability is a problem on two fronts: first, it may be difficult (or impossible) to define a 'bad outcome', what is 'bad' with respect to one goal may be a desired behavior with respect to another. Second, given that the designer can define this 'bad' outcome, there is no guarantee that this value will be context independent. For example, it the action is to cross a gulf on a narrow board, the 'badness' is low if the gulf is a few inches deep, but is very high if it is between to tall buildings, or a creek full of rushing water. So assigning values to these is hard.

Plan Space

Depending on how the information is used, this may have little effect on the plan space. Rather, it is simply a modification to the evaluation function associated with each proposed plan.

Elaboration Time

The effect on Elaboration Time is also probably small, perhaps requiring no additional time, if the modified evaluation function is used, or perhaps each candidate plan is simulated to determine 'badness' adding increasing by a linear amount the simpler POS case. Finally, it might be that the planner builds a set of high ranking positive plans, and a set of low ranking plans based on their badness, and when these two sets intersect, that that plan is chosen.

Representation Quality Size Avail. Plan Space Elab. Time
BUGBetter O(17) Hard O(A^N) exp

Return to Candidate List


Subset of outcomes (SUBSET)

In the subset of outcomes representation we introduce the notion that an operator can have multiple outcomes. So, for each operator a small subset of the possible outcomes are modeled. This is done as a single operator with multiple results. As a plan is constructed, an entire family of outcomes is maintained, showing both the intended results, and modeling the un-intentional bad outcomes.

Quality

Since the planning system has data about the possible 'bad' outcomes of the steps in the plan, it can reason about, and attempt to avoid plans which have high likelihoods of failing to achieve the plan goals. Unfortunately, to take full advantage of this, the planner would need to reason about the interactions between various failure modes and eventual goal satisfaction. In this model, the planner does not do that step, so the plan quality is not perfect.

Size

This has the effect of forcing the planning system to explore a much larger space to determine the quality of a candidate plan, but can substantially improve the ability to select a good plan. We need to look at not just A actions at each step, but S possible outcomes, so we have O((A*S)^N) plan evaluations to do.

Availability

However, it requires that the developer of the domain have access to the set of possible outcomes of each operator, the probability distribution of those outcomes, and they must select a good subset to model. This process of modeling the risks associated with implementing a plan is complex, and the process of assessing the risks is dependent on the skills of the analyst, and not well suited to any form of automated analysis (See Risk Analysis for a recent summary of quantitative risk analysis)

Plan Space

The plan space contains the same number of plans, however, each plan is much more complex than a simple linear plan. This effects the Elaboration time dramatically, since it is possible to get an exponential increase in the complexity of each plan, if the subsets are too large, or if they are poorly chosen.

Elaboration Time

Due to the increased complexity of the candidate plans, both the time needed to construct the candidates, and the time needed to evaluate them increases dramatically. This can result in an exponential increase in the already exponential planning time.

RepresentationQuality Size Avail. Plan Space Elab. Time
SUBSET Very GoodO(S) Poor O((A*S)^N) exp+

Return to Candidate List


Fully Observable "Complete" Model (FOM)

The complete models are based on the concept of maximizing the expected value of a plan with respect to a goal. They typically used either Markov Decision Problems (MDPs) or Decision Theoretic representations of the problem, and solve these for an 'optimal' policy.

The fully observable model makes the assumption that the planner has complete and accurate world knowledge at the time of planning, and that the world won't change significantly before the plan is executed. Given these assumptions, and a complete probability distribution of the possible outcomes of every operator, these models use Linear programming techniques to produce a policy. A policy is a mapping from world states to actions, such that in every world state, the execution system simply selects the predetermined best action, and it slides into an new world state, from which it selects the best action, etc. until it reaches the goal.

The nice thing about this representation is that once it is solved, a state transition table can be produced and the execution system becomes a simple lookup function. The not so nice thing about this representation are:

  1. Getting an exhaustive, and mutually exclusive outcome set for every operator,
  2. Getting the probability distribution for these sets, and
  3. Solving the LP problem.
Even if these criteria can be met, solving the fully observable problem is difficult. These problems fall into the class EXP-Space, indicating that they require exponential space (and therefore at a minimum exponential time to execute. See Littman, Goldsmith and Mundhenk for an analysis) and without severe restrictions are undecidable. That's if you can get the probability distributions for the complete outcome set. It is generally believed that this is unrealistic in all but toy problems.

The result of this is to throw away any attempt to completely model the domains, and instead attempt to produce an "approximate" representation. As it is put in Abstraction and Approximate Decision-theoretic Planning: "Much emphasis in DTP research has been placed on the issue of speeding up computation by means of approximation. ... This approach reduces the state space to locally accessible regions and allows OR methods to be used on reduced problems. While optimality is sacrificed, judicious choice of relevant state can lead to good approximations"

Quality

If the complete model is done, this is optimal for the assumptions.

Size

Massively ugly, since it is EXP-Space, it gets real big real fast.

Availability

This is the second big problem, for most actions, this is not possible, or is very difficult.

Plan Space

Given the input size, the plan space isn't too bad, but in comparison with a classical planner, it is monstrous

Elaboration Time

Here is where it gets even worse. While it is true that many LP problems can be solved in polynomial time (on the exponential input of course), there is no guarantee of this. Some problems require exponential time (on the exponential input).

Execution Time

Trivial, look up the world state, and select an action. A little slower than the sequential plan, since the lookup is required, but with a good hash function and a small state space, we're o.k.

Representation Quality Size Avail. Plan Space Elab. Time
FOM Excellent O(K^N) May be impossible O(A^(S^N)) hyper-exp

Return to Candidate List


Partially Observable "Complete" Model (POM)

Up to this point, the models that have been discussed have all assumed that they had an accurate picture of the world, and that they had complete knowledge. However, when planning in the real world, that is rarely the case. Far more frequently, the planning is based on a 'best guess' of the current state of the world, and the accuracy of this guess is affected by many different things. Among these are: This uncertainty has an impact on the ability of the planning system to develop effective plans. If a service robot develops a delivery plan based on the assumption that a specific door along the path is open, when the door is closed, the service robot may need to make significant changes to its planned route. [For a pretty, high-level overview of service robots, see Service Robots. But don't expect technology as much as concepts.]

So the question becomes "Can a planner improve its ability to achieve goals by reasoning about the uncertainty in its knowledge of the world?" The traditional POMDP approach restates this as "Can the planning system improve its ability to satisfy goals by reducing its uncertainty about the state of the world?" Can the system plan for additional sensing operations and there by improve performance?

It seems that the short answer is yes. A common example is to travel to the a point where the robot can sense the state of the door, then, if it is open continue on, otherwise backtrack and go the long way around. However, this ability to plan for sensing operations comes at a price. In a recent paper by Madani, Hanks and Condon they show that POMDPs are undecidable for a number of criteria, including total reward, average reward per stage, and plan existence.

Needless to say, POMDPs have the same large input set as fully observable models above, and in the case of approximating the required data, this paper demonstrates that if the length of a candidate plan is bounded in size - even by an exponential function of the input description length - the solution found can be arbitrarily suboptimal.

Quality

If the complete model can be done, this is optimal for the assumptions.

Size

This is at least as bad as the Fully observable model

Availability

This is the second big problem, for most actions, this is not possible, or is very difficult.

Plan Space

Monstrous

Elaboration Time

Add to the FOM, the need to do multiple assessments of possible world state.

Representation Quality Size Avail. Plan Space Elab. Time
POM Excellent O(K^N) may be impossible O(A^(S^N)) may be unbounded

Return to Candidate List


Evaluation Summary

So, now the question becomes what does this mean? Can we look at these characteristics of the various representations, and say that one or another is clearly a better choice for representing uncertainty in our domains.

here is a summary of the rating assigned above.

Representation Quality Size Avail. Plan Space Elab. Time
None Luck O(17) trivial O(A^N) exp
Prob of Success Good O(17) almost trivial O(A^N) exp
BUG Better O(17) Hard O(A^N) exp
SUBSET Very GoodO(AS) Really Hard O((A*S)^N) exp+
FOM Excellent O(K^N) May be impossible O(A^(S^N)) hyper-exp
POM Excellent O(K^N) May be impossible O(A^(S^N)) possibly unbounded

So the first thing we need to do is figure out how to numerically value these ratings. Since we're in the land of exponential increases, I've somewhat arbitrarily decided on an exponential scale. So, for example the Plan quality scale which runs from Luck to Excellent, is assigned as follows, along with the rest of the features.

Feature124816
QualityLuckGoodBetterVery GoodExcellent
SizeO(17)O(AS) O(K^N)N/A N/A
AvailabilityTrivialAlmost TrivialPoorHardMay Be Impossible
SpaceO(A^N)O((A*S)^N)O(A^(S^N))N/AN/A
Timeexpexp+hyper-expN/Amay be unbounded

One note, in the case of unbounded, or potentially impossible, I assigned the maximum value of 16, and for some features, some values are not used (such as hyper-hyper-exp in Time).

Based on this, and using a simple benefit/cost model the results are as follows:

This clearly suggests that the the range from BUG to FOM is the area to look into. For some of the reasons expressed in the section on Execution time, below, I lean towards the less complex end of this spectrum, in the BUG and Subset areas.


More on Execution Time

The question about execution time gets tricky. The simple answer that the classical planner may be invoked many times, where the FOM planner is only invoked once may be incorrect, and even if it is correct, may be unimportant. The issue, I think, comes down to the fact that we can't really do a complete model. This means that there are transitions (outcomes from applying an operator) which are not accounted for in the FOM or POM representation. When one of these 'unexpected' transitions occurs (as it must), we will land in a state for which we have no corresponding action ready to invoke. This will result in the need to re-run the FOM or POM planner, at great expense.

Clearly there is a trade-off here. The more complete the model, the less frequently we will 'fall off the map'. However, the more complete the model, the higher the cost on running the planner. Since the run time grows hyper-exponentially, and the coverage (the probability that an outcome is within the model space) grows ??? (logarithmically at a guess), we should show a win by covering fewer possibilities, pushing us towards the Subset model.
back up


References

Bylander, T. "The Computational Complexity of Propositional STRIPS Planning", Artificial Intelligence vol 69, pgs 165-204, 1994

Dearden, R. and Boutilier, C. "Abstraction and Approximate Decision-Theoretic Planning", Artificial Intelligence Vol 89, pgs 219-283, 1997.

Gunderson, J. P. and Ferrer, G. J. "A Linear Time Transform for Probability-Aware Planning", Proceedings of the IEEE Conference on Systems, Man, and Cybernetics 2000.

Littman, M. L., Goldsmith, J., and Mundhenk, M. "The Computational Complexity of Probabilistic Planning", Journal of Artificial Intelligence Research, Vol 9, pgs 1-36, 1998

Littman, M. L., Dean, T. L., and Kaelbling, L. P. "On the complexity of Solving Markov Decision Problems", Proceedings of Conference on Uncertainty in Artificial Intelligence (UAI-1995), 1995

Madani, O. and Hanks, S. and Condon, A. "On the Undecidability of Probabilistic Planning and Infinite Horizon Partially Observable Markov Decision Problems", Proceedings of American Association for Artificial Intelligence, 1999.

Schraft, R. D. and Schmierer, G. "Service Robots", A. K. Peters, Natick MA. 2000

Vose, D. "Risk Analysis, a quantitative guide 2nd. Ed.", John Wiley and Sons, LTD. 2000


jpg3u@virginia.edu
Last modified: Thu Jan 25 18:53:42 2001