Alternative Representation of Uncertainty for Probabilistic Planning
The purpose of this document is to list a set of possible schema for
the representation of uncertainty to be used by a
Probability-Aware Planner. The high level design places this
planner into an interleaved plan execution system, which allows the
planner to elaborate a plan and execute it. If the plan executes
flawlessly, the systems goals are achieved, however, in the event that
part of the plan fails, or that upon termination the system is not in
its desired state, planning begins again, and the new plan is
executed.
As a note I will refer to these two stages as plan elaboration and
plan execution, or more simply elaboration and
execution stages. The primary reason for splitting these two
apart is that some representations of uncertainty may provide strong
benefits during one of these stages, but result in high penalties in
the other. By separating these two issues, a better side-by-side
comparison can be made of the candidate representations.
I have listed six candidate representations for the uncertainty of
applying actions in a domain with variability. Below, each is
expanded to identify its strengths and weaknesses, and these are
assigned values for several features.
The feature set that is being used for evaluation is evolving, as
various aspects of the representations are explored, it may be that
additional features will be added to the evaluation criteria. If that
is the case, the summary tables will be updated to show the evaluation
of each candidate on that feature axis.
The feature set is:
| Feature |
Definition/Purpose |
| |
|
| Quality |
What is the impact on the quality of the solution. For an
uncertain domain, the quality would be the the one that provides the
best overall plan satisfaction.
best |
| Size |
The required storage of the information. |
| Availability |
How will the data be acquired, and what is the level of effort
needed to supply it in the general case. |
| Plan Space |
The effect of the representation on the plan space that must be
explored to find plans |
| Elaboration Time |
What is the estimated impact on elaboration time |
A note on Execution time
Any use of execution time is currently missing from the feature
space. This has a couple of reasons. First, by splitting elaboration
time out, we would be left with the time needed to execute the plan,
which will really vary depending on the hardware. Second, and more
importantly, all of these planning systems produce output which can be
mapped onto a lookup table. In the case of a classical plan, an
ordered sequence of steps, this maps onto a table with one action per
entry, and a simple counter, the algorithm is PC=PC+1, Execute
Act[PC]. In the case of the more complex models, we have ('Current
State' , 'Action') pairs; lookup the current state, and do the
associated action. So there is no real difference between the
execution times of any of these.
The argument might be made, "But wait, in
the classical planning mode, you will have to re-plan many times
during the execution, shouldn't this be counted as part of the
execution time?" Or at least this should be represented in the
elaboration time, to capture the fact that several plan elaboration
phases may be needed to achieve a single goal, while the complete
models will produce a single policy that covers all the possible
failures. This is a good question, which will be addressed below.
No Representation (Classical Planning)
This option has no representation of uncertainty at all. From the
point of view of the planner, all actions are equally likely. If this
likelihood is assumed to be 1, then all goal achieving plans are
equally likely, and there is no preference shown to any one plan over
another. If the likelihood of all actions is less than one, then the
goal achieving plan with the fewest number of actions is more probable
that longer plans, and will be preferred.
Quality
This representation provides no improvement in plan quality over a
classical planner. If it happens that the Shortest Feasible Plan is
the most probable, or if there is only one feasible plan, then it does
as well as any other representation.
Size
This is the base case, with no representation, so we'll
group it with the constant sized representations. O(17)
Availability
All that is needed for this representation is the
list of actions available to the planner. So this is the most easily
available, so we'll label it 'trivial'.
Plan Space
The plan space tries to capture the effect that
utilizing this representation has on the space that must be
explored. No representation corresponds to the the traditional
planning model, where there are an exponential number of possible
plans. If there are A actions that can be applied, then there are
A^N plans of length N. These must be compared to find the 'best'
feasible plan, so we're O(A^N)
Elaboration Time
The traditional approaches to planning with
no representation of uncertainty (and no functions) suggests that this
is exponential in the size of the input (primarily actions). Ref: Bylander 1994
| Representation | Quality | Size | Avail. | Plan Space |
Elab. Time |
| None | Luck | O(17) | Trivial | O(A^N) |
exp |
Return to Candidate List
Probability of Success (POS)
The basic concept here is to simply attach a single value to each
action that captures the naive probability that the action will
succeed. This allows the planning system to choose between actions
on the basis of maximizing the overall probability of success of the plan.
Since no representation of the results of the failure of the action
are included, the system cannot reason about what might happen if the
action fails, avoiding the multiple worlds problem associated with
more complex representations. However, that has the problem of not
being able to reason in a more global manner, possibly resulting in
sub-optimal planning.
Quality
Here we get a mapping between the likelihood of an operator succeeding,
and the selection of that operator by the planning system. However,
this model tacitly assumes that if things go wrong, nothing bad will
happen (the fail negative hypothesis), so it has no way of avoiding
potentially bad consequences.
Size
The size impact is pretty close to minimal, since this single naive
probability can be represented by a single real value, again, a constant.
Availability
The availability of this number is almost as
trivial as the base case. Either by modeling the actions, or by
empirical studies, the developer can establish a good approximation
for the success rate. In addition, the planning system itself can
update these values with very little overhead.
Plan Space
This has no impact on the plan space, since the planning system is
simply executing a search for an optimal plan, but it is the maximum
probability plan instead of the shortest feasible plan. Research done
by Ferrer and Gunderson show a
linear time transform to reduce this to classical planning.
Elaboration Time
The planning process is the same as for classical planning, so there
is no effective change in the elaboration time
| Representation | Quality | Size | Avail. | Plan Space |
Elab. Time |
| Prob of Success | Good | O(17) |
Almost Trivial |
O(A^N) | exp |
Return to Candidate List
The Bad, and the Ugly, and the Good (BUG)
The notion behind this model is to overcome (to a degree) the fact
that POS has no idea what can go wrong. For example, suppose that
there were two actions, either of which could be used to cause a
desired outcome. Action A has a probability of success of 0.9, and
when it fails, nothing happens, the world simply remains in its
original state. Action B has a probability of success of 0.9 as
well, however, when it fails, the arm falls off the robot. To a
planner that has only a POS representation, these two actions are
equivalent, and either one may be used to achieve the goal.
In terms of 'intelligent' behavior (whatever that means) these two
actions are far from interchangeable, however. It would be nice if a
planner could take into account the possible negative ramification of
pursuing a particular course of action. This idea, when executed
fully, results in the development of complete models. As described
below, these models can have significant problems, so BUG is an
attempt to capture some of the benefits of complete modeling, while
avoiding the problems.
Quality
Now, in addition to picking the 'most likely to succeed' plans, we can
also attempt to avoid the really bad plans.
Size
Clearly this model will increase the size of the input set. It will be
necessary to not only describe the intended outcome and the
probability associated with that, but to produce a second value that
captures the probability of critical un-intended outcomes, or the
magnitude of the risk associated with the action. However, this
increases the input size by a single value.
Availability
Availability is a problem on two fronts: first, it may be difficult
(or impossible) to define a 'bad outcome', what is 'bad' with respect
to one goal may be a desired behavior with respect to another. Second, given
that the designer can define this 'bad' outcome, there is no guarantee
that this value will be context independent. For example, it the
action is to cross a gulf on a narrow board, the 'badness' is low if
the gulf is a few inches deep, but is very high if it is between to
tall buildings, or a creek full of rushing water. So assigning values
to these is hard.
Plan Space
Depending on how the information is used, this may have little effect
on the plan space. Rather, it is simply a modification to the
evaluation function associated with each proposed plan.
Elaboration Time
The effect on Elaboration Time is also probably small, perhaps
requiring no additional time, if the modified evaluation function is
used, or perhaps each candidate plan is simulated to determine 'badness'
adding increasing by a linear amount the simpler POS case. Finally,
it might be that the planner builds a set of high ranking positive
plans, and a set of low ranking plans based on their badness, and when
these two sets intersect, that that plan is chosen.
| Representation | Quality | Size | Avail. | Plan Space |
Elab. Time |
| BUG | Better | O(17) | Hard | O(A^N) |
exp |
Return to Candidate List
Subset of outcomes (SUBSET)
In the subset of outcomes representation we introduce the notion that
an operator can have multiple outcomes. So, for each operator a small
subset of the possible outcomes are modeled. This is done as a single
operator with multiple results. As a plan is constructed, an entire
family of outcomes is maintained, showing both the intended results,
and modeling the un-intentional bad outcomes.
Quality
Since the planning system has data about the possible 'bad' outcomes
of the steps in the plan, it can reason about, and attempt to avoid
plans which have high likelihoods of failing to achieve the plan
goals. Unfortunately, to take full advantage of this, the planner
would need to reason about the interactions between various failure
modes and eventual goal satisfaction. In this model, the planner does
not do that step, so the plan quality is not perfect.
Size
This has the effect of forcing the planning system to explore a
much larger space to determine the quality of a candidate plan, but can
substantially improve the ability to select a good plan. We need to
look at not just A actions at each step, but S possible outcomes, so
we have O((A*S)^N) plan evaluations to do.
Availability
However, it requires that the developer of the
domain have access to the set of possible outcomes of each operator,
the probability distribution of those outcomes, and they must select a
good subset to model. This process of modeling the risks associated
with implementing a plan is complex, and the process of assessing the
risks is dependent on the skills of the analyst, and not well suited
to any form of automated analysis (See Risk
Analysis for a recent summary of quantitative risk analysis)
Plan Space
The plan space contains the same number of plans, however, each
plan is much more complex than a simple linear plan. This effects the
Elaboration time dramatically, since it is possible to get an
exponential increase in the complexity of each plan, if the subsets
are too large, or if they are poorly chosen.
Elaboration Time
Due to the increased complexity of the candidate plans, both the time
needed to construct the candidates, and the time needed to evaluate
them increases dramatically. This can result in an exponential
increase in the already exponential planning time.
| Representation | Quality | Size | Avail. | Plan Space |
Elab. Time |
| SUBSET | Very Good | O(S) |
Poor | O((A*S)^N) | exp+ |
Return to Candidate List
Fully Observable "Complete" Model (FOM)
The complete models are based on the concept of maximizing the
expected value of a plan with respect to a goal. They typically used
either Markov Decision Problems (MDPs) or Decision Theoretic
representations of the problem, and solve these for an 'optimal'
policy.
The fully observable model makes the assumption that the planner
has complete and accurate world knowledge at the time of planning, and
that the world won't change significantly before the plan is
executed. Given these assumptions, and a complete probability
distribution of the possible outcomes of every operator, these models
use Linear programming techniques to produce a policy. A policy is a
mapping from world states to actions, such that in every world state,
the execution system simply selects the predetermined best action, and
it slides into an new world state, from which it selects the best
action, etc. until it reaches the goal.
The nice thing about this representation is that once it is solved,
a state transition table can be produced and the execution system
becomes a simple lookup function. The not so nice thing about this
representation are:
- Getting an exhaustive, and mutually exclusive outcome set for
every operator,
- Getting the probability distribution for these sets, and
- Solving the LP problem.
Even if these criteria can be met, solving the fully observable
problem is difficult. These problems fall into the class EXP-Space,
indicating that they require exponential space (and therefore at a
minimum exponential time to execute. See
Littman, Goldsmith and Mundhenk for an analysis) and without
severe restrictions are undecidable. That's if you can get the
probability distributions for the complete outcome set. It is
generally believed that this is unrealistic in all but toy problems.
The result of this is to throw away any attempt to completely model
the domains, and instead attempt to produce an "approximate"
representation. As it is put in Abstraction
and Approximate Decision-theoretic Planning:
"Much emphasis in DTP research has been placed on the issue of
speeding up computation by means of approximation. ... This approach
reduces the state space to locally accessible regions and allows OR
methods to be used on reduced problems. While optimality is
sacrificed, judicious choice of relevant state can lead to good
approximations"
Quality
If the complete model is done, this is optimal for the assumptions.
Size
Massively ugly, since it is EXP-Space, it gets real big real fast.
Availability
This is the second big problem, for most actions, this is not
possible, or is very difficult.
Plan Space
Given the input size, the plan space isn't too bad, but in comparison
with a classical planner, it is monstrous
Elaboration Time
Here is where it gets even worse. While it is true that many LP
problems can be solved in polynomial time (on the exponential input of
course), there is no guarantee of this. Some problems require
exponential time (on the exponential input).
Execution Time
Trivial, look up the world state, and select an action. A little
slower than the sequential plan, since the lookup is required, but
with a good hash function and a small state space, we're o.k.
| Representation | Quality | Size | Avail. | Plan Space |
Elab. Time |
| FOM | Excellent | O(K^N) | May be
impossible | O(A^(S^N)) | hyper-exp |
Return to Candidate List
Partially Observable "Complete" Model (POM)
Up to this point, the models that have been discussed have all assumed
that they had an accurate picture of the world, and that they had
complete knowledge. However, when planning in the real world, that is
rarely the case. Far more frequently, the planning is based on a 'best
guess' of the current state of the world, and the accuracy of this
guess is affected by many different things. Among these are:
- errors in the sensors,
- limited sensor ranges, and
- events that change the state of the world.
This uncertainty has an impact on the ability of the planning system
to develop effective plans. If a service robot develops a delivery
plan based on the assumption that a specific door along the path is
open, when the door is closed, the service robot may need to make
significant changes to its planned route. [For a pretty, high-level
overview of service robots, see Service
Robots. But don't expect technology as much as concepts.]
So the question becomes "Can a planner improve its ability to
achieve goals by reasoning about the uncertainty in its knowledge of
the world?" The traditional POMDP approach restates this as "Can the
planning system improve its ability to satisfy goals by reducing its
uncertainty about the state of the world?" Can the system plan for
additional sensing operations and there by improve performance?
It seems that the short answer is yes. A common example is to
travel to the a point where the robot can sense the state of the door,
then, if it is open continue on, otherwise backtrack and go the long
way around. However, this ability to plan for sensing operations comes
at a price. In a recent paper by Madani, Hanks
and Condon they show that POMDPs are undecidable for a number of
criteria, including total reward, average reward per stage, and plan
existence.
Needless to say, POMDPs have the same large input set as fully
observable models above, and in the case of
approximating the required data, this paper demonstrates that if the
length of a candidate plan is bounded in size - even by an exponential
function of the input description length - the solution found can be
arbitrarily suboptimal.
Quality
If the complete model can be done, this is optimal for the assumptions.
Size
This is at least as bad as the Fully observable model
Availability
This is the second big problem, for most actions, this is not
possible, or is very difficult.
Plan Space
Monstrous
Elaboration Time
Add to the FOM, the need to do multiple assessments of possible world
state.
| Representation | Quality | Size | Avail. | Plan Space |
Elab. Time |
| POM | Excellent | O(K^N) | may be impossible | O(A^(S^N)) |
may be unbounded |
Return to Candidate List
Evaluation Summary
So, now the question becomes what does this mean? Can we look at these
characteristics of the various representations, and say that one or
another is clearly a better choice for representing uncertainty in our
domains.
here is a summary of the rating assigned above.
| Representation | Quality | Size | Avail. | Plan Space |
Elab. Time |
| None | Luck | O(17) | trivial | O(A^N) |
exp |
| Prob of Success | Good | O(17) |
almost trivial |
O(A^N) | exp |
| BUG | Better | O(17) | Hard | O(A^N) |
exp |
| SUBSET | Very Good | O(AS) |
Really Hard | O((A*S)^N) | exp+ |
| FOM | Excellent | O(K^N) | May be
impossible | O(A^(S^N)) | hyper-exp |
| POM | Excellent | O(K^N) | May be
impossible | O(A^(S^N)) | possibly unbounded |
So the first thing we need to do is figure out how to numerically
value these ratings. Since we're in the land of exponential increases,
I've somewhat arbitrarily decided on an exponential scale. So, for
example the Plan quality scale which runs from Luck to Excellent, is
assigned as follows, along with the rest of the features.
| Feature | 1 | 2 | 4 | 8 | 16 |
| Quality | Luck | Good | Better | Very Good | Excellent |
| Size | O(17) | O(AS)
| O(K^N) | N/A | N/A |
| Availability | Trivial | Almost
Trivial | Poor | Hard | May Be Impossible |
| Space | O(A^N) | O((A*S)^N) | O(A^(S^N)) | N/A | N/A |
| Time | exp | exp+ | hyper-exp | N/A | may
be unbounded |
One note, in the case of unbounded, or potentially impossible, I
assigned the maximum value of 16, and for some features, some values
are not used (such as hyper-hyper-exp in Time).
Based on this, and using a simple benefit/cost model the results
are as follows:
This clearly suggests that the the range from BUG to FOM is the
area to look into. For some of the reasons expressed in the section on
Execution time, below, I lean towards the less complex end of this
spectrum, in the BUG and Subset areas.
More on Execution Time
The question about execution time gets tricky. The simple answer that
the classical planner may be invoked many times, where the FOM planner
is only invoked once may be incorrect, and even if it is correct, may
be unimportant. The issue, I think, comes down to the fact that we
can't really do a complete model. This means that there are
transitions (outcomes from applying an operator) which are not
accounted for in the FOM or POM representation. When one of these
'unexpected' transitions occurs (as it must), we will land in a state
for which we have no corresponding action ready to invoke. This will
result in the need to re-run the FOM or POM planner, at great expense.
Clearly there is a trade-off here. The more complete the model, the
less frequently we will 'fall off the map'. However, the more complete
the model, the higher the cost on running the planner. Since the run
time grows hyper-exponentially, and the coverage (the probability that
an outcome is within the model space) grows ??? (logarithmically at a
guess), we should show a win by covering fewer possibilities, pushing
us towards the Subset model.
back up
References
Bylander, T.
"The Computational Complexity of Propositional STRIPS Planning",
Artificial Intelligence vol 69, pgs 165-204, 1994
Dearden, R. and Boutilier, C.
"Abstraction and Approximate Decision-Theoretic Planning",
Artificial Intelligence Vol 89, pgs 219-283, 1997.
Gunderson, J. P. and Ferrer,
G. J.
"A Linear Time Transform for Probability-Aware Planning",
Proceedings of the IEEE Conference on Systems, Man, and Cybernetics 2000.
Littman, M. L., Goldsmith, J., and
Mundhenk, M.
"The Computational Complexity of Probabilistic Planning",
Journal of Artificial Intelligence Research, Vol 9, pgs 1-36, 1998
Littman, M. L., Dean, T. L., and Kaelbling,
L. P.
"On the complexity of Solving Markov Decision Problems",
Proceedings of Conference on Uncertainty in Artificial Intelligence
(UAI-1995), 1995
Madani, O. and Hanks, S. and Condon, A.
"On the Undecidability of Probabilistic Planning and Infinite Horizon
Partially Observable Markov Decision Problems",
Proceedings of American Association for Artificial Intelligence, 1999.
Schraft, R. D. and Schmierer, G.
"Service Robots",
A. K. Peters, Natick MA. 2000
Vose, D.
"Risk Analysis, a quantitative guide 2nd. Ed.",
John Wiley and Sons, LTD. 2000
jpg3u@virginia.edu
Last modified: Thu Jan 25 18:53:42 2001