# plan_recognition_as_planning_revisited__f9ae3983.pdf

Plan Recognition as Planning Revisited

Shirin Sohrabi Anton V. Riabov Octavian Udrea IBM T.J. Watson Research Center 1101 Kitchawan Rd, Yorktown Heights, NY 10598, USA

{ssohrab, riabov, udrea}@us.ibm.com

Recent work on plan recognition as planning has shown great promise in the use of a domain theory and general planning algorithms for the plan recognition problem. In this paper, we propose to extend previous work to (1) address observations over ﬂuents, (2) better address unreliable observations (i.e., noisy or missing observations), and (3) recognize plans in addition to goals. To this end, we introduce a relaxation of the plan-recognitionas-planning formulation that allows unreliable observations. That is, in addition to the original costs of the plan, we deﬁne two objectives that account for missing and noisy observations, and optimize for a linear combination of all objectives. We approximate the posterior probabilities of generated plans by taking into account the combined costs that include penalties for missing or noisy observations, and normalizing over a sample set of plans generated by ﬁnding either diverse or high-quality plans. Our experiments show that this approach improves goal recognition in most domains when observations are unreliable. In addition, we evaluate plan recognition performance and show that the high-quality plan generation approach should be preferred in most domains.

1 Introduction Plan recognition is the problem of recognizing the plans and the goals of an agent given a set of observations. There exists a number of different approaches to the plan recognition problem including the use of SAT solvers and planning (e.g., [Ram ırez and Geffner, 2009; Zhuo et al., 2012]) where the domain theory is given as an input as well as the use of techniques that assume a plan library is given as an input [Sukthankar et al., 2014]. Plan recognition continues to be an important problem to study as it has many practical applications such as assisted cognition [Pentney et al., 2006], computer games [Synnaeve and Bessi ere, 2011], and network monitoring [Sohrabi et al., 2013; Riabov et al., 2015].

In this paper, we revisit the work on plan recognition as planning by Ram ırez and Geffner 2009; 2010, where the domain theory is given as an input and planning algorithms are

used to generate a solution to the plan recognition problem. The advantage of using domain theory and planning is that plan library is no longer required as input. In addition, AI planning techniques can be exploited to compute the solution. Hence, use of domain theory and planning algorithms for plan recognition is more ﬂexible, general, and scalable.

While the use of planning for plan recognition by Ram ırez and Geffner 2009-2010 showed great promise and allowed for some noisy behavior, it did not effectively deal with unreliable observations. Unreliable observations can be a result of a sensor malfunction, intentional obfuscation by malware, or simply a mistake that an agent makes [Thorsley et al., 2008]. We consider two classes of unreliable observations: (1) observations that are noisy - cannot be explained by the actions of a plan for a particular a goal; hence, need to be discarded, and (2) observations that are missing - observations that should have been picked up but have not, possibly because the sensors malfunctioned. Unreliable observations present a challenge for the previous work due to the way the posterior probability of a goal G given observations O, P(G|O), is calculated. Even if we have a reasonable prior estimate of P(G), P(G|O) as calculated by previous work, will be underestimated when observations are unreliable, because the cost difference of achieving G and O, and achieving G and not O will be large. This is because either the plan that satisﬁes G and complies with O has a much higher cost than the plan that satisﬁes G and does not comply with O (or complies to just part of the observations in O), or possibly no plan satisfying both O and G, either optimal or suboptimal, exist.

The main contribution of this paper is to provide a relaxation of the plan-recognition-as-planning formulation that allows for unreliable observations. That is, we provide a transformation of the original plan recognition problem to a new planning problem with action costs, where the costs now include penalties for missing or noisy observations. Hence, in addition to the original costs of the plan, we deﬁne two objectives that account for missing and noisy observations, and optimize for a linear combination of all objectives. We approximate the posterior probabilities of plans by taking into account the cost of the plan in the transformed planning problem, and normalizing over a sample set of plans generated by ﬁnding either diverse or high-quality plans. The posterior probabilities of goals are then computed by a summation over the posterior probabilities of the sampled set of plans that

Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI-16)

Figure 1: Goal Recognition Example

achieve this goal. In addition, unlike most plan recognition approaches, we consider observations over ﬂuents and argue that in many applications the actual actions of an agent may not be observable, instead, their effects through the change in the states are observable.We use a diverse planner, LPG-d [Nguyen et al., 2012], and a top-k planner, TK , [Riabov et al., 2014] to generate a set of plans. Our experiments show that our approach improves goal recognition in most domains when observations are unreliable. In addition, we evaluate plan recognition performance and show that the top-k plan generation approach performed better in most domains.

2 Motivating Example

Consider the grid example shown in Figure 1 where an agent initially at cell marked S heads to one of the goals A or B. The agent can move diagonally, horizontally or vertically all at a cost of 1. The arrows indicate the observations.

First, assume that the agent can move to any neighboring cells including the grey cells. Using the previous approach s algorithm [Ram ırez and Geffner, 2010] where P(G|O) is deﬁned by e β

1+e β , P(G=B|O)=0.65, because the cost difference or is 0, while P(G=A|O)=0.35. Note, the numbers are normalized. However, the algorithm that we will describe in this paper concludes that P(G=A|O) is higher than P(G=B|O) because pursing goal A will result in having less number of missing observations. Hence, knowing that the agent is pursuing goal A, our approach assigns a higher posterior probability to the true goal. The previous approach fails to assign a high probability to the true goal because it only penalizes for the agent not pursing the optimal plan and does not take into account the higher number of missing observations if the agent were to choose the optimal plan.

Now consider a modiﬁed version of the grid example where we know that the agent cannot go to the same cell it visited before, cannot move to the grey cells, and cannot go diagonally to left and up (i.e., north west). Then using the previous approach P(G=B|O)=0 because there is no plan that would achieve both G and O, either optimally or suboptimally, and P(G=A|O)=1. However, it is possible that the third observation is noisy, and that the agent was really pursuing goal B. Hence, our algorithm does not prune goal B and instead it assigns probabilities such that 0 < P(G=B|O) < P(G=A|O). Therefore, in the case that the agent was pursing goal B, our approach assigns a higher posterior probability to the true goal compared to the previous approach.

3 Plan Recognition with Domain Theory

Before describing the plan recognition problem we highlight a few key differences between our approach and that of prior work: (1) unlike many plan recognition approaches we do not require plan libraries as input. Instead we require a domain theory; (2) we consider observations over ﬂuents as we believe in many applications the actual actions of an agent may not be observable, instead, their effects through the change in the states are observable; (3) we consider unreliable observations (i.e., noisy or missing); (4) our approach can infer or recognize both goals and plans. In the rest of this section, we deﬁne the plan recognition problem and its solution. We will also deﬁne how to compute the posterior probabilities of both goals as well as plans given the unreliable observations.

Deﬁnition 1 (Planning Problem) A planning problem is a tuple P = (F, A, I, G), where F is a ﬁnite set of ﬂuent symbols, A is a set of actions with preconditions, PRE(a), add effects, ADD(a), delete effects, DEL(a), and non-negative action costs, COST(a), I F deﬁnes the initial state, and G F deﬁnes the goal state.

A state, s, is a set of ﬂuents that are true. An action a is executable in a state s if PRE(a) s. We deﬁne the successor state as δ(a, s) = ((s\DEL(a)) [ ADD(a)) for the executable actions. The sequence of actions = [a0, ..., an] is executable in s if state s0 = δ(an, δ(an 1, . . . , δ(a0, s))) is deﬁned. Moreover, is the solution to the planning problem P if it is executable from the initial state and G δ(an, δ(an 1, . . . , δ(a0, I))). Furthermore, is said to be optimal if it has minimal cost, where COST( )=P

Deﬁnition 2 (Plan Recognition Problem) A plan recognition problem is a tuple R = (F, A, I, O, G, PROB), where (F, A, I) is the planning domain as deﬁned above, O = [o1, ..., om], where oi 2 F, i 2 [1, m] is the sequence of observations, G is the set of possible goals G, G F, and PROB is the goal priors or a probability distribution over G.

In general, O can be expressed as a Linear Temporal Logic (LTL) formula or Past LTL formula [Emerson, 1990; Sohrabi et al., 2011]. In other words, each observation can be a logical expression over the set of ﬂuents. While it is possible to address this general type of observations, our focus in this paper is on observations that are sequenced, or are totally ordered, and that each observation is an observable ﬂuent. Observations over ﬂuents are more general and ﬂexible than observations over actions, because often in practice, actions are not directly observable, and instead some of their effects can be observed, for example, through sensors. However, we can also deal with observations over actions by assigning a unique ﬂuent per action that is added only by that action. This is how we are able to directly compare with the prior work which focused on observations over actions.

Next, we deﬁne the satisfaction of an observation sequence O by an action sequence. This deﬁnition is a relaxation over the original satisfaction deﬁnition because it allows for observations to be left unexplained. This is necessary in order to address the noisy observations.

Deﬁnition 3 (Satisfaction) Let σ = s0s1s2...sn+1 be an execution trace of an action sequence = [a0, ..., an] from the initial state, where δ(ai, si) = si+1 is deﬁned, for any i 2 [0, n]. Given a planning domain (F, A, I), an observation sequence O = [o1, ..., om] is said to be satisﬁed by an action sequence = [a0, ..., an], and its execution trace σ if there is a non-decreasing function f that maps the observation indices j = 1, ..., m into the state indices i = 0, ..., n + 1, such that for all 1 j m, either oj 2 sf(j), or oj /2 sf(j).

The above deﬁnition takes into account the observation order through the mapping of the non-decreasing function. It also allows for leaving some observations unexplained. In one extreme, all observations can be explained by the sequence of states, and in the other extreme, all observations are discarded as it may be possible that all observations are noisy, but very unlikely. Hence, there may be many such nondecreasing functions and that the deﬁnition holds as long as at least one such mapping exists. Next, we formulate the definition of noisy observation and missing observation.

Deﬁnition 4 (Noisy/Missing) let T F be the set of observable ﬂuents, P = (F, A, I, G) be a planning problem, O be an observation sequence, and be a plan for G that satisﬁes O. Then O is said to be noisy with respect to the given G and , if at least one of its observations, o 2 O, is noisy. o is said to be noisy, if o is never added to the state by any of the actions ai 2 (i.e., o /2 ADD(ai)). Further, O is said to have missing observations with respect to the given G and if at least one of its observations is missing. o is said to be missing if (1) o is observable or o 2 T; (2) o /2 O; and (3) o is added by at least one of the actions ai 2 (i.e., o 2 ADD(ai)).

Hence, noisy observations are those that have not been added to the state for that particular goal and plan, while missing observations are those that have been added but were not observed. Going back to our example in Figure 1, o3 can be a noisy observation for goal B, but not for goal A. Also many observations were missing for goal B, but only two were missing for goal A. Next, we deﬁne posterior goal probabilities P( |O) and posterior plan probabilities P(G|O).

P( |O) = βP(O| )P( ) = βP(O| )P( |G)P(G) (1) where β is a normalizing constant that depends on P(O) only, and P(G) is PROB(G). In this expression, we assume that the agent is pursuing only one goal and P( |G) is 0 for the action sequences that are not plans for G. Using Bayes rule, we also have:

P(G|O) = βP(O|G)P(G) (2) Assuming that the observations are independent of the goal G given , and ranges over all plans that achieve G and satisfy O, P(O|G) can be written as:

P(O| ) P( |G) (3)

Note, unlike previous work we do not assume that P(O| ) is 0 for plans that are inconsistent with the observation because observations can be noisy. However, P(O| ) is small

if O is noisy or has missing observations. Hence we deﬁne P(G|O) as follows with that now ranges over the set of sampled found plans, , that achieve G and satisfy O.

There are two important questions yet to be discussed: (1) how to deﬁne P(O| ) and P( |G), and (2) how to use planning to ﬁnd the sampled set of plans.

To address the ﬁrst question, we provide an approximation that takes into account not only the original cost of the actions but also the number of missing and noisy observations according to Deﬁnition 4. Hence, we deﬁne a weighted factor, VO,G( ), that combines all our three objectives as follows:

VO,G( ) = COST( ) + b1 MO,G( ) + b2 NO,G( ) (5)

where is a plan that meets the goal G and satisﬁes O, MO,G( ) is the number of missing observations in O, NO,G( ) is the number of noisy observations in O, and b1 and b2 are the corresponding coefﬁcients assigning weights to the different objectives. Hence, we approximate the value of P(O| ) P( |G) by taking into account the value of VO,G( 0) for all other sampled plans 0 in addition to the value of VO,G( ) for this plan . We use a constant β0 to offset large sums if necessary to get a better approximation.

P(O| ) P( |G) 1 β0 VO,G( ) P

VO,G( 0) (6)

With β0 = 1, if there are 4 sampled plans, each with the same V values, then equation 6 results in 0.75, and once normalized P( |O) = 0.25. On the other hand, if a plan 1 has a higher V value than another plan 2, then P( 1|O) < P( 2|O). Equation 6 allows us to compute P( |O) and P(G|O) given a set of sampled plans. In the next section, we address the second question which is how to use planning to ﬁnd the sampled set of plans and also how to calculate VO,G( ) from the cost of the plans in the transformed planning problem.

4 Transformation to Planning In this section, we will provide a general transformation of the plan recognition problem into a planning problem with action costs. This transformation allows use of AI planning, in particular, the use of planners capable of ﬁnding a plan set.

To create the new planning problem, we augment the existing actions with a set of discard and explain actions for each observation oi in the observation sequence O. These actions ensure that the observations are considered while in some cases they may need to be discarded. In particular, the noisy observations can be skipped using the discard actions. We also ensure that the order of the observations is preserved using a special predicates, loi, that is set to true if the observation is either explained to discarded. We add o0 as a dummy initial observation and set lo0 to true initially. Further, we ensure that at least one of the goals G 2 G is satisﬁed by the computed plans. We do so by creating an

action for each G 2 G with a special add predicate we call done . The goal state will be updated to include this done predicate. This ensures that we restrict our search space and consider only plans that meet at least one of the given goals when generating the set of sampled plans.

Deﬁnition 5 For a plan recognition problem R = (F, A, I, O, G, PROB) we create a new planning problem with action costs P 0 = (F 0, A0, I0, G0) such that:

F 0 = F[ {done, lo0} [{loi|oi 2 O},

I0 = I [ {lo0},

G0 = {done, lom}, where om is the last observation,

A0 = Aorig [ Agoal [ Adiscard [ Aexplain, where,

Aorig = {ha| a 2 A , COST(ha) = COST(a) +

b1 |{oi|oi 2 ADD(a) oi 2 T oi /2 O}|, where T F is the set of observable ﬂuents}, Agoal = {hg| g 2 G, COST(hg) = 0, PRE(hg) = {g},

ADD(hg) = {done}}, Adiscard = {hoi|oi 2 O, COST(hoi) = b2, PRE(hoi) =

{ oi, loi 1},ADD(hoi) = {loi}, DEL(hoi) = {loi 1}}, Aexplain = {hoi| oi 2 O, COST(hoi) = 0 , PRE(hoi) =

{oi, loi 1}, ADD(hoi) = {loi}, DEL(hoi) = {loi 1}}.

Note, the cost of the plan for P 0 now encodes a penalty for the missing observations and unexplained observations. In fact, it is not hard to see that the cost of the plan for P 0 is the same as VO,G( ), according to Equation 5.

Theorem 1 (Soundness/Correctness) Given a plan recognition problem R = (F, A, I, O, G, PROB), and the corresponding planning problem P 0 = (F 0, A0, I0, G0) as deﬁned above, for all G 2 G, if is a plan for R, then there exists a plan 0 for P 0 such that can be constructed straightforwardly from 0 by removing the extra actions (i.e., discard, explain, and goal actions) and more importantly, VO,G( ) = COST( 0). On the other hand, if there is a plan 0 for P 0, then there exists a plan for P that can be constructed from 0 by removing the extra actions such that VO,G( ) = COST( 0).

Proof is based on the fact that the extra actions do not change any of the observable ﬂuents while preserving the ordering amongst the observations. Moreover, cost of the plans now map to VO,G( ), hence, the posterior probabilities, P(G|O) and P( |O), can be computed using the cost of the plans in the transformed planning problem. Note that these probabilities will be different based on which method is used to generate a sample set of plans.

5 Computation We present two methods in ﬁnding a sampled set of plans. The ﬁrst method is based on top-k planning, and the second method is based on ﬁnding a set of diverse plans.

5.1 Computation via Top-k Planning We deﬁne top-k planning similar to its deﬁnition given in [Riabov et al., 2014]. The top-k planning problem is a tuple (P, k), where P is the planning problem with action costs as deﬁned in Deﬁnition 1, and k is the number of plans to ﬁnd.

Let n be the number of valid plans for the planning problem P. The set of plans = { 1, ..., m}, where m = k if k n, m = n otherwise, is the solution to the top-k planning problem (P, k) if and only if each i 2 is a plan for the planning problem P; and there does not exists a plan 0 for P, 0 /2 , and a plan i 2 such that COST( 0) < COST( i). Hence, may contain just one optimal plan (i.e., k = 1), all optimal plans, or all optimal plans and some suboptimal plans if k is large enough.

There are several techniques that can be used to compute the top-k plans as described by [Riabov et al., 2014]. In this paper, we use the top-k planning planner, TK , that is based on the use of a k shortest paths algorithm called K [Aljazzar and Leue, 2011] because it has been shown that TK outperforms other techniques for top-k planning. K shortest paths problem is an extension of the shortest path problem where in addition of ﬁnding one shortest path, a set of paths is found, representing the k shortest paths [Hoffman and Pavley, 1959]. The K algorithm is an improved variant of the Eppstein s k shortest paths algorithm [Eppstein, 1998] because it does not require the complete graph of states and actions to be available in memory. TK , applies K to search in state space, with dynamic grounding of actions. For more details on the TK planner, please see [Riabov et al., 2014].

Note, while the the sample set of plans we are generating is not required to be of high quality or low cost, use of the top-k planning approach is guaranteed to ﬁnd such a set. In turn, the solution set will optimize for the value of VO,G( ).

5.2 Computation via Diverse Planning In diverse planning the objective is ﬁnd a set of plans m that are at least d distance away from each other. The distance between plans can be computed by considering the plans as a set of actions, a sequence of states, or casual links. The solution to the diverse planning problem, (m, d), is a set of plans , such that | | = m and min , 02 δ( , 0) d, where δ( , 0) measures the distance between plans.

There are several techniques that can be used to compute the diverse plans and there are several diverse planners that exist [Srivastava et al., 2007; Coman and Mu noz-Avila, 2011; Nguyen et al., 2012; Bryce, 2014]. In this paper, we use LPG-d [Nguyen et al., 2012], which is an extension of a local search-based planner, LPG [Gerevini et al., 2003], for two reasons: (1) we were able to access and run it successfully, and (2) it showed relatively better performance compared to the other diverse planners [Roberts et al., 2014]. We use three settings of LPG-d, (10, 0.75), (50, 0.5), and (100, 0.75), and report on our experiments in the next section.

6 Experimental Evaluation

In this section, we evaluate our two proposed plan recognition approaches, LPG-d [Nguyen et al., 2012] for diverse planning, and TK [Riabov et al., 2014] for top-k planning, against previous work [Ram ırez and Geffner, 2010]. We conﬁgured the previous work so that it uses the LM-Cut1 [Pom-

1We used the most recent download of the fast downward planning system (http://www.fast-downward.org/) with the parameter set to alias seq-opt-lmcut to obtain LM-Cut.

Ram ırez & Geffner Proposed Approaches LM-Cut TK , top-1000 LPG-d, (50, 0.5) %O |E| S P Q S P Q S P Q

Intrusion |G|=20

25 0 1.2/1.1 0.27 40/2 0.7/0 0.21 21/0 1/3.2 0.17 26/26 50 0 1/0.5 0.37 40/0 0.5/0 0.25 25/0 1/4.3 0.25 45/30 75 0 1/0.1 0.39 40/0 0.2/0 0.15 15/0 1/4.2 0.30 62/23 100 0 1/0 0.39 40/0 0.0/0 0.02 2/0 1/3.8 0.31 58/23 25 2 1.9/1.2 0.18 23/15 0.6/0 0.17 17/0 1/4.0 0.15 26/32 50 2 2.8/0.4 0.27 30/0 0.3/0 0.13 13/0 1/4.5 0.20 32/45 75 2 5.3/0.2 0.29 30/0 0.1/0 0.02 2/0 1/3.9 0.26 45/28 100 2 8.2/0 0.24 25/0 0.1/0 0.02 2/0 1/3.5 0.29 51/23

Campus |G|=2

25 0 1.2/0.8 0.72 95/5 1/0.1 0.65 65/7 1/0.8 0.46 42/44 50 0 1.1/0.5 0.81 95/5 1/0.3 0.87 91/7 1/0.8 0.48 53/35 75 0 1/0.5 0.88 100/0 1/0.1 0.98 100/0 1/0.8 0.53 53/37 100 0 1/0.5 0.91 100/0 1/0.1 0.99 100/0 1/0.8 0.56 56/40 25 2 1.3/0.6 0.62 81/19 1/0.1 0.59 58/7 1/0.8 0.50 56/30 50 2 1.2/0.4 0.75 91/9 1/0.3 0.75 74/16 1/0.8 0.50 56/33 75 2 1.2/0.4 0.81 100/0 1/0.2 0.92 95/2 1/0.8 0.54 58/40 100 2 1.2/0.3 0.79 98/2 1/0.3 0.89 93/7 1/0.9 0.52 53/40

Kitchen |G|=3

25 0 1.1/0.4 0.67 57/43 1/0 0.99 100/0 1/0.7 0.79 86/14 50 0 1/0.4 0.78 71/29 1/0.1 0.84 86/0 1/0 0.71 71/0 75 0 1/0.3 0.86 86/14 1/0.1 0.91 86/14 1/0.4 0.69 71/14 100 0 1/0.1 0.88 86/14 0.9/0.1 0.77 71/14 1/0 0.85 86/0 25 2 2.4/0 0.14 14/0 0.4/0 0.14 14/0 1/0.3 0.59 57/14 50 2 2.4/0 0.00 0/0 0.6/0 0.57 57/0 1/0.1 0.48 43/14 75 2 3.0/0 0.00 0/0 0.3/0 0.29 29/0 1/0 0.71 71/0 100 2 2.7/0 0.14 14/0 0.6/0 0.43 43/0 1/0.1 0.83 86/0

IPC-Grid |G|=5

25 0 1.2/0.4 0.82 100/0 1/0.4 0.85 90/10 1/1.6 0.51 67/29 50 0 1.4/0.1 0.82 86/0 1/0.1 0.93 95/0 1/2.1 0.38 48/48 75 0 1.6/0.1 0.85 86/0 1/0 1.00 100/0 1/2.2 0.33 38/52 100 0 1.8/0 0.80 81/0 1/0 1.00 100/0 1/2.2 0.36 38/57 25 2 4.5/0.1 0.05 5/5 1/1.2 0.32 33/24 1/2.2 0.25 19/67 50 2 4.8/0 0.00 0/0 1/0.8 0.38 43/14 1/2.6 0.31 43/43 75 2 5.0/0 0.00 0/0 1/0.6 0.18 14/19 1/2.2 0.38 48/33 100 2 5.0/0 0.00 0/0 1/1.0 0.34 38/19 1/2.2 0.30 33/52

Table 1: Comparison of our two proposed plan recognition approaches for recognizing a goal, top-k planning using TK , and diverse planning using LPG-d against previous work by Ram ırez and Geffner 2010, using LM-Cut. The columns stand for % of observations selected, number of extra observations, |E|, average number of most/less likely goals, S, posterior probability of the goal given the observations, P, and average percentage of problems where the ground truth goal is among the most/less likely goals, Q. The overall value of Q indicates the coverage for that method (i.e., sum of the most and less likely goals). The bold numbers indicate better results.

merening and Helmert, 2012] planner. We selected the LMCut planner because it was shown to perform well in the planning competition and it performs better than the original planners used for the conﬁguration of the previous approach. Note, using a more efﬁcient optimal planner or even choosing a sub-optimal planner will not improve the presented results for the previous approach because LM-Cut was able to solve all problems optimally within the time limit. We used a timeout of 30 minutes and ran all our experiments on a dual 16-core 2.70 GHz Intel(R) Xeon(R) E5-2680 processor with 256 GB RAM. Note, while we obtained the results by run-

ning the top-k planner and the diverse planner once for each problem, LM-Cut was run 2 |G| times.

6.1 Planning Problems In our experimental evaluation we used four domains and had over 1000 problems. We used the Kitchen, Intrusion, Campus, and IPC-Grid domain from the previous approach [Ram ırez and Geffner, 2010] but modiﬁed the kitchen domain to disallow the extra take actions that are not towards achieving the goal in order to evaluate the approaches in the case of noisy observations that need to be discarded. We did not run the other domains reported by the previous approach such as the blocks words because of the use of equality which is not supported by the top-k planner. To construct the planning problems, for each domain we use a ﬁxed given possible goal set G and for each goal set G we generate a set of optimal and suboptimal plans. From these plans, we select the observable actions in order to construct O, keeping track of the goal used in the planning problem (i.e., ground truth goal). To evaluate how well the approaches address missing observations, we create several problems that do not have the full observation sequence (i.e., some observation are missing). We do so by randomly selecting 25%, 50%, and 75% of the observations in O. Therefore, the 100% case indicates that the full observation sequence, O, is given. Furthermore, to evaluate how these approaches address noise, we randomly add extra observations to the original observation sequence. While some domains such as the IPC-Grid domain can still explain these extra observations using the extra move actions, a plan for G that fully explains O may not exist in other domains. In the special case that no plan is found for G and O, we assign a high cost to that plan which can lead to P(G|O) ' 0 for that goal for the previous approach.

For the LPG-d planner we used three settings of (m, d), (10, 0.75), (50, 0.5), and (100, 0.75), but report only on the (50, 0.5) case; 50 plans that are at least 0.5 distance away from each other. This is because this setting performed better than the other two both with respect to recognizing a goal and recognizing a plan. For the top-k planner, we set k = 1000 in order to obtain a large set of high-quality plans that are also diverse. Lower values of k might help improve the posterior probabilities but will likely result in lower converge. For the coefﬁcients, we set b1 = 2, and b2 = 4 in the experiments meaning that we assign a higher penalty for the unexplained observations, and a lower penalty for the missing observations. We also experimented with different values for these coefﬁcients but the results were not signiﬁcantly different.

6.2 Goal Recognition Table 1 shows the summary of the results when evaluating our two proposed plan recognition approaches for recognizing a goal with the previous work [Ram ırez and Geffner, 2010]. The number of extra observations, |E|, indicates the amount of noise introduced to the problem; we added two extra observations resulting in about 12% noise on average relative to the total number of observations that was close to 16 on average. To evaluate the coverage and accuracy of the different approaches, in addition to the actual posterior probability of the goal given observations, P(G|O), shown in the P

column, we measure the average number of the most likely goals and the average number of less likely goals found by that method, shown in the S column separated by / and average percentage of problems where the ground truth goal is among the most and less likely goals also separated by / shown in the Q column. The overall value of Q, sum of the most and less likely, indicates the goal recognition coverage for that method. The most likely goals are chosen relative to that particular approach (i.e., goals with the highest posterior probability) and the less likely goals are those goals with greater than 0.05 posterior probability. Hence, value of 1/3.2 in the S column indicates that, on average, one of the goals in G was found to be most likely, while 3.2 of the goals were found to be less likely. Also value of 56/40 in the Q column indicates that, on average, the ground truth goal was among the goals found to be most likely 56% of the time, while the ground truth goal was among goals found to be be less likely 40% of the time, and that that approach had on average 96% goal recognition coverage for this problem.

The results in Table 1 show that our approach, diverse planning and top-k planning, helped improve the percentage of ﬁnding the ground truth goal in many domains or the coverage, when observations are unreliable. In particular, in Kitchen and IPC-grid domain, it helped improve the percentage of ﬁnding the ground truth goal even with high probability, when the observations are unreliable. Note that there are a few cases such as in the Intrusion domain that TK performed worse. This is mainly because the TK ran out of time and was not able to ﬁnd the top-k plans. However, we expect to obtain better performance in domains that have smaller search space, and/or use a more efﬁcient top-k planner.

Comparing the results of the problems where %O=25 versus %O=100 (i.e., no missing observations), LGP-d and TK performed better in recognizing the goal; LGP-d s coverage increased from the average of 79% to 87% and TK s coverage increased from the average of 56% to 61%. However, goal recognition coverage went down from the average of 63% to 57% for the previous approach in our problem sets. This can be caused by the general poor performance of the previous approach on long observation sequences since the observations may not all be from an optimal plan.

On average, over all our problems, LPG-d had 83% coverage, TK had 60% coverage, and LM-Cut had 59% coverage. Moreover, when observations are unreliable, problems where |E| = 2 or %O is not 100, the coverage stayed the same for LPG-d and TK but it went down to 55% for LM-Cut. However, when observations are noisy, problems where |E| = 2, the coverage went down to 79% for LPG-d, to 46% for TK , and to 35% for LM-Cut. This indicates that the previous approach performed worse when observations are unreliable, even more so when observations are noisy. Also note that the posterior probabilities of goals, when observations are noisy, is on average 0.26 for LM-Cut, versus on average 0.42 for LPG-d. Also, the size of the most likely goals is largest for LM-Cut and lowest for the top-k approach, indicating that the top-k approach has a higher precision in detecting the ground truth goal than the previous approach.

6.3 Plan Recognition

We evaluate our two proposed plan recognition methods for recognizing a plan, top-k planning and diverse planning, and we did not compare with the previous work [Ram ırez and Geffner, 2010] since their focus was on goal recognition. Also, in our benchmark problems, recognizing the full plan is often very challenging. This is because of the combinatorial property of these domains (i.e., there is often no particular order among the actions). Hence, instead of recognizing the full plan, we ﬁnd a random subset of actions in the full plan and we evaluate our techniques in recognizing this partial plan.

The results we obtained show that while LPG-d performed better in some domains, TK performed better on average over all domains with respect to ﬁnding the ground truth partial plan with high probability; TK found the partial plan with high probability in 43% of the time, while LPG-d found the partial plan with high probability in 25% of the time. This can be because the top-k planning approach focuses on ﬁnding a set of high-quality plans, while LPG-d ﬁnds diverse plans instead and this is helpful in some domains but not all. However, both approaches were able to ﬁnd the ground truth partial plan in about only 50% of the time on average as this problem was much more difﬁcult for the two approaches. Also the posterior probability of partial plans for the top-k planning approach is on average 0.48 compared to 0.41 for the LPG-d approach. In summary, while the previous work as it stands is not able to recognize plans, our experiments show that our approach is able to recognize ground truth (partial) plans. Moreover, top-k planning approach performed better in plan recognition coverage and found a higher posterior probability for the partial plan.

7 Discussion and Summary

There exist a body of work on the plan recognition problem with several different approaches including the use of AI planning and other techniques such as the use of SAT [Zhuo et al., 2012]. While most existing work assumes that the observations are reliable, a recent work by Vattam et al. 2015 addresses missing and noisy observations. However, they require plan libraries and use case-based plan recognition algorithms, whereas, we use AI planning instead. Generating a plan set rather than just one plan has gained its popularity in recent years in the context of diverse planning. There exist several diverse planners including LPG-d which we use in this paper. Using a more efﬁcient diverse planner, specially if it accounts for cost, such as a recent approach by Vadlamudi et al. 2016, can improve the performance of our approach.

In this paper, we proposed to extend previous work to address observations over ﬂuents, better address unreliable observations, and recognize plans in addition to goals. To this end, we provide a relaxation of the plan-recognition-asplanning formulation that allows unreliable observations. We also deﬁned the posterior probabilities of goals and plans. We evaluate our solution using two approaches, diverse planning, and top-k planning. Our experiments showed that using our approach improves goal recognition in most domains when observations are unreliable. We also reported on the extension of our approach for recognizing plans. In the future, we

would like to tackle the plan recognition problem where actions can have durations. This is a difﬁcult problem especially in the presence of concurrency.

References [Aljazzar and Leue, 2011] Husain Aljazzar and Stefan Leue.

K*: A heuristic search algorithm for ﬁnding the k shortest paths. Artiﬁcial Intelligence, 175(18):2129 2154, 2011. [Bryce, 2014] Daniel Bryce. Landmark-based plan distance measures for diverse planning. In Proceedings of the 24th International Conference on Automated Planning and Scheduling (ICAPS), pages 56 64, 2014. [Coman and Mu noz-Avila, 2011] Alexandra Coman and Hec-

tor Mu noz-Avila. Generating diverse plans using quantitative and qualitative plan distance metrics. In Proceedings of the 25th National Conference on Artiﬁcial Intelligence (AAAI), pages 946 951, 2011. [Emerson, 1990] E. Allen Emerson. Temporal and modal logic.

Handbook of theoretical computer science: formal models and semantics, B:995 1072, 1990. [Eppstein, 1998] David Eppstein. Finding the k shortest paths.

SIAM Journal on Computing, 28(2):652 673, 1998. [Gerevini et al., 2003] Alfonso Gerevini, Alessandro Saetti,

and Ivan Serina. Planning through stochastic local search and temporal action graphs in LPG. Journal of Artiﬁcial Intelligence Research, 20:239 290, 2003. [Hoffman and Pavley, 1959] Walter Hoffman and Richard Pavley. A method for the solution of the nth best path problem. Journal of the ACM, 6(4):506 514, 1959. [Nguyen et al., 2012] Tuan Anh Nguyen, Minh Binh Do, Al-

fonso Gerevini, Ivan Serina, Biplav Srivastava, and Subbarao Kambhampati. Generating diverse plans to handle unknown and partially known user preferences. Artiﬁcial Intelligence, 190:1 31, 2012. [Pentney et al., 2006] William Pentney, Ana-Maria Popescu,

Shiaokai Wang, Henry A. Kautz, and Matthai Philipose. Sensor-based understanding of daily life via large-scale use of common sense. In Proceedings of the 21st National Conference on Artiﬁcial Intelligence (AAAI), pages 906 912, 2006. [Pommerening and Helmert, 2012] Florian Pommerening and

Malte Helmert. Optimal planning for delete-free tasks with incremental lm-cut. In Proceedings of the 22nd International Conference on Automated Planning and Scheduling (ICAPS), 2012. [Ram ırez and Geffner, 2009] Miquel Ram ırez and Hector Geffner. Plan recognition as planning. In Proceedings of the 21st International Joint Conference on Artiﬁcial Intelligence (IJCAI), pages 1778 1783, 2009. [Ram ırez and Geffner, 2010] Miquel Ram ırez and Hector Geffner. Probabilistic plan recognition using off-the-shelf classical planners. In Proceedings of the 24th National Conference on Artiﬁcial Intelligence (AAAI), 2010. [Riabov et al., 2014] Anton Riabov, Shirin Sohrabi, and Octa-

vian Udrea. New algorithms for the top-k planning problem. In Proceedings of the Scheduling and Planning Applications

wo RKshop (SPARK) at the 24th International Conference on Automated Planning and Scheduling (ICAPS), pages 10 16, 2014. [Riabov et al., 2015] Anton V. Riabov, Shirin Sohrabi, Daby M. Sow, Deepak S. Turaga, Octavian Udrea, and Long H. Vu. Planning-based reasoning for automated large-scale data analysis. In Proceedings of the 25th International Conference on Automated Planning and Scheduling (ICAPS), pages 282 290, 2015. [Roberts et al., 2014] Mark Roberts, Adele E. Howe, and In-

drajit Ray. Evaluating diversity in classical planning. In Proceedings of the 24th International Conference on Automated Planning and Scheduling (ICAPS), pages 253 261, 2014. [Sohrabi et al., 2011] Shirin Sohrabi, Jorge A. Baier, and

Sheila A. Mc Ilraith. Preferred explanations: Theory and generation via planning. In Proceedings of the 25th National Conference on Artiﬁcial Intelligence (AAAI), pages 261 267, 2011. [Sohrabi et al., 2013] Shirin Sohrabi, Octavian Udrea, and An-

ton Riabov. Hypothesis exploration for malware detection using planning. In Proceedings of the 27th National Conference on Artiﬁcial Intelligence (AAAI), pages 883 889, 2013. [Srivastava et al., 2007] Biplav Srivastava, Tuan Anh Nguyen,

Alfonso Gerevini, Subbarao Kambhampati, Minh Binh Do, and Ivan Serina. Domain independent approaches for ﬁnding diverse plans. In Proceedings of the 20th International Joint Conference on Artiﬁcial Intelligence (IJCAI), pages 2016 2022, 2007. [Sukthankar et al., 2014] Gita Sukthankar, Christopher Geib,

Hung Hai Bui, David V. Pynadath, and Robert P. Goldman. Plan, Activity, and Intent Recognition. Morgan Kaufmann, 2014. [Synnaeve and Bessi ere, 2011] Gabriel Synnaeve and Pierre

Bessi ere. A bayesian model for plan recognition in RTS games applied to starcraft. In Proceedings of the 7th AAAI Conference on Artiﬁcial Intelligence and Interactive Digital Entertainment, AIIDE 2011, 2011. [Thorsley et al., 2008] David Thorsley, Tae-Sic Yoo, and Hum-

berto E. Garcia. Diagnosability of stochastic discrete-event systems under unreliable observations. In Proceedings of American Control Conference, pages 1158 1165, 2008. [Vadlamudi and Kambhampati, 2016] Satya Gautam Vadlamudi and Subbarao Kambhampati. A combinatorial search perspective on diverse solution generation. In Proceedings of the 30th National Conference on Artiﬁcial Intelligence (AAAI), 2016. To appear. [Vattam and Aha, 2015] Swaroop S Vattam and David W Aha.

Case-based plan recognition under imperfect observability. In Proceedings of the 23rd International Conference on Case Based Reasoning (ICCBR), pages 381 395, 2015. [Zhuo et al., 2012] Hankz Hankui Zhuo, Qiang Yang, and Sub-

barao Kambhampati. Action-model based multi-agent plan recognition. In Proceedings of the 26th Annual Conference on Neural Information Processing Systems (NIPS), pages 377 385, 2012.