# a_demonstration_of_interactive_task_learning__4282dcdb.pdf A Demonstration of Interactive Task Learning James Kirk, Aaron Mininger, and John Laird Division of Computer Science and Engineering University of Michigan, Ann Arbor, Michigan, USA {jrkirk, mininger, laird}@umich.edu We will demonstrate a tabletop robotic agent that learns new tasks through interactive natural language instruction. The tasks to be demonstrated are simple puzzles and games, such as Tower of Hanoi, Eight Puzzle, Tic-Tac-Toe, Three Men s Morris, and the Frog and Toads puzzle. We will include a live, interactive simulation of a mobile robot that learns new tasks using the same system. Humans are not limited to a fixed set of innate or preprogrammed tasks. We quickly learn novel tasks through instruction. We learn to play new games and puzzles in just a few minutes and we can learn navigation and manipulation tasks such as delivering or fetching a package. We are headed to a future populated with autonomous systems that have the cognitive and physical capabilities to perform a wide variety of tasks; however, today we rely on either hand programming or extensive training to teach these systems new tasks. Consider a future where it is possible to directly instruct agents with new tasks in real time using language. This would greatly increase the ability of non-experts to extend and customize the computational systems they interact with every day. Our demonstration is centered on the Rosie system [Mohan et al., 2012; Kirk and Laird, 2014] developed in Soar [Laird, 2012] that is embodied in both a tabletop robot and a mobile robot. In the tabletop robot, Rosie learns simple puzzles and games, as well as object manipulation tasks that mirror simple kitchen-like activities. In the mobile robot, it learns tasks that involve simple navigation, manipulation, and communication. The agent can learn the tasks from scratch, including termination/goal conditions, legal actions, and task-specific concepts that are grounded in its perceptual and functional primitives. It also transfers learned knowledge to other tasks, such as the concept of three-in-a-row, which can be learned for Tic-Tac-Toe and then used in Three Men s Morris. If a new concept is used, such as when the agent is taught an action for Othello: If the locations between a clear location and a captured location are occupied then you can place a piece on the clear location. , it will request definitions of all undefined words, such as clear , captured , and occupied . The instructor can then provide a definition, such as If a location is below an enemy piece then the location is occupied. We have made the following extensions to Rosie: 1. Rosie has a new parser, implemented in Soar, that al- lows the instructor to use a restricted form of natural language. The parser supports simple grammatical constructions that are sufficient to support natural instructions for all the described tasks (see example below). The parser is completely integrated in Rosie, and uses syntactic, semantic, and pragmatic knowledge to generate a semantic description of linguistic input that is used by the task learning system. 2. Rosie is now also embodied in a real-world mobile robot. The original tabletop system had complete perception of all objects. In the mobile domain, Rosie must understand and reason about commands that refer to objects outside of its immediate perception, including unknown objects, such as when it is requested to fetch an object from another room. 3. Rosie can learn many more kinds of concepts and can compose them to form complex hierarchical concepts. These include concepts that require internal computation, such as counting. 4. Rosie can learn goal states through visual demonstra- tions in addition to natural language descriptions. Rather than explicitly describing the conditions of the goal, which can be tedious, the instructor can demonstrate it: This is the goal state. After Rosie constructs a hypothesis, I think the goal state is that... , the instructor can provide refinements through subsequent interactions. 5. Finally, revisions and extensions have made Rosie much faster. This includes many optimizations, but most importantly it includes the use of Soar s chunking mechanism to dynamically compile the learned task knowledge into procedural rules, eliminating costly interpretation of declarative structures. As a result, the efficiency of Rosie s task reasoning using learned knowledge (such as when it is searching for a solution to a problem) is comparable to hand-coded knowledge. Many of the games and puzzles our agent can learn are obvious isomorphisms of the traditional versions. These different versions are often necessary because of limits in perceptual and motor system capabilities. For example, the arm is not dexterous enough to manipulate disks and place them on pegs. Therefore, for the Tower of Hanoi puzzle, we use three Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI-16) blocks of different sizes and three locations instead of pegs and disks. Below are the instructions for an isomorphism of the Tower of Hanoi puzzle. The name of the puzzle is tower-of-hanoi. You can move a clear block onto a clear object that is larger than the block. The goal is that a large block is on a blue location and a medium block is on the large block and a small block is on the medium block. Following these instructions, Rosie builds the internal structures necessary to interpret the described conditions, learns procedural knowledge to efficiently compute the conditions and test for the goal, and then internally searches for a solution. Once the goal is found (in 1 second), Rosie executes the plan using the robot arm. Rosie also learns goal-oriented tasks for a mobile robot in a large, multi-room environment. It drives around the environment, interacts with simple objects, and communicates with people. It learns new tasks such as Deliver the package to the main office, Tell Alice a message, and Fetch a stapler. To teach a task, the instructor describes the goal ( The package is in the main office ), and if necessary, the actions that the agent should execute to achieve the goal. Once the goal is achieved, the agent performs a retrospective causal analysis of its actions and learns a policy for achieving the goal. 1 Demonstration Details We will demonstrate our tabletop robot, which consists of a small robot arm and a Kinect sensor (see Figure 1). We will give live demonstrations of Rosie learning and solving simple puzzles, such as Tower of Hanoi and the Frog and Toad puzzle. We will also show Rosie learning simple games such as Tic-Tac-Toe, that can then be played with spectators. It would not be feasible for us to demonstrate task learning for a real mobile robot, but we can teach the same kinds of tasks with a simulated robot and environment. We will show live demonstrations of teaching the agent tasks involving navigation, manipulation, and communication, such as delivering an object to a person, fetching an object, telling a person a message, and guiding a person to a desired location. 2 Related Work Although there has been a variety of research on task learning, they often depend on learning from many examples or on off-line processing. For example, Kaiser [Kaiser, 2012] and Barbu et al. [Barbu et al., 2010] describe systems that extract the relevant goal-state predicates for games, by viewing many iterations of game play. Most learning from demonstration systems [Argall et al., 2009][Chao et al., 2011][Nicolescu and Mataric, 2003] for robotic tasks teach action sequences and not descriptions of goals or conditions. Closely related to our work is the work of Hinrich et al. [Hinrichs and Forbus, 2014], where a computer agent learns to play Tic-Tac-Toe from interactions and gestures with a human instructor. Figure 1: The Tabletop Version of Rosie. Acknowledgments The work described here was supported by the National Science Foundation under Grant Number 1419590 and the Office of Naval Research under Grant Number N00014-08-1-0099. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressly or implied, of the NSF, ONR, or the U.S. Government. References [Argall et al., 2009] B. Argall, S. Chernova, M. Veloso, and B. Browning. A Survey of Robot Learning from Demonstration. Robotics and Autonomous Systems, 2009. [Barbu et al., 2010] A. Barbu, S. Narayanaswamy, and J. Siskind. Learning physically-instantiated game play through visual observation. In IEEE International Conference on Robotics and Automation (ICRA), pages 1879 1886, 2010. [Chao et al., 2011] C. Chao, M. Cakmak, and A. Thomaz. Towards Grounding Concepts for Transfer in Goal Learning from Demonstration. In Proceedings of the International Conference on Development and Learning, 2011. [Hinrichs and Forbus, 2014] T. R. Hinrichs and K. D. For- bus. X goes first: Teaching simple games through multimodal interaction. Advances in Cognitive Systems, 3:31 46, 2014. [Kaiser, 2012] L. Kaiser. Learning games from videos guided by descriptive complexity. In AAAI, 2012. [Kirk and Laird, 2014] J. R. Kirk and J. Laird. Interactive task learning for simple games. Advances in Cognitive Systems, 3:11 28, 2014. [Laird, 2012] J. E. Laird. The Soar cognitive architecture. MIT Press, 2012. [Mohan et al., 2012] S. Mohan, J. R. Kirk, A. Mininger, and J. E. Laird. Acquiring Grounded Representations of Words with Situated Interactive Instruction. Advances in Cognitive Systems, 2:113 130, 2012. [Nicolescu and Mataric, 2003] M. Nicolescu and M. Mataric. Natural Methods for Robot Task Learning: Instructive Demonstrations, Generalization and Practice. In Proceedings of AAMAS, pages 241 248, 2003.