[campus icon] Accesskey [ h ] University of Paderborn - Home
DE deutsch
Die Universität der Informationsgesellschaft

GET-Forschungsseminar Abstracts

Hierarchical Reinforcement Learning in a Robot Rescue Domain

Kinan Mahdi, GET Lab

Vortrag: Mi. 02. Februar, 16:30, Raum P 1.4.17


One of the trickiest problems of AI research is to enable an autonomous agent to deal with unexpected situations. An adaptive behavior which is common to sophisticated biological systems (like mammals) is a pivotal matter for realization of this task. To achieve this behavior it could be useful to have a close look at how it originates in animals or humans. This knowledge should be transferred to a robotic system. Therefore it is relevant to address such biologically inspired paradigms as learning by trial and error and the re-use of already existing (similar) behavior. Another common way of learning is to split up a complex problem into smaller sub-problems and learn these problems individually. Current research in behavioral and neuroscience shows that biologically inspired learning can lead to hierarchical constructed learning which accelerates the learning process. Realization of complex tasks in real-world scenarios requires the autonomous operation of robotic agents and the adaptation of behaviors to changing situations. The computational Reinforcement Learning (RL) [SB98] provides the framework for attainment of these challenges. This biologically-inspired machine learning approach had also great impact on the research and understanding of numerous issues in psychology and neuroscience. Despite the great popularity of this learning method and its increased utilizations in robotics there are still some problems which limit the applicability of the RL in complex domains. Such domains involve a large search space with many possible features describing the states or the big variability of the available actions. One of the most influential approaches tackling the scaling problem is the temporal abstraction. Algorithms like the MAXQ Value Decomposition, the Options Framework or Hierarchies of Machines [PR98] which where developed in this frame extend the classic RL due to abstraction of temporally interrelated actions. These action groups or sequences obtain the status of a high-level skill. Compounded high-level skills arrange the new hierarchy of the given task which leads also to the abstraction and hierarchical subdivision of the search space. The temporal abstraction of RL is also named hierarchical reinforcement learning (HRL). Evaluation, realization and applying of HRL are the main issues of this thesis.