In artificial intelligence and computational cognitive science, 'the action selection problem' is typically related with intelligent agents and animats, that means software systems that exhibit a sort of complex behaviour in an artificial environment. The problem concerns agents’ deciding of “what actions to do in the next instant”. What exactly an action is depends on the level of abstraction used by the designer. At the basic level of abstraction, as an atomic action, one can find whatever from “movement of a cell” to “provoking a war between two kingdoms”. The set of possible actions is predefined and typically fixed.

There are several reasons why this problem is not trivial and so interests a lot of researchers both in artificial intelligence and cognitive science:

  • the agents typically act in dynamic and unpredictable environments
  • the agents typically acts in real time, therefore they must make decisions in a timely fashion
  • humans are typically allowed to interact with the agents (and can be prejudicial to them)
  • the agents are typically created to perform several tasks, which can be conflicting in terms of resources allocation
  • the agents are often models of animals/humans, and animal/human behaviour is quite complicated

Action selection mechanism

The mechanisms of action selection (ASM) is often thought as such “part” of the agent that makes decision about what to do next using a form of behavioural structures created by a designer in advance. These structures are often called plans or domain representations. In considering the future possibilities, the ASM perform in accordance with these structures while taking into the account also the agent’s actual state, i.e. its memory and needs.

In a broader sense, the ASM does not decide only about future agent’s actions, but also directs its perceptual attention, updates its memory and modifies the behavioural structures, i.e. carries out a form of behavioural learning possibly.

Note, that ASM is also sometimes referred as “agent architecture” or thought of as a substantial part of it.

General approaches to action selection

Generally, the mechanisms can be divided into three groups: reactive planning, classical planning and hybrid approaches. Reactive planning methods compute just one next action in every instant based on the current context and prescripted plans. On the contrary, classical planning methods do not use prescripted plans, instead they rely on a description of the domain (the virtual environment and the agent’s goals) and based on it, they compute a sequence of actions, so they essentially compute a plan (notice that “plan” in classical planning means really a sequence of actions, while “plan” in reactive planning is rather a structure that describes behaviour).

While reactive planning techniques operates in a timely fashion, classical planning methods suffer from combinatorial complexity and do not fit to dynamics environments well. However, reactive planning ASM is often limited to a fixed set of predetermined reactions, while a classical planner can cope with a new situation (to some extent). Therefore, hybrid techniques are often exploited. They are actually combinations of reactive planners with classical planners, or so-called anytime mechanisms.



Concrete architectures

Soar architecture

Soar is a |cognitive architecture being developed and extended from about middle of 80th. It is based on condition-action rules. One can use Soar programming toolkit for building both reactive and planning agents, or compromise at will between these two extremes.

Tyrrell architecture

It is a reactive planning architecture developed by Toby Tyrrell in 1993. The agent's behaviour is stored in the form of a hierarchical connectionism network, which Tyrrell named free-flow hierarchy. Recently exploited for example by [ de Sevin & Thalmann] (2005) or Kadleček (2001).


Creatures are virtual pets from a computer game driven by three-layered neural network. Their mechanism belongs to reactive planning branch since the network in every time step determines the task that has to be performed by the pet. The network is described well in the paper of Grand et al. (1997) and in The Creatures Developer Resources. See also Creatures wiki.

External links

LDAP: couldn't connect to LDAP server
guidelines/action_selection_problem.txt · Last modified: 2011/12/22 15:11 by michal.bida