Abstract: Designing an adequate fitness function requires substantial knowledge of a problem and of features that indicate progress towards a solution. Coevolution takes the human out of the loop by dynamically constructing the evaluation function based on interactions between evolving individuals. A question is to what extent such automatic evaluation can be adequate. We define the notion of an ideal evaluation function. It is shown that coevolution can in principle achieve ideal evaluation. Moreover, progress towards ideal evaluation can be measured. This observation leads to an algorithm for coevolution. The algorithm makes stable progress on several challenging abstract test problems.
Abstract:
Genetic algorithms generally use a fixed problem representation that maps variables of the search space to variables of the problem, and operators of variation that are fixed over time. This limits their scalability on non-separable problems. To address this issue, methods have been proposed that coevolve explicitly represented modules. An open question is how modules in such coevolutionary setups should be evaluated.
Recently, Pareto-coevolution has provided a theoretical basis for evaluation in coevolution. We define a notion of functional modularity, and objectives for module evaluation based on Pareto-Coevolution. It is shown that optimization of these objectives maximizes functional modularity.
The resulting evaluation method is developed into an algorithm for variable length, open ended development of representations called {\em DevRep}.
DevRep successfully identifies large partial solutions and greatly outperforms fixed length and variable length genetic algorithms on several test problems, including the 1024-bit Hierarchical-XOR problem.
Abstract:
An important question in reinforcement learning is how generalization may be performed. This problem is especially important if the learning agent receives only partial information about the state of the environment. Typically, the bias required for generalization is chosen by the experimenter. Here, we investigate a way for the {\em learning method} to extract bias from learning one problem and apply it in subsequent problems. We use a gradient-based policy search method, and look for controllers that consist of a context component and an action component. Empirical results on a two-agent coordination problem are reported. It was found that learning a bias made it possible to address problems that were not solved otherwise.
Abstract: The representation of a learning or search problem can greatly impact its performance. An alternative to hand-constructing an appropriate representation is to let the learning method adapt its own representation. We investigate this in a setup where building blocks and assemblies thereof are coevolved. Building blocks may employ other building blocks, thus leading to hierarchical constructions. In experiments, this is observed to lead to highly compact representations of sequences that are useful as building blocks. The method is able to solve problems requiring long sequences of primitive operators. Control experiments using a conventional evolutionary method were much less efficient, and did not find solutions to the problems. Limitations of the current method are discussed.
Abstract: Since architectures and weights for recurrent neural networks are difficult to design, evolutionary methods may be applied to search the space of such networks. However, for all but trivial problems, this space is very large. Hence, biases are required that guide the search. Here, we investigate solving a smaller related problem to establish such a bias. Networks are specified by trees containing operators that act on nodes (neurons) and edges (connections). We demonstrate the approach on a signal reproduction task that requires internal state. Performance on a small problem size was improved by solving a smaller problem first. By repeatedly applying the principle, versions of the problem were solved that were not solved by a direct approach.
Abstract: Two important problems in genetic programming (GP) are its tendency to find unnecessarily large trees (bloat), and the general evolutionary algorithms problem that diversity in the population can be lost prematurely. The prevention of these problems is frequently an implicit goal of basic GP. We explore the potential of techniques from multi-objective optimization to aid GP by adding explicit objectives to avoid bloat and promote diversity. The even 3, 4, and 5-parity problems were solved efficiently compared to basic GP results from the literature. Even though only non-dominated individuals were selected and populations thus remained extremely small, appropriate diversity was maintained. The size of individuals visited during search consistently remained small, and solutions of what we believe to be the minimum size were found for the 3, 4, and 5-parity problems.
Abstract: The development of communication in a population of agents is viewed as the behavior of a dynamical system. A deterministic communication system is shown experimentally to have point attractors that correspond to perfect communication. However, the determinism required for the presence of point attractors impedes the development of communication. A corresponding attractor type for the stochastic system is defined, and the existence of these attractors in a stochastic version of the system is demonstrated.
Abstract: We study the evolution of communication where concepts are developed individually by agents and relations between concepts and forms (words, signals) are learned through interaction with other agents. By constructing concepts based on experience with the same environment, agents develop similar conceptual systems. Concepts represent situations in the environment. The system of associations between forms and meanings is viewed as a dynamical system. This paper presents first results with investigating the phase space of the system. The analysis contributed to understanding the interaction between association strengths of different agents and of different meanings.
Abstract: A model for the formation of situation concepts is described. A characteristic of this form of concept formation is that it does not require instructive feedback. This renders it suitable for concept formation by autonomous agents. It is experimentally demonstrated that situation concepts constructed independently by several agents can convey useful information between agents through a learned system of communication. A relation was found between the development of the learned system of communication and the duration of the situations.
Abstract:
Sensory channels determine the way an agent views the world. We investigate the question of how sensory channels may be autonomously constructed using generation and selection. The context is the discrimination of geometric shapes. In a first experiment, elements of a solution were attributed fitness based on the part of the problem they solved. In two subsequent experiments, cooperation between elements was respectively required and encouraged by means of a fitness function which only rewards complete solutions. Differences between the approaches are discussed, and generation and selection is concluded to provide a successful mechanism for the autonomous construction of sensory channels.
Abstract: This paper reports on research into the origins of communication and coordination. Several problems with defining communication and coordination are
noted. A research methodology is described that circumvents these problems.
The methodology is used in an experiment concerning the development of coordination.
The aim of the experiment is to see whether a learning agent can use coordination signals, which represent evaluations of its behavior, to learn to coordinate its actions in an unknown environment. The task is a pursuit problem where four agents are needed to capture a randomly moving prey. One of these agents adapts its behavior based on the coordination signals it receives from the three other agents. The development of coordination increased the capture rate in this pursuit problem from an initial 5% to 93%. Thus, in combination with a general learning mechanism, coordination signals may be sufficient for the development of coordination.
Abstract:
This paper investigates whether a group of agents may develop a common
lexicon relating words to situations by a process of self-organization.
Each agent independently decides which situations are useful to
distinguish, based on its experience with the environment.
It then starts to associate signals with each situation.
The agents adapt their own associations based on the signals they received from other agents.
The system is monitored using measures which reflect the development of the lexicons over time.
The result of the distributed activities of the agents is that a coherent shared lexicon
emerges linking signals to situations.
Abstract: In order to discriminate among the different objects
in its environment, an agent may develop a primitive notion of concepts
based on the sensor data it receives. In this paper, this phenomenon is
investigated by having software agents play discrimination games with
the sensor data of autonomous robots. We have compared the Simple Prototype
method and the Adaptive Subspace method. Both methods achieve high discrimination-success
rates. The Adaptive Subspace method accomplishes this with a converging
and relatively low number of categories. The purpose of these discrimination
games is to serve as a basis for lexicon formation experiments. From the
experiments in this paper, we conclude that the Adaptive Subspace method
is more attractive for discrimination games.
Abstract: A framework for coordination in multi-agent systems is introduced. The main idea of our framework is that an agent with knowledge about the desired behavior in a certain domain will direct other, domain-independent agents by means of signals which reflect its evaluation of the coordination between its own actions and their actions. Mechanisms for coordination are required to enable construction of open multi-agent systems. The goal of this investigation was to test the feasibility of guiding an agent with coordination evaluation signals, and furthermore to gather experience with instantiating the framework on a testbed domain, the Pursuit Problem. In the testbed system, agents have been created which choose their actions by maximizing the coordination evaluation signals they will receive. The performance of these agents turned out to rank among the best results encountered in literature, and behavior guided by coordination evaluation signals can thus be concluded to be useful in this domain.
Back to Edwin de Jong's homepage