This webpage provides a description, MatLab implementation, and demonstration of the DELPHI algorithm for coevolution, which is described in [DeJ04]. For background information on accurate evaluation in coevolution, see here.
The DELPHI algorithm operates as follows. There are two populations: a population of learners and a population of tests. At every cycle, a new generation of learners and tests is produced. To determine which of the newly generated learners and tests will replace existing individuals and which ones will be discarded, all individuals are evaluated. In this evaluation, the objectives of a learner are its outcomes against all tests in the test population. The objectives of a tests are the distinctions it makes between learners.
After evaluation, replacement works as follows. A new learner replaces an existing learner if there is some existing learner which it dominates. A new test can only replace its own parent, and also does so if it dominates it. A test can be described as a hill-climber that leaves its current position in genotype space if the next position (its offspring) is better in the sense of Pareto-dominance; thus, a test can be described as a Pareto-hillclimber. Thus, given a population of learners L and a population of tests E, new learners are evaluated based on the tests in E and can replace any learner they dominate, while tests are Pareto-hillclimbers that use the distinctions between the learners in L as their objectives. The method is therefore called DELPHI, which stands for Dominance-based Evaluation of Learners on Pareto-Hillclimbing Individuals. For a detailed description, please see the pseudo-code or download the Matlab implementation.
2. Demonstration of DELPHI
2.1 The Compare-on-one Problem
A central problem in coevolution which many instances of failure can be derived from is inaccurate evaluation. To test whether a coevolution method performs accurate evaluation, we need a test problem that makes accurate evaluation difficult. That is, the problem should make it difficult to maintain informative tests. Two features that make it difficult to maintain informative tests are:
There are multiple underlying objectives on which learners must be tested
Tests that test on different underlying objectives are in different regions of the genotype space.
If these two properties hold, then accurate evaluation requires maintaining evaluators in different regions of the space. We have designed a problem of variable dimensionality that possesses these features. The problem is called compare-on-one, as a test assess a learner in only one underlying dimension of the problem. Thus, at least one test for each dimension is required to evaluate a learner on all it's underlying objectives. What's more, the regions in which these different tests are located are disconnected, and diverge to increasing distances from each other as learner performance improves.
Illustration of the compare-on-one problem. The figure shows the tests that are passed by the learner.
The compare-on-one problem works as follows. Both learners and tests are vectors containing n real values, and the game is a variant of the Numbers games introduced by Richard Watson [Wat01]. A test assesses the performance of a learner only in the dimension in which the test itself has its largest value. The test returns an outcome of 1 if the learner is greater or equal to the test's value in this dimension, and -1 otherwise. As a result of this, a test whose highest value is in the horizontal (x) dimension (below the diagonal) tests learners only on their x-dimension, while a test whose highest value is in the vertical (y) dimension (above the diagonal) tests the learners on their y-dimension only.
The figure shows the tests that are able to distinguish between the two learners; the tests testing on different dimensions are in disconnected regions of the space. This makes it difficult to maintain tests for all underlying objectives. Thus, accurate evaluation is difficult to obtain for the compare-on-one problem, as desired.
2.2 Results on the Compare-on-one Problem
The following figure shows the end result of running the MatLab implementation of the DELPHI algorithm on the 2-dimensional compare-on-one game for 250 generations. In the experiment, mutation adds a constant chosen randomly from [-0.2 .. 0.1] to each dimension. Thus, without an appropriate selection mechanism, both learners and tests are more likely to go down than up. The strict selection mechanism of DELPHI allows the learners to make overall progress. The tests thereby function as a ratchet by detecting regress and posing increasingly high thresholds in all dimensions.
This experiment illustrates the operation of the DELPHI algorithm, but the importance of accurate evaluation becomes clearer for higher-dimensional versions, such as the 5-dimensional compare-on-one game. While DELPHI still makes stable progress on this problem, none of the other coevolutionary methods we have compared was able to make progress on all underlying objectives.
This experiment may be tried using the code that can be downloaded below. A movie of the complete experiment can be viewed here.
DELPHI on the compare-on-one problem. The tests traverse the axes, the learners (top right) use the evaluation provided by the tests to distinguish progress from regress, and thereby achieve progress in both underlying objectives.
The following two figures show DELPHI and eight comparison methods on the five-dimensional compare-on-one problem with mutation bias. Only Delphi and a method based on the same principles (but allowing a learner to replace its own parent only) are able to achieve consistent progress on this difficult test problem. For details, see [DeJ04].
2.3 Finding the Underlying Objectives
The test in the previous experiment traverse the axes. This suggests that the DELPHI algorithm identifies the underlying objectives of the compare-on-one problem. However, there is a direct correspondence between the underlying objectives and the genetic representation of the individuals. Therefore, we perform a control experiment where there is no such direct correspondence. This is done as follows. Before the outcome of a learner on the compare-on-one problem is computed, we rotate both the learner and the test 30 degrees clockwise. If the individuals are to find the optimal directions of progress in this case, they would have to move along directions rotated 30 anti-clockwise in order to counteract the rotation.
As the figure shows, the direction in which the learners and tests are moving has indeed rotated anti-clockwise by approximately 30 degrees. This further supports the earlier evidence that the tests are extracting the underlying objectives of the problem.
The experiment can be repeated using the code below.
A movie of the complete experiment can be viewed here.
DELPHI on the rotated compare-on-one problem.
3. Pseudo-code for DELPHI
The following pseudo-code describes the DELPHI algorithm:
DELPHI()
initialize(Lpop); %randomly initialize populations
initialize(Tpop);
for gen=1:generations
for i=1:n
Lgen[i] := mutate(Lpop[i]); %create offspring using mutation
Tgen[i] := mutate(Tpop[i]);
end
L := Lpop+Lgen;
T := Tpop+Tgen;
Tpop := evolve-tests(L, Tpop, Tgen);
Lpop := evolve-learners(Lpop, Lgen, T);
end
end
4. MatLab Implementation of DELPHI
A MatLab(TM) implementation of the DELPHI algorithm is provided here. The code has also been tested to work under Octave. Octave can be downloaded for free from www.octave.org.
The complete distribution is available as a tar-file here. Please consult the README file contained in this distribution for instructions.
This distribution contains the following files:
The DELPHI algorithm is in the first place
a proof-of-principle, designed with the aim of demonstrating that
coevolutionary evaluation can be sufficiently accurate to enable
stable progress on multiple underlying objectives. For a detailed discussion of DELPHI's limitations, please consult [DeJ04].
A particular limitation of the DELPHI algorithm is
its limited degree of exploration. The development of practical, reliable
algorithms for coevolution will require combining accurate evaluation with exploration in a single algorithm.
6. Contact Information
Comments on this work are very welcome. If you have any questions or comments, please send email to: d e j o n g @ c s . u u . n l (without spaces)
References
[DeJ04] De Jong, E.D. and J.B. Pollack (2004). Ideal Evaluation from Coevolution [PS] [PDF]. This is the final manuscript of the article. The final version of this article will be published in Evolutionary Computation, Vol. 12, Issue 2, published by The MIT Press.
[Wat01] Watson, R.A. and Pollack, J.B. (2001). Coevolutionary Dynamics in a Minimal Substrate. Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-'01). L. Spector and E. Goodman and A. Wu and W.B. Langdon and H.-M. Voigt and M. Gen and S. Sen and M. Dorigo and S. Pezeshk and M. Garzon and E. Burke (eds.), pp.702-709. Morgan Kaufmann, San Francisco, CA.