| Authors | J. Lehman, J. Clune, D. Misevic, C. Adami, L. Altenberg, Julie Beaulieu, P. Bentley, Samuel Bernard, G. Beslon, David M. Bryson, P. Chrabaszcz, Nick Cheney, Antoine Cully, S. Doncieux, F. Dyer, K. Ellefsen, R. Feldt, Stephan Fischer, S. Forrest, Antoine Frénoy, Christian Gagné, L. L. Goff, L. Grabowski, B. Hodjat, F. Hutter, L. Keller, C. Knibbe, Peter Krcah, R. Lenski, H. Lipson, R. Maccurdy, Carlos Maestre, R. Miikkulainen, Sara Mitri, David E. Moriarty, Jean-Baptiste Mouret, Anh Totti Nguyen, Charles Ofria, M. Parizeau, David P. Parsons, Robert T. Pennock, W. Punch, T. S. Ray, Marc Schoenauer, E. Shulte, Karl Sims, Kenneth O. Stanley, F. Taddei, Danesh S. Tarapore, S. Thibault, Westley Weimer, R. Watson, Jason Yosinksi |
| Journal | Artificial Life |
| Year | 2018 |
| DOI | 10.1162/artl_a_00319 |
What Problem It Solves
This paper addresses the systematic under-documentation and undervaluation of unexpected, creative, or "surprising" outcomes in digital evolution experiments. In both evolutionary computation and artificial life research, the standard scientific narrative demands clean hypotheses, controlled experiments, and reproducible results. When evolving digital organisms produce outcomes that subvert the experimenter's expectations—exploiting bugs, discovering unintended strategies, or converging on natural-like behaviors—these events are typically treated as nuisances: bugs are fixed, experiments are refocused, and anomalies are collapsed into aggregate statistics. This creates a critical information loss problem. The field lacks a formal mechanism for capturing, preserving, and learning from these "evolutionary surprises," which may contain genuine insights about the nature of evolutionary processes, the robustness of computational substrates, and the universal properties of complex adaptive systems. The paper solves this by providing the first peer-reviewed, crowd-sourced, fact-checked repository of such anecdotes, transforming oral tradition into citable evidence. It addresses the meta-scientific challenge of how to systematically collect and legitimize anomalous observations that do not fit standard hypothesis-testing frameworks but nonetheless carry scientific value.
This paper addresses the systematic under-documentation and undervaluation of unexpected, creative, or "surprising" outcomes in digital evolution experiments. In both evolutionary computation and artificial life research, the standard scientific narrative demands clean hypotheses, controlled experiments, and reproducible results. When evolving digital organisms produce outcomes that subvert the experimenter's expectations—exploiting bugs, discovering unintended strategies, or converging on natural-like behaviors—these events are typically treated as nuisances: bugs are fixed, experiments are refocused, and anomalies are collapsed into aggregate statistics. This creates a critical information loss problem. The field lacks a formal mechanism for capturing, preserving, and learning from these "evolutionary surprises," which may contain genuine insights about the nature of evolutionary processes, the robustness of computational substrates, and the universal properties of complex adaptive systems. The paper solves this by providing the first peer-reviewed, crowd-sourced, fact-checked repository of such anecdotes, transforming oral tradition into citable evidence. It addresses the meta-scientific challenge of how to systematically collect and legitimize anomalous observations that do not fit standard hypothesis-testing frameworks but nonetheless carry scientific value.
The paper operates as a meta-scientific collection and synthesis, not a single experiment or algorithm. Its "method" is crowd-sourced anecdote collection with structured editorial review.
Intuition: In any complex evolving system—biological or digital—the space of possible adaptations is vast and non-intuitive. Researchers who build and study evolving digital systems routinely encounter outcomes that their design intentions did not anticipate. These "surprises" are scientifically valuable because they reveal properties of the evolutionary process that are not captured by the experimenter's mental model. However, standard scientific publishing has no venue for such observations: they are too rare for statistical analysis, too context-dependent for clean replication, and often emerge from "failed" experiments. The paper creates a dedicated venue by soliciting first-hand accounts from the research community, then organizing them thematically to extract cross-cutting insights.
Mechanics: The editorial team issued a public call for anecdotes to the artificial life and evolutionary computation communities. Submissions underwent a fact-checking process where editors verified the plausibility and internal consistency of each account, often consulting with the original researchers. The final collection includes 31 distinct anecdotes, each presented as a short narrative with the following structure: (1) the researcher's original intention or expectation, (2) what actually happened, (3) why it was surprising, and (4) what was learned. The anecdotes are grouped into thematic categories:
No formal equations: The paper is entirely qualitative. Its analytical contribution is the taxonomy of surprise types and the argument that these surprises are not anomalies but expected features of any sufficiently complex evolving system. The paper draws on concepts from evolutionary biology (convergent evolution, exaptation, evolutionary arms races) and computer science (emergent behavior, software bugs as evolutionary substrates) without formalizing them mathematically.
This paper is not a method in the traditional sense—it does not provide an estimator, algorithm, or experimental protocol. It is a meta-scientific resource. Use it as follows:
Prefer this paper over standard experimental design texts when: You are designing an evolutionary computation or artificial life experiment and want to anticipate the kinds of unexpected outcomes that can arise. The anecdotes serve as a "failure modes" checklist. For example, if you are evolving neural network controllers, the paper's examples of evolved "cheating" behaviors can alert you to potential vulnerabilities in your fitness function.
Prefer this paper over pure theory papers when: You need concrete examples of how evolutionary dynamics can produce counter-intuitive results in computational substrates. The anecdotes provide intuition that formal models (e.g., schema theory, fitness landscape analysis) may not capture.
Prefer standard statistical methods over this paper when: You need to make quantitative claims about evolutionary dynamics. This paper provides no effect sizes, confidence intervals, or hypothesis tests. It is a qualitative complement, not a replacement.
Prefer this paper over single-case studies when: You want a broad survey of the types of surprises that have been documented across different systems (genetic algorithms, genetic programming, artificial life simulations, evolutionary robotics). The collection provides cross-system patterns that a single experiment cannot.
Prefer this paper as a teaching resource when: Introducing students to evolutionary computation or artificial life. The anecdotes are engaging and illustrate that evolution is not a simple optimization process but a creative, often unpredictable one. They also serve as cautionary tales about the gap between design intention and emergent behavior.
Prefer this paper as a citation for "evolutionary surprises" when: Writing a paper that reports an unexpected outcome in your own evolutionary experiment. The paper provides a legitimizing framework—your anomaly is not a failure but a data point in a recognized phenomenon.
Selection bias: The anecdotes are self-reported. Researchers are more likely to report surprising outcomes that are interesting or flattering. Boring surprises, or surprises that reveal the researcher's own errors, are likely underreported. The paper acknowledges this but cannot correct it.
Verification difficulty: Many anecdotes describe events that occurred years ago, in code that may no longer exist. Fact-checking relies on the researcher's memory and any surviving documentation. The paper's editorial process mitigates but does not eliminate this problem.
No negative evidence: The paper does not report cases where evolution failed to be creative, or where surprises turned out to be artifacts. This creates an asymmetric picture. A reader might conclude that evolutionary surprises are ubiquitous, when in fact they may be rare relative to the total number of experiments run.
Definitional ambiguity: What counts as "surprising" is subjective. One researcher's surprise is another's expected outcome. The paper does not operationalize "surprise" in a way that would allow independent replication of the collection process.
No causal analysis: The anecdotes describe that surprises occurred but rarely provide rigorous causal analysis of why. For example, a digital organism that evolves to exploit a bug may do so because of the specific selection pressure, the mutation rate, the population size, or pure chance. The anecdotes do not disentangle these factors.
Temporal degradation: The paper was published in 2018. As evolutionary computation platforms evolve (e.g., from Tierra to modern GPU-accelerated systems), the specific types of surprises may change. The paper's taxonomy may need updating for contemporary systems.
Cultural specificity: The anecdotes come from a specific research community with shared norms and expectations. Surprises that would be obvious to an outsider (e.g., that evolution can produce complex behavior from simple rules) are treated as remarkable. The paper's framing may not generalize to other scientific communities.
No quantitative framework: The paper provides no way to predict when surprises will occur, how often, or how severe they will be. It is purely descriptive. For researchers who need actionable guidance (e.g., "how do I design a fitness function that minimizes the risk of bug exploitation?"), the paper offers intuition but no algorithm.
Related papers
Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning
Felipe Petroski Such, Vashisht Madhavan, Edoardo Conti +3 more · 2017
PaperEvolution Strategies as a Scalable Alternative to Reinforcement Learning
Tim Salimans, Jonathan Ho, Xi Chen +1 more · 2017
PaperThe CMA Evolution Strategy: A Tutorial
Nikolaus Hansen · 2016
PaperPopulation Based Training of Neural Networks
Max Jaderberg, Valentin Dalibard, Simon Osindero +9 more · 2017