What Evolutionary Algorithms Can and Can’t Do

45 min readMar 20, 2019

Computer Engineering and Computer Science

University of Louisville

Abstract

In this paper, we review the state-of-the-art results in evolutionary computation and observe that we don’t evolve non-trivial source code from scratch and with no human intervention. A number of possible explanations are considered, but we conclude that computational complexity of the problem prevents it from being solved as currently attempted. A detailed analysis of necessary and available computational resources is provided to support our findings as well as analysis of neuroevolution as a method to generate software of the future.

Keywords: Darwinian Algorithm, Genetic Algorithm, Genetic Programming, Optimization, Neuroevolution.

1. Introduction

In 1859 Charles Darwin (Darwin 1859) and many scholars before him (Lamarck 1809, Chambers 1853) have proposed theories to explain the origins of complex life forms via natural selection and modification. Scientific theories are algorithms which given as input starting conditions make statistically accurate predictions of the future state of the system. For example, computer simulations of continental drift give us positions of continents at some time t. Yampolskiy emphasizes importance of such simulations, “A scientific theory cannot be considered fully accepted until it can be expressed as an algorithm and simulated on a computer. It should produce observations consistent with measurements obtained in the real world, perhaps adjusting for the relativity of time scale between simulation and the real world. In other words, an unsimulatable hypothesis should be considered significantly weaker than a simulatable one. It is possible that the theory cannot be simulated due to limits in our current computational capacity, hardware design, or capability of programmers and that it will become simulatable in the future, but until such time, it should have a tentative status.” (Yampolskiy 2017).

Simulations of Darwinian algorithm on a computer are known as Evolutionary Algorithms (EA) and have been around since the early days of computer science (Fogel, Owens, and Walsh 1966, Barricelli 1957), with popular sub-fields such as Genetic Algorithms (GA), Genetic Programming (GP), Evolutionary Strategy (ES) and Artificial Life (AL). Currently, the state of performance in all of the above-mentioned areas is orders of magnitude less complex than what we observe in the natural world, but why?

A number of seminal papers have been published attempting to formalize Darwin’s biological theory from the point of view of computational sciences. Such works essentially see biological evolution as a computational process running on a carbon-based substrate, but which can be run on other substrates. Valiant in his work on evolvability (Valiant 2009) treats Darwinian evolution as a learning process over mathematical functions and attempts to explain quantitatively which artifacts can be evolved with given resources, and which can not. Likewise, Chaitin in his work on metabiology (Chaitin 2013, 2012) attempts to develop an abstract fundamental mathematical theory of evolution. Wolfram in his, “a New Kind of Science” (Wolfram May 14, 2002), attempts to show how rules of computational universe of simple programs can be used to explain some of the biological complexity we observe. Livnat and Papadimitriou analyze sex as an algorithm, in their work on the theory of evolution viewed through the lens of computation (Livnat and Papadimitriou 2016).

It is interesting to do a thought experiments and try to imagine what testable predictions Charles Darwin would have made, had he made his discovery today, with full knowledge of modern bioinformatics and of computer science. His predictions may have included the following: 1) Simulations of evolution will produce statistically similar results at least with respect to complexity of artifacts produced. 2) If running evolutionary algorithms for as long as possible continued to produce non-trivial outputs, scientists would run them forever. Likewise, he would be able to make some predictions, which would be able to falsify his theory, such as: 1) Representative simulations of evolution will not produce similar results to those observed in nature. 2) Researchers will not be able to evolve software or other complex or novel artifacts. 3) There will not be any projects running evolutionary algorithms long-term because their outputs would quickly stop improving and stabilize. With respect to the public and general cultural knowledge, it would be reasonable to predict that educated people would know the longest-running evolutionary algorithm, and the most complex evolved algorithm. Similarly, even schoolchildren would know the most complex digital organism ever evolved.

In the rest of the paper we evaluate the state-of-the-art in relevant published research to see how above mentioned predictions and counterfactuals hold up and what we can say about the foundational question of this paper. We analyze potential explanations for the current observations of progress in the domain of EAs and look at computational resources, required and available, as the main source of limitations and future opportunities for evolving software source code. We conclude that Evolutionary Algorithms are powerful tools capable of addressing many optimization problems, improve existing designs and are competitive with the best machine learning algorithms in optimization of neural networks, but they are not competitive in novel engineering design or synthesis of human readable source code.

2. Evolutionary Computation

Inspired by the Darwin’s theory (Darwin 1859) of biological evolution, evolutionary computation attempts to automate the process of optimization and problem solving by simulating differential survival and reproduction of individual solutions. From the early 1950s multiple well documented attempts to make Darwin’s algorithm work on a computer have been published under such names as Evolutionary Programming (Back 1996), Evolutionary Strategies (Mayr 1974), Genetic Algorithms (Davis 1991), Genetic Programming (Koza 1994), Genetic Improvement (Petke et al. 2017), Gene Expression Programming (Ferreira 2006), Differential Evolution (Storn and Price 1997), Neuroevolution (Such et al. 2017) and Artificial Embryogeny (Stanley and Miikkulainen 2003). While numerous variants different in their problem representation and metaheuristics exist (Yampolskiy, Ashby, and Hassan 2012, Yampolskiy and El-Barkouky 2011, Khalifa and Yampolskiy 2011, Lowrance, Abdelwahab, and Yampolskiy 2015), all can be reduced to just two main approaches — Genetic Algorithm (GA) and Genetic Programming (GA).

GAs are used to evolve optimized solutions to a particular instance of a problem such as Shortest Total Path (Hundley and Yampolskiy April 28–29, 2017), Maximum Clique (Ouch, Reese, and Yampolskiy 2013), Battleship (Port and Yampolskiy 2012), Sudoku (Hughes and Yampolskiy 2012), Mastermind (Khalifa and Yampolskiy 2011), Light Up (Ashby and Yampolskiy 2011), Graph Coloring (Hindi and Yampolskiy 2012), integer factorization (Yampolskiy 2010, Mishra, Pal, and Yampolskiy 2016) or efficient halftone patterns for printers (Yampolskiy et al. 2004) and so are not the primary focus of this paper. GPs purpose, from their inception, was to automate programming by evolving an algorithm or a program for solving a particular class of problems, for example an efficient (Yampolskiy 2013a) search algorithm. Software design is the type of application most frequently associated with GPs (Rylander, Soule, and Foster 2001), but work in automated programming is also sometimes referred to as “real programing”, “object-oriented GP”, “algorithmic programming”, “program synthesis”, “traditional programming”, “Turing Equivalent (TE) programming” or “Turing-complete GP” (White et al. 2013, Woodward and Bai 2009, Helmuth and Spector 2015).

The sub-field of computation, inspired by evolution in general, and Genetic Programing paradigm, established by John Koza in 1990s , in particular are thriving and growing exponentially as evidenced both by the number of practitioners and of scientific publications. Petke et al. observe “…enormous expansion of number of publications with the Genetic Programming Bibliography passing 10,000 entries … By 2016 there were nineteen GP books including several intended for students …” (Petke et al. 2017). Such tremendous growth has been fueled, since early days, by belief in capabilities of evolutionary algorithms, and our ability to overcome obstacles of limited compute or data as illustrated by the following quotes:

· “We will (before long) be able to run genetic algorithms on computers that are sufficiently fast to recreate on a human timescale the same amount of cumulative optimization power that the relevant processes of natural selection instantiated throughout our evolutionary past … ” (Shulman and Bostrom 2012)

· “As computational devices improve in speed, larger problem spaces can be searched.” (Becker and Gottschlich 2017).

· “Evolution is a slow learner, but the steady increase in computing power, and the fact that the algorithm is inherently suited to parallelization, mean that more and more generations can be executed within practically acceptable timescales.” (Eiben and Smith 2015).

· “We believe that in about fifty years’ time it will be possible to program computers by means of evolution. Not merely possible but indeed prevalent.” (Orlov and Sipper 2011a). “The relentless iteration of Moore’s law promises increased availability of computational resources in future years. If available computer capacity continues to double approximately every 18 months over the next decade or so, a computation requiring 80 h will require only about 1% as much computer time (i.e., about 48 min) a decade from now. That same computation will require only about 0.01% as much computer time (i.e., about 48 seconds) in two decades. Thus, looking forward, we believe that genetic programming can be expected to be increasingly used to automatically generate ever-more complex human-competitive results.” (Koza 2010).

· “The production of human-competitive results as well as the increased intricacy of the results are broadly correlated to increased availability of computing power tracked by Moore’s law. The production of human-competitive results using genetic programming has been greatly facilitated by the fact that genetic algorithms and other methods of evolutionary computation can be readily and efficiently parallelized. … Additionally, the production of human-competitive results using genetic programming has facilitated to an even greater degree by the increased availability of computing power, over a period of time, as tracked by Moore’s law. Indeed, over the past two decades, the number and level of intricacy of the human-competitive results has progressively grown. … there is, nonetheless, data indicating that the production of human-competitive results using genetic programming is broadly correlated with the increased availability of computer power, from year to year, as tracked by Moore’s Law.” (Koza 2010).

· “… powerful test data generation techniques, an abundance of source code publicly available, and importance of nonfunctional properties have combined to create a technical and scientific environment ripe for the exploitation of genetic improvement.” (Becker and Gottschlich 2017).

3. State-of-the-Art with Respect to Predictions

In order to establish the state-of-the-art in evolutionary computation we examined a number of survey papers (Kannappan et al. 2015, Koza 2010) and seminal results (Clune et al. 2008, Ray 1991, David et al. 2014, Altshuler and Linden 1997, Bird and Layzell 2002, Hauptman and Sipper 2005) looking at produced human-competitive results, as they are meant to represent the greatest accomplishments of the field. While, on the surface the results may seem impressive, deeper analysis shows complete absence of success in evolving non-trivial software from scratch and without human assistance. It is of course necessary to be precise about what it is we are trying to measure or detect, as to avoid disagreements resulting from ambiguity in terms being used.

It may be difficult to formally specify what makes a piece of software non-trivial, but intuitively-attractive measure of length of the program expressed as the number of lines of code is not a sufficient indicator of complexity, as it could have an extremely low Kolmogorov complexity (Kolmogorov 1965). Inspired by the Turing Test (Turing 1950, Yampolskiy 2013b), which is based on inability to distinguish output from a person and a computer, we propose defining non-trivial software as such which would take an average experienced human programmer at least a full hour of effort to produce if given the same problem to solve. If the solution source code could be produced with significantly less effort (ex. 1 minute), it may not be sufficiently complex and the problem may be deemed trivial for our purposes. Our approach to specifying triviality would exclude “Hello World” and most other toy programs/problems from consideration, which is exactly what we wanted to achieve as the main benefit from being able to evolve software would come from ability to replace full time programmers.

With regards to the other two conditions, they are much easier to specify. From “scratch” means that we are not starting with an existing version of a program (but are happy to rely on existing APIs, subject to the non-triviality of all newly produced code). Without human assistance can be interpreted to mean that the programmer is working alone, or a team of programmers is working an equivalent amount of time, for example two programmers would each need at least 40 minutes to solve the problem, which implies a small communication overhead.

Reading early claims about capabilities of EA feels just like reading early predictions from AI literature (Armstrong and Sotala 2015). Some early success is projected into the future by assuming that the same rate of progress with continue and it is claimed that complete success is just years away. However, just like with early AI, the claims are inflated, unsupported, overly optimistic, phrased in deceptive and metaphoric language, and the solutions do not scale to the real world problems. Perhaps an EA “winter” is long overdue. Here is how Koza, presents the state of the field in 1994: “… in this article, we will present a number of examples from various fields supporting the surprising and counter-intuitive notion that computers can indeed by programmed by means of natural selection. We will show, via examples, that the recently developed genetic programming paradigm provides a way to search the space of all possible programs to find a function which solves, or approximately solves, a problem.” (Koza 1994).

Sixteen years later he reports results of what he calls an ‘extraordinary long experiment’: “An additional order-of-magnitude increase was achieved by making extraordinarily long runs on the largest machine (the 1,000-node machine). … The length of the run that produced the two patentable inventions was 28.8 days — almost an order-of-magnitude increase (9.3 times) over the overall 3.4-day average for typical runs of genetic programming that our group had been making at the time.” (Koza 2010). One quickly realizes that most improvements in the field simply come from using more compute to search progressively larger parts of the solutions space, a result similar to the one expected for random search algorithm.

Here is an example of overhyped and ambiguous reporting of results, from recent work on EA. Researchers Becker and Gottschlich (Becker and Gottschlich 2017) go from naming their paper — “AI Programmer: Autonomously Creating Software Programs Using Genetic Algorithms” to abstract “AI Programmer, that can automatically generate full software programs requiring only minimal human guidance.” To claiming that “Using AI Programmer, we were able to generate numerous complete software programs.” Finally in experimental results they state what they managed to produce “A generated program that outputs ‘hello’ ” or performs addition operation. (Becker and Gottschlich 2017). But even that is a bit of a hype, “Rather than starting with “Hello World”, we first had AI Programmer create a more simplistic program that simply output “hi.” It was successfully after 5,700 generations…” (Becker and Gottschlich 2017). Even this trivial one-liner was not a clean success. “The generated program fulfilled its requirement to output the target text, but interestingly included subsequent random characters, which contained parsing errors, including nonmatching brackets.” (Becker and Gottschlich 2017). An identical program but the one printing “I love all humans” took 6,057,200 generations (Becker and Gottschlich 2017).

Perhaps it is unfair to pick on this particular paper, which is only available as an unreviewed pre-print, but we selected it because it is highly representative of the type of work frequently published in GP, and its extremeness makes problems clear to identify. If its title was “Brute Forcing Strings” it would be a reasonable work on that subject, but like so many others, authors claim to “Autonomously Creating Software Programs” using evolutionary computation, a claim which is never substantiated in any published literature on this subject. Here are similar results from the 2015 “General Program Synthesis Benchmark Suite” by Helmuth and Spector: “Of the seven problems on which PushGP found no generalizing solution, most are not surprising in that they involve extensive use of multiple programming constructs, the linking of many distinct steps, or a deceptive fitness space where fitness improvements do not lead toward perfect programs. We have written solutions to each of the unsolved problems by hand …” (Helmuth and Spector 2015). They go on to conclude: “While some problems can be solved with programs containing fewer than 10 instructions, none would likely be found using brute force search over our instruction sets within the number of program evaluations allowed here. Searching over size 5 programs using the Number IO instruction set would require evaluating over 7 billion programs, much more than the 5 million we used in our GP runs. Other problems have smallest known solutions of over 20 instructions using instruction sets with more than 100 instructions, to our knowledge beyond the reach of all other program synthesis systems.” (Helmuth and Spector 2015).

Others have expressed similar skepticism:

· “We examine what has been achieved in the literature, and find a worrying trend that largely small toy-problems been attempted which require only a few line of code to solve by hand.” (Woodward and Bai 2009).

· “A full understanding of open-ended evolutionary dynamics remains elusive” (Soros and Stanley 2014).

· “A literature review has revealed that only a small fraction of the papers on GP deal with evolving TE computer programs, with the ability to iterate and utilize memory, while the majority of papers deal with evolving less expressive logical or arithmetic functions.” (Woodward and Bai 2009). “We conclude that GP in its current form is fundamentally awed, when applied to the space of TE programs.” (Woodward and Bai 2009). “Computer code is not as robust as genetic code, and therefore poorly suited to the process of evolution, resulting in a insurmountable landscape which cannot be navigated effectively with current syntax based genetic operators. Crossover, has problems being adopted in a computational setting, primarily due to a lack of context of exchanged code. A review of the literature reveals that evolved programs contain at most two nested loops, indicating that a glass ceiling to what can currently be accomplished.” (Woodward and Bai 2009).

· “There are many problems that traditional Genetic Programming (GP) cannot solve, due to the theoretical limitations of its paradigm. A Turing machine (TM) is a theoretical abstraction that express the extent of the computational power of algorithms. Any system that is Turing complete is sufficiently powerful to recognize all possible algorithms. GP is not Turing complete.” (Teller 1994).

Even a survey of GP community itself produced the following feedback regarding current problems being worked on:

• “Far too many papers include results only on simple toy problems which are often worse than meaningless: they can be misleading”;

• “(we should exclude) irrelevant problems that are at least 20 years old”;

• “get rid of some outdated, too easy benchmarks”;

• “the standard ‘easy’ Koza set should not be included”

• “[it is] time to move on”. (White et al. 2013).

In practice GPs are used in the same way as GAs, for optimization of solutions to particular problems or for function optimization (Woodward, Johnson, and Brownlee 2016, Poli et al. 2008, Woodward 2003, Teller 1994, Woodward and Bai 2009, White et al. 2013) or for software improvement (Orlov and Sipper 2011b).

With regards to Darwin’s hypothetical predictions raised in the introduction we can state the following:

Prediction. Simulations of evolution will produce statistically similar results at least with respect to complexity of artifacts produced. Status. False as of 2019.

Prediction. If running evolutionary algorithms for as long as possible continued to produce non-trivial outputs, scientists would run them forever. Status. False as of 2019.

Prediction. Representative simulations of evolution will not produce similar results to those observed in nature. Status. True as of 2019.

Prediction. Researchers will not be able to evolve software or other complex or novel artifacts. Status. True as of 2019.

Prediction. There will not be any projects running evolutionary algorithms long-term because their outputs would quickly stop improving and stabilize. Status. True as of 2019.

Prediction. With respect to the public and general cultural knowledge, it would be reasonable to predict that educated people would know the longest-running evolutionary algorithm, and the most complex evolved algorithm. Status. False and False as of 2019.

Prediction. Similarly, even schoolchildren would know the most complex digital organism ever evolved. Status. False as of 2019.

Looking at outcomes from the made predictions we observe that all predictions are false as of 2019 and all counterfactuals are true as of the same year as long as we look only at non-trivial products of evolutionary computations. We are not evolving complex artifacts, we are not running evolutionary algorithms for as long as possible, we are not evolving software, and the public is unaware of most complex products of evolutionary computation. On close examination all “human-competitive” results turn out to be just optimizations, never fully autonomous programming leading to novel software being engineered.

4. Possible Explanations

A number of possible explanations for “Why we don’t evolve software?” could be considered. We tried to be as comprehensive as possible in this section, but it is possible that we have not considered some plausible explanations.

· Incompetent programmers

It is theoretically possible, but is highly unlikely, that out of thousands of scientists working on evolutionary computation all failed to correctly implement the Darwinian algorithm.

· Non-representative algorithms

Some (Teller 1994) have suggested that evolutionary algorithms do not accurately capture the theory of evolution, but of course, that would imply that the theory itself is not specified in sufficient detail to make falsifiable predictions. If on the other hand, such more detailed specifications are available to GP believers, it is up to them to implement them as computer simulations for testing purposes, but no successful examples of such work is known and the known ones have not been successful in evolving software.

· Inadequate fitness functions

Fitness function for a complex software product is difficult to outline and specify and may be as complex (or even more complex) as the software we want to evolve as it has to consider all the possible use cases and pass all unit tests. This may be the Achilles heel of GP, but it is also an objection to feasibility of programming in general and GP in particular, as both have to convert software specification into the source code. If human programmers and biological evolution succeed with such constraints so should Darwinian simulations.

· The Halting problem

Turing proved (Turing 1936) that it is impossible to determine if an arbitrary program halts, but this is also a problem for human programmers and could be easily address by placing time limits on considered solutions.

· Program correctness

If we require evolved software to be provably correct this would present a problem as GP doesn’t verify produced designs, but only tests them against specific unit tests. Likewise we can’t rely on automated software verification as it is still an unsolved problem(Yampolskiy 2017) in the general case. This is not really a problem as majority of human written software is never proven to be correct and only a small portion of software engineering process relies of formal specification and Test Driven Development.

· Inappropriate solutions

Literature on EA is full of examples(Lehman et al. 2018) of surprising creativity of Darwinian algorithm resulting in solutions which match the letter of design specifications but not the spirit. This is similar to human-produced software and numerous examples of ways in which such software fails the goals of the initial design.(Yampolskiy and Spellchecker 2016)

· Insufficient complexity of the environment (not enough data, poor fitness functions)

It is possible that the simulated environment is not complex enough to generate high complexity outputs in evolutionary simulations. This does not seem correct as Internet presents a highly complex landscape in which many self-modifying computer viruses roam (Nachenberg 1997). Likewise, virtual world such as Second Life and many others present close approximations to the real world and are certainly more complex than early Earth was. “A skeptic might insist that an abstract environment would be inadequate for the evolution …, believing instead that the virtual environment would need to closely resemble the actual biological environment in which our ancestors evolved. Creating a physically realistic virtual world would require a far greater investment of computational resources than the simulation of a simple toy world or abstract problem domain (whereas evolution had access to a physically realistic real world “for free”). In the limiting case, if complete microphysical accuracy were insisted upon, the computational requirements would balloon to utterly infeasible proportions.” (Shulman and Bostrom 2012). Requiring more realistic environmental conditions, may results in an increase in necessary computational resources, a problem addressed in the next bullet. Eiben and Smith suggest that evolvable hardware may provide complexity which is currently missing from software-only emulations: “The use of real hardware overcomes the principal deficiency of software models, which lack the richness of matter that is a source of challenges and opportunities not yet matched in artificial algorithms.” (Eiben and Smith 2015).

· Insufficient resources (Compute, Memory)

From the history of computer science, we know of many situations (speech recognition, NN training), where we had a correct algorithm, but insufficient computational resources to run it to success. It is possible that we simply don’t have hardware powerful enough to emulate evolution. We will address this possibility in Section 5.

· Software design is not amenable to evolutionary methods

Space of software designs may be discrete with no continuous path via incremental fitness to the desired solutions. This is possible, but this implies that original goals of GP are unattainable and misguided. In addition, since a clear mapping exists between solutions to problems and animals as solutions to environmental problems this would also imply that current explanation for the origin of the species is incorrect (Yampolskiy 2016a).

· Darwinian algorithm is incomplete or wrong

Lastly, we have to consider the possibility that the inspiration behind evolutionary computation, the Darwinian algorithm itself is wrong or at least partially incomplete. If that was true, computer simulations of such algorithm would fail to produce results comparable to observations we see in nature and a search for an alternative algorithm would need to take place. This would be an extraordinary claim and would require that we discard all the other possible explanations from this list.

Perhaps, we can learn some answers from similar historical conundrums. Earliest work on artificial neurons was done in 1943 by McCulloch and Pitts (McCulloch and Pitts 1943) and while research on Artificial Neural Networks (ANN) continued (LeCun, Bengio, and Hinton 2015), until 2010 it would have been very logical to ask: “Why don’t artificial neural networks perform as well as natural ones?” Today, deep neural networks frequently outperform their human counterparts (Yampolskiy 2015, Yampolskiy 2018), but it may still be helpful to answer this question about NN, to see how it was resolved. Stuhlmüller succinctly summarizes answer given by Ghahramani: “Why does deep learning work now, but not 20 years ago, even though many of the core ideas were there? In one sentence: We have more data, more compute, better software engineering, and a few algorithmic innovations …” (Stuhlmüller 2016). Consequently, the next section looks at this very-likely explanation in detail.

5. Computational Complexity of Biological Evolution and Available Compute

In the biological world, evolution is a very time consuming process with estimates for the appearance of early life pointing to some 4 billion years ago and each new generation taking minutes for simple life forms like bacteria and about 20 years for more complex species, like Homo Sapiens. Given the timescales involved, it is impossible to replicate full-scale evolution in experimental settings, but it may be possible to do so in computer simulations, by generating new offspring in matter of milliseconds and by greatly expediting necessary fitness evaluation time, potentially reducing a multi-billion year natural process to just a few years of simulation on a powerful supercomputer. “Using a non-biochemical substrate for such research is becoming technologically ever more feasible, and it increases the generalizability of the findings. In particular, using a different medium for evolutionary studies can separate generic principles and ground truth from effects that are specific for carbon-based life as we know it.” (Eiben and Smith 2015). Others have wondered about the same algorithm, “What algorithm could create all this in just 1012 steps? The number 1012 — one trillion — comes up because this is believed to be the number of generations since the dawn of life 3.5 ∙ 109 years ago (notice that most of our ancestors could not have lived for much more than a day).” (Livnat and Papadimitriou 2016).

Hamiltonian complexity (Osborne 2012) studies how hard is it to simulate a physical system, where “hard” means that the computational resources required to approximate behavior of the system grow too quickly with the size of the system being simulated, so that no computer can carry out the task in reasonable time (Osborne 2012). Specifically, in the context of evolutionary algorithms, research effort to establish bounds and improve efficiency is known as Evolutionary Algorithm Theory (EAT) (Jansen and Zarges 2011). In this section, we will attempt to estimate the computational power of evolution in biosphere, analyze computational complexity of bio-inspired evolutionary algorithms and finally compare our findings to the available and anticipated computational resources; all in the hopes of understanding, if it is possible to replicate evolution on a computer, in practice.

Similar attempts have been made by others, for example Shulman and Bostrom wanted to figure out computational requirements necessary to evolve Artificial Intelligence: “The argument from evolutionary algorithms then needs one additional premise to deliver the conclusion that engineers will soon be able to create machine intelligence, namely that we will soon have computing power sufficient to recapitulate the relevant evolutionary processes that produced human intelligence. Whether this is plausible depends both on what advances one might expect in computing technology over the next decades and on how much computing power would be required to run genetic algorithms with the same optimization power as the evolutionary process of natural selection that lies in our past. One might for example try to estimate how many doublings in computational performance, along the lines of Moore’s law, one would need in order to duplicate the relevant evolutionary processes on computers.” (Shulman and Bostrom 2012).

By looking at total number of generations, population sizes, DNA storage (Beck and Yampolskiy 2015, Beck et al. 2014, Beck, Rouchka, and Yampolskiy 2012, Beck and Yampolskiy 2012) and computation and involved neural information processing it is possible to arrive at broad estimates of computational power behind biological evolution. “In this way, the biosphere can be visualised as a large, parallel supercomputer, with the information storage represented by the total amount of DNA and the processing power symbolised by transcription rates. In analogy with the Internet, all organisms on Earth are individual containers of information connected through interactions and biogeochemical cycles in a large, global, bottom-up network.” (Landenmark, Forgan, and Cockell 2015). “We have various methods available to begin to estimate the power of evolutionary search on Earth: estimating the number of generations and population sizes available to human evolution, creating mathematical models of evolutionary “speed limits” under various conditions, and using genomics to measure past rates of evolutionary change. ” (Shulman and Bostrom 2012).

· With respect to the estimates of the storage capabilities of the biosphere we have: “The total amount of DNA contained in all of the cells on Earth is estimated to be about 5.3 x 1037 base pairs (Landenmark, Forgan, and Cockell 2015), equivalent to 1.325 x 1037 bytes of information.” (Gillings, Hilbert, and Kemp 2016).

· “Modern whole-organism genome analysis, in combination with biomass estimates, allows us to estimate a lower bound on the total information content in the biosphere: 5.3 × 1031 (±3.6 × 1031) megabases (Mb) of DNA. Given conservative estimates regarding DNA transcription rates, this information content suggests biosphere processing speeds exceeding yottaNOPS values (1024 Nucleotide Operations Per Second).” (Landenmark, Forgan, and Cockell 2015).

· “Finding the amount of DNA in the biosphere enables an estimate of the computational speed of the biosphere, in terms of the number of bases transcribed per second, or Nucleotide Operations Per Second (NOPS), analogous to the Floating-point Operations Per Second (FLOPS) metric used in electronic computing. A typical speed of DNA transcription is 18–42 bases per second for RNA polymerase II to travel along chromatin templates … and elsewhere suggested as 100 bases per second …. Precisely how much of the DNA on Earth is being transcribed at any one time is unknown. The percentage of any given genome being transcribed at any given time depends on the reproductive and physiological state of organisms, and at the current time we cannot reliably estimate this for all life on Earth. If all the DNA in the biosphere was being transcribed at these reported rates, taking an estimated transcription rate of 30 bases per second, then the potential computational power of the biosphere would be approximately 1015 yottaNOPS (yotta = 1024), about 1022 times more processing power than the Tianhe-2 supercomputer …, which has a processing power on the order of 105 teraFLOPS (tera = 1012).” (Landenmark, Forgan, and Cockell 2015).

· To estimate neural information processing of nature we need to look at the processing power of all neurons in the biosphere: “There are some 4–6*1030 prokaryotes in the world today, but only 1019 insects, and fewer than 1010 human (pre-agricultural populations were orders of magnitude smaller). However, evolutionary algorithms require not only variations to select among but a fitness function to evaluate variants, typically the most computationally expensive component. A fitness function for the evolution of artificial intelligence plausibly requires simulation of “brain development,” learning, and cognition to evaluate fitness. We might thus do better not to look at the raw number of organisms with complex nervous systems, but instead to attend to the number of neurons in biological organisms that we might simulate to mimic evolution’s fitness function. We can make a crude estimate of that latter quantity by considering insects, which dominate terrestrial biomass, with ants alone estimated to contribute some 15–20% of terrestrial animal biomass. Insect brain size varies substantially, with large and social insects enjoying larger brains; e.g., a honeybee brain has just under 106 neurons, while a fruit fly brain has 105 neurons, and ants lie in between with 250,000 neurons. The majority of smaller insects may have brains of only a few thousand neurons. Erring on the side of conservatively high, if we assigned all 1019 insects fruit-fly numbers of neurons the total would be 1024 insect neurons in the world. This could be augmented with an additional order of magnitude, to reflect aquatic copepods, birds, reptiles, mammals, etc., to reach 1025. (By contrast, in pre-agricultural times there were fewer than 107 humans, with under 1011 neurons each, fewer than 1018 total, although humans have a high number of synapses per neuron.) The computational cost of simulating one neuron depends on the level of detail that one wants to include in the simulation. Extremely simple neuron models use about 1,000 floating-point operations per second (FLOPS) to simulate one neuron (for one second of simulated time); an electrophysiologically realistic Hodgkin-Huxley model uses 1,200,000 FLOPS; a more detailed multicompartmental model would add another 3–4 orders of magnitude, while higher-level models that abstract systems of neurons could subtract 2–3 orders of magnitude from the simple models. If we were to simulate 1025 neurons over a billion years of evolution (longer than the existence of nervous systems as we know them) in a year’s run time these figures would give us a range of 1031–1044 FLOPS.” (Shulman and Bostrom 2012).

As Darwinian algorithm is inherently probabilistic, it is likely that many runs of the algorithm are required to have just one of them succeed, just like in the case of biological evolution (Lenski et al. 1991). The number of such simultaneous runs can be estimated from the total size of the search space divided by the average individual computational resources of each run. In the special case of biological evolution evolving intelligent beings, “The observation selection effect is that no matter how hard it is for human-level intelligence to evolve, 100% of evolved civilizations will find themselves originating from planets where it happened anyway. … every newly evolved civilization will find that evolution managed to produce its ancestors.” (Shulman and Bostrom 2012). So even a successful evolutionary run, with fixed computational resources, does not indicate that used compute would be sufficient in a similar experiment, as subsequent runs may not produce similar results. As Shulman and Bostrom put it, “However, reliable creation of human-level intelligence through evolution might require trials on many planets in parallel, with Earth being one of the lucky few to succeed.” (Shulman and Bostrom 2012). Conceivably, “Evolution requires extraordinary luck to hit upon a design for human-level intelligence, so that only 1 in 101000 planets with life does so.” (Shulman and Bostrom 2012). Hanson elaborates, “Many have recognized that the recent appearance of intelligent life on Earth need not suggest a large chance that similarly intelligent life appears after a similar duration on any planet like Earth. Since Earth’s one data point has been subject to a selection effect, it is consistent with any expected time for high intelligence to arise beyond about a billion years. Few seem to have recognized, however, that this same selection effect also allows the origin of life to be much harder than life’s early appearance on Earth might suggest.” (Hanson 1998).

EAT attempts to estimate computational requirements theoretically necessary to run different variants of the Darwinian algorithm. Such estimates are usually made with respect to the size of the input problem, which is difficult to formalize with respect to software generation. “It is difficult to characterize the complexity of a problem specific to a method of programming. Holding all things constant, you measure what must change as the size of the input instance increases. It is even more difficult to describe the complexity of a problem that can be solved by a program that is itself the output of a program, as is the case with the typical GP. In general, this type of question cannot be answered. What can be done however, is to compare the information content of a program with the information content of its output and in this way provide a bound on the complexity of that output.” (Rylander, Soule, and Foster 2001).

Specifically, “Though it is impossible to classify the complexity of a problem that can be solved by the output program in advance, it is possible to relate the amount of information contained in the output program to the GP itself. By applying the theorems from Kolmogorov complexity, it can be shown that the complexity of the output program of a GP using a pseudo random number generator (PRNG) can be bound above by the GP itself.

Theorem 3: For all strings x,y, if x is the shortest program that outputs y, that is K(y)=|x|, then K(x) ≥ K(y) + c.

Proof: Let x be the shortest program (by definition, incompressible) that outputs y. That is, K(y) = |x|. Suppose K(y) > K(x). By substitution, |x| > K(x), which is impossible since x was defined as incompressible.” (Rylander, Soule, and Foster 2001).

Next, we attempted to include best estimates for Darwinian algorithm complexity found in literature. “The performance of an EA is measured by means of the number of function evaluations T it makes until an optimal solution is found for the first time. The reason is that evolutionary algorithms tend to be algorithmically simple and each step can be carried out relatively quick. Thus, a function evaluation is assumed to be the most costly operation in terms of computation time. Most often, results about the expected optimization time E(T) as a function of n are derived where n is a measure for the size of the search space. If a fixed-length binary encoding is used n denotes the length of the bit strings (and the size of the search space equals 2n).” (Jansen and Zarges 2011). “[W]ith random mutations, random point mutations, we will get to fitness BB(N) in time exponential in N (evolution by exhaustive search)” (Chaitin 2013). There Busy Beaver function BB(N) = the largest integer that can be named by an N-bit program.

Fitness function evaluation is the most costly procedure in the Darwinian algorithm and is particularly ill defined in the case of software evaluation. How does one formalize a fitness function for something like an operating system, without having to include human users as evaluators? One may be required to rely on Human-Based Genetic Algorithms (HBGA) (Kosorukoff 2001), which would greatly increase time necessary to evaluate every generation and by extension overall simulation time for the run, making it impossible to recapitulate evolution through EAs.

“Essentially, the complexity of an optimization problem for a GA is bound above by the growth rate of the smallest representation [Minimum Chromosome Length — (MCL)] that can be used to solve the problem … . This is because the probabilistic convergence time will remain fixed as a function of the search space. All things held constant, the convergence time will grow as the search space grows.” (Rylander, Soule, and Foster 2001). “This means that the size of the search space doubles for every increase in instance size because the number of possible solutions is equivalent to the number 2 raised to the length of the chromosome, 2l.” (Rylander, Soule, and Foster 2001). “By creating a UGP [Universal Genetic Program], we have a single vehicle capable of evolving any program evolvable by a GP. To do this, we treat the first part of the data for the UGP as the specification (i.e. the “target” function) for a unique GP. In this way, we can implement any GP. This does not eliminate the Kolmogorov complexity bound, rather it determines the hidden constant in the Kolmogorov complexity bound.” (Rylander, Soule, and Foster 2001). “Because the output complexity includes all individuals from all populations, producing more individuals through larger populations or longer runs must eventually stop producing new solutions because these solutions would necessarily increase the output complexity beyond the finite limit imposed by the GP.” (Rylander, Soule, and Foster 2001). Others have attempted to calculate total “Computational requirements for recapitulating evolution through genetic algorithms” (Shulman and Bostrom 2012).

Given estimates of computational power of biological evolution in the wild and theoretical analysis for computational resources necessary to run a Darwinian algorithm, we will now try to see if matching compute is currently available, and if not how soon until it is predicted to be developed. Currently, world’s top 10 supercomputers[1] range from 10 to 125 peta (10¹⁵) floating point operations per second (FLOPS) of theoretical peak performance. For comparison, Bitcoin network[2] currently performs around 35 exa (10¹⁸) hashes per second, many thousands of times the combined speed of the top 500 supercomputers. Similarly, “Storing the total amount of information encoded in DNA in the biosphere, 5.3 × 1031 megabases (Mb), would require approximately 1021 supercomputers with the average storage capacity of the world’s four most powerful supercomputers.” (Landenmark, Forgan, and Cockell 2015). “In recent years it has taken approximately 6.7 years for commodity computers to increase in power by one order of magnitude. Even a century of continued Moore’s law would not be enough to close this gap. Running more or specialized hardware, or longer runtimes, could contribute only a few more orders of magnitude.” (Shulman and Bostrom 2012).

In this section, we looked at estimated computational power of biological evolution and theoretical computational complexity of Darwinian algorithm. In both cases, we found that required computational resources greatly exceed what is currently available and what is projected to be available in the near future. In fact, depending on some assumptions we make with regard to multiverse (De Simone et al. 2010), quantum aspects of biology (Lambert et al. 2013) and probabilistic nature of Darwinian algorithm such compute may never be available. Mahoney arrives at a similar realization: “The biosphere has on the order of 10³¹ cells (mostly bacteria) … with 10⁶ DNA base pairs each, encoding 10³⁷ bits of memory. Cells replicate on the order of 10⁶ seconds, for a total of 10⁴⁸ copy operations over the last 3 billion years. If we include RNA transcription and protein synthesis as computing operations, then the evolution of humans required closer to 10⁵⁰ operations. By contrast, global computing power is closer to 10²⁰ operations per second and 10²² bits of storage. If we were to naively assume that Moore’s Law were to continue increasing computing power by a factor of 10 every 5 years, then we would have until about 2080 before we have something this powerful.” (Mahoney 2015). Others agree, “The computing resources to match historical numbers of neurons in straightforward simulation of biological evolution on Earth are severely out of reach, even if Moore’s law continues for a century. The argument from evolutionary algorithms depends crucially on the magnitude of efficiency gains from clever search, with perhaps as many as thirty orders of magnitude required.” (Shulman and Bostrom 2012). If “… one would have to simulate evolution on vast numbers of planets to reliably produce intelligence through evolutionary methods, then computational requirements could turn out to be many, many orders of magnitude higher still …” (Shulman and Bostrom 2012). It is hoped by some that future developments in Quantum Evolutionary Computation (Han and Kim 2002) will help to overcome some of the resource limitations (Sofge 2006) without introducing negative side-effects (Majot and Yampolskiy 2015).

6. Conclusions

Our analysis of relevant literature shows that no one has succeeded at evolving non-trivial software from scratch, in other words the Darwinian algorithm works in theory, but does not work in practice, when applied in the domain of software production. The reason we do not evolve software is that the space of working programs is very large and discrete. While hill-climbing-heuristic-based evolutionary computations are excellent at solving many optimization problems they fail in the domains of non-continuous fitness (Mishra et al. 2015). This is also the reason we do not evolve complex alife or novel engineering designs. With respect to our two predictions, we can conclude that 1) Simulations of evolution do not produce comparably complex artifacts. 2) Running evolutionary algorithms longer leads to progressively diminishing results. With respect to the three falsifiability conditions, we observe that all three are true as of this writing. Likewise, neither the longest running evolutionary algorithm nor the most complex evolved algorithm nor the most complex digital organism are a part of our common cultural knowledge. This is not an unrealistic expectation as successful software programs, like Deep Blue (Hsu 2004) or Alpha Go (Silver et al. 2016, Silver et al. 2017), are well known to the public.

Others have come to similar conclusions: “It seems reasonable to assume that the number of programs possible in a given language is so inconceivably large that genetic improvement could surely not hope to find solutions in the ‘genetic material’ of the existing program. The test input space is also, in the words of Dijkstra, “so fantastically high” that surely sampling inputs could never be sufficient to capture static truths about computation.” (Petke et al. 2017). “… computing science is — and will always be — concerned with the interplay between mechanized and human symbol manipulation, usually referred to as ‘computing’ and ‘programming’ respectively. An immediate benefit of this insight is that it reveals “automatic programming” as a contradiction in terms.” (Dijkstra 1989). Moreover, more specifically, “Genetic algorithms do not scale well with complexity. That is, where the number of elements which are exposed to mutation is large there is often an exponential increase in search space size. This makes it extremely difficult to use the technique on problems such as designing an engine, a house or plane. In order to make such problems tractable to evolutionary search, they must be broken down into the simplest representation possible. Hence we typically see evolutionary algorithms encoding designs for fan blades instead of engines, building shapes instead of detailed construction plans, and airfoils instead of whole aircraft designs. The second problem of complexity is the issue of how to protect parts that have evolved to represent good solutions from further destructive mutation, particularly when their fitness assessment requires them to combine well with other parts.” (Mewada, Sharma, and Gautam 2018).

Even Koza himself acknowledges that it would be highly surprising if his approach could work: “Anyone who has ever written and debugged a computer program and has experienced their brittle, highly non-linear, and perversely unforgiving nature will probably be understandably skeptical about the proposition that the biologically motivated process sketched above could possibly produce a useful computer program.” (Koza 1994). We challenge EA community to prove us wrong by producing an experiment, which evolves non-trivial code from scratch and without human help. That would be the only way in which our findings could be shown to be incorrect.

On a positive side, the fact that it seems impossible to evolve complex code implies that we are unlikely to be able to directly evolve highly sophisticated artificially intelligent agents, which may present significant risk to our safety and security (Sotala and Yampolskiy 2015, Babcock, Kramár, and Yampolskiy 2016, Pistono and Yampolskiy 2016, Majot and Yampolskiy May 23–24, 2014, Ramamoorthy and Yampolskiy 2017, Yampolskiy 2016b, Yampolskiy and Fox 2013). Just imagine what would have happened, if the very first time we ran a simulation of evolution on a computer, it produced a superintelligent agent. Yampolskiy has shown that Programming as a problem is AI Complete (Yampolskiy April 21–22, 2012), if GP can solve Programming that would imply that GP = AGI (Artificial General Intelligence), but we see no experimental evidence for such claim. In fact, it is more likely that once we have AGI, it could be used to create an intelligent fitness function for GP and so make it possible to evolve code. Genetic Programming will not be the cause of Artificial Intelligence, but a product of it.

Conversely, neuroevolution methods for optimizing neural networks (weights, parameters and deep learning architectures) show significant success in a number of domains and remain a strong contender for for creation of AGI (Stanley et al. 2019). Karpahty suggests that neural networks represent the future of software engineering as they allow substitution of software based on difficult to write source code with automated optimization based on code evaluation (Karpathy November 11, 2017): “The “classical stack” of Software 1.0 is what we’re all familiar with — it is written in languages such as Python, C++, etc. It consists of explicit instructions to the computer written by a programmer. By writing each line of code, the programmer identifies a specific point in program space with some desirable behavior.” … “In contrast, Software 2.0 can be written in much more abstract, human unfriendly language, such as the weights of a neural network. No human is involved in writing this code because there are a lot of weights (typical networks might have millions) … .” “Instead, our approach is to specify some goal on the behavior of a desirable program (e.g., “satisfy a dataset of input output pairs of examples”, or “win a game of Go”), write a rough skeleton of the code (e.g. a neural net architecture), that identifies a subset of program space to search, and use the computational resources at our disposal to search this space for a program that works. In the specific case of neural networks, we restrict the search to a continuous subset of the program space where the search process can be made (somewhat surprisingly) efficient with backpropagation and stochastic gradient descent.” … “It turns out that a large portion of real-world problems have the property that it is significantly easier to collect the data (or more generally, identify a desirable behavior) than to explicitly write the program.” … “Software 1.0 is code we write. Software 2.0 is code written by the optimization based on an evaluation criterion (such as “classify this training data correctly”). It is likely that any setting where the program is not obvious but one can repeatedly evaluate the performance of it (e.g. — did you classify some images correctly? do you win games of Go?) will be subject to this transition, because the optimization can find much better code than what a human can write.” (Karpathy November 11, 2017).

Previously, a number of attempts have been made to understand what makes certain problems hard in the context of evolutionary algorithms (Eiben, Raué, and Ruttkay 1995), typically taking inspiration from computational complexity and subdividing all problems into GA-Easy (search space grows linearly with the size of input) and GA-Hard (search space grows exponentially with the size of input) for the problem encoded in the minimum chromosomal length (Rylander and Foster 2000, Rylander and Foster 2001). Others, in their quest to explain Hardness, have looked at the presence in the search space of deceptive attractors which guide the search away from global maxima (Forrest and Mitchell 1993). We, likewise, suggest existence of two broad classes of problems but which are based on whatever we are optimizing an existing design or trying to discover a novel design. If a particular class of problems can be represented as an optimization problem with smooth fitness landscape it is likely to be solvable by EAs. However most software engineering/design problems seem to have discontinuous fitness landscape and so are resistant to the evolutionary hill-climbing paradigm, with GP not able to outperform random search on such problems. Whatever it is possible to convert such design problems into optimization problems remains an open questions and is similar in difficulty and consequences to the famous P VS NP (Cook 2006) problem.

While it seems unlikely that evolutionary algorithms would be able to solve design problems directly, it may be possible to use them to evolve advanced artificial intelligences (Darwin Machines (de Garis 1993) in evolvable hardware (Higuchi et al. 1993, De Garis et al. 2000)) via neuroevolution, such AIs in turn may be able to solve design problems. Stanley et al. also argue that neuroevolution may be a likely path to strong artificial intelligence, “… because of its complexity, AGI itself is only possible to discover through an open-ended process that generates more and more complex brain-like structures indefinitely. Furthermore, open-endedness may require more than only neural networks to evolve — brains and bodies evolve together in nature, and so can morphologies evolve along with neural networks in artificial systems, providing a form of embodiment. In the long run, open-endedness could be the fuel for generating the architectures and learning algorithms that ultimately reach human-level intelligence.” (Stanley et al. 2019). Which means that any safety concerns we might have about advanced artificial intelligence remain as an unfortunate side-effect of research into neuroevolution and are not eliminated due to fundamental limits of the Darwinian algorithm.

Acknowledgments

An earlier version of this paper has been published as “Why We Do Not Evolve Software? Analysis of Evolutionary Algorithms”. Evolutionary Bioinformatics. Volume 14: 1–11. 2018. © Roman V. Yampolskiy.

References

Altshuler, Edward E , and Derek S. Linden. 1997. “Wire-antenna designs using genetic algorithms.” IEEE Antennas and Propagation magazine 39(2):33–43.

Armstrong, Stuart, and Kaj Sotala. 2015. “How we’re predicting AI–or failing to.” In Beyond Artificial Intelligence, 11–29. Springer.

Ashby, Leif H, and Roman V Yampolskiy. 2011. “Genetic algorithm and Wisdom of Artificial Crowds algorithm applied to Light up.” Computer Games (CGAMES), 2011 16th International Conference on.

Babcock, James, János Kramár, and Roman Yampolskiy. 2016. “The AGI containment problem.” International Conference on Artificial General Intelligence.

Back, Thomas. 1996. Evolutionary algorithms in theory and practice: evolution strategies, evolutionary programming, genetic algorithms: Oxford university press.

Barricelli, Nils Aall. 1957. “Symbiogenetic evolution processes realized by artificial methods.” Methodos 9 (35–36):143–182.

Beck, Marc B, Ahmed H Desoky, Eric C Rouchka, and Roman V Yampolskiy. 2014. “Decoding Methods for DNA Steganalysis.” Paper presented at the 6th International Conference on Bioinformatics and Computational Biology (BICoB 2014), Las Vegas, Nevada, USA.

Beck, Marc B, Eric C Rouchka, and Roman V Yampolskiy. 2012. “Finding data in DNA: computer forensic investigations of living organisms.” International Conference on Digital Forensics and Cyber Crime.

Beck, Marc B, and Roman V Yampolskiy. 2015. “Hiding Color Images in DNA Sequences.” MAICS.

Beck, Marc, and Roman Yampolskiy. 2012. “DNA as a medium for hiding data.” BMC bioinformatics.

Becker, Kory, and Justin Gottschlich. 2017. “AI Programmer: Autonomously Creating Software Programs Using Genetic Algorithms.” arXiv preprint arXiv:1709.05703.

Bird, Jon, and Paul Layzell. 2002. “The evolved radio and its implications for modelling the evolution of novel sensors.” Proceedings of the IEEE Congress on Evolutionary Computation (CEC’02).

Chaitin, Gregory. 2012. Proving Darwin: making biology mathematical: Vintage.

Chaitin, Gregory. 2013. “Life as evolving software.” In A Computable Universe: Understanding and Exploring Nature as Computation, 277–302. World Scientific.

Chambers, Robert. 1853. Vestiges of the natural history of creation: J. Churchill.

Clune, J, D Misevic, C Ofria, RE Lenski, SF Elena, and R Sanjuán. 2008. “ Natural Selection Fails to Optimize Mutation Rates for Long-Term Adaptation on Rugged Fitness Landscapes.” PLoS Comput Biol 4(9).

Cook, Stephen. 2006. “The P versus NP problem.” The millennium prize problems:87–104.

Darwin, Charles. 1859. “On the origin of species by means of natural selection, or.” The Preservation of Favoured Races in the Struggle for Life, London/Die Entstehung der Arten durch natürliche Zuchtwahl, Leipzig oJ.

David, Omid E, H. Jaap van den Herik, Moshe Koppel, and Nathan S. Netanyahu. 2014. “Genetic algorithms for evolving computer chess programs.” IEEE transactions on evolutionary computation 18(5):779–789.

Davis, Lawrence. 1991. Handbook of genetic algorithms: Van Nostrand Reinhold.

de Garis, Hugo. 1993. “Evolvable hardware genetic programming of a Darwin machine.” Artificial neural nets and genetic algorithms.

De Garis, Hugo, Michael Korkin, Felix Gers, Eiji Nawa, and Michael Hough. 2000. “Building an artificial brain using an FPGA based CAM-Brain Machine.” Applied Mathematics and Computation 111 (2–3):163–192.

De Simone, Andrea, Alan H Guth, Andrei Linde, Mahdiyar Noorbala, Michael P Salem, and Alexander Vilenkin. 2010. “Boltzmann brains and the scale-factor cutoff measure of the multiverse.” Physical Review D 82 (6):063520.

Dijkstra, E. W. 1989. “On the cruelty of really teaching computing science.” Communications of the ACM 32(12):1398–1404.

Eiben, Agoston E, and Jim Smith. 2015. “From evolutionary computation to the evolution of things.” Nature 521 (7553):476.

Eiben, Ágoston, Paul-Erik Raué, and Zsófia Ruttkay. 1995. “GA-easy and GA-hard constraint satisfaction problems.” In Constraint Processing, 267–283. Springer.

Ferreira, Cândida. 2006. Gene expression programming: mathematical modeling by an artificial intelligence. Vol. 21: Springer.

Fogel, Lawrence J, Alvin J Owens, and Michael J Walsh. 1966. “Artificial intelligence through simulated evolution.”

Forrest, Stephanie, and Melanie Mitchell. 1993. “What makes a problem hard for a genetic algorithm? Some anomalous results and their explanation.” Machine Learning 13 (2–3):285–319.

Gillings, Michael R., Martin Hilbert, and Darrell J. Kemp. 2016. “Information in the biosphere: Biological and digital worlds.” Trends in ecology & evolution 31(3):180–189.

Han, Kuk-Hyun, and Jong-Hwan Kim. 2002. “Quantum-inspired evolutionary algorithm for a class of combinatorial optimization.” IEEE transactions on evolutionary computation 6(6):580–593.

Hanson, Robin. 1998. “Must early life be easy? The rhythm of major evolutionary transitions.” Unpublished manuscript, Available on-line at: http://hanson.gmu.edu/hardstep.pdf.

Hauptman, Ami, and Moshe Sipper. 2005. “GP-endchess: Using genetic programming to evolve chess endgame players.” European Conference on Genetic Programming, Berlin, Heidelberg.

Helmuth, Thomas, and Lee Spector. 2015. “General program synthesis benchmark suite.” Proceedings of the 2015 Annual Conference on Genetic and Evolutionary Computation.

Higuchi, Tetsuya, Tatsuya Niwa, Toshio Tanaka, Hitoshi Iba, Hugo de Garis, and Tatsumi Furuya. 1993. “Evolving hardware with genetic learning: A first step towards building a Darwin machine.” Proc. of the 2nd International Conference on Simulated Adaptive Behaviour.

Hindi, Musa, and Roman V Yampolskiy. 2012. “Genetic Algorithm Applied to the Graph Coloring Problem.” MAICS.

Hsu, Feng-Hsiung. 2004. Behind Deep Blue: Building the computer that defeated the world chess champion: Princeton University Press.

Hughes, Ryan, and Roman V Yampolskiy. 2012. “Solving Sudoku Puzzles with Wisdom of Artificial Crowds.” Int. J. Intell. Games & Simulation 7 (1):24–29.

Hundley, Madeline V, and Roman V Yampolskiy. April 28–29, 2017. “Shortest Total Path Length Spanning Tree via Wisdom of Artificial Crowds Algorithm.” The 28th Modern Artificial Intelligence and Cognitive Science Conference (MAICS2017), Fort Wayne, IN, USA.

Jansen, Thomas, and Christine Zarges. 2011. “Analysis of evolutionary algorithms: from computational complexity analysis to algorithm engineering.” Proceedings of the 11th workshop proceedings on Foundations of genetic algorithms.

Kannappan, Karthik, Lee Spector, Moshe Sipper, Thomas Helmuth, William La Cava, Jake Wisdom, and Omri Bernstein. 2015. “Analyzing a decade of human-competitive (“humie”) winners: What can we learn?” In Genetic Programming Theory and Practice XII, 149–166. Springer.

Karpathy, Andrej. November 11, 2017. “Software 2.0.” Medium, Available at: https://medium.com/@karpathy/software-2-0-a64152b37c35.

Khalifa, Amine Ben, and Roman V Yampolskiy. 2011. “GA with Wisdom of Artificial Crowds for Solving Mastermind Satisfiability Problem.” Int. J. Intell. Games & Simulation 6 (2):12–17.

Kolmogorov, A N. 1965. “Three Approaches to the Quantitative Definition of Information.” Problems Inform. Transmission 1(1):1–7.

Kosorukoff, Alex. 2001. “Human based genetic algorithm.” Systems, Man, and Cybernetics, 2001 IEEE International Conference on.

Koza, John R. 1994. “Genetic programming as a means for programming computers by natural selection.” Statistics and computing 4 (2):87–112.

Koza, John R. 2010. “Human-competitive results produced by genetic programming.” Genetic Programming and Evolvable Machines 11 (3–4):251–284.

Lamarck, J-B-PA. 1809. Philosophie zoologique.

Lambert, Neill, Yueh-Nan Chen, Yuan-Chung Cheng, Che-Ming Li, Guang-Yin Chen, and Franco Nori. 2013. “Quantum biology.” Nature Physics 9(1).

Landenmark, Hanna KE, Duncan H. Forgan, and Charles S. Cockell. 2015. “An estimate of the total DNA in the biosphere.” PLoS Biology 13(6).

LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton. 2015. “Deep learning.” Nature 521 (7553):436–444.

Lehman, Joel, Jeff Clune, Dusan Misevic, Christoph Adami, Julie Beaulieu, Peter J Bentley, Samuel Bernard, Guillaume Belson, David M Bryson, and Nick Cheney. 2018. “The surprising creativity of digital evolution: A collection of anecdotes from the evolutionary computation and artificial life research communities.” arXiv preprint arXiv:1803.03453.

Lenski, Richard E, Michael R. Rose, Suzanne C. Simpson, and Scott C. Tadler. 1991. “Long-term experimental evolution in Escherichia coli. I. Adaptation and divergence during 2,000 generations.” The American Naturalist 138(6):1315–1341.

Livnat, Adi, and Christos Papadimitriou. 2016. “Sex as an algorithm: the theory of evolution under the lens of computation.” Communications of the ACM 59 (11):84–93.

Lowrance, Christopher J, Omar Abdelwahab, and Roman V Yampolskiy. 2015. “Evolution of a Metaheuristic for Aggregating Wisdom from Artificial Crowds.” Portuguese Conference on Artificial Intelligence.

Mahoney, Matt. 2015. “Couldn’t we eliminate any and all risk of the artificial general intelligence by making its goal to be the most accurate possible advisor?”, Available at: https://www.quora.com/Couldnt-we-eliminate-any-and-all-risk-of-the-artificial-general-intelligence-by-making-its-goal-to-be-the-most-accurate-possible-advisor.

Majot, Andrew M, and Roman V Yampolskiy. May 23–24, 2014. “AI safety engineering through introduction of self-reference into felicific calculus via artificial pain and pleasure.” IEEE International Symposium on Ethics in Science, Technology and Engineering, Chicago, IL.

Majot, Andy, and Roman Yampolskiy. 2015. “Global catastrophic risk and security implications of quantum computers.” Futures 72:17–26.

Mayr, Ernst. 1974. “Behavior Programs and Evolutionary Strategies: Natural selection sometimes favors a genetically” closed” behavior program, sometimes an” open” one.” American scientist 62 (6):650–659.

McCulloch, Warren S, and Walter Pitts. 1943. “A logical calculus of the ideas immanent in nervous activity.” The bulletin of mathematical biophysics 5 (4):115–133.

Mewada, Shivlal, Pradeep Sharma, and SS Gautam. 2018. “Exploration of Fuzzy System With Applications.” In Handbook of Research on Promoting Business Process Improvement Through Inventory Control Techniques, 479–498. IGI Global.

Mishra, Mohit, Vaibhav Gupta, Utkarsh Chaturvedi, Kaushal K Shukla, and Roman V Yampolskiy. 2015. “A Study on the Limitations of Evolutionary Computation and other Bio-inspired Approaches for Integer Factorization.” Procedia Computer Science 62:603–610.

Mishra, Mohit, SK Pal, and RV Yampolskiy. 2016. “Nature-Inspired Computing Techniques for Integer Factorization.” Evolutionary Computation: Techniques and Applications:401.

Nachenberg, Carey. 1997. “Computer virus-antivirus coevolution.” Communications of the ACM 40 (1):46–51.

Orlov, Michael, and Moshe Sipper. 2011a. “FINCH: A system for evolving Java (bytecode).” In Genetic Programming Theory and Practice VIII, 1–16. Springer.

Orlov, Michael, and Moshe Sipper. 2011b. “Flight of the FINCH through the Java wilderness.” IEEE Transactions on Evolutionary Computation 15 (2):166–182.

Osborne, Tobias J. 2012. “Hamiltonian complexity.” Reports on Progress in Physics 75 (2):022001.

Ouch, Roger, Kristopher Reese, and Roman V Yampolskiy. 2013. “Hybrid Genetic Algorithm for the Maximum Clique Problem Combining Sharing and Migration.” MAICS.

Petke, Justyna, Saemundur Haraldsson, Mark Harman, David White, and John Woodward. 2017. “Genetic improvement of software: a comprehensive survey.” IEEE Transactions on Evolutionary Computation.

Pistono, Federico, and Roman V Yampolskiy. 2016. “Unethical Research: How to Create a Malevolent Artificial Intelligence.” 25th International Joint Conference on Artificial Intelligence (IJCAI-16). Ethics for Artificial Intelligence Workshop (AI-Ethics-2016).

Poli, Riccardo, William B Langdon, Nicholas McPhee, and John Koza. 2008. “Field guide to genetic programming.” http://www.gp-field-guide.org.uk/.

Port, Aaron C, and Roman V Yampolskiy. 2012. “Using a GA and Wisdom of Artificial Crowds to solve solitaire battleship puzzles.” Computer Games (CGAMES), 2012 17th International Conference on.

Ramamoorthy, Anand, and Roman Yampolskiy. 2017. “Beyond Mad?: The Race for Artificial General Intelligence.” ITU Journal: ICT Discoveries.

Ray, T. S. 1991. “Evolution and optimization of digital organisms.” Scientific Excellence in Supercomputing: The IBM 1990 Contest Prize Papers.

Rylander, BART, and James Foster. 2001. “Genetic Algorithms, and Hardness.” WSEAS International Conference on Evolutionary Computation, Tenerife Playa, Canary Islands, Spain.

Rylander, Bart, and James A Foster. 2000. “GA Hard Problems.” GECCO.

Rylander, Bart, Terry Soule, and James Foster. 2001. “Computational complexity, genetic programming, and implications.” European Conference on Genetic Programming.

Shulman, Carl, and Nick Bostrom. 2012. “How hard is artificial intelligence? Evolutionary arguments and selection effects.” Journal of Consciousness Studies 19 (7–8):103–130.

Silver, David, Aja Huang, Chris J Maddison, Arthur Guez, Laurent Sifre, George Van Den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, and Marc Lanctot. 2016. “Mastering the game of Go with deep neural networks and tree search.” nature 529 (7587):484–489.

Silver, David, Julian Schrittwieser, Karen Simonyan, Ioannis Antonoglou, Aja Huang, Arthur Guez, Thomas Hubert, Lucas Baker, Matthew Lai, and Adrian Bolton. 2017. “Mastering the game of go without human knowledge.” Nature 550 (7676):354.

Sofge, Donald A. 2006. “Toward a framework for quantum evolutionary computation.” Cybernetics and Intelligent Systems, 2006 IEEE Conference on.

Soros, Lisa B, and Kenneth O Stanley. 2014. “Identifying necessary conditions for open-ended evolution through the artificial life world of chromaria.” H. Sayama, J. Rieffel, S. Risi, R. Doursat, & H. Lipson (Eds.), Artificial life 14: Proceedings of the Fourteenth International Conference on the Synthesis and Simulation of Living Systems.

Sotala, Kaj, and Roman V Yampolskiy. 2015. “Responses to catastrophic AGI risk: a survey.” Physica Scripta 90 (1):018001.

Stanley, Kenneth O, Jeff Clune, Joel Lehman, and Risto Miikkulainen. 2019. “Designing neural networks through neuroevolution.” Nature Machine Intelligence 1:24–35.

Stanley, Kenneth O, and Risto Miikkulainen. 2003. “A taxonomy for artificial embryogeny.” Artificial Life 9 (2):93–130.

Storn, Rainer, and Kenneth Price. 1997. “Differential evolution–a simple and efficient heuristic for global optimization over continuous spaces.” Journal of global optimization 11 (4):341–359.

Stuhlmüller, Andreas. 2016. “50 things I learned at NIPS 2016.” NIPS, Available at: https://blog.ought.com/nips-2016-875bb8fadb8c.

Such, Felipe Petroski, Vashisht Madhavan, Edoardo Conti, Joel Lehman, Kenneth O Stanley, and Jeff Clune. 2017. “Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning.” arXiv preprint arXiv:1712.06567.

Teller, Astro. 1994. “Turing completeness in the language of genetic programming with indexed memory.” Evolutionary Computation, 1994. IEEE World Congress on Computational Intelligence., Proceedings of the First IEEE Conference on.

Turing, A. 1950. “Computing Machinery and Intelligence.” Mind 59(236):433–460.

Turing, Alan. 1936. “On computable numbers, with an application to the Entscheidungsproblem.” Proceedings of the London Mathematical Society 2(42):230–265.

Valiant, Leslie G. 2009. “Evolvability.” Journal of the ACM (JACM) 56 (1):3.

White, David R, James Mcdermott, Mauro Castelli, Luca Manzoni, Brian W Goldman, Gabriel Kronberger, Wojciech Jaśkowski, Una-May O’Reilly, and Sean Luke. 2013. “Better GP benchmarks: community survey results and proposals.” Genetic Programming and Evolvable Machines 14 (1):3–29.

Wolfram, Stephen. May 14, 2002. A New Kind of Science: Wolfram Media, Inc.

Woodward, John R. 2003. “GA or GP? that is not the question.” Evolutionary Computation, 2003. CEC’03. The 2003 Congress on.

Woodward, John R, and Ruibin Bai. 2009. “Why evolution is not a good paradigm for program induction: a critique of genetic programming.” Proceedings of the first ACM/SIGEVO Summit on Genetic and Evolutionary Computation.

Woodward, John R, Colin G Johnson, and Alexander EI Brownlee. 2016. “GP vs GI: if you can’t beat them, join them.” Proceedings of the 2016 on Genetic and Evolutionary Computation Conference Companion.

Yampolskiy, R, P Anderson, J Arney, Vladimir Misic, and T Clarke. 2004. “Printer model integrating genetic algorithm for improvement of halftone patterns.” Western New York Image Processing Workshop (WNYIPW).

Yampolskiy, R.V. 2010. “Application of bio-inspired algorithm to the problem of integer factorisation.” International Journal of Bio-Inspired Computation 2 (2):115–123.

Yampolskiy, Roman. 2018. Artificial Intelligence Safety and Security: CRC Press.

Yampolskiy, Roman, and Joshua Fox. 2013. “Safety engineering for artificial general intelligence.” Topoi 32 (2):217–226.

Yampolskiy, Roman V. 2013a. “Efficiency Theory: a Unifying Theory for Information, Computation and Intelligence.” Journal of Discrete Mathematical Sciences and Cryptography 16 (4–5):259–277.

Yampolskiy, Roman V. 2013b. “Turing test as a defining feature of AI-completeness.” In Artificial Intelligence, Evolutionary Computing and Metaheuristics, 3–17. Springer Berlin Heidelberg.

Yampolskiy, Roman V. 2016a. “On the origin of synthetic life: attribution of output to a particular algorithm.” Physica Scripta 92 (1):013002.

Yampolskiy, Roman V. 2016b. “Taxonomy of Pathways to Dangerous Artificial Intelligence.” AAAI Workshop: AI, Ethics, and Society.

Yampolskiy, Roman V. 2017. “What are the ultimate limits to computational techniques: verifier theory and unverifiability.” Physica Scripta 92 (9):093001.

Yampolskiy, Roman V, Leif Ashby, and Lucas Hassan. 2012. “Wisdom of Artificial Crowds — A Metaheuristic Algorithm for Optimization.” Journal of Intelligent Learning Systems and Applications 4 (2):98–107.

Yampolskiy, Roman V, and Ahmed El-Barkouky. 2011. “Wisdom of artificial crowds algorithm for solving NP-hard problems.” International Journal of Bio-inspired computation 3 (6):358–369.

Yampolskiy, Roman V, and MS Spellchecker. 2016. “Artificial Intelligence Safety and Cybersecurity: a Timeline of AI Failures.” arXiv preprint arXiv:1610.07997.

Yampolskiy, Roman V. 2015. Artificial Superintelligence: a Futuristic Approach: Chapman and Hall/CRC.

Yampolskiy, Roman V. April 21–22, 2012. “AI-Complete, AI-Hard, or AI-Easy — Classification of Problems in AI.” The 23rd Midwest Artificial Intelligence and Cognitive Science Conference, Cincinnati, OH, USA.

[1] https://en.wikipedia.org/wiki/TOP500#Top_10_ranking

[2] https://blockchain.info

What Evolutionary Algorithms Can and Can’t Do

Written by Roman V. Yampolskiy

No responses yet