Hello World

Evolutionary Hello World

Jul 18, 2016


This is the first project to kick off my journey in Evolutionary Computing in summer 2016. Here it aims to evolve the term "hello world" (or a user-defined term) from a pool of random strings using a method called genetic algorithm. As you click the "Evolve!" button below, you will witness a grand evolutionary process taking place in a blink of an eye: hundreds of generations of strings are created and gradually move towards the evolutionary goal with selection, crossover and mutation performed at each individual candidate from each generation.

Here I set the maximum number of generations to be 500. There is a slight chance that things get unlucky and it may take more than that. If that's the case, just click the button to give it another try. Code is up here.


Experiment

Maximum Generation: 500

Population Size: 50

Selection Weight Factor: 1.5

Crossover Point: dynamic

Mutation Rate: 2%

Define an evolutionary goal:


Thoughts

This is quite interesting! It started all randomly and eventually was able to land on what we want. All we needed to do is to define an evolutionary goal and then get ready to be amazed. Of course, this is a super simplified model of what's happening behind evolution in the real-world. However, it is enough to illustrate a point: evolution is driven by fitness.

In the biological world, fitness can be easily understood as the ability to thrive. The better an individual adapts to an environment, the higher fitness it possesses, and the more offspring it would produce. In this Evolutionary Hello World example, the fittness of each string is measured by how many letters it matches with the target string. Better-matched strings have higher chance to be selected to give babies to the next generation.

What if this generate-and-test idea is generalized to other domains, for example, creating a new food recipe with egg and tomato? Before answering that, we want to know how recipes are created in traditional manner. Below are two classic dishes made by eggs and tomatoes, both taste good even though were made differently. Imagine how they were invented in the first place. Did the cook manually enumerate all possible combinations of the ingredients and select the tasty ones? Not possible. Did he intuitively create the recipes with his years of cooking experience, along with a good amount of trial-and-error? That's possibly the answer.

Western style tomato omelet

Chinese style stir-fried egg with tomato

(image source) (image source)

Our generation is lucky because we can enjoy a large variety of cuisines as a result of thousands of years of manual knowledge accumulation. In the digital world, things could get even better. To use genetic algorithm, we can start out with a bunch of food "genes":

  • Ingredients: egg, tomato
  • Seasoning: salt, pepper, MSG, soy source, cinnamon, mint,..
  • Cooking methods: keep it raw, steam, boil, roast, stir fry, deep fry,..

Then, we randomly select some combinations of the "genes" to form the "genotypes":

genotype1: egg + tomato + salt + keep it raw
genotype2: egg + tomato + pepper + cinnamon + boil + stir fry
genotype3: egg + tomato + mint + soy source + steam
genotype4: you get the idea

Finally, in a virtual environment, we cook them all out. The resulting dishes, along with their scent, flavor, texture and appearance, make up the "phenotypes".

As of the initial generation, some may taste bad, some may taste even worse, but ideally, we can rank their tastiness and let the top ones breed the next generation. By proceeding this process over and over, the program may invent some tasty recipes! Note that the beauty of it lies in the fact that computers are not constrained by human's preferences, biases and stereotypes, so that recipes invented could be really strange but taste surprisingly good (not guaranteed though, I'm going to test it someday).

Considering how we human were evolved from unicellular organism to such complex beings, I do believe evolutionary computing has the potential to automate innovation. However, it is more difficult than the example above.

  • The search space could be HUGE. Just think about how many ingredients, seasonings, cooking methods, cooking orders and so on exist in this world. Since genetic algorithm is a randomized heuristic search method, it takes more time to explore the search space as more variables are introduced.
  • Quantifying the genes and finding a suitable data structure for genotype is hard.
  • There might be multiple evolutionary goals, conflicting with one and another. We may appreciate not only tastiness, but also low cost of material as well as high energy efficiency.
  • The measurement of fitness is a problem itself. If dishes are evolved and tested in a digital context, how do we evaluate the tastiness?
  • Hyper-parameter tuning for a healthy evolutionary environment. Population size, mutation rate, crossover method, optimizing parameters like these takes time and computational resources.

These are active research questions that we want to address collectively. As an optimist, I think the future is bright. Okay, that's enough for a blog post, I'm going to cook my Chinese-style stir fried egg with tomato now.