Self-Landing Spaceship - Simple version
Sep 27, 2016
In September 2016, Elon Musk revealed SpaceX's Mars colonization plan. It made me curious about what machine learning could do in this great endeavor. Here I have borrowed some code from John Watson's blog and turned it into an experiment field, where a simple landing scenario is simulated.
The goal is simple: to let the spaceship learn a landing strategy through trial and error, so that it thrusts at optimal timings and eventually land on the ground safe and sound. To be more specific, a successful landing has to meet the following criteria:
- it doesn't hit the ground too hard, that is, the Y velocity must be smaller than 30px/s
- it doesn't go beyond the upper boundary (this is a landing task, after all)
- it doesn't run out of fuel, which only lasts for 10 seconds
Not too harsh, right?
I call this setting a simplified scenario is because the spaceship only have two actions to choose from: either thrust or not thrust. If thrusts, it will battle with the gravity to slow down or even go upward; if doesn't thrust, it will be dragged down to the ground. The simulator runs at 60fps, at each frame it makes a decision based on two inputs: its current Y velocity, and distance to the ground.
I firstly experimented with the genetic algorithm similar to the hello world example. Basically the simulation starts out with a generation of 20 spaceship candidates, each is controlled by a neural network that takes sensor readings as input, and outputs a binary yes/no for thrust. In the first generation, all neural networks are initialized with random weights. As the landing performance of each spaceship is recorded, good ones will have higher chance to pass their "genes" (the weight parameters) to the next generation. This is referred as Selection. While reproducing the next generation, a candidate of which receives its genes from a remix of its parents'. This is referred as Crossover. In this evolutionary process, once in a while some random genes of a candidate are altered to some random values, just to maintain the genetic diversity of a population, or put in machine learning's jargon, to avoid local optima. This is called Mutation. Turned out that traditional genetic algorithm can easily handle this simple problem, took about 9 generations with population size=40 to find a landing solution.
I then tried another method called beam search, which tries to find the optimal set of weight parameters by mutating around the best-performing candidates. It gave even better results: found a landing solution around about 5th generation.
Performance comparison before and after evolution
Both of these two methods are heuristic search algorithms that work well when the search space is not too complex. Although what's simulated here is no where near the real-life landing scenario, I regard this project a good practice to learn what genetic algorithm/ beam search can do in parameter optimization. For future work, I plan to increase the landing difficulty and compare the performances among genetic algorithm, beam search, and reinforcement learning.
Code of this project will be published soon.