Demo 4: Full Example

We’ll now look at a full evolutionary robotics example. Here are the steps:

Setup a compute environment.
Create a simulation that is decoupled from evolution and visualization.
Design a fitness function.
Implement an evolutionary algorithm.
Launch the evolutionary process and leverage parallelism.
Analyze and visualize the results.

Setup

I am using an HPC linux server with Lmod (used for environment management) and Slurm (for workload management). Though these instructions will work pretty broadly.

I use Miniforge to manage my software installations. So, here is my process:

# Install or activate Miniforge
module load miniforge3

# Create a new environment
mamba create --name simer

# Install necessary packages
mamba install pybox2d ipython pandas enlighten jupyter seaborn plotly

# Save the environment for reproducibility (create both)
conda env export --from-history > cross-platform.yml
conda list --explicit > spec-file.txt

# Create a directory for code (and a separate for data if needed)
mkdir -p ~/simer-tutorial
cd ~/simer-tutorial

Once the directory is created, I typically edit code using VSCode with the Remote - SSH extension.

Simulation

My WMR simulation, written in Python, largely follows the version from demos 1-3. The key is that the simulation can be constructed with the parameters we want to evolve:

class WMR:
    def __init__(
        self,
        *,
        wheel_radius: float,
        chassis_length: float,
        suspension_frequency: float,
        suspension_damping: float,
        sensor_limit: float,
        duration: float,
        time_step: float,
        visualize: bool = False,
    ):
    ...

In the evolution file, we can then simulate with:

wmr = WMR(
    wheel_radius=wheel_radius,
    chassis_length=chassis_length,
    suspension_frequency=suspension_frequency,
    suspension_damping=suspension_damping,
    sensor_limit=sensor_limit,
    duration=DURATION,
    time_step=TIME_STEP,
    visualize=visualize,
)

Where each of these values is computed from a genome.

Fitness Evaluation

We need to design a fitness function to satisfy our problem statement:

We want to evolve an autonomous wheeled mobile robot (WMR) to navigator quickly over obstacles and then stop in front of a wall.

I recommend using a two-value fitness function. One component is used for a feasibility (constraint) metric, and the other is an objective metric. For example, for this demo I used the following equation for feasibility:

\[ f = \frac{l}{2} - r \tag{1}\]

where \(l\) is the chassis length and \(r\) is the wheel radius.

Essentially, Equation 1 is a feasibility value that takes into account the ability to actually construct the WMR. If the wheels are too big, then they will not fit on the chassis. It is better to use a feasibility metric as part of the fitness value instead of just throwing out the individual completely. This provides a better “gradient” for evolution.

For the demo, I used the following objective function:

\[ o = 2 \left(1 - \frac{d}{d_{\text{max}}}\right) + 1 \left(1 - \frac{\omega}{\omega_{\text{max}}}\right) + \frac{1}{2} \left( 1 - \text{hit} \right) + \frac{1}{4} \left(1 - \frac{r}{r_{\text{max}}}\right) + \frac{1}{4} \left(1 - \frac{t_r}{t_{\text{max}}}\right) \tag{2}\]

The different components are meant to:

Minimize distance from the final target location.
Minimize the final angular velocity of the wheels.
Penalize hitting the wall.
Minimize the wheel radius.
Minimize the time taken to reach rest.

This equation is pretty brittle (and definitely over engineered), and with multiple objectives you would be better off using a many objective optimization algorithm, such as Lexicase.

Evolution

Here is the basic structure of the evolutionary algorithm:

flowchart LR
    Init[Initialize] --> Eval[Evaluate]
    Eval --> Stop{Stop?}
    Stop --Yes--> Retn[Return]
    Stop --No--> Sele[Select]
    Sele --> Modi[Modify]
    Modi --> Evl2[Evaluate]
    Evl2 --> Comb[Combine]
    Comb --> Stop

I like to get a bit of linting help by defining the types I’ll be using:

Genome = list[float]
Fitness = namedtuple("Fitness", ["feasibility", "objective"])
Individual = tuple[Genome, Fitness]
Population = list[Individual]

And also some utility function:

def clamp(lo: float, hi: float, value: float) -> float: ...
def scale(from_lo: float, from_hi: float, to_lo: float, to_hi: float, value: float) -> float: ...
def statistics(pop: Population) -> tuple[Fitness, Fitness, Fitness]: ...

Now the core of our algorithm’s implementation:

def generate_genome() -> Genome: ...
def simulate(...) -> Fitness: ...
def fitness(genome: Genome, testing=False) -> tuple[Fitness, dict]: ...

def initialize(size: int) -> Population: ...
def evaluate(pop: Population, manager) -> Population: ...
def stop(pop: Population, *, _best=[Fitness(0, 0)], _counter=[0]) -> bool: ...

def tournament(pop: Population) -> Individual: ...
def select(pop: Population) -> Population: ...

def mutate_gene(gene: float) -> float: ...
def mutate(genome: Genome) -> Genome: ...
def modify(pop: Population) -> Population: ...

def combine(original: Population, children: Population) -> Population: ...

And here is the main loop:

population = initialize(args.population_size)
population = evaluate(population, manager)

for generation in range(args.num_generations):
    if stop(population):
        break

    selected = select(population)
    children = modify(selected)
    children = evaluate(children, manager)
    population = combine(population, children)

A few of my implementation specific details are bleeding through. For example, using the enlighten progress bar manager. You can view the full code in the repository.

Here’s a demo of the evolution script:

Launch

Rarely do we want to run a single “replicate” of an evolutionary optimization process. Instead, we run many with different initial conditions. The script below will launch 10 trials of the evolutionary process:

# Load conda and activate an environment
module load miniconda3
conda activate boxcarv2

POP_SIZE=100
NUM_GENERATIONS=100
NUM_TRIALS=10

slurm_args="--ntasks=1 --nodes=1 --exclusive"
cmd="python wmr_evolution.py"
cmd_args="--population_size $POP_SIZE --num_generations $NUM_GENERATIONS"

set -exuo pipefail

for trial in $(seq 1 $NUM_TRIALS); do
    srun $slurm_args $cmd "trial$trial" $cmd_args --seed $trial &
done
wait

You can find the full script with SLURM options here.

Analyze

Last, let’s take a look at the results.

Visualizations

I like to start with the behaviors. I have my script output the “best” individual from each replicate experiment. The behavior is output as a JSON log file that I can visualize with Review (a tool I created for this purpose).

Here’s a second example with a slightly different behavior:

Both of these are “better” behaviors than I was able to achieve by hand.

Evolutionary Progress

Figure Figure 1 shows the fitness of the best and average individuals over generations. We can see that the best individual from each replicate improves over time. The shaded regions denote the 95% confidence interval around the average performance.

The average performance (in orange) will always be a bit noisy depending on the exact algorithm. I implemented an overly simple algorithm with a lot of “exploration” and less “exploitation.”

Figure 1: Generations Vs. Fitness

Final Fitness Values

From Figure 2, we can see that we mostly achieve our goals. It appears that there is a minimum wheel radius that is able to get over the step, and we cannot instantaneously reach the target position.

Figure 2: Evolved Objectives of Top 5% of Final Populations

Evolved Parameters

Now we turn to the evolved parameters, shown in Figure 3.

Figure 3: Evolved Parameters of Top 5% of Final Populations

This plot only shows the top 5% of individuals across all final replicate populations. I filtered out the lower performing individuals since I have a heavily “exploration”-centered algorithm.

From these values, we can get a good idea of our design constraints. We can also guess that it would be good to increase the maximum possible value for suspension frequency and the sensor limit. Increasing the top speed is probably not a good idea since it will be limited by the hardware more so than the other two parameters.

Finally, in Figure 4, we take a look at the correlations among parameters.

Figure 4: Correlations among Evolved Parameters of Top 5% of Final Populations

It is usually a good idea to see if there are any strong tradeoffs among the parameters. For example, there is a strongly negative correlation between the speed slop and intercept.