Full Example

We’ll now look at a full example focused on evolutionary robotics. Here are the steps:

Setup a compute environment.
Create a simulation that is decoupled from evolution and visualization.
Design a fitness function.
Implement an evolutionary algorithm.
Launch the evolutionary process and leverage parallelism.
Analyze and visualize the results.

Setup

This example uses conda/mamba in place of Pixi and uv. The differences are fairly minor and you should stick with whatever is best practices for your group.

I am using an HPC linux server with Lmod (used for environment management) and Slurm (for workload management). Though these instructions will work pretty broadly.

I use Miniforge to manage my software installations. So, here is my process:

# Install or activate Miniforge
module load miniforge3

# Create a new environment
mamba create --name simr

# Install necessary packages
mamba install pybox2d ipython pandas enlighten jupyter seaborn plotly

# Save the environment for reproducibility (create both)
conda env export --from-history > cross-platform.yml
conda list --explicit > spec-file.txt

# Create a directory for code (and a separate for data if needed)
mkdir -p ~/simr-tutorial
cd ~/simr-tutorial

Once the directory is created, I typically edit code using VSCode with the Remote - SSH extension.

Simulation

My WMR simulation, written in Python, largely follows the version from demos 1-3. The key is that the simulation can be constructed with the parameters we want to evolve:

class WMR:
    def __init__(
        self,
        *,
        wheel_radius: float,
        chassis_length: float,
        suspension_frequency: float,
        suspension_damping: float,
        sensor_limit: float,
        duration: float,
        time_step: float,
        visualize: bool = False,
    ):
    ...

In the evolution file, we can then simulate with:

wmr = WMR(
    wheel_radius=wheel_radius,
    chassis_length=chassis_length,
    suspension_frequency=suspension_frequency,
    suspension_damping=suspension_damping,
    sensor_limit=sensor_limit,
    duration=DURATION,
    time_step=TIME_STEP,
    visualize=visualize,
)

Where each of these values is computed from a genome. For example,

GENOME_MAPPING = {
    "wheel_radius": (0.5, 1.5),
    "chassis_length": (1, 4),
    "suspension_frequency": (1, 8),
    "suspension_damping": (0.3, 0.9),
    "sensor_limit": (1, 15),
    "speed_max": (0, 10),
    "speed_slope": (0, 10),
    "speed_intercept": (-20, 20),
}

params = {
    k: scale(0, 1, lo, hi, g)
    for (k, (lo, hi)), g in zip(GENOME_MAPPING.items(), genome)
}

sim_info = simulate(**params)

Here is also a good place to track information about the phenotype using, for example, Phylotrackpy. ## Fitness Evaluation

We need to design a fitness function to satisfy our problem statement:

We want to evolve an autonomous wheeled mobile robot (WMR) to navigator quickly over obstacles and then stop in front of a wall.

I recommend using a two-value fitness function. One component is used for a feasibility (constraint) metric, and the other is an objective metric. For example, for this demo I used the following equation for feasibility:

\[ f = \frac{l}{2} - r \tag{1}\]

where \(l\) is the chassis length and \(r\) is the wheel radius.

Essentially, Equation 1 is a feasibility value that takes into account the ability to actually construct the WMR. If the wheels are too big, then they will not fit on the chassis. It is better to use a feasibility metric as part of the fitness value instead of just throwing out the individual completely. This provides a better “gradient” for evolution.

For the demo, I used the following objective function:

\[ o = 2 \left(1 - \frac{d}{d_{\text{max}}}\right) + 1 \left(1 - \frac{\omega}{\omega_{\text{max}}}\right) + \frac{1}{2} \left( 1 - \text{hit} \right) + \frac{1}{4} \left(1 - \frac{r}{r_{\text{max}}}\right) + \frac{1}{4} \left(1 - \frac{t_r}{t_{\text{max}}}\right) \tag{2}\]

The different components are meant to:

Minimize distance from the final target location.
Minimize the final angular velocity of the wheels.
Penalize hitting the wall.
Minimize the wheel radius.
Minimize the time taken to reach rest.

This equation is pretty brittle (and definitely over engineered), and with multiple objectives you would be better off using a many objective optimization algorithm, such as Lexicase.

I typically save each component separately so that I can better evaluate the final individuals and the algorithm’s performance.

sim_info["objective"] = {}

objective = 0

# Minimize final distance from target
distance_to_target = sim_info["location"][-1] - TARGET_LOCATION
objective += 2 * (1 - abs(distance_to_target) / INITIAL_TARGET_DISTANCE)
sim_info["objective"]["final_distance"] = distance_to_target

# Penalize final velocity
final_speed = sim_info["speed"][-1]
objective += 1 - abs(final_speed) / GENOME_MAPPING["speed_max"][1]
sim_info["objective"]["final_speed"] = final_speed

# Penalize hitting the wall
hit_wall = any(sim_info["contact"])
objective += 0.5 * (1 - hit_wall)
sim_info["objective"]["hit_wall"] = hit_wall

# Minimize wheel radius (genome is already scaled 0 to 1)
objective += 0.25 * (1 - genome[0])
sim_info["objective"]["wheel_radius"] = params["wheel_radius"]

# If at target, minimize time to rest
near_zero = [abs(s) < SPEED_TOLERANCE for s in sim_info["speed"]]
if False in near_zero:
    index = (n - indexOf(reversed(near_zero), False)) if near_zero[-1] else n
else:
    index = n
objective += 0.25 * (1 - (index / n))
sim_info["objective"]["index_at_rest"] = index

Evolution

Here is the basic structure of the evolutionary algorithm:

flowchart LR
    Init[Initialize] --> Eval[Evaluate]
    Eval --> Stop{Stop?}
    Stop --Yes--> Retn[Return]
    Stop --No--> Sele[Select]
    Sele --> Modi[Modify]
    Modi --> Evl2[Evaluate]
    Evl2 --> Comb[Combine]
    Comb --> Stop

I like to get a bit of linting help by defining the types I’ll be using:

Genome = list[float]
Fitness = namedtuple("Fitness", ["feasibility", "objective"])
Individual = tuple[Genome, Fitness]
Population = list[Individual]

For this demo, a genome is just a list of real-valued numbers. Finding a good representation is an important part that I am glossing over here.

And also some utility function:

def clamp(lo: float, hi: float, value: float) -> float: ...
def scale(from_lo: float, from_hi: float, to_lo: float, to_hi: float, value: float) -> float: ...
def statistics(pop: Population) -> tuple[Fitness, Fitness, Fitness]: ...

Now the core of our algorithm’s implementation:

def generate_genome() -> Genome: ...
def simulate(...) -> Fitness: ...
def fitness(genome: Genome, testing=False) -> tuple[Fitness, dict]: ...

def initialize(size: int) -> Population: ...
def evaluate(pop: Population, manager) -> Population: ...
def stop(pop: Population, *, _best=[Fitness(0, 0)], _counter=[0]) -> bool: ...

def tournament(pop: Population) -> Individual: ...
def select(pop: Population) -> Population: ...

def mutate_gene(gene: float) -> float: ...
def mutate(genome: Genome) -> Genome: ...
def modify(pop: Population) -> Population: ...

def combine(original: Population, children: Population) -> Population: ...

And here is the main loop:

population = initialize(args.population_size)
population = evaluate(population, manager)

for generation in range(args.num_generations):
    if stop(population):
        break

    selected = select(population)
    children = modify(selected)
    children = evaluate(children, manager)
    population = combine(population, children)

A few of my implementation specific details are bleeding through. For example, using the enlighten progress bar manager. You can view the full code in the repository.

Here’s a demo of the evolution script:

Launch

Rarely do we want to run a single “replicate” of an evolutionary optimization process. Instead, we run many with different initial conditions. The script below will launch 10 trials of the evolutionary process:

# Load conda and activate an environment
module load miniconda3
conda activate boxcarv2

POP_SIZE=100
NUM_GENERATIONS=100
NUM_TRIALS=10

slurm_args="--ntasks=1 --nodes=1 --exclusive"
cmd="python wmr_evolution.py"
cmd_args="--population_size $POP_SIZE --num_generations $NUM_GENERATIONS"

set -exuo pipefail

for trial in $(seq 1 $NUM_TRIALS); do
    srun $slurm_args $cmd "trial$trial" $cmd_args --seed $trial &
done
wait

You can find the full script with SLURM options here.

Analyze

Last, let’s take a look at the results.

Saving Data

Generation data can be saved in a dataframe with:

# Before the loop
df_generations = pd.DataFrame(
    columns=[
        "Worst Feasibility",
        "Average Feasibility",
        "Best Feasibility",
        "Worst Objective",
        "Average Objective",
        "Best Objective",
    ]
)

# In the loop
df_generations.loc[generation + 1] = [
    worst.feasibility,
    average.feasibility,
    best.feasibility,
    worst.objective,
    average.objective,
    best.objective,
]

# After the loop
df_generations.to_csv(f"{args.name}-generations.csv", index_label="Generation")

These files can then be loaded with:

files = data_dir.glob("*-generations.csv")

partial_dfs = []

for f in files:
    trial = f.stem.split(replicate_name)[1].split("-")[0]
    df_partial = pd.read_csv(f)
    df_partial["Trial"] = trial

    partial_dfs.append(df_partial)

df = pd.concat(partial_dfs, ignore_index=True)

Visualizations

I like to start with the behaviors. I have my script output the “best” individual from each replicate experiment. The behavior is output as a JSON log file that I can visualize with Review (a tool I created for this purpose).

Here’s a second example with a slightly different behavior:

Both of these are “better” behaviors than I was able to achieve by hand.

Evolutionary Progress

Figure Figure 1 shows the fitness of the best and average individuals over generations. We can see that the best individual from each replicate improves over time. The shaded regions denote the 95% confidence interval around the average performance.

The average performance (in orange) will always be a bit noisy depending on the exact algorithm. I implemented an overly simple algorithm with a lot of “exploration” and less “exploitation.”

Figure 1: Generations Vs. Fitness

This type of figure can be quickly created from a Pandas dataframe using Seaborn:

sns.lineplot(x="Generation", y="Best Objective", data=df)
sns.lineplot(x="Generation", y="Average Objective", data=df)
plt.legend(["Best", "Best-Conf", "Average", "Average-Conf"])
sns.despine()

sns_plot.figure.savefig("generations.svg")

I used Plotly for this website since it creates interactive figures.

Final Fitness Values

From Figure 2, we can see that we mostly achieve our goals. It appears that there is a minimum wheel radius that is able to get over the step, and we cannot instantaneously reach the target position.

Figure 2: Evolved Objectives of Top 5% of Final Populations

Evolved Parameters

Now we turn to the evolved parameters, shown in Figure 3.

Figure 3: Evolved Parameters of Top 5% of Final Populations

This plot only shows the top 5% of individuals across all final replicate populations. I filtered out the lower performing individuals since I have a heavily “exploration”-centered algorithm.

From these values, we can get a good idea of our design constraints. We can also guess that it would be good to increase the maximum possible value for suspension frequency and the sensor limit. Increasing the top speed is probably not a good idea since it will be limited by the hardware more so than the other two parameters.

Finally, in Figure 4, we take a look at the correlations among parameters.

Figure 4: Correlations among Evolved Parameters of Top 5% of Final Populations

It is usually a good idea to see if there are any strong tradeoffs among the parameters. For example, there is a strongly negative correlation between the speed slop and intercept.