# On Methods for Genetic Algorithms
{#gen-alg}
_This memo is based on a literature review I conducted as part of my
bachelor’s thesis, "Positioning Individuals in a Genogram". The greater part
of that research was based on the work by Sivanandam and
Deepa.[^sivanandam-deepa:2008] Unless indicated differently, the methodes
listed here can be found in their work._
A Genetic Algorithm (GA) operates on the principle of evolutionary theory: a
problem is solved by selecting desirable traits for the next generation of
solutions and allowing a limited amount of mutations.
There are no specific requirements for the type of problem that can be solved
with a GA, making it a powerful tool for optimising solutions. A GA evolves one
or more generations of potential solutions through selection, crossover, and
mutation to reach an optimal result, as illustrated in [Figure 1](#gen-alg).
{#gen-sol}
In this memo, I illustrate the workings of various GA methods using a visual
language I developed, as shown in [Figure 1](#gen-alg). In this visual language,
solutions are represented as coloured circles, where the colour represents a
particular fitness value. The fitness value of the solutions is determined by
the colour of the circle, which in turn is defined by three properties:
red $\text{R}(X)$, green $\text{G}(X)$, and blue $\text{B}(X)$. For convenience,
these three properties can be noted together
as $\text{RGB}(X) = (\text{R}(X), \text{G}(X), \text{B}(X))$. The values of
these properties all fall within the range $[0, 255]$, as illustrated
in [Figure 2](#gen-sol). The fitness value $f(X)$ reaches its maximum of $1$
when $\text{RGB}(X) = (0, 0, 0)$ and its minimum of $0$
when $\text{RGB}(X) = (255, 255, 255)$. This means that solutions with different
colour values can have the same fitness value, but this does not present an
issue for the examples included here. Formally, the fitness value can be
determined based on the colour value as follows:
$$
f(X) = 1 - \frac{\text{R}(X) + \text{G}(X) + \text{B}(X)}
{3 \times 255} \text{.}
$$
{#gen-col}
In some cases, it is desirable to display the different properties (red, green,
and blue) separately. In such cases, they are placed in this order from top to
bottom and outlined in black. An example of this can be seen
in [Figure 3](#gen-col).
It should be noted that the problem of finding the color black is not complex
at all, and does not warrant the use of a complicated solution-finding
algorithm such as a GA. Of course, in the case of this memo, it only serves
as a way to make the algorithm's methods comprehensible.
## Selection
During the selection step of a GA, solutions are chosen based on their fitness
value, after which a new generation of solutions is generated. Several methods
are available for this purpose, each with its own advantages and disadvantages.
{#gen-sel-roulette}
{#gen-sel-rank}
{#gen-sel-stochastic}
### Roulette Wheel Selection
In this method, solutions are selected at random, with the probability of
selection being proportional to the solution’s fitness value. The generation of
solutions is represented as a roulette wheel, where segments vary in size
depending on the fitness value. This method can sometimes lead to premature
convergence, causing the algorithm to get stuck in a local optimum. A visual
example can be seen in [Figure 4](#gen-sel-roulette).
### Rank Selection
Solutions are ranked according to fitness, from lowest to highest. Selection
then takes place via a Roulette Wheel Selection based on the position in the
sorted list of solutions. This method reduces the risk of premature convergence
and ensures more balanced selection pressure. An illustration is included
in [Figure 5](#gen-sel-rank).
### Stochastic Universal Sampling
A variant of Roulette Wheel Selection, this method selects solutions using
evenly distributed "pointers." The result is a uniform selection process with
fewer deviations and a higher reliability of convergence. For a selection size
of $N$, the pointers are placed at a distance of $\frac{1}{N}$ from one another.
The first pointer is randomly placed within the range $[0, \frac{1}{N}]$, as
illustrated in [Figure 6](#gen-sel-stochastic).
{#gen-sel-tournament}
### Tournament Selection
A subset of the population is randomly selected, and the solution with the best
fitness value is included in the selection. By adjusting the size of the subset,
the selection pressure can be influenced. An example of this selection method is
shown in [Figure 7](#gen-sel-tournament).
### Boltzmann Selection
This method is based on "simulated annealing," which simulates the cooling of
temperature. As the GA progresses, the selection pressure increases, allowing
more room in the early stages to explore solutions outside of a local optimum.
For each generation $g$, solution $X_g$ is selected if its fitness
value $f(X_g)$ is greater than or equal to the highest fitness value from the
previous generation: $f(X_g) \ge \max(f(X_{g - 1}))$. If this is not the case,
there is still a chance $P$ that $X_g$ will be selected:
$$
P = \exp\left[
\frac{f(X_g) - \max(f(X_{g - 1}))}
{T_0(1 - \alpha)^{1 + \frac{100g}
{G}}}
\right] \text{.}
$$
$G$ is the maximum value of $g$, i.e., the maximum number of generations. By
setting the parameter $\alpha$ within the range $[0, 1]$ and $T_0$ within the
range $[5, 100]$, this selection method can be configured. To prevent losing a
local optimum that may later turn out to be a global optimum, this method can be
combined with "[elitism](#elitism)," where the best results are preserved until
the end of the GA.
## Crossover
Crossover combines the characteristics of two different solutions to generate
new solutions. Each crossover method affects the diversity and completeness of
the search space differently.
{#gen-cro-single}
{#gen-cro-multi}
### Single Point Crossover
In this method, both solutions are split at the same point into two parts.
Offspring are created by combining one part from one parent with the other part
from the other parent. Although this method is simple to implement, it
significantly limits the exploration of the search space.[^owais-snasel:2014]
An illustration of this method is shown in [Figure 8](#gen-cro-single).
### Multi-Point Crossover
This method is similar to Single Point Crossover but differs in that the
solutions are split at multiple points, as seen in [Figure 9](#gen-cro-multi).
Offspring are created by alternating parts from both parents. With more
splitting points, a larger portion of the search space is explored.
{#gen-cro-three}
### Uniform Crossover
In Uniform Crossover, the crossover is performed based on a binary mask that is
the same size as the number of properties in the solutions. Depending on the
binary values in the mask, the property is inherited from one parent ($0$) or
the other ($1$). This method introduces a high degree of randomness but also
increases the likelihood of suboptimal solutions.
### Three Parent Crossover
In this method, the properties of two solutions are compared and inherited by
the offspring if they are identical. If the values differ, the property is
inherited from a third parent. This prevents the creation of suboptimal
combinations of properties. An illustration of this method is provided
in [Figure 10](#gen-cro-three).
### Reduced Surrogate Crossover
Splitting points are placed where the solutions differ from each other. This
guarantees that the next generation always consists of unique solutions.
### Shuffle
Before applying Single Point Crossover, the properties of the solutions are
randomly shuffled. This adds an extra layer of randomness, ensuring that a
larger part of the search space is explored, but it also increases the
likelihood of suboptimal results. See [Figure 11](#gen-cro-shuffle) for an
illustration.
{#gen-cro-shuffle}
## Mutation
Mutation involves making random changes to the new generation of solutions. This
increases diversity and prevents the GA from getting stuck in a local optimum.
Depending on the problem being solved, various methods are available.
### Flipping
This method is only applicable to binary-type genes, where the value of random
genes in the chromosome is "flipped", i.e., a 0 becomes a 1 and vice versa.
### Interchanging
Two properties of a solution are randomly selected, and their values are
swapped. This method is particularly effective in situations where the order of
elements is important, such as in task prioritisation or route optimisation.
### Reversing
The order of the properties in a solution is completely reversed, altering the
solution while preserving all existing values. This method is effective for
problems like the "Travelling Salesman".
### Specialised Mutation (Owais-Snasel Method)
This method, developed by Owais and Snasel, is used for optimising a
lattice.[^owais-snasel:2014] The $x$-coordinate of nodes in the lattice is
mutated, but this can also be extended to the $y$-coordinate. It is therefore
highly suitable for coordinate-based problems.
## Replacement
After the offspring of the selected solutions have been generated and mutated,
they must replace solutions from the previous generation. Several approaches are
available for this, depending on the desired balance between innovation and the
preservation of good traits.
### Generational Replacement
The entire parent generation is replaced by the offspring generated from them.
This ensures complete renewal with each iteration of the GA, resulting in a high
degree of genetic diversity. However, there is a risk of losing good solutions.
### Steady-State Replacement
Replacement is based on the fitness values of the parent generation. Only poorly
performing solutions are replaced by offspring. This allows good traits to be
preserved for longer, leading to a more gradual evolution. By adjusting the
number of solutions to be replaced and the threshold for replacement, this
method can be configured.
### Elitism
The solutions with the highest fitness values are retained. At the end of the
GA, these can be presented as the final result if the last generation does not
contain better solutions. This ensures that good solutions are never lost, and
it encourages risk-taking during crossover, mutation, and replacement.
### Random Immigrants
During replacement, not only are offspring from the previous generation added,
but also new, randomly generated solutions. This introduces new genetic variants
that may not have arisen from the existing genetic material.
## Termination
To terminate the GA, one or more stopping criteria must be met. The criteria
used depend on the application for which the algorithm is being used. Below are
some common stopping criteria.
1. **Maximum Generations**
The algorithm stops after a predetermined number of generations.
2. **Sufficient Solutions**
The algorithm stops when a solution that meets a predetermined minimum
fitness value is found.
3. **Convergence**
The algorithm stops when the population converges, i.e., when diversity among
the solutions decreases and individuals start to resemble each other.
4. **No Improvement**
The algorithm stops when no significant improvements are observed in the best
solution for a certain number of generations. This can occur when the best
fitness value remains stable or oscillates within certain bounds.
*[GA]: Genetic Algorithm
[^sivanandam-deepa:2008]: S.N. Sivanandam & S.N. Deepa. "Introduction to Genetic Algorithms". 2008. ISBN: 978-3-642-09224-4. DOI: [10.1007/978-3-540-73190-0](https://doi.org/10.1007/978-3-540-73190-0).
[^owais-snasel:2014]: S. Owais & V. Snasel. "Usage of Genetic Algorithm for Lattice Drawing". 2005.