## Overview

A genetic algorithm is the most popular type of a larger family of AI algorithms called *evolutionary algorithms*. Evolutionary algorithms are heuristic AI algorithms that can be used for search and/or optimisation problems. Such algorithms are inspired by Darwinian theories of evolution, natural selection and survival of the fittest. In a genetic algorithm, possible solutions are represented as *chromosomes*, which ultimately stand for arrays of *genes*. A *gene*, from what little I understand of biology and what holds true in computer programming, is the most basic representation of information. A gene can be represented either as its binary string or as its face value (letters, characters, numbers etc). A *population* is an array of chromosomes. These are the basic data structures needed to code a genetic algorithm.

A genetic algorithm starts by initialising a population with completely random chromosomes. Each chromosome has a *fitness value*, calculated by a heuristic function (which is why genetic algorithms are considered meta-heuristic algorithms). Chromosomes with better fitness values are deemed to be fitter than chromosomes with poorer fitness values, and in line with the survival of the fittest ideology, are more likely to be selected. *It is important to note that “better” need not be the same as “greater”*. The selection of fitter chromosomes is known as *elitism*.

After the population is initialised, the fitness values of the chromosomes are calculated and the fittest chromosomes of the population are selected to populate, and to move on to the next generation. The next section explains how this happens.

## Converging to a Solution

A genetic algorithm solves a problem by generating populations of successively fitter and fitter chromosomes. A genetic algorithm produces generations of chromosomes by using biological operators such as mating and mutation. After elitism, the chromosomes selected are made to mate amongst themselves. In this process of “mating”, also called “crossing”, two randomly selected elite parents create two children by interchanging their genes at a certain *crossover point*. Also, genes of the offspring are liable to undergoing *mutation*, which means changing the value of the gene to a randomly chosen value. After all parents have undergone a round of mating, the children and their parents are used to populate a generation, and the previously existing population is wiped out by the new generation.

A genetic algorithm can be made to terminate in several ways: for instance, when the known global optimum solution has been reached, a time limit has been reached, a fixed number of generations have been generated, a combination of custom termination criteria has been reached etc. After termination, the fittest chromosome is returned as the optimal solution.

## Coding a Primitive Genetic Optimisation Algorithm in Python

*This problem is from **http://www.math.brown.edu/~dbejleri/teaching/math0090sp18/optimization.pdf*

The problem basically states: Given a rectangle of perimeter *p*, find its dimensions such that the area of the rectangle so formed is maximised. So, we shall code a genetic optimisation algorithm to find its dimensions.

The genetic algorithm to solve this problem is simple enough to code. However, I use the word *primitive* to describe this particular GA since it often does not return the global optimal value and gets stuck in local optima. I shall elaborate on this in a while. However, since the goal of this post is simply to provide basic knowledge of genetic algorithms for optimisation, I shall not include implementations of techniques to make a more accurate GA.

Coming back to our problem, the first thing we need to do is create a heuristic function to calculate the fitness of a chromosome. We know that the formula for perimeter of a rectangle is *2x + 2y*, therefore, the value of *y* becomes *(p/2) – x*. The fitness of a chromosome can simply be the area of the rectangle formed by the chromosome’s dimensions. In this case, the fitness and area are *directly proportional*. The area of a rectangle is given by *x*y*. Therefore, the final heuristic function, *h(x)* is: * h(x) = x * [(p/2) – x]* . In this problem, we need to optimise the value of

*x*.

After determining the heuristic function, we can define a class for a chromosome, which, upon initialisation, takes a parameter x, the length of a rectangle. It has functions to return its binary value and to calculate its fitness.

```
# program to demonstrate working of a simple genetic algorithm
import random
# class for chromosome
class Chromosome:
def __init__(self, _x):
self.x = _x
self.fitness = 0
self.binary = self.get_binary()
def calc_fitness(self, per):
self.fitness = self.x * (per / 2 - self.x)
def get_binary(self):
binary_str = "{0:08b}".format(self.x)
binary = [int(i) for i in binary_str]
return binary
```

Next, we create a class for the main optimiser, in which our genetic algorithm shall be coded. It takes in four parameters – the perimeter, size of the initial population, mutation rate and number of generations it has to populate. It has 3 functions. The first is used to populate the random *0th*, or original, generation. The second is where new generations are created. This function first selects the 16 fittest individuals in the population and places them in an array of *elites*. It then randomly picks a pair of parents from the array of elites and pops them from the list. Then, it creates a pair of offspring by crossing the genes of the selected parents at a random crossover point (3rd, 4th or 5th index). It adds these offspring to a new generation array. This goes on until the list of elites is empty. Also, the genes of the offspring may be mutated according to the mutation rate. The elites are added to the new generation and the existing population is replaced by the new generation. The third function runs the previous function as many times as the specified number of generations. *Note that, every time a generation is made, the fitness of its chromosomes are calculated and it is sorted in descending order according to their fitness values.*

```
#the actual algorithm
class GeneticOptimizerAlgorithm:
def __init__(self, _perimeter, _popu_size, _mutation, _generations):
self.perimeter = _perimeter
self.population_size = _popu_size
self.mutation_rate = _mutation
self.generations = _generations
self.numbers = [n for n in range(0, 256)]
self.population = self.initial_population()
# generates the starting population
def initial_population(self):
init_population = []
high = self.population_size + 1
for _ in range(0, high):
choice = random.randint(0, len(self.numbers) - 1)
chromosome = Chromosome(self.numbers.pop(choice))
init_population.append(chromosome)
for chromo in init_population:
chromo.calc_fitness(self.perimeter)
fitness = lambda c: c.fitness
init_population.sort(reverse=True, key=fitness)
return init_population
# generates a new population
def select_cross_mutate(self):
self.population_size = len(self.population)
# SELECTION
mating_pool = [self.population[i] for i in range(0, 16)]
elite = mating_pool.copy()
new_gen = []
while len(mating_pool) > 0:
p1 = mating_pool.pop(random.randint(0, len(mating_pool) - 1))
p2 = mating_pool.pop(random.randint(0, len(mating_pool) - 1))
crosspoints = [3, 4, 5]
cross_point = random.choice(crosspoints)
# CROSSING
c1_bin, c2_bin = "", ""
for i in range(0, cross_point):
c1_bin = c1_bin + str(p1.binary[i])
c2_bin = c2_bin + str(p2.binary[i])
for i in range(cross_point, 8):
c1_bin = c1_bin + str(p2.binary[i])
c2_bin = c2_bin + str(p1.binary[i])
# MUTATION
if random.random() < self.mutation_rate:
pos = random.randint(0, 7)
if c1_bin[pos] == "0":
c1_bin[pos].replace("0", "1")
else:
c1_bin[pos].replace("1", "0")
if random.random() < self.mutation_rate:
pos = random.randint(0, 7)
if c2_bin[pos] == "0":
c2_bin[pos].replace("0", "1")
else:
c2_bin[pos].replace("1", "0")
c1 = Chromosome(int(c1_bin, 2))
c2 = Chromosome(int(c2_bin, 2))
new_gen.append(c1)
new_gen.append(c2)
# REPLACING UNFIT OLDER CHROMOSOMES WITH NEWLY CREATED CHROMOSOMES
new_gen.extend(elite)
for chromo in new_gen:
chromo.calc_fitness(self.perimeter)
fitness = lambda c: c.fitness
new_gen.sort(reverse=True, key=fitness)
return new_gen
# "training" the algorithm
def run(self):
gen = self.generations
print("Original population: ")
for chromo in self.population:
print(chromo.x, end=" ")
print()
print()
while self.generations > 0:
self.population = self.select_cross_mutate()
fitness = lambda c: c.fitness
self.population.sort(reverse=True, key=fitness)
print("Generation ", gen - self.generations + 1)
for chromo in self.population:
print(chromo.x, end=" ")
print()
print()
self.generations -= 1
if __name__ == "__main__":
perimeter = int(input("What is the perimeter of the rectangle? "))
while perimeter > 1021:
perimeter = int(input("Perimeter must be b/w 0 and 1021: "))
print()
ga = GeneticOptimizerAlgorithm(perimeter, 32, 0.3, 8)
ga.run()
sol = ga.population[0].x
print("Length:", sol, ", Breadth:", perimeter / 2 - sol)
print("The area of a rectangle with the above dimensions is: ", sol * (perimeter / 2 - sol))
```

In the end, we print the value of the first (fittest) chromosome in the final population. The output obtained upon running the above code is as follows:

Clearly, in some cases, though it provides *good* solutions, they may not be the *best* possible solutions (Case 3). So, what are some ways to optimise the algorithm?

- Promotion of diversity:
*Higher mutation rates, larger population size, only storing unique chromosomes can lead to a larger diversity in generations.* - Uniform crossover:
*Each gene has equal chance of coming from either of the parents, instead of a fixed crossover point.* - Using a zero-value heuristic:
*Making the heuristic formula one that equals 0, i.e., the equation should equal zero, can be beneficial since then, it is possible to tell the program to stop when the value of a chromosome returns a fitness of 0.* - Setting maximum number of generations

## Uses of Genetic Algorithms

#### Search & Optimisation Problems:

- Solving algebraic equations
- Scheduling
- Engineering
- Bioinformatics
- Natural Language Processing
- Economics
- Game theory
- Blockchain
- Code breaking
- Artificial creativity, and many more

#### Machine Learning:

- Natural Language Processing
- Clustering
- Feature selection for tree based learning models
- Recurrent Neural Networks
- Neural Network training as a replacement for gradient descent, and neural network construction

The github link for the above program is:

https://github.com/adityapentyala/Python/blob/master/gen_alg.py

## One Reply to “Introduction to Genetic Algorithms”