Overview

A genetic algorithm is the most popular type of a larger family of AI algorithms called evolutionary algorithms. Evolutionary algorithms are heuristic AI algorithms that can be used for search and/or optimisation problems. Such algorithms are inspired by Darwinian theories of evolution, natural selection and survival of the fittest. In a genetic algorithm, possible solutions are represented as chromosomes, which ultimately stand for arrays of genes. A gene, from what little I understand of biology and what holds true in computer programming, is the most basic representation of information. A gene can be represented either as its binary string or as its face value (letters, characters, numbers etc). A population is an array of chromosomes. These are the basic data structures needed to code a genetic algorithm.

A genetic algorithm starts by initialising a population with completely random chromosomes. Each chromosome has a fitness value, calculated by a heuristic function (which is why genetic algorithms are considered meta-heuristic algorithms). Chromosomes with better fitness values are deemed to be fitter than chromosomes with poorer fitness values, and in line with the survival of the fittest ideology, are more likely to be selected. It is important to note that “better” need not be the same as “greater”. The selection of fitter chromosomes is known as elitism.

After the population is initialised, the fitness values of the chromosomes are calculated and the fittest chromosomes of the population are selected to populate, and to move on to the next generation. The next section explains how this happens.

Converging to a Solution

A genetic algorithm solves a problem by generating populations of successively fitter and fitter chromosomes. A genetic algorithm produces generations of chromosomes by using biological operators such as mating and mutation. After elitism, the chromosomes selected are made to mate amongst themselves. In this process of “mating”, also called “crossing”, two randomly selected elite parents create two children by interchanging their genes at a certain crossover point. Also, genes of the offspring are liable to undergoing mutation, which means changing the value of the gene to a randomly chosen value. After all parents have undergone a round of mating, the children and their parents are used to populate a generation, and the previously existing population is wiped out by the new generation.

Mating between parent chromosomes.
Mutation of a gene in a chromosome.

A genetic algorithm can be made to terminate in several ways: for instance, when the known global optimum solution has been reached, a time limit has been reached, a fixed number of generations have been generated, a combination of custom termination criteria has been reached etc. After termination, the fittest chromosome is returned as the optimal solution.

Coding a Primitive Genetic Optimisation Algorithm in Python

This problem is from http://www.math.brown.edu/~dbejleri/teaching/math0090sp18/optimization.pdf

The problem basically states: Given a rectangle of perimeter p, find its dimensions such that the area of the rectangle so formed is maximised. So, we shall code a genetic optimisation algorithm to find its dimensions.

The genetic algorithm to solve this problem is simple enough to code. However, I use the word primitive to describe this particular GA since it often does not return the global optimal value and gets stuck in local optima. I shall elaborate on this in a while. However, since the goal of this post is simply to provide basic knowledge of genetic algorithms for optimisation, I shall not include implementations of techniques to make a more accurate GA.

Coming back to our problem, the first thing we need to do is create a heuristic function to calculate the fitness of a chromosome. We know that the formula for perimeter of a rectangle is 2x + 2y, therefore, the value of y becomes (p/2) – x. The fitness of a chromosome can simply be the area of the rectangle formed by the chromosome’s dimensions. In this case, the fitness and area are directly proportional. The area of a rectangle is given by x*y. Therefore, the final heuristic function, h(x) is: h(x) = x * [(p/2) – x] . In this problem, we need to optimise the value of x.

After determining the heuristic function, we can define a class for a chromosome, which, upon initialisation, takes a parameter x, the length of a rectangle. It has functions to return its binary value and to calculate its fitness.

# program to demonstrate working of a simple genetic algorithm
import random


# class for chromosome
class Chromosome:
    def __init__(self, _x):
        self.x = _x
        self.fitness = 0
        self.binary = self.get_binary()

    def calc_fitness(self, per):
        self.fitness = self.x * (per / 2 - self.x)

    def get_binary(self):
        binary_str = "{0:08b}".format(self.x)
        binary = [int(i) for i in binary_str]
        return binary

Next, we create a class for the main optimiser, in which our genetic algorithm shall be coded. It takes in four parameters – the perimeter, size of the initial population, mutation rate and number of generations it has to populate. It has 3 functions. The first is used to populate the random 0th, or original, generation. The second is where new generations are created. This function first selects the 16 fittest individuals in the population and places them in an array of elites. It then randomly picks a pair of parents from the array of elites and pops them from the list. Then, it creates a pair of offspring by crossing the genes of the selected parents at a random crossover point (3rd, 4th or 5th index). It adds these offspring to a new generation array. This goes on until the list of elites is empty. Also, the genes of the offspring may be mutated according to the mutation rate. The elites are added to the new generation and the existing population is replaced by the new generation. The third function runs the previous function as many times as the specified number of generations. Note that, every time a generation is made, the fitness of its chromosomes are calculated and it is sorted in descending order according to their fitness values.

#the actual algorithm
class GeneticOptimizerAlgorithm:
    def __init__(self, _perimeter, _popu_size, _mutation, _generations):
        self.perimeter = _perimeter
        self.population_size = _popu_size
        self.mutation_rate = _mutation
        self.generations = _generations
        self.numbers = [n for n in range(0, 256)]
        self.population = self.initial_population()

    # generates the starting population
    def initial_population(self):
        init_population = []
        high = self.population_size + 1
        for _ in range(0, high):
            choice = random.randint(0, len(self.numbers) - 1)
            chromosome = Chromosome(self.numbers.pop(choice))
            init_population.append(chromosome)
        for chromo in init_population:
            chromo.calc_fitness(self.perimeter)
        fitness = lambda c: c.fitness
        init_population.sort(reverse=True, key=fitness)
        return init_population

    # generates a new population
    def select_cross_mutate(self):
        self.population_size = len(self.population)
        # SELECTION
        mating_pool = [self.population[i] for i in range(0, 16)]
        elite = mating_pool.copy()
        new_gen = []
        while len(mating_pool) > 0:
            p1 = mating_pool.pop(random.randint(0, len(mating_pool) - 1))
            p2 = mating_pool.pop(random.randint(0, len(mating_pool) - 1))
            crosspoints = [3, 4, 5]
            cross_point = random.choice(crosspoints)
            # CROSSING
            c1_bin, c2_bin = "", ""
            for i in range(0, cross_point):
                c1_bin = c1_bin + str(p1.binary[i])
                c2_bin = c2_bin + str(p2.binary[i])
            for i in range(cross_point, 8):
                c1_bin = c1_bin + str(p2.binary[i])
                c2_bin = c2_bin + str(p1.binary[i])
            # MUTATION
            if random.random() < self.mutation_rate:
                pos = random.randint(0, 7)
                if c1_bin[pos] == "0":
                    c1_bin[pos].replace("0", "1")
                else:
                    c1_bin[pos].replace("1", "0")
            if random.random() < self.mutation_rate:
                pos = random.randint(0, 7)
                if c2_bin[pos] == "0":
                    c2_bin[pos].replace("0", "1")
                else:
                    c2_bin[pos].replace("1", "0")
            c1 = Chromosome(int(c1_bin, 2))
            c2 = Chromosome(int(c2_bin, 2))
            new_gen.append(c1)
            new_gen.append(c2)
        # REPLACING UNFIT OLDER CHROMOSOMES WITH NEWLY CREATED CHROMOSOMES
        new_gen.extend(elite)
        for chromo in new_gen:
            chromo.calc_fitness(self.perimeter)
        fitness = lambda c: c.fitness
        new_gen.sort(reverse=True, key=fitness)
        return new_gen

    # "training" the algorithm
    def run(self):
        gen = self.generations
        print("Original population: ")
        for chromo in self.population:
            print(chromo.x, end=" ")
        print()
        print()
        while self.generations > 0:
            self.population = self.select_cross_mutate()
            fitness = lambda c: c.fitness
            self.population.sort(reverse=True, key=fitness)
            print("Generation ", gen - self.generations + 1)
            for chromo in self.population:
                print(chromo.x, end=" ")
            print()
            print()
            self.generations -= 1


if __name__ == "__main__":
    perimeter = int(input("What is the perimeter of the rectangle? "))
    while perimeter > 1021:
        perimeter = int(input("Perimeter must be b/w 0 and 1021: "))
    print()
    ga = GeneticOptimizerAlgorithm(perimeter, 32, 0.3, 8)
    ga.run()
    sol = ga.population[0].x
    print("Length:", sol, ", Breadth:", perimeter / 2 - sol)
    print("The area of a rectangle with the above dimensions is: ", sol * (perimeter / 2 - sol))

In the end, we print the value of the first (fittest) chromosome in the final population. The output obtained upon running the above code is as follows:

Clearly, in some cases, though it provides good solutions, they may not be the best possible solutions (Case 3). So, what are some ways to optimise the algorithm?

  • Promotion of diversity: Higher mutation rates, larger population size, only storing unique chromosomes can lead to a larger diversity in generations.
  • Uniform crossover: Each gene has equal chance of coming from either of the parents, instead of a fixed crossover point.
  • Using a zero-value heuristic: Making the heuristic formula one that equals 0, i.e., the equation should equal zero, can be beneficial since then, it is possible to tell the program to stop when the value of a chromosome returns a fitness of 0.
  • Setting maximum number of generations

Uses of Genetic Algorithms

Search & Optimisation Problems:

  • Solving algebraic equations
  • Scheduling
  • Engineering
  • Bioinformatics
  • Natural Language Processing
  • Economics
  • Game theory
  • Blockchain
  • Code breaking
  • Artificial creativity, and many more

Machine Learning:

  • Natural Language Processing
  • Clustering
  • Feature selection for tree based learning models
  • Recurrent Neural Networks
  • Neural Network training as a replacement for gradient descent, and neural network construction

The github link for the above program is:

https://github.com/adityapentyala/Python/blob/master/gen_alg.py

Visits: 331

One Reply to “Introduction to Genetic Algorithms”

Leave a Reply

Your email address will not be published. Required fields are marked *