*NOTE: This post assumes that the reader is familiar with basic data preparation techniques* *for machine learning. Also, it is recommended that the readers familiarise themselves with the concepts covered in the below posts by Saarang and me.*

## Neuroevolution: An Overview

As explained by Saarang in his previous post, a neural network is a machine learning model inspired by the working of interconnected neurons in a human brain. They are one of the most popular machine learning frameworks for image classification. In this post, we will use an *evolutionary* neural network, a.k.a. neuroevolutionary network, to classify images based on whether or not they contain a plane.

A neuroevolutionary network is a type of neural network whose hyperparameters are made to *evolve* in the training phase to get the best possible result. This evolution is simulated by using a form of some evolutionary algorithm. *Hyperparameters* are parameters according to which the neural network is constructed; for example, the number of layers, the weights between nodes, number of nodes per layer, optimisation algorithm used etc. In a neuroevolutionary network, one or more of these hyperparameters are determined using an evolutionary algorithm. In this post, we will tune the weights of a neural network using a genetic algorithm.

## Data Preparation

*The dataset is from: **https://www.kaggle.com/rhammell/planesnet/data*

*ABOUT THE DATASET**The planesnet dataset is a dataset of satellite images that, intuitively, either contain a plane or don’t. The images are full-colour, 20×20 pixels in size. There are 2 target classes – “plane”(1) or “no-plane”(0), in the ratio 1:3.The plane class consist of full images of planes, i.e., in which a plane is completely discernable. The no plane class consists of 3 types of images – completely random photographs, pictures of partial planes and “confusers”, images that resemble planes but are actually not. *

This post will not explain how to prepare the data in detail, but the code to do so is similar to Saarang’s diabetes classification models data preparation.

First, download the dataset and copy the folder path. Then, copy the path and use that as the path in the code.

*NOTE: Reading from the .csv* *file takes over a minute each time. So, it is better to save the loaded data as a .npy file, which can be read from later on*. *The .npy file is also available for download from the link at the end.*

```
import glob
import numpy as np
import os.path as path
import imageio
import time
import matplotlib.pyplot as plot
import random
import copy
starttime = time.time()
IMAGE_PATH = '/Users/sureshp/Downloads/planesnet/planesnet/planesnet'
file_paths = glob.glob(path.join(IMAGE_PATH, '*.png'))
images = [imageio.imread(path) for path in file_paths]
images = np.asarray(images)
currenttime = time.time()
print("Image aspects are: ", images.shape)
print("Time taken: ", currenttime - starttime)
images = np.around((images / 255), 3)
np.save("dataRGB", images)
```

After running the above code, you can comment out everything from the *starttime* to* np.save* statements. The code that you’ll need to run everytime the program starts, that includes the main data preparation, is given below.

```
images = np.load("dataRGB.npy")
IMAGE_PATH = '/Users/sureshp/Downloads/planesnet/planesnet/planesnet'
file_paths = glob.glob(path.join(IMAGE_PATH, '*.png'))
n_images = images.shape[0]
labels = np.zeros(n_images)
for im in range(n_images):
filename = path.basename(file_paths[im])[0]
labels[im] = int(filename[0])
TRAIN_TEST_SPLIT = 0.9
split_index = int(TRAIN_TEST_SPLIT * n_images)
shuffled_indices = np.random.permutation(n_images)
train_indices = shuffled_indices[0:split_index]
test_indices = shuffled_indices[split_index:]
X_train_raw = images[train_indices, :, :]
y_train_raw = labels[train_indices]
# print(X_train_raw.shape, y_train_raw.shape)
X_test = images[test_indices, :, :]
y_test = labels[test_indices]
#Here, we "balance" the dataset, to make sure there are equal number of positive and negative images
#in the training set. Otherwise, given that the +ve to -ve ratio is about 1:3, the classifier might be
#biased towards -ve images.
negatives = np.where(y_train_raw == 0)[0]
positives = np.where(y_train_raw == 1)[0]
indices = negatives[:int((negatives.shape[0] - positives.shape[0]) * 0.6)]
# print(positives.shape, negatives.shape, indices.shape)
X_train = np.delete(X_train_raw, indices, axis=0)
y_train = np.delete(y_train_raw, indices)
```

To visualise a few images, run the following code.

```
def visualize_data(positive_images, negative_images):
figure = plot.figure()
count = 0
for i in range(positive_images.shape[0]):
count += 1
figure.add_subplot(2, positive_images.shape[0], count)
plot.imshow(positive_images[i, :, :])
plot.axis('off')
plot.title("1")
figure.add_subplot(1, negative_images.shape[0], count)
plot.imshow(negative_images[i, :, :])
plot.axis('off')
plot.title("0")
plot.show()
N_TO_VISUALIZE = 10
# Select the first N positive examples
positive_example_indices = (y_train == 1)
positive_examples = X_train[positive_example_indices, :, :]
positive_examples = positive_examples[0:N_TO_VISUALIZE, :, :]
# Select the first N negative examples
negative_example_indices = (y_train == 0)
negative_examples = X_train[negative_example_indices, :, :]
negative_examples = negative_examples[0:N_TO_VISUALIZE, :, :]
visualize_data(positive_examples, negative_examples)
```

Finally, we must flatten the arrays.

```
n_pixels = X_train.shape[1] * X_train.shape[2] * X_train.shape[3]
X_train = X_train.reshape((X_train.shape[0], n_pixels)).astype('float32')
X_test = X_test.reshape((X_test.shape[0], n_pixels)).astype('float32')
n_targets = 2
```

## Creating a Neural Network From Scratch

The math behind a neural network is fairly simple and only involves a bit of matrix multiplication. Each layer of the network is represented as a 1-dimensional matrix with *a* values, i.e., *a* nodes. The weights connecting a layer to the next are represented as a 2-dimensional matrix, *(a, b)*, where *b* is the number of nodes in the next layer.

Forward propagation in a network is done by multiplying a layer by its weights, and adding “biases” (which we will assume to be 0 in this post), to get the next layer. Additionally, an activation function can be applied. In this post, we shall use *RELU* and *SoftMax* activation functions. The formula for calculating a layer is as follows:

In our neural network, the matrices will be represented as numpy arrays. Numpy also has a handy tool to find the dot product of matrices, *numpy.dot()*, which we will use in the forward propagation function. We will use a simple neural network with a single input layer, 3 hidden layers and a single output layer (with 2 nodes – class *plane *and class *no-plane*). The architecture of the neural network looks like this:

So, let’s start by coding the activation functions:

```
def softmax(x):
return np.exp(x) / np.sum(np.exp(x), axis=0)
def relu(x):
return np.maximum(x, 0)
```

Now, coding a neural network class. It takes in 4 weights matrices as parameters upon initialisation. It has 3 functions – to forward propagate, to measure training accuracy, and finally to test the network’s accuracy. In forward propagation, we will use the *relu* activation function for all layers except the output layer, for which we use *softmax*. The output is either a 0 (no plane) or a 1 (plane present), the index of the node in the final layer which has a higher value (probability, between 0 and 1).

```
class NeuralNetworkClassifier:
def __init__(self, w0, w1, w2, w3):
self.weights0 = w0
self.weights1 = w1
self.weights2 = w2
self.weights3 = w3
def forward_propogate(self, Xi):
layer1 = relu(np.dot(Xi, self.weights0))
layer2 = relu(np.dot(layer1, self.weights1))
layer3 = relu(np.dot(layer2, self.weights2))
raw_output = softmax(np.dot(layer3, self.weights3))
if raw_output[0] > raw_output[1]:
output = 0
else:
output = 1
return output
def measure_accuracy(self, X_train, y):
true_rate = 0
total = 0
preds = []
for i in range(X_train.shape[0]):
x = X_train[i]
prediction = self.forward_propogate(x)
preds.append(prediction)
total += 1
if round(prediction) == y[i]:
true_rate += 1
return round((true_rate / total) * 100, 2)
def test(self, X_test, y_test):
zeroes = 0
ones = 0
correct = 0
total = 0
incorrect = []
for i in range(X_test.shape[0]):
x = X_train[i]
prediction = self.forward_propogate(x)
if prediction == 0:
zeroes += 1
else:
ones += 1
total += 1
if round(prediction) == y_test[i]:
correct += 1
else:
incorrect.append((round(prediction), y_test[i]))
print(f"The network was tested and got {round((correct / total) * 100, 2)}% of its predictions right.")
print(f"zeroes:ones = {zeroes}:{ones}")
print(f"INCORRECT VALUES: {incorrect}")
print()
```

And that’s all it takes to create a neural network. Neat, huh? The forward propagation method serves as a conventional predict() function. You’ll notice that the network doesn’t have a training method – that’s because we’ll train the model through the genetic algorithm in the next section.

## The Genetic Algorithm – Evolving The Network

Again, this post will not explain the genetic algorithm’s processes in detail except where required.

Each chromosome has a 1-dimensional numpy float array of weights. The number of floats in this array is 521020 (do the math), the total number of weights in the neural network. The weights array is reshaped to suit the architecture of the neural network, and the resulting sub-arrays (matrices) are used to create a neural network for the chromosome. The fitness of a chromosome is measured by its neural network’s *measure_accuracy()* function – a simple correct over total percentage.

```
class Chromosome:
def __init__(self, weights):
self.weights = weights
self.w0, self.w1, self.w2, self.w3 = weights[0:480000].reshape(1200, 400), weights[480000:520000].reshape(400,
100),\
weights[520000:521000].reshape(100, 10), weights[521000:521020].reshape(10,
2)
self.nn = NeuralNetworkClassifier(self.w0, self.w1, self.w2, self.w3)
self.fitness = 0
self.calc_fitness()
def calc_fitness(self):
weights = self.weights
self.w0, self.w1, self.w2, self.w3 = weights[0:480000].reshape(1200, 400), weights[480000:520000].reshape(400,
100),\
weights[520000:521000].reshape(100, 10), weights[521000:521020].reshape(10,
2)
self.nn = NeuralNetworkClassifier(self.w0, self.w1, self.w2, self.w3)
true_rate = self.nn.measure_accuracy(X_train, y_train)
self.fitness = round(true_rate, 2)
```

Now for the genetic algorithm. It’s pretty much the same as in my previous post, with a few modifications. For one, I’ve added an additional variable called “species”, basically signifying a generation in which most chromosomes are killed off and started over. Also, instead of a point crossover, we’ll use something called a “uniform crossover”, in which each child gene has a one-in-two chance of coming from either parent. This is to promote diversity and to reduce the chances of twins being created. Also, the algorithm stores the best value and average value of elites per generation.

```
class GA_Optimizer:
def __init__(self):
self.population_size = 20
self.mutation_rate = 0.35
self.generations = 75
self.species = 1
self.population = self.init_population()
self.population.sort(reverse=True, key=lambda c: c.fitness)
self.accuracies = []
self.avgaccuracies = []
print(f"ORIGINAL POPULATION | Best: {self.population[0].fitness}")
for chromo in self.population:
print(chromo.fitness, end=" ")
print()
print()
def init_population(self):
print(f"Initialization begun at {time.ctime(time.time())}")
population = []
for _ in range(self.population_size):
chromosome = Chromosome(np.random.randn(521020))
population.append(chromosome)
return population
def select_cross_mutate(self):
# select
elites = [self.population[idx] for idx in range(0, 10)]
selected = copy.deepcopy(elites)
new_gen = []
elites[8] = self.population[random.randint(10, 18)]
elites[9] = self.population[random.randint(10, 18)]
# cross
while len(elites) != 0:
p1 = elites.pop(random.randint(0, len(elites) - 1))
p2 = elites.pop(random.randint(0, len(elites) - 1))
# cross_points = [130255, 260510, 390765]
c1, c2 = Chromosome(np.zeros(521020)), Chromosome(np.zeros(521020))
for i in range(c1.weights.shape[0]):
choice = random.random()
#cross
if choice < 0.5:
c1.weights[i] = p1.weights[i]
c2.weights[i] = p2.weights[i]
else:
c1.weights[i] = p2.weights[i]
c2.weights[i] = p1.weights[i]
new_gen.append(c1)
new_gen.append(c2)
# mutate
for chromo in new_gen:
if random.random() < self.mutation_rate:
toReplace = [random.randint(0, 521019) for _ in range(150000)]
while len(toReplace) > 0:
chromo.weights[toReplace.pop(0)] = np.random.random()
# replace old population
new_gen.extend(selected)
self.population = new_gen
for chromo in self.population:
chromo.calc_fitness()
self.population.sort(reverse=True, key=lambda c: c.fitness)
def train_network(self):
starttime = time.time()
print(f"Training started at: {time.ctime(starttime)}")
species = self.species
while species > 0:
print("SPECIES", self.species - species + 1)
gen = self.generations
while gen > 0:
self.select_cross_mutate()
print("Generation", self.generations - gen + 1, "| BEST:", self.population[0].fitness)
self.accuracies.append(self.population[0].fitness)
for chromo in self.population:
print(chromo.fitness, end=" ")
print()
print()
n = 0
for c in self.population[:10]:
n += c.fitness
self.avgaccuracies.append(n/10)
gen -= 1
new_species = [self.population[i] for i in range(4)]
while len(new_species) < self.population_size:
chromosome = Chromosome(np.random.randn(521020))
new_species.append(chromosome)
self.population = new_species
species -= 1
self.population.sort(reverse=True, key=lambda c: c.fitness)
print("Best Accuracy Possible:", self.population[0].fitness)
print("Time Taken:", time.time() - starttime)
```

## It’s Showtime!

And with that, we’ve completed the bulk of the code. All that’s left is to create and execute the main function, that trains, displays the best networks and plots a graph to visualise the accuracies.

```
if __name__ == '__main__':
ga = GA_Optimizer()
ga.train_network()
#testing
for i in range(0, 7):
weights = ga.population[i].weights
w0, w1, w2, w3 = weights[0:480000].reshape(1200, 400), weights[480000:520000].reshape(400, 100), \
weights[520000:521000].reshape(100, 10), weights[521000:521020].reshape(10, 2)
nn = NeuralNetworkClassifier(w0, w1, w2, w3)
print(f"Accuracy: {ga.population[i].fitness}")
nn.test(X_test, y_test)
plot.figure()
generations = [n for n in range(0, ga.generations*ga.species)]
plot.title("Accuracy over generations")
plot.plot(generations, ga.accuracies, label="Best accuracy")
plot.plot(generations, ga.avgaccuracies, label="Average accuracy of elites")
plot.xlabel("Generation")
plot.ylabel("Accuracies")
plot.legend()
plot.show()
```

Accuracy over generations graph for the first pair of sample output. Accuracy over generations for the second pair of sample output.

The github link for this project is:

https://github.com/adityapentyala/Python/tree/master/Neuroevolution

The .npy file is available through the link in the README.md file.

## One Reply to “Introduction to Deep Neuroevolutionary Networks”