Genetic algorithms in Scala - solving optimization problems

In this article, I am going to describe the usage of genetic algorithms in Scala. Genetic algorithms are used to solve optimization problems. It is a class of problems that don’t have an obvious solution, and there is no easy way to improve an existing solution.

In general, there is a global maximum, which is the best possible solution and many local maximums, so it is precisely the same situation as training a machine learning model or a neural network.

Genetic algorithms mimic evolution. The potential solution of a problem is described as a genotype in which all genes describe a part of a solution.

The crucial part of the algorithm is its “fitness” function. The value produced by the function tells how well the genotype solves the problem. It is the equivalent of an evaluation metric used in machine learning.

# Knapsack problem

The most known problem solved by optimization algorithms is the Knapsack problem. The details depend on the person who is telling the story, but I usually heard this version.

A thief broke into a house. He has only a small backpack, so he cannot take everything. Therefore the thief must decide which items he should steal. The thief knows the value of items and their weight. The goal of the optimization algorithm is to tell which items should the thief take to maximize the value of the stolen loot, but the maximal capacity of the backpack cannot be exceeded.

# Example

I am going to do something very similar. Let’s imagine that we are playing a computer game (for example Fallout 2, because we like old, good games). We have just won a battle with multiple enemies, and we started looting their corpses. We found many armors, and we want to sell them. Unfortunately, there is no way to take all of them because they are too heavy, so we must decide which armors should we take to maximize the amount we get when we sell them.

# Problem

To solve the problem, I am going to use the Jenetics library with a Scala wrapper called Helisa. In the first step, I have to define the problem space. I have some items on the ground. I can distinguish between the types of items. I know how many of them are there. I can check how much their weight and I know the armor prices.

That information can be represented as a Scala case class and a couple of constants:

``````case class Item(weightLbs: Int, valueDollars: Int, availableItems: Int)

object Items {
//values taken from: //https://fallout.fandom.com/wiki/Fallout_2_armor_and_clothing

val robes = Item(10, 90, 5)
val leatherJacket = Item(5, 250, 5)
val leatherArmors = Item(8, 700, 5)
val combatLeatherJacket = Item(7, 1000, 5)
val leatherArmorMk2 = Item(10, 1000, 2)
val metalArmor = Item(35, 1100, 2)
val metalArmorMk2 = Item(35, 1900, 1)
val bridgekeeperRobes = Item(10, 5, 1)
val combatArmor = Item(20, 6500, 4)
val combatArmorMk2 = Item(25, 8000, 2)
val brootherhoodArmor = Item(25, 4800, 2)
val powerArmor = Item(42, 12500, 2)
val hardenedPowerArmor = Item(50, 15000, 1)
val advancedPowerArmor = Item(45, 20000, 1)
val advancedPowerArmorMk2 = Item(50, 25000, 1)
val teslaArmor = Item(35, 4500, 3)
}
``````

# Solution

I have already discovered one of the constraints. The number of available armors. I can’t take two Advanced Power Armors with me because there is only one. The other constraint is the total weight of the items. In Fallout 2, the maximal weight of items a player can carry is defined as 25 + (player’s strength * 25) lbs.

I know everything I need, so now it is the time to start defining the schema of the solution. I start with a backpack. It is a case class that tells me how many armors of a particular type I have decided to take. For convenience, I also include the maximal weight as a field in this case class.

``````case class Backpack(
robes: Int,
leatherJackets: Int,
leatherArmors: Int,
combatLeatherJacket: Int,
leatherArmorMk2: Int,
metalArmor: Int,
metalArmorMk2: Int,
bridgekeeperRobes: Int,
combatArmor: Int,
combatArmorMk2: Int,
brootherhoodArmor: Int,
powerArmor: Int,
hardenedPowerArmor: Int,
teslaArmor: Int
) {
val playerStrength = 8
val maxWeightLbs = 25 + (playerStrength * 25)
}
``````

## Fitness function

Now, I need a fitness function. It is supposed to tell the algorithm how good is the solution. In this case, it is quite easy. The function must multiply the number of items in my backpack by their value in dollars and return the sum.

``````// add this function inside the Backpack case class
def valueOfItems: Int =
this.robes * Items.robes.valueDollars +
this.leatherJackets * Items.leatherJacket.valueDollars +
this.leatherArmors * Items.leatherArmors.valueDollars +
this.combatLeatherJacket * Items.combatLeatherJacket.valueDollars +
this.leatherArmorMk2 * Items.leatherArmorMk2.valueDollars +
this.metalArmor * Items.metalArmor.valueDollars +
this.metalArmorMk2 * Items.metalArmorMk2.valueDollars +
this.bridgekeeperRobes * Items.bridgekeeperRobes.valueDollars +
this.combatArmor * Items.combatArmor.valueDollars +
this.combatArmorMk2 * Items.combatArmorMk2.valueDollars +
this.brootherhoodArmor * Items.brootherhoodArmor.valueDollars +
this.powerArmor * Items.powerArmor.valueDollars +
this.hardenedPowerArmor * Items.hardenedPowerArmor.valueDollars +
this.teslaArmor * Items.teslaArmor.valueDollars
``````

I need to keep in mind the total weight of items, so I need another function that returns the total weight of armors in the backpack.

## Constraints

``````// add this function inside the Backpack case class
def weightOfItems: Int =
this.robes * Items.robes.weightLbs +
this.leatherJackets * Items.leatherJacket.weightLbs +
this.leatherArmors * Items.leatherArmors.weightLbs +
this.combatLeatherJacket * Items.combatLeatherJacket.weightLbs +
this.leatherArmorMk2 * Items.leatherArmorMk2.weightLbs +
this.metalArmor * Items.metalArmor.weightLbs +
this.metalArmorMk2 * Items.metalArmorMk2.weightLbs +
this.bridgekeeperRobes * Items.bridgekeeperRobes.weightLbs +
this.combatArmor * Items.combatArmor.weightLbs +
this.combatArmorMk2 * Items.combatArmorMk2.weightLbs +
this.brootherhoodArmor * Items.brootherhoodArmor.weightLbs +
this.powerArmor * Items.powerArmor.weightLbs +
this.hardenedPowerArmor * Items.hardenedPowerArmor.weightLbs +
this.teslaArmor * Items.teslaArmor.weightLbs
``````

A solution which tells me to take 20 Tesla armors is useless if I found only 3 such armors. Therefore, I cannot exceed the number of available items.

I am going to define an “areItemsAvailable” function to tell me which solutions I should reject because there are not enough items on the ground.

``````// add this function inside the Backpack case class
def areItemsAvailable: Boolean =
this.robes <= Items.robes.availableItems &&
this.leatherJackets <= Items.leatherJacket.availableItems &&
this.leatherArmors <= Items.leatherArmors.availableItems &&
this.combatLeatherJacket <= Items.combatLeatherJacket.availableItems &&
this.leatherArmorMk2 <= Items.leatherArmorMk2.availableItems &&
this.metalArmor <= Items.metalArmor.availableItems &&
this.metalArmorMk2 <= Items.metalArmorMk2.availableItems &&
this.bridgekeeperRobes <= Items.bridgekeeperRobes.availableItems &&
this.combatArmor <= Items.combatArmor.availableItems &&
this.combatArmorMk2 <= Items.combatArmorMk2.availableItems &&
this.brootherhoodArmor <= Items.brootherhoodArmor.availableItems &&
this.powerArmor <= Items.powerArmor.availableItems &&
this.hardenedPowerArmor <= Items.hardenedPowerArmor.availableItems &&
this.teslaArmor <= Items.teslaArmor.availableItems
``````

Finally, I need a function to check if the backpack is too heavy for the player’s character or not.

``````// add this function inside the Backpack case class
def canCarryItems: Boolean = this.weightOfItems <= maxWeightLbs
``````

# Helisa and Jenetics

## Genotype

That was the problem and solution representation is Scala. Now, I can start using Helisa to configure Jenetics and its genetic algorithm.

First, I define the genotype. It is the representation of the solution and its structure must fit the Backpack case class I defined above. Also, I encode some constraints using the genotype. I know that there is 5 robes, 5 leather jackets, 5 leather armors, and 5 combat leather armors. Those are also the first 4 fields of my case class, so I encode them all as one chromosome which can have values between 0 and 5.

``````chromosomes.int(0, 5, length = 4)
``````

There are 2 leather armors Mk. 2 and 2 metal armors. So the next chromosome encodes two fields which can have values between 0 and 2.

``````chromosomes.int(0, 2, length = 2)
``````

The total length of the genotype (the sum of the chromosome lengths) must be equal to the number of fields in the case class.

``````import com.softwaremill.helisa._

//note that it is a *function*
val genotype = () => genotypes.uniform(
chromosomes.int(0, 5, length = 4),
chromosomes.int(0, 2, length = 2),
chromosomes.int(0, 1, length = 2),
chromosomes.int(0, 4, length = 1),
chromosomes.int(0, 2, length = 3),
chromosomes.int(0, 1, length = 3),
chromosomes.int(0, 3, length = 1),
)
``````

## Fitness function again

After that, I implement the fitness function. It takes a Backpack as an argument and returns an integer which denotes the value of the items in the backpack.

There is one caveat, I also need to take into account invalid solutions. There are genotype validators, but I did not manage to make them work as I expect, so instead of that, I included validation in my fitness score. If I find a way to use the validators to impose constraints, I will write an article about it ;)

A solution that exceeds the maximal weight of items or contains more items than it is possible to obtain must get a low fitness score. So the function returns the total value if the solution is valid and zero if the constraints have been violated.

``````def fitness(value: Backpack): Int =
if(value.canCarryItems && value.areItemsAvailable) value.valueOfItems else 0
``````

## Jenetics configuration

When all of that is ready, I can start configuring the Evolver, which is the class that implements the genetic algorithm in Jenetics.

First, I give it the fitness function and the structure of the genotype. I also tell it that I want to maximize the value of the fitness function and that the population size should be 500. The population is the number of solutions generated in every iteration of the algorithm.

### Survivors selection

After that, I define the way to decide which elements of the population are going to survive and be copied to the next iteration. I define the selector function. In this case, a “tournament” function which is a fency name of a function that takes n best solutions (best means, having the highest fitness score).

I want 20 survivors, so I set the survivor size and the number of solutions that win the tournament. If I used a smaller number as the tournament parameter the next iteration would contain duplicates of the best solutions from the previous one (it is not necessarily bad, but as the consequence, one genotype may dominate the gene pool).

### Offspring selection

It is a genetic algorithm, so my solutions need to “reproduce.” The offspring selector is used to decide which genotypes will become “parents.”

The roulette wheel is a function which gives a higher chance of reproducing to the solutions which have a higher fitness score. The worse solutions have a lower probability of reproducing but may get lucky and pass their genes to the next generation.

### Crossover and mutations

In the end, I define the way of combining the “parents” genes. I want to use the crossover method which splits the parents’ genes and takes one part from the first parent and the second part from the other parent. In this case, it is going to be a single point crossover, so the parent gene is split in a one randomly selected position.

As the last step, I add the “swap” mutation which randomly swaps parts of the “children’s” genotype. Such a mutation prevents the algorithm from getting stuck in a local maximum, because the mutation may randomly produce a solution which is way better than anything that may be generated by combining the “parent’s” genotypes.

``````val evolver = Evolver(fitness, genotype)
.maximizing()
.populationSize(500)
.offspringSelector(selectors.rouletteWheel())
.geneticOperators(operators.crossover.singlePoint(), operators.mutator.swap())
.build()
``````

# Running the algorithm

Now, I can run the genetic algorithm (in this example, generate 5000 populations) and see what solution I get.

``````val best = evolver.streamScalaStdLib().drop(5000).head

Console.println(best.bestPhenotype)
``````

One of my results:

Some(Backpack(0,0,0,1,0,0,0,0,1,2,0,0,1,1,1,0))

Max weights: 225

is not too heavy: true

is possible: true

Weight: 222

Value: 83500

Older post

## Re: DataOps Principles: How Startups Do Data The Right Way

Team vs. a bunch of individuals reporting work time in the same spreadsheet