Genetic Algorithm  
 Member: Aurelian Georgescu     Jose A. Cubillos     Vivian Mau     Jun Wu  
   
 Please click here to see the presentation. also to see the Conclusion, and the code.              

 Experiments on Genetic Algorithm

 

                                                                Jun Wu (4317)(group leader)

                                                                         Vivian Mau (9873)

                                                                        Jose Cubillos (7153)

                                                                Aurelian Georgescu (3591)

 

 

Abstract

 

This project  introduce some fundamentals of genetics algorithms and do some experiments on it. As the area of genetics algorithms is very wide, it is not possible to cover everything in these paper. So we first simply discuss what is genetics algorithm and then using it to search extreme in a given function which is difficult to do using other methods.

 

I. Introduction

 

First words

 

Genetic algorithms are a part of evolutionary computing, which is a rapidly growing area of artificial intelligence. As you can guess, genetic algorithms are inspired by Darwin's theory about evolution. Simply said, solution to a problem solved by genetic algorithms is evolved.

 

History

 

Idea of evolutionary computing was introduced in the 1960s by I. Rechenberg in his work "Evolution strategies" (Evolutions strategie in original). His idea was then developed by other researchers. Genetic Algorithms (GAs) were invented by John Holland [1]and developed by him and his students and colleagues. This lead to Holland's book "Adaption in Natural and Artificial Systems" published in 1975.

 

In 1992 John Koza has used genetic algorithm to evolve programs to perform certain tasks. He called his method "genetic programming" (GP). LISP programs were used, because programs in this language can expressed in the form of a "parse tree", which is the object the GA works on.

 

II. Biological Background

 

Chromosome

 

All living organisms consist of cells. In each cell there is the same set of chromosomes. Chromosomes are strings of DNA and serves as a model for the whole organism. A chromosome consist of genes, blocks of DNA. Each gene encodes a particular protein. Basically can be said, that each gene encodes a trait, for example color of eyes. Possible settings for a trait (e.g. blue, brown) are called alleles. Each gene has its own position in the chromosome. This position is called locus.

 

Complete set of genetic material (all chromosomes) is called genome. Particular set of genes in genome is called genotype. The genotype is with later development after birth base for the organism's phenotype, its physical and mental characteristics, such as eye color, intelligence etc.

 

Reproduction

 

During reproduction, first occurs recombination (or crossover). Genes from parents form in some way the whole new chromosome. The new created offspring can then be mutated. Mutation means, that the elements of DNA are a bit changed. This changes are mainly caused by errors in copying genes from parents.

 

The fitness of an organism is measured by success of the organism in its life.

 

 

 

 

III. Search Space

 

Search Space

 

If we are solving some problem, we are usually looking for some solution, which will be the best among others. The space of all feasible solutions (it means objects among those the desired solution is) is called search space (also state space). Each point in the search space represent one feasible solution. Each feasible solution can be "marked" by its value or fitness for the problem. We are looking for our solution, which is one point (or more) among feasible solutions - that is one point in the search space.

 

The looking for a solution is then equal to a looking for some extreme (minimum or maximum) in the search space. The search space can be whole known by the time of solving a problem, but usually we know only a few points from it and we are generating other points as the process of finding solution continues.

 

Example of a search space

 

The problem is that the search can be very complicated. One does not know where to look for the solution and where to start. There are many methods, how to find some suitable solution (ie. not necessarily the best solution), for example hill climbing, tabu search, simulated annealing and genetic algorithm. The solution found by this methods is often considered as a good solution, because it is not often possible to prove what is the real optimum.

 

IV. Genetic Algorithm

 

Basic Description

 

Genetic algorithms are inspired by Darwin's theory about evolution. Solution to a problem solved by genetic algorithms is evolved.

 

Algorithm is started with a set of solutions (represented by chromosomes) called population. Solutions from one population are taken and used to form a new population. This is motivated by a hope, that the new population will be better than the old one. Solutions which are selected to form new solutions (offspring) are selected according to their fitness - the more suitable they are the more chances they have to reproduce.

 

This is repeated until some condition (for example number of populations or improvement of the best solution) is satisfied.

 

In search space, problem solving can be often expressed as looking for extreme of a function. This is exactly what the problem shown here is. Some function is given and GA tries to find maximum of the function.

 

 

 

 

 

 

 

 

 

 

 

Outline of the Basic Genetic Algorithm [2]

 

1.[Start] Generate random population of n chromosomes (suitable solutions for the problem).

2.[Fitness] Evaluate the fitness f(x) of each chromosome x in the population.

3.[New population] Create a new population by repeating following steps until the new population is      

   complete

1.[Selection] Select two parent chromosomes from a population according to their fitness (the better fitness, the bigger chance to be selected)

2.[Crossover] With a crossover probability cross over the parents to form a new offspring (children). If no crossover was performed, offspring is an exact copy of parents.

3.[Mutation] With a mutation probability mutate new offspring at each locus (position in chromosome).

4.[Accepting] Place new offspring in a new population

4.[Replace] Use new generated population for a further run of algorithm

5.[Test] If the end condition is satisfied, stop, and return the best solution in current population

6.[Loop] Go to step

 

Some Comments

 

As you can see, the outline of Basic GA is very general. There are many things that can be implemented differently in various problems.

 

First question is how to create chromosomes, what type of encoding choose. With this is connected crossover and mutation, the two basic operators of GA. 

 

Next questions is how to select parents for crossover. This can be done in many ways, but the main idea is to select the better parents (in hope that the better parents will produce better offspring). Also you may think, that making new population only by new offspring can cause lost of the best chromosome from the last population. This is true, so so-called elitism is often used. This means, that at least one best solution is copied without changes to a new population, so the best solution found can survive to end of run.

 

V. Parameters of GA

 

Crossover and Mutation Probability

 

 

 

There are two basic parameters of GA - crossover probability and mutation probability.

 

Crossover probability says how often will be crossover performed. If there is no crossover, offspring is exact copy of parents. If there is a crossover, offspring is made from parts of parents' chromosome. If crossover probability is 100%, then all offspring is made by crossover. If it is 0%, whole new generation is made from exact copies of chromosomes from old population (but this does not mean that the new generation is the same!).

Crossover is made in hope that new chromosomes will have good parts of old chromosomes and maybe the new chromosomes will be better. However it is good to leave some part of population survive to next generation.

 

Mutation probability says how often will be parts of chromosome mutated. If there is no mutation, offspring is taken after crossover (or copy) without any change. If mutation is performed, part of chromosome is changed. If mutation probability is 100%, whole chromosome is changed, if it is 0%, nothing is changed.

Mutation is made to prevent falling GA into local extreme, but it should not occur very often, because then GA will in fact change to random search.

 

Other Parameters

 

There are also some other parameters of GA. One also important parameter is population size.

 

Population size says how many chromosomes are in population (in one generation). If there are too few chromosomes, GA have a few possibilities to perform crossover and only a small part of search space is explored. On the other hand, if there are too many chromosomes, GA slows down. Research shows that after some limit (which depends mainly on encoding and the problem) it is not useful to increase population size, because it does not make solving the problem faster.

 

 

 

VI. GA Example

 

Maximum of Function

 

About the Problem

 

As you already know  about search space, problem solving can be often expressed as looking for extreme of a function. This is exactly what the problem shown here is.

 

Some function is given and GA tries to find maximum of the function. For other problems we just have to define search space and the fitness function which means to define the function, which we want to find extreme for.

 

Here the given function is f(x)=x*sin(10*pi*x)+1.0, find maximum in the interval  [-1, 2].

 

VII. Operators of GA

 

Overview

 

As you can see, the crossover and mutation are the most important part of the genetic algorithm. The performance is influenced mainly by these two operators. Before we can explain more about crossover and mutation, some information about chromosomes will be given.

 

Encoding of a Chromosome [2]

 

The chromosome should in some way contain information about solution which it represents. The most used way of encoding is a binary string. The chromosome then could look like this:

Chromosome 11101100100110110Chromosome 21101111000011110

 

Each chromosome has one binary string. Each bit in this string can represent some characteristic of the solution. Or the whole string can represent a number .

 

Of course, there are many other ways of encoding.[2] This depends mainly on the solved problem. For example, one can encode directly integer or real numbers, sometimes it is useful to encode some permutations and so on.

 

In our example, we just choose specific Chromosome length and randomly encode the binary string to generate initial population. Correspondent module is like below:

 

 

void initial(int population[popsize][L])

{

                for(int i=0;i<popsize;i++)

                                for(int j=0;j<L;j++)

                                                population[i][j]=rand()%2;

}

 

However we need to decode the binary string when we need the value which the string represent.

 

void decode(int population[popsize][L],double x[popsize])

{

                for(int i=0;i<popsize;i++)

                {

                                double temp=binary_to_decimal(population,i);

                                x[i]=a+temp*(b-a)/(pow(2,L)-1);

                }

}

 

double binary_to_decimal(int population[popsize][L], int i)

{

                double x=0.0;

                for(int j=0;j<L;j++)

                                x=x+population[i][j]*pow(2,L-j-1);

                return x;

}

 

Fitness sorting [1]

 

We choose parents according to their fitness, so we need to sort the chromosome in terms of their fitness.

In our example, we sort the chromosome in descending order, so we can choose parents in the front:

 

void fitness_sorting(double fx[popsize],int population[popsize][L])        

{

                int order[popsize]={0};

                for(int i=0;i<popsize;i++)

                                order[i]=i;

    //inverse of bubble-sort fx[popsize] rearrange x[popsize]

                for(i=0;i<(popsize-1);i++)

                                for(int pos=popsize-1;pos>0;pos--)

                                                if(fx[pos]>fx[pos-1])

                                                {

                                                                double temp1=0.0;

                                                                int temp2=0;

                                                                temp1=fx[pos-1];

                                                                fx[pos-1]=fx[pos];

                                                                fx[pos]=temp1;

                                                               

                                                                temp2=order[pos-1];

                                                                order[pos-1]=order[pos];

                                                                order[pos]=temp2;

                                                }

                //get the new population array according to the fitness

                int temp[popsize][L]={0};

                for(i=0;i<popsize;i++)

                                //temp[i]=old population[order[i]]

                                for(int j=0;j<L;j++)

                                                temp[i][j]=population[order[i]][j];

                //new population=temp

                for(i=0;i<popsize;i++)

                                for(int j=0;j<L;j++)

                                                population[i][j]=temp[i][j];

}  

 

Crossover

 

After we have decided what encoding we will use, we can make a step to crossover. Crossover selects genes from parent chromosomes and creates a new offspring. The simplest way how to do this is to choose randomly some crossover point [2]  and everything before this point  copy from a first parent and then everything after a crossover point copy from the second parent.

 

Crossover can then look like this ( | is the crossover point):

Chromosome 111011 | 00100110110  Chromosome 211011 | 11000011110 

Offspring 111011 | 11000011110Offspring 211011 | 00100110110

 

There are other ways how to make crossover, for example we can choose more crossover points. Crossover can be rather complicated and very depends on encoding of the encoding of chromosome. Specific crossover made for a specific problem can improve performance of the genetic algorithm.

 

In our example we randomly choose the same point to split parents chromosome, fixed one part and exchange another part:

 

void reproduction(int population[popsize][L])

{             

//simply choose the first half population array

//which is fitter than second half;Two parents geneate

//two children by using single point crossover,

//store the children into the second half population array;

                for(int i=0;i<(popsize/2-1);i=i+2)

                {

                                int nc=(rand()%100)/100*(L-1);

                                for(int j=0;j<nc;j++)

                                {

                                                //child1(j)=parent1(j)

                                                //child2(j)=parent2(j)

                                                population[popsize/2+i][j]=population[i][j];

                                                population[popsize/2+i+1][j]=population[i+1][j];

                                }

                                for(j=nc;j<(L-1);j++)

                                {

                                                //child1(j)=parent2(j)

                                                //child2(j)=parent1(j)

                                                population[popsize/2+i][j]=population[i+1][j];

                                                population[popsize/2+i+1][j]=population[i][j];

                                }

                }             

}

 

Mutation

 

After a crossover is performed, mutation take place. This is to prevent falling all solutions in population into a local optimum of solved problem.[2] Mutation changes randomly the new offspring. For binary encoding we can switch a few randomly chosen bits from 1 to 0 or from 0 to 1. Mutation can then be following:

Original offspring 11101111000011110 Original offspring 21101100100110110 Mutated offspring 11100111000011110 Mutated offspring 21101101100110110

 

 void mutation(int population[popsize][L])//(1/4~1/2);

{

                int mutation_gene=popsize*L*pm;

                for(int i=0;i<mutation_gene;i++)

                {

                                if(population[(1+i*2)%(popsize/2)][rand()%L]==0)

                                                population[(1+i*2)%(popsize/2)][rand()%L]=1;

                                else population[(1+i*2)%(popsize/2)][rand()%L]=0;

                }

}

 

 

 

 

If  in a specific interval a function can not be differentiated then we have to use numerical method to find its maximum. By using numerical method,  the function has to have specific properties. By  using Genetic Algorithm we don’t need to analyze the complex details of the function, we just need to simply choose the better solutions as reproduction seeds in an evolutionary computing .That’s the one of the advantages of evolutionary computing. We don’t need to analyze the complex details of a problem, we just need to simply choose the better solutions as reproduction seeds in an evolutionary computing. In another words we only need to care about the results not the process. That made the algorithm simple.

 

 

References:

 

1.Marek Obitko, “Introduction to Genetic Algorithms”: 2.http://cs.felk.cvut.cz/~xobitko/ga/

3.http://alife.santafe.edu/~joke/encore/www/

4.Luger & Stubblefield, “Artificial Intelligence”  Addison Wesley, 1988

5.J. H. Holland, Adaptation in natural and artificial systems.

   University of Michigan Press, 1975

6.D. E. Goldberg: Genetic Algorithm in Search, Optimization, and Machine    

   Learning.  Addison Wesley Publishing Company, Inc. 1989

7.McFadden & Keeton, “Biology: An Exploration of Life” Norton, 1994

8.www.geatbx.com

9.http://library.advanced,org/tg-admin/mont.cgi