APGAIN, Compass
- [算法 1] A novel approach in parameter adaptive and diversity maintenance for genetic algorithms (APGAIN)
- 1. The techniques of PRAM
- 2. The repelling algorithm
- 3. The lazy repelling algorithm
- [算法 2] A Compass to guide genetic algorithms
- 1. Steps of the compass
- 2. Operator Application
- [算法 3] An adaptive operator rate controlled evolutionary algorithm (AROCEA)
- 1. Motivation and contributions
- 2. The framework of AORCEA
- 3. Adaptation strategy
- 4. Steps of the AORCEA
- [算法 4] Analysis of exploration and exploitation in evolutionary algorithms by ancestry trees
- 动机
- 主要贡献
- Ancestry tree
- splitting
- Metircs
- Exploration metrics
- Exploitation metrics
[算法 1] A novel approach in parameter adaptive and diversity maintenance for genetic algorithms (APGAIN)
1. The techniques of PRAM
A probabilistic rule-driven adaptive model (PRAM) is used to adapt the crossover rate and the mutation rate automatically, where three different parameter values are used on the population for the adaptation. These values will be adjusted automatically during the search by a set of rules according to the fitness improvement gained.
PRAM can be applied to adapting both pc and pm by running two PRAMs concurrently, each PRAM is responsible for adapting a parameter.
Step 1: PRAM uses three parameter values (; ; with ) to adapt a control parameter (pm or pc). The run of a GA is divided into epochs.
Step 2: Each epoch is further divided into two periods. During the first period, the corresponding genetic operations (crossover or mutation) are applied on the chromosomes using any one of the three values randomly. In the second period, the probabilities of applying any one of the three values are proportional to their fitness improvement gained in the first period.
Step 3: At the end of each epoch, a set of ‘‘greedy rules’’ are applied to alter the values of ; ; according to three fitness improvement rules: , , , and ;
2. The repelling algorithm
The repelling algorithm samples a population to a ‘‘representative’’. Population diversity is maintained by driving the population away from the representative. The fitness of a chromosome is modified to include the ‘‘diversity fitness’’, which is inversely proportional to the similarity between the chromosome and the representative. In other wards, the population diversity is maintained by giving a higher survival probability to chromosomes located in a sparsely populated region.
Step 1: The algorithm modifies the fitness evaluation function to increase the survival opportunity of chromosomes with almost lost alleles. The modified fitness of a chromosome () is given as:
where is the weight of diversity, () is the ranking of the objective fitness (diversity fitness) of the chromosome, () returns a score directly proportional to the ranking of the objective fitness (diversity fitness).
Step 2: The diversity fitness is evaluated by comparing the chromosome with the population’s representative. The representative is represented by an array of real numbers in the range of [0, 1]. Assuming binary chromosomes, the probability of finding a ‘0’ at each allele of the chromosome is assigned to the corresponding array element of the representative. The algorithm to create the representative of a population in the repelling model is as follows.
Step 3: Get the diversity fitness of a chromosome by rewarding the chromosome that are dissimilar to the representative.
Step 4: The replacement scheme after a genetic operation (crossover or mutation) is done by tournament between the offspring and the parents.
3. The lazy repelling algorithm
A lazy repelling algorithm is proposed to further reduce the computational overheads. The diversity fitness is evaluated only if the following condition holds:
where is lazy threshold, is the -th allele of the latest representative, is the -th allele of the representative computed last time when the diversity fitness is evaluated,
[算法 2] A Compass to guide genetic algorithms
1. Steps of the compass
Step 1: Let and be the population’s mean diversity variation and quality (fitness) variation, respectively, Define a vector to characterize the effects of the operator over the population in terms of variation of quality and diversity (axis and );
Step 2: then normalize and , we have
Algorithms that just consider the fitness improvement to adjust the operator probabilities would only use the projection of over the y-axis (dotted lines in (Fig. b)). On the other hand, if diversity is solely taken in account, measures would be considered as the projection over the x-axis (Fig. c).
Step 3: To control the fitness improvement and the diversity together, a vector (defined by its angle ) that characterizes also its orthogonal plane P (see Fig. d).
Rewards are then based on the projection of vectors over , i.e., , being the angle between and . A value of close to encourage exploration, while a value close to will favor exploitation. In this way, the management of application rates is abstracted by the angle , that guides the direction of the search as the needle of a compass shows the north.Step 4: Projections are turned into positive values by subtracting the smallest one and dividing them by execution time, in order to award faster operators (Fig. e).
Step 5: Application rates are obtained proportionally to values of plus a constant , that ensures that the smallest rate is equal to a minimal rate, , preventing the disappearance of the corresponding operator (Fig. f).
2. Operator Application
Consider for instance that £ is set to , so an equal importance is given to and . At the beginning of the search, just after the population was randomly created, the population diversity is high and mean fitness is low, thus it is easy for most operators to be situated in the quadrant . After some generations, the population starts to converge to some optimum, so improvement becomes difficult, and points in
[算法 3] An adaptive operator rate controlled evolutionary algorithm (AROCEA)
1. Motivation and contributions
Evolutionary search is always an interaction between exploration of new regions and exploitation of previously detected good regions of the search space. When applying evolutionary algorithms to optimization problems many different strategy parameters have to be set to define the behavior of the evolutionary algorithm itself. These strategy parameters include population size, operator rates, or tournament size of selection, which have to be set. The choice of good parameter values is most often based on experience and requires therefore considerable effort. Despite some recommendations there is no evidence that there exists a parameter setup which is optimal for an entire optimization run. Consequently, adapting the parameter values on the run is a promising way to improve EA performance. In the current paper, the motivation for introducing an adaptive strategy arose from the recurrent problem of setting appropriate operator rates for structural optimization problems.
Contributions of the proposed AROCEA can be summarized as follows:
- The framework
The standard components of EA are extended with an additional component termed adaptation. Theoretically, arbitrary control parameters of the evolutionary algorithm can be adjusted in the adaptation phase, e.g. the population size, some operator rates, the fitness definitions, etc. Many journal papers concerned with (self-)adaptive strategies investigate Genetic Algorithms (GA) which are based on bit representation and only employ simple operators like one-point crossover and bit mutation. In contrast, AORCEA is based on a more general evolutionary computing framework (Evolving Objects) which provides an environment to evolve any kind of data structures. - The best fitness frequency measure
A best-fitness-frequency (bff) measure is evaluated indicating the success of the current search. Depending on this value either exploitative or explorative genetic operators are forced in order to accelerate the search process. - The success of operators
To increase the operator rates of successful operators contributing above average to the search process performance, the success measure for the application of an operator of type X is defined. - The diversity of the population
To analyzes the distribution of the individuals in the search space, the operator diversity measure is proposed based on the euclidean distance of an individual i originating from the application of an operator of type X in generation G to the current best individual b. - Operator rate adaptation
To provide operator rates guaranteeing a balance between exploration and exploitation, the operator rate adaptation is provided, based on which the operator rates are linearly in - or decreased for the next generation.
2. The framework of AORCEA
The framework of AORCEA is shown as below:
From the Fig. 1, the additional component is added to the standard evolutionary computing framework. This paper mainly focuses on the design of this component.
There are some basic differences between the traditional GA and AORCEA:
- Representation
AORCEA is particularly developed for structural optimization tasks based on the heterogeneous genotype. For structural optimization it is very convenient to use a heterogeneous genotype, e.g. the eoUniGene concept presented by Ko¨nig, which consists of different gene types. - Evaluation
The fitness formulations introduced by Ko¨nig are employed defining the fitness value as a weighted sum of ratings for the optimization objective and potential constraints. For the fitness value F of a given individual i it holds
where denotes the respective weights and stands for mapping functions assigning fitness ratings within an interval to the given individual. Three different types of mapping functions are distinguished, namely, design objective, limit constraint, and target constraint functions, as shown in Fig. 2. - Selection
AORCEA uses the so-called stochastic tournament selection which chooses two random individuals of the population and prefers the better individual with a given probability. - Reproduction
AORCEA chooses a single operator from the set of available operators to produce child individual(s). This style of algorithm is used to estimate success and diversity measures for each operator application. Then, these measures are taken as a basis for the adaptation of the respective operator rates. - Replacement
AORCEA is based on the plus replacement known from Evolution Strategies (ES) where the best from offspring and parents become the members of the next generation.
3. Adaptation strategy
Evolutionary search is always an interaction between exploration and exploitation. The adaptation strategy proposed within this section, schematically shown in Fig. 3, focuses on determining whether exploitative or explorative operators should be favored in the current search state.
By introducing a threshold value
- bff : Increase the operator rates of successful operators contributing above average to the search process performance.
- bff : Increase the operator rates of explorative operators in order to escape from the local optimum by ntroducing more diversity into the population.
4. Steps of the AORCEA
Following steps are to determine which operators are of explorative and which are of exploitative nature.
Step 1: The best fitness frequency measure
As shown in Fig. 4, a histogram of the fitness values can be established. The class limits defining the class intervals are determined by , i.e. the fitness of the current best individual, and a factor determining the width of the class interval. The bff value of a single individual i of the current population is calculated as
where is the fitness value of individual i. The total bff value of the current population is then calculated as
Step 2: Operator success measure
The success measure for the application of an operator of type in generation number G producing an individual i is defined as
where is the fitness of the best parent individual, is the fitness of the newly created child individual i, and is the fitness of the overall best individual included in the current population. Thus, the maximum success measure occurs, if a new overall best individual () is found. Then, for each operator of type X (e.g. one-point crossover, uniform mutation, etc.) an average success value can be calculated
where is the number of operator applications of type in the respective generation .
Step 3: Operator diversity measure
The euclidean distance of an individual i originating from the application of an operator of type in generation to the current best individual b is calculated
where is the number of genes of the genotype, and and are the components of the respective genotypes.
The diversity measure for the application of an operator of type in generation number producing an individual i is defined as
where $d^G_{max} is the maximum euclidean distance between the current best individual and all individuals of the current population.
for each operator type an average diversity value can be calculated for each generation , i.e.
where is the number of operator applications of type .
Step 4: Operator rate adaptation
If the current bff value is below the target bff value , the success ranking is employed in order to force the successful operators, see Fig. 5. On the other hand, the diversity measure is applied if the search stagnates and finds a lot of very similar individuals close to the current best solution.
An adaptation rate is introduced depending on the difference between the current bff value and the threshold value .
Fig. 6 illustrates the linear adaptation of the operator rates which always sum up to a total of 100%, On the left, the current rates of six operators are shown. Then, the operators are ranked according to the previously determined success or diversity measure. The operator fulfilling the respective measure the best is assigned with rank 1, the worst operator is assigned with rank 6.
Subsequently, the operator rates are linearly in – or decreased for the next generation according to
where G is the current generation, NO is the total number of applied operators, and RX is the rank of the operator of type X. Furthermore, a minimum operator rate pmin is guaranteed for each operator in order to prevent that single operators are completely excluded from the optimization process.
[算法 4] Analysis of exploration and exploitation in evolutionary algorithms by ancestry trees
接受并理解这篇文章用了我两天时间,先说说对这篇文章的感受吧。首先,(对我本人而言)这是一篇比较抽象文章,初读很晦涩,但是慢慢地就领悟到了作者的写作思路。其次,本文内容充实也很有趣,思路比较清晰。但是,不得不说的是,作者的写作严谨性值得被吐槽一番,这在之后的算法介绍中会提到。
动机
有效地探索搜索空间是使用种群多样性作为反馈机制的主要动机之一,其被设计为对搜索空间探索程度的一个指标。保持一定程度的搜索空间多样性对于优化机制是非常重要的,这样它就不会陷入局部最优。这通常被称为 “exploration-exploitation”困境。为了解决这个问题,本文将搜索空间表示成树状结构并通过提出几个相关指标来估计搜索空间被探索和探测的程度。
主要贡献
- 本文介绍了一个基于祖先树的方法来记录种群的评估历史。 设计一组指标为如何以及何时使用探索或者探测提供建议。
- 另外,(分裂)树结构和探索/探测指标也用于指导我们如何在演化过程中探索/探测遗传结构。
- 通过求解多目标0/1背包问题的结果表明,这些指标明确地显示了探索和探测如何以及何时处于支配地位,这也可以更好地帮助我们分析演化过程的行为。
- 所提出的方法可以用于各种不同的场景:进化参数控制分析,基于问题的成功率分析,不同进化算法的比较等。
Ancestry tree
如Fig. 4所示,一棵ancestry tree是由一组如Fig. 3所示的数据收集结构(也称为节点)组成的。每个节点由5中信息组成,包括:当前代数及个体id,祖先及其id,个体染色体结构,变化操作类型(crossover, mutation,clone, repair, 和random),以及变化次数。[槽点1:图4中每个节点的机构与图3不一致]
规定,ancestry tree的数量等于初始种群规模()并且每个树根代表一个初始个体。将树中节点的数目定义为树的尺寸(size()),因此,节点的适应度值越大的树具有更高的尺寸。
splitting
考虑一下,一个新个体的产生是得益于探索还是探测?
当父代个体通过改变一小部分基因而产生新个体时,我们可以说该新个体是对搜索空间的探测而获得的。但是,如果改变了父代个体的大部分基因,那么我们不能说其来自于探测操作。为此,祖先树必须在一个特定的阈值(欧氏距离或者海明距离)处被分裂,而且该阈值也可以用于区分探索与探测。
如Fig.5 所示, 在中,每一个变化数大于的连接都会导致分裂祖先树()。[槽点2:图5有点乱,如果能与图4联系起来就好了]
每一棵分裂出来的祖先树可按照进行索引,其中 ;; 并且 表示分裂之后树的数量。越大意味着对树的根节点及其子节点探测的越多。当分裂阈值取时,全部树的数量,,为
Metircs
本文基于对搜索空间的探索与探测,介绍了七种度量指标,分别用于分析演化过程中搜索解空间时探索的使用比例(),算子对探索的影响(),何时进行探索();搜索解空间时探测的使用比例(),算子对探测的影响(),选择对探测的影响(),重复访问个体的影响()等。
Exploration metrics
- 【指标1】 Ratio between exploration and exploitation can be defined as percentage of exploration, calculated as ratio between and all individuals where is number of generations.
以为例, - 【指标2】 每一个新的根个体()代表一个新的搜索区域同时也是一次探索的起点。因此,可以通过对根个体变化操作类型()的分析来度量不同算子对探索的影响,进而提出指标如下:
其中,,;是所有变化类型的和。
以为例,
为了帮助后续内容的理解,这里有必要给出以上计算过程的解释:
表示:当设定阈值为2时,8号个体完成分裂之后所有的棵分裂树。在图4中,。
表示全部子树根的集合。图4中,8号个体完成分裂之后,一共产生6棵子树。其中,5棵为分裂树,分别以(3, 2), (3, 4), (3, 9), (5, 3) and (5, 4)为根节点,以及1棵根节点为(0, 8)的由原祖先树留下的子树。
表示:分裂出这些子树时,其根节点所经历的不同类型变化操作的数目。比如:总共使用了0次mutation,5次crossover,1次repair,以及1次random。因此,可以算出,。
-【指标3】为了回答“何时开始探索”的问题,提出了两种指标:
[槽点3:avr函数应该作用在两个gen函数的差值上,否则减号左边是一个实数,而右边是集合]
以及能够体现分裂次数信息的
其中 ,,为平均值函数。表示分裂树根,返回其父代树的根。为代数统计函数。
以为例,对于点,,和,函数的返回值为。而对于,函数的返回值为,如图6所示。
因此得到它们之间的代沟分别为。进而求得
进一步地,表示分裂树的层次(可以理解为从树根开始,直到获得每棵分裂树所经历的分裂的次数)。以为例,个体(5,3)的tree level为2(路径为(0,8)-(3,4)-(5,3))。其他个体的tree level均为1。因此,对于图4,。
Exploitation metrics
分裂树及其根个体对于探索是非常重要的,但是对于探测,需要考虑的是分裂树尺寸,节点以及它们的结构。为了遍历分裂树的全部节点,需要为其内部的个体重新分配索引,其中树根个体下表为,最后一个个体下标为。
-【指标4】由可以得到
-【指标5】指标与相似但稍有不同,具体形式为:
其中, ,, 。
[槽点4:论文中的公式少写了等号]
以图4为例,
上述结果的计算过程为:
表示:当设定阈值为2时,8号个体完成分裂之后,所有的j棵分裂树中的第k个节点。而这里只考虑非根节点,因为对于每棵分裂树来说,只有非根节点需要执行变化算子。
以图4为例,总共有10个非根节点,分别为(1,2), (2,7), (2,6), (2,1), (4,7), (4,5), (3,6), (3,7), (4,8)以及(5,6)。在完成全部分裂操作的过程中这些非根节点一共使用了11次变化操作。其中,包括2次mutation,7次crossover, 0次repair,以及2次clone。因此,可以计算出
-【指标6】指标用于量化选择操作对探测的影响,其一般形式为:
其中,用于统计分裂树叶子节点的数量。
以图4为例,当树完成分裂之后,根节点(0,8)有5个叶子节点,分别为:(2,6),(2,7),(3,6),(3,7)以及(5,6);根节点(3,4)和(3,9)分别有一个叶子(4,5)和(4,7);而根节点(5,3),(3,2)和(5,4)没有叶子节点。因此,可得
[槽点5:原文公式的分母有笔误]
-【指标7】指标用于评估重复访问个体的影响。该指标为unique个体(只被访问过一次的个体)在全部个体中所占的比例:
以ancestry tree 为例,(3, 6),(4, 7)和 (5, 6)由(2, 1),(3, 9) 和 (4, 8)复制而来,另外(2, 6)和(3, 7)与(0, 8)和(1, 2)有相同的基因型。因此,