TY  - THES
A1  - Krejca, Martin Stefan
T1  - Theoretical analyses of univariate estimation-of-distribution algorithms
N2  - Optimization is a core part of technological advancement and is usually heavily aided by computers. However, since many optimization problems are hard, it is unrealistic to expect an optimal solution within reasonable time. Hence, heuristics are employed, that is, computer programs that try to produce solutions of high quality quickly. One special class are estimation-of-distribution algorithms (EDAs), which are characterized by maintaining a probabilistic model over the problem domain, which they evolve over time. In an iterative fashion, an EDA uses its model in order to generate a set of solutions, which it then uses to refine the model such that the probability of producing good solutions is increased.

In this thesis, we theoretically analyze the class of univariate EDAs over the Boolean domain, that is, over the space of all length-n bit strings. In this setting, the probabilistic model of a univariate EDA consists of an n-dimensional probability vector where each component denotes the probability to sample a 1 for that position in order to generate a bit string.

My contribution follows two main directions: first, we analyze general inherent properties of univariate EDAs. Second, we determine the expected run times of specific EDAs on benchmark functions from theory. In the first part, we characterize when EDAs are unbiased with respect to the problem encoding. We then consider a setting where all solutions look equally good to an EDA, and we show that the probabilistic model of an EDA quickly evolves into an incorrect model if it is always updated such that it does not change in expectation.

In the second part, we first show that the algorithms cGA and MMAS-fp are able to efficiently optimize a noisy version of the classical benchmark function OneMax. We perturb the function by adding Gaussian noise with a variance of σ², and we prove that the algorithms are able to generate the true optimum in a time polynomial in σ² and the problem size n. For the MMAS-fp, we generalize this result to linear functions. Further, we prove a run time of Ω(n log(n)) for the algorithm UMDA on (unnoisy) OneMax. Last, we introduce a new algorithm that is able to optimize the benchmark functions OneMax and LeadingOnes both in O(n log(n)), which is a novelty for heuristics in the domain we consider.
N2  - Optimierung ist ein Hauptbestandteil technologischen Fortschritts und oftmals computergestützt. Da viele Optimierungsprobleme schwer sind, ist es jedoch unrealistisch, eine optimale Lösung in angemessener Zeit zu erwarten. Daher werden Heuristiken verwendet, also Programme, die versuchen hochwertige Lösungen schnell zu erzeugen. Eine konkrete Klasse sind Estimation-of-Distribution-Algorithmen (EDAs), die sich durch das Entwickeln probabilistischer Modelle über dem Problemraum auszeichnen. Ein solches Modell wird genutzt, um neue Lösungen zu erzeugen und damit das Modell zu verfeinern, um im nächsten Schritt mit erhöhter Wahrscheinlichkeit bessere Lösungen zu generieren.

In dieser Arbeit untersuchen wir die Klasse univariater EDAs in der booleschen Domäne, also im Raum aller Bitstrings der Länge n. Das probabilistische Modell eines univariaten EDAs besteht dann aus einem n-dimensionalen Wahrscheinlichkeitsvektor, in dem jede Komponente die Wahrscheinlichkeit angibt, eine 1 an der entsprechenden Stelle zu erzeugen.

Mein Beitrag folgt zwei Hauptrichtungen: Erst untersuchen wir allgemeine inhärente Eigenschaften univariater EDAs. Danach bestimmen wir die erwartete Laufzeit gewisser EDAs auf Benchmarks aus der Theorie. Im ersten Abschnitt charakterisieren wir, wann EDAs unbefangen bezüglich der Problemcodierung sind. Dann untersuchen wir sie in einem Szenario, in dem alle Lösungen gleich gut sind, und zeigen, dass sich ihr Modell schnell zu einem falschen entwickelt, falls es immer so angepasst wird, dass sich im Erwartungswert nichts ändert.

Im zweiten Abschnitt zeigen wir, dass die Algorithmen cGA und MMAS-fp eine verrauschte Variante des klassischen Benchmarks OneMax effizient optimieren, bei der eine Gaussverteilung mit Varianz σ² hinzuaddiert wird. Wir beweisen, dass die Algorithmen das wahre Optimum in polynomieller Zeit bezüglich σ² und n erzeugen. Für den MMAS-fp verallgemeinern wir dieses Ergebnis auf lineare Funktionen. Weiterhin beweisen wir eine Laufzeit von Ω(n log(n)) für den Algorithmus UMDA auf OneMax (ohne Rauschen). Zuletzt führen wir einen neuen Algorithmus ein, der die Benchmarks OneMax und LeadingOnes in O(n log(n)) optimiert, was zuvor für noch keine Heuristik gezeigt wurde.
T2  - Theoretische Analysen univariater Estimation-of-Distribution-Algorithmen
KW  - theory
KW  - estimation-of-distribution algorithms
KW  - univariate
KW  - pseudo-Boolean optimization
KW  - run time analysis
KW  - Theorie
KW  - Estimation-of-Distribution-Algorithmen
KW  - univariat
KW  - pseudoboolesche Optimierung
KW  - Laufzeitanalyse
Y1  - 2019
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-434870
ER  - 
TY  - JOUR
A1  - Doerr, Benjamin
A1  - Krejca, Martin S.
T1  - Significance-based estimation-of-distribution algorithms
JF  - IEEE transactions on evolutionary computation
N2  - Estimation-of-distribution algorithms (EDAs) are randomized search heuristics that create a probabilistic model of the solution space, which is updated iteratively, based on the quality of the solutions sampled according to the model. As previous works show, this iteration-based perspective can lead to erratic updates of the model, in particular, to bit-frequencies approaching a random boundary value. In order to overcome this problem, we propose a new EDA based on the classic compact genetic algorithm (cGA) that takes into account a longer history of samples and updates its model only with respect to information which it classifies as statistically significant. We prove that this significance-based cGA (sig-cGA) optimizes the commonly regarded benchmark functions OneMax (OM), LeadingOnes, and BinVal all in quasilinear time, a result shown for no other EDA or evolutionary algorithm so far. For the recently proposed stable compact genetic algorithm-an EDA that tries to prevent erratic model updates by imposing a bias to the uniformly distributed model-we prove that it optimizes OM only in a time exponential in its hypothetical population size. Similarly, we show that the convex search algorithm cannot optimize OM in polynomial time.
KW  - heuristic algorithms
KW  - sociology
KW  - statistics
KW  - history
KW  - probabilistic
KW  - logic
KW  - benchmark testing
KW  - genetic algorithms
KW  - estimation-of-distribution
KW  - algorithm (EDA)
KW  - run time analysis
KW  - theory
Y1  - 2020
U6  - https://doi.org/10.1109/TEVC.2019.2956633
SN  - 1089-778X
SN  - 1941-0026
VL  - 24
IS  - 6
SP  - 1025
EP  - 1034
PB  - Institute of Electrical and Electronics Engineers
CY  - New York, NY
ER  - 
TY  - JOUR
A1  - Krejca, Martin S.
A1  - Witt, Carsten
T1  - Lower bounds on the run time of the Univariate Marginal Distribution Algorithm on OneMax
JF  - Theoretical computer science : the journal of the EATCS
N2  - The Univariate Marginal Distribution Algorithm (UMDA) - a popular estimation-of-distribution algorithm - is studied from a run time perspective. On the classical OneMax benchmark function on bit strings of length n, a lower bound of Omega(lambda + mu root n + n logn), where mu and lambda are algorithm-specific parameters, on its expected run time is proved. This is the first direct lower bound on the run time of UMDA. It is stronger than the bounds that follow from general black-box complexity theory and is matched by the run time of many evolutionary algorithms. The results are obtained through advanced analyses of the stochastic change of the frequencies of bit values maintained by the algorithm, including carefully designed potential functions. These techniques may prove useful in advancing the field of run time analysis for estimation-of-distribution algorithms in general.
KW  - estimation-of-distribution algorithm
KW  - run time analysis
KW  - lower bound
Y1  - 2020
U6  - https://doi.org/10.1016/j.tcs.2018.06.004
SN  - 0304-3975
SN  - 1879-2294
VL  - 832
SP  - 143
EP  - 165
PB  - Elsevier
CY  - Amsterdam [u.a.]
ER  - 
TY  - JOUR
A1  - Kötzing, Timo
A1  - Lagodzinski, Gregor J. A.
A1  - Lengler, Johannes
A1  - Melnichenko, Anna
T1  - Destructiveness of lexicographic parsimony pressure and alleviation by a concatenation crossover in genetic programming
JF  - Theoretical computer science
N2  - For theoretical analyses there are two specifics distinguishing GP from many other areas of evolutionary computation: the variable size representations, in particular yielding a possible bloat (i.e. the growth of individuals with redundant parts); and also the role and the realization of crossover, which is particularly central in GP due to the tree-based representation. Whereas some theoretical work on GP has studied the effects of bloat, crossover had surprisingly little share in this work. <br /> We analyze a simple crossover operator in combination with randomized local search, where a preference for small solutions minimizes bloat (lexicographic parsimony pressure); we denote the resulting algorithm Concatenation Crossover GP. We consider three variants of the well-studied MAJORITY test function, adding large plateaus in different ways to the fitness landscape and thus giving a test bed for analyzing the interplay of variation operators and bloat control mechanisms in a setting with local optima. We show that the Concatenation Crossover GP can efficiently optimize these test functions, while local search cannot be efficient for all three variants independent of employing bloat control. (C) 2019 Elsevier B.V. All rights reserved.
KW  - genetic programming
KW  - mutation
KW  - theory
KW  - run time analysis
Y1  - 2020
U6  - https://doi.org/10.1016/j.tcs.2019.11.036
SN  - 0304-3975
VL  - 816
SP  - 96
EP  - 113
PB  - Elsevier
CY  - Amsterdam
ER  -