TY - THES A1 - Krejca, Martin Stefan T1 - Theoretical analyses of univariate estimation-of-distribution algorithms N2 - Optimization is a core part of technological advancement and is usually heavily aided by computers. However, since many optimization problems are hard, it is unrealistic to expect an optimal solution within reasonable time. Hence, heuristics are employed, that is, computer programs that try to produce solutions of high quality quickly. One special class are estimation-of-distribution algorithms (EDAs), which are characterized by maintaining a probabilistic model over the problem domain, which they evolve over time. In an iterative fashion, an EDA uses its model in order to generate a set of solutions, which it then uses to refine the model such that the probability of producing good solutions is increased. In this thesis, we theoretically analyze the class of univariate EDAs over the Boolean domain, that is, over the space of all length-n bit strings. In this setting, the probabilistic model of a univariate EDA consists of an n-dimensional probability vector where each component denotes the probability to sample a 1 for that position in order to generate a bit string. My contribution follows two main directions: first, we analyze general inherent properties of univariate EDAs. Second, we determine the expected run times of specific EDAs on benchmark functions from theory. In the first part, we characterize when EDAs are unbiased with respect to the problem encoding. We then consider a setting where all solutions look equally good to an EDA, and we show that the probabilistic model of an EDA quickly evolves into an incorrect model if it is always updated such that it does not change in expectation. In the second part, we first show that the algorithms cGA and MMAS-fp are able to efficiently optimize a noisy version of the classical benchmark function OneMax. We perturb the function by adding Gaussian noise with a variance of σ², and we prove that the algorithms are able to generate the true optimum in a time polynomial in σ² and the problem size n. For the MMAS-fp, we generalize this result to linear functions. Further, we prove a run time of Ω(n log(n)) for the algorithm UMDA on (unnoisy) OneMax. Last, we introduce a new algorithm that is able to optimize the benchmark functions OneMax and LeadingOnes both in O(n log(n)), which is a novelty for heuristics in the domain we consider. N2 - Optimierung ist ein Hauptbestandteil technologischen Fortschritts und oftmals computergestützt. Da viele Optimierungsprobleme schwer sind, ist es jedoch unrealistisch, eine optimale Lösung in angemessener Zeit zu erwarten. Daher werden Heuristiken verwendet, also Programme, die versuchen hochwertige Lösungen schnell zu erzeugen. Eine konkrete Klasse sind Estimation-of-Distribution-Algorithmen (EDAs), die sich durch das Entwickeln probabilistischer Modelle über dem Problemraum auszeichnen. Ein solches Modell wird genutzt, um neue Lösungen zu erzeugen und damit das Modell zu verfeinern, um im nächsten Schritt mit erhöhter Wahrscheinlichkeit bessere Lösungen zu generieren. In dieser Arbeit untersuchen wir die Klasse univariater EDAs in der booleschen Domäne, also im Raum aller Bitstrings der Länge n. Das probabilistische Modell eines univariaten EDAs besteht dann aus einem n-dimensionalen Wahrscheinlichkeitsvektor, in dem jede Komponente die Wahrscheinlichkeit angibt, eine 1 an der entsprechenden Stelle zu erzeugen. Mein Beitrag folgt zwei Hauptrichtungen: Erst untersuchen wir allgemeine inhärente Eigenschaften univariater EDAs. Danach bestimmen wir die erwartete Laufzeit gewisser EDAs auf Benchmarks aus der Theorie. Im ersten Abschnitt charakterisieren wir, wann EDAs unbefangen bezüglich der Problemcodierung sind. Dann untersuchen wir sie in einem Szenario, in dem alle Lösungen gleich gut sind, und zeigen, dass sich ihr Modell schnell zu einem falschen entwickelt, falls es immer so angepasst wird, dass sich im Erwartungswert nichts ändert. Im zweiten Abschnitt zeigen wir, dass die Algorithmen cGA und MMAS-fp eine verrauschte Variante des klassischen Benchmarks OneMax effizient optimieren, bei der eine Gaussverteilung mit Varianz σ² hinzuaddiert wird. Wir beweisen, dass die Algorithmen das wahre Optimum in polynomieller Zeit bezüglich σ² und n erzeugen. Für den MMAS-fp verallgemeinern wir dieses Ergebnis auf lineare Funktionen. Weiterhin beweisen wir eine Laufzeit von Ω(n log(n)) für den Algorithmus UMDA auf OneMax (ohne Rauschen). Zuletzt führen wir einen neuen Algorithmus ein, der die Benchmarks OneMax und LeadingOnes in O(n log(n)) optimiert, was zuvor für noch keine Heuristik gezeigt wurde. T2 - Theoretische Analysen univariater Estimation-of-Distribution-Algorithmen KW - theory KW - estimation-of-distribution algorithms KW - univariate KW - pseudo-Boolean optimization KW - run time analysis KW - Theorie KW - Estimation-of-Distribution-Algorithmen KW - univariat KW - pseudoboolesche Optimierung KW - Laufzeitanalyse Y1 - 2019 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-434870 ER - TY - JOUR A1 - Doerr, Benjamin A1 - Krejca, Martin S. T1 - Significance-based estimation-of-distribution algorithms JF - IEEE transactions on evolutionary computation N2 - Estimation-of-distribution algorithms (EDAs) are randomized search heuristics that create a probabilistic model of the solution space, which is updated iteratively, based on the quality of the solutions sampled according to the model. As previous works show, this iteration-based perspective can lead to erratic updates of the model, in particular, to bit-frequencies approaching a random boundary value. In order to overcome this problem, we propose a new EDA based on the classic compact genetic algorithm (cGA) that takes into account a longer history of samples and updates its model only with respect to information which it classifies as statistically significant. We prove that this significance-based cGA (sig-cGA) optimizes the commonly regarded benchmark functions OneMax (OM), LeadingOnes, and BinVal all in quasilinear time, a result shown for no other EDA or evolutionary algorithm so far. For the recently proposed stable compact genetic algorithm-an EDA that tries to prevent erratic model updates by imposing a bias to the uniformly distributed model-we prove that it optimizes OM only in a time exponential in its hypothetical population size. Similarly, we show that the convex search algorithm cannot optimize OM in polynomial time. KW - heuristic algorithms KW - sociology KW - statistics KW - history KW - probabilistic KW - logic KW - benchmark testing KW - genetic algorithms KW - estimation-of-distribution KW - algorithm (EDA) KW - run time analysis KW - theory Y1 - 2020 U6 - https://doi.org/10.1109/TEVC.2019.2956633 SN - 1089-778X SN - 1941-0026 VL - 24 IS - 6 SP - 1025 EP - 1034 PB - Institute of Electrical and Electronics Engineers CY - New York, NY ER - TY - JOUR A1 - Krejca, Martin S. A1 - Witt, Carsten T1 - Lower bounds on the run time of the Univariate Marginal Distribution Algorithm on OneMax JF - Theoretical computer science : the journal of the EATCS N2 - The Univariate Marginal Distribution Algorithm (UMDA) - a popular estimation-of-distribution algorithm - is studied from a run time perspective. On the classical OneMax benchmark function on bit strings of length n, a lower bound of Omega(lambda + mu root n + n logn), where mu and lambda are algorithm-specific parameters, on its expected run time is proved. This is the first direct lower bound on the run time of UMDA. It is stronger than the bounds that follow from general black-box complexity theory and is matched by the run time of many evolutionary algorithms. The results are obtained through advanced analyses of the stochastic change of the frequencies of bit values maintained by the algorithm, including carefully designed potential functions. These techniques may prove useful in advancing the field of run time analysis for estimation-of-distribution algorithms in general. KW - estimation-of-distribution algorithm KW - run time analysis KW - lower bound Y1 - 2020 U6 - https://doi.org/10.1016/j.tcs.2018.06.004 SN - 0304-3975 SN - 1879-2294 VL - 832 SP - 143 EP - 165 PB - Elsevier CY - Amsterdam [u.a.] ER - TY - JOUR A1 - Kötzing, Timo A1 - Lagodzinski, Gregor J. A. A1 - Lengler, Johannes A1 - Melnichenko, Anna T1 - Destructiveness of lexicographic parsimony pressure and alleviation by a concatenation crossover in genetic programming JF - Theoretical computer science N2 - For theoretical analyses there are two specifics distinguishing GP from many other areas of evolutionary computation: the variable size representations, in particular yielding a possible bloat (i.e. the growth of individuals with redundant parts); and also the role and the realization of crossover, which is particularly central in GP due to the tree-based representation. Whereas some theoretical work on GP has studied the effects of bloat, crossover had surprisingly little share in this work.
We analyze a simple crossover operator in combination with randomized local search, where a preference for small solutions minimizes bloat (lexicographic parsimony pressure); we denote the resulting algorithm Concatenation Crossover GP. We consider three variants of the well-studied MAJORITY test function, adding large plateaus in different ways to the fitness landscape and thus giving a test bed for analyzing the interplay of variation operators and bloat control mechanisms in a setting with local optima. We show that the Concatenation Crossover GP can efficiently optimize these test functions, while local search cannot be efficient for all three variants independent of employing bloat control. (C) 2019 Elsevier B.V. All rights reserved. KW - genetic programming KW - mutation KW - theory KW - run time analysis Y1 - 2020 U6 - https://doi.org/10.1016/j.tcs.2019.11.036 SN - 0304-3975 VL - 816 SP - 96 EP - 113 PB - Elsevier CY - Amsterdam ER -