TY  - THES
A1  - Quinzan, Francesco
T1  - Combinatorial problems and scalability in artificial intelligence
N2  - Modern datasets often exhibit diverse, feature-rich, unstructured data, and they are massive in size. This is the case of social networks, human genome, and e-commerce databases. As Artificial Intelligence (AI) systems are increasingly used to detect pattern in data and predict future outcome, there are growing concerns on their ability to process large amounts of data. Motivated by these concerns, we study the problem of designing AI systems that are scalable to very large and heterogeneous data-sets.

Many AI systems require to solve combinatorial optimization problems in their course of action. These optimization problems are typically NP-hard, and they may exhibit additional side constraints. However, the underlying objective functions often exhibit additional properties. These properties can be exploited to design suitable optimization algorithms. One of these properties is the well-studied notion of submodularity, which captures diminishing returns. Submodularity is often found in real-world applications. Furthermore, many relevant applications exhibit generalizations of this property. 

In this thesis, we propose new scalable optimization algorithms for combinatorial problems with diminishing returns. Specifically, we focus on three problems, the Maximum Entropy Sampling problem, Video Summarization, and Feature Selection. For each problem, we propose new algorithms that work at scale. These algorithms are based on a variety of techniques, such as forward step-wise selection and adaptive sampling. Our proposed algorithms yield strong approximation guarantees, and the perform well experimentally.

We first study the Maximum Entropy Sampling problem. This problem consists of selecting a subset of random variables from a larger set, that maximize the entropy. By using diminishing return properties, we develop a simple forward step-wise selection optimization algorithm for this problem. Then, we study the problem of selecting a subset of frames, that represent a given video. Again, this problem corresponds to a submodular maximization problem. We provide a new adaptive sampling algorithm for this problem, suitable to handle the complex side constraints imposed by the application. We conclude by studying Feature Selection. In this case, the underlying objective functions generalize the notion of submodularity. We provide a new adaptive sequencing algorithm for this problem, based on the Orthogonal Matching Pursuit paradigm.

Overall, we study practically relevant combinatorial problems, and we propose new algorithms to solve them. We demonstrate that these algorithms are suitable to handle massive datasets. However, our analysis is not problem-specific, and our results can be applied to other domains, if diminishing return properties hold. We hope that the flexibility of our framework inspires further research into scalability in AI.
N2  - Moderne Datensätze bestehen oft aus vielfältigen, funktionsreichen und unstrukturierten Daten, die zudem sehr groß sind. Dies gilt insbesondere für soziale Netzwerke, das menschliche Genom und E-Commerce Datenbanken. Mit dem zunehmenden Einsatz von künstlicher Intelligenz (KI) um Muster in den Daten zu erkennen und künftige Ergebnisse vorherzusagen, steigen auch die Bedenken hinsichtlich ihrer Fähigkeit große Datenmengen zu verarbeiten. Aus diesem Grund untersuchen wir das Problem der Entwicklung von KI-Systemen, die auf große und heterogene Datensätze skalieren.
Viele KI-Systeme müssen während ihres Einsatzes kombinatorische Optimierungsprobleme lösen. Diese Optimierungsprobleme sind in der Regel NP-schwer und können zusätzliche Nebeneinschränkungen aufwiesen. Die Zielfunktionen dieser Probleme weisen jedoch oft zusätzliche Eigenschaften auf. Diese Eigenschaften können genutzt werden, um geeignete Optimierungsalgorithmen zu entwickeln. Eine dieser Eigenschaften ist das wohluntersuchte Konzept der Submodularität, das das Konzept des abnehmende Erträge beschreibt. Submodularität findet sich in vielen realen Anwendungen. Darüber hinaus weisen viele relevante An- wendungen Verallgemeinerungen dieser Eigenschaft auf.
In dieser Arbeit schlagen wir neue skalierbare Algorithmen für kombinatorische Probleme mit abnehmenden Erträgen vor. Wir konzentrieren uns hierbei insbesondere auf drei Prob- leme: dem Maximum-Entropie-Stichproben Problem, der Videozusammenfassung und der Feature Selection. Für jedes dieser Probleme schlagen wir neue Algorithmen vor, die gut skalieren. Diese Algorithmen basieren auf verschiedenen Techniken wie der schrittweisen Vorwärtsauswahl und dem adaptiven sampling.
Die von uns vorgeschlagenen Algorithmen bieten sehr gute Annäherungsgarantien und zeigen auch experimentell gute Leistung.
Zunächst untersuchen wir das Maximum-Entropy-Sampling Problem. Dieses Problem besteht darin, zufällige Variablen aus einer größeren Menge auszuwählen, welche die Entropie maximieren. Durch die Verwendung der Eigenschaften des abnehmenden Ertrags entwickeln wir einen einfachen forward step-wise selection Optimierungsalgorithmus für dieses Problem. Anschließend untersuchen wir das Problem der Auswahl einer Teilmenge von Bildern, die ein bestimmtes Video repräsentieren. Dieses Problem entspricht einem submodularen Maximierungsproblem. Hierfür stellen wir einen neuen adaptiven Sampling-Algorithmus für dieses Problem zur Verfügung, das auch komplexe Nebenbedingungen erfüllen kann, welche durch die Anwendung entstehen. Abschließend untersuchen wir die Feature Selection. In diesem Fall verallgemeinern die zugrundeliegenden Zielfunktionen die Idee der submodularität. Wir stellen einen neuen adaptiven Sequenzierungsalgorithmus für dieses Problem vor, der auf dem Orthogonal Matching Pursuit Paradigma basiert.
Insgesamt untersuchen wir praktisch relevante kombinatorische Probleme und schlagen neue Algorithmen vor, um diese zu lösen. Wir zeigen, dass diese Algorithmen für die Verarbeitung großer Datensätze geeignet sind. Unsere Auswertung ist jedoch nicht problemspezifisch und unsere Ergebnisse lassen sich auch auf andere Bereiche anwenden, sofern die Eigenschaften des abnehmenden Ertrags gelten. Wir hoffen, dass die Flexibilität unseres Frameworks die weitere Forschung im Bereich der Skalierbarkeit im Bereich KI anregt.
KW  - artificial intelligence
KW  - scalability
KW  - optimization
KW  - Künstliche Intelligenz
KW  - Optimierung
Y1  - 2023
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-611114
ER  - 
TY  - JOUR
A1  - Karamzadeh Toularoud, Nasim
A1  - Heimann, Sebastian
A1  - Dahm, Torsten
A1  - Krüger, Frank
T1  - Earthquake source arrays
BT  - optimal configuration and applications in crustal structure studies
JF  - Geophysical journal international
N2  - A collection of earthquake sources recorded at a single station, under specific conditions, are considered as a source array (SA), that is interpreted as if earthquake sources originate at the station location and are recorded at the source location. Then, array processing methods, that is array beamforming, are applicable to analyse the recorded signals. A possible application is to use source array multiple event techniques to locate and characterize near-source scatterers and structural interfaces. In this work the aim is to facilitate the use of earthquake source arrays by presenting an automatic search algorithm to configure the source array elements. We developed a procedure to search for an optimal source array element distribution given an earthquake catalogue including accurate origin time and hypocentre locations. The objective function of the optimization process can be flexibly defined for each application to ensure the prerequisites (criteria) of making a source array. We formulated four quantitative criteria as subfunctions and used the weighted sum technique to combine them in one single scalar function. The criteria are: (1) to control the accuracy of the slowness vector estimation using the time domain beamforming method, (2) to measure the waveform coherency of the array elements, (3) to select events with lower location error and (4) to select traces with high energy of specific phases, that is, sp- or ps-phases. The proposed procedure is verified using synthetic data as well as real examples for the Vogtland region in Northwest Bohemia. We discussed the possible application of the optimized source arrays to identify the location of scatterers in the velocity model by presenting a synthetic test and an example using real waveforms.
KW  - location of scatterers
KW  - optimization
KW  - source array design
Y1  - 2020
U6  - https://doi.org/10.1093/gji/ggaa002
SN  - 0956-540X
SN  - 1365-246X
VL  - 221
IS  - 1
SP  - 352
EP  - 370
PB  - Oxford Univ. Press
CY  - Oxford
ER  - 
TY  - JOUR
A1  - Babalola, Jonathan Oyebamiji
A1  - Omorogie, Martins Osaigbovo
A1  - Babarinde, Adesola Abiola
A1  - Unuabonah, Emmanuel Iyayi
A1  - Oninla, Vincent Olukayode
T1  - OPTIMIZATION OF THE BIOSORPTION OF Cr3+, Cd2+ AND Pb2+ USING A NEW BIOWASTE: Zea mays SEED CHAFF
JF  - Environmental engineering and management journal
N2  - This study highlights the potential use of yellow Zea mays seed chaff (YZMSC) biomass as a biosorbent for the removal of Cr3+, Cd2+ and Pb2+ ions from aqueous solutions. Fourier transformed Infrared analysis of the biomass suggests that YZMSC biomass is basically composed of cellulose and methyl cellulose. The biosorption capacities, q(max), of YZMSC biomass for Cr3+, Cd2+ and Pb2+ are 14.68, 121.95 and 384.62 mg/g respectively. Biosorption equilibrium was achieved at 20, 30 and 60 min for Cr3+, Cd2+ and Pb2+ respectively. YZMSC biomass was found to have higher biosorption capacity and overall kinetic rate of uptake for Pb2+ than for Cd2+ and Cr3+. However, Cr3+ had better initial kinetic rate of uptake by the biomass than Pb2+ and Cd2+. The Freundlich equilibrium isotherm model was found to describe equilibrium data better than Langmuir model suggesting that biosorption of these metal ions could be on more than one active site on the surface of YZMSC biomass. Kinetic study predicted the pseudo-second kinetic model as being able to better describe kinetic data obtained than either modified pseudo-first order or Bangham kinetic models. Biosorption of Cr3+, Cd2+ and Pb2+ onto YZMSC biomass was endothermic in nature with large positive entropy values. Biosorption of these metal ions onto YZMSC biomass was observed to be feasible and spontaneous above 283 K. Optimization of biomass weight for the removal of these metal ions suggest that 384 kg, 129 kg and 144 kg of YZMSC biomass is required for the removal of 95% of Cr3+, Cd2+ and Pb2+ metal ions respectively from 100 mg/L of metal ions in 10 tonnes of aqueous solutions.
KW  - biomass
KW  - biosorption
KW  - optimization
KW  - yellow Zea mays
Y1  - 2016
SN  - 1582-9596
SN  - 1843-3707
VL  - 15
SP  - 1571
EP  - 1580
PB  - Gh. Asachi Universitatea TehnicÄƒ IaÅŸi
CY  - Iasi
ER  - 
TY  - JOUR
A1  - Käser, Tanja
A1  - Baschera, Gian-Marco
A1  - Kohn, Juliane
A1  - Kucian, Karin
A1  - Richtmann, Verena
A1  - Grond, Ursina
A1  - Gross, Markus
A1  - von Aster, Michael G.
T1  - Design and evaluation of the computer-based training program Calcularis for enhancing numerical cognition
JF  - Frontiers in psychology
N2  - This article presents the design and a first pilot evaluation of the computer-based training program Calcularis for children with developmental dyscalculia (DD) or difficulties in learning mathematics. The program has been designed according to insights on the typical and atypical development of mathematical abilities. The learning process is supported through multimodal cues, which encode different properties of numbers. To offer optimal learning conditions, a user model completes the program and allows flexible adaptation to a child's individual learning and knowledge profile. Thirty-two children with difficulties in learning mathematics completed the 6-12-weeks computer training. The children played the game for 20 min per day for 5 days a week. The training effects were evaluated using neuropsychological tests. Generally, children benefited significantly from the training regarding number representation and arithmetic operations. Furthermore, children liked to play with the program and reported that the training improved their mathematical abilities.
KW  - learning
KW  - intervention
KW  - optimization
KW  - calculation
KW  - spatial representation
KW  - interactive learning environment
Y1  - 2013
U6  - https://doi.org/10.3389/fpsyg.2013.00489
SN  - 1664-1078
VL  - 4
IS  - 31
PB  - Frontiers Research Foundation
CY  - Lausanne
ER  -