TY  - THES
A1  - Harmeling, Stefan
T1  - Independent component analysis and beyond
N2  - 'Independent component analysis' (ICA) ist ein Werkzeug der statistischen Datenanalyse und Signalverarbeitung, welches multivariate Signale in ihre Quellkomponenten zerlegen kann. Obwohl das klassische ICA Modell sehr nützlich ist, gibt es viele Anwendungen, die Erweiterungen von ICA erfordern. In dieser Dissertation präsentieren wir neue Verfahren, die die Funktionalität von ICA erweitern: (1) Zuverlässigkeitsanalyse und Gruppierung von unabhängigen Komponenten durch Hinzufügen von Rauschen, (2) robuste und überbestimmte ('over-complete') ICA durch Ausreissererkennung, und (3) nichtlineare ICA mit Kernmethoden.
N2  - Independent component analysis (ICA) is a tool for statistical data analysis and signal processing that is able to decompose multivariate signals into their underlying source components. Although the classical ICA model is highly useful, there are many real-world applications that require powerful extensions of ICA. This thesis presents new methods that extend the functionality of ICA: (1) reliability and grouping of independent components with noise injection, (2) robust and overcomplete ICA with inlier detection, and (3) nonlinear ICA with kernel methods.
T2  - Independent component analysis and beyond
KW  - ICA
KW  - Zuverlässigkeitsanalyse
KW  - robuste ICA
KW  - überbestimmte ICA
KW  - Ausreissererkennung
KW  - nichtlineare ICA
KW  - Kern-PCA
KW  - Kernmethoden
KW  - ICA
KW  - reliability assessment
KW  - robust ICA
KW  - overcomplete ICA
KW  - outlier detection
KW  - nonlinear ICA
KW  - kernel PCA
KW  - kernel methods
Y1  - 2004
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-0001540
ER  - 
TY  - THES
A1  - Mücke, Nicole
T1  - Direct and inverse problems in machine learning
T1  - Direkte und inverse Probleme im maschinellen Lernen
BT  - kernel methods and spectral regularization
BT  - Kern Methoden und spektrale Regularisierung
N2  - We analyze an inverse noisy regression model under random design with the aim of estimating the unknown target function based on a given set of data, drawn according to some unknown probability distribution. Our estimators are all constructed by kernel methods, which depend on  a Reproducing Kernel Hilbert Space structure using spectral regularization methods. 

A first main result establishes upper and lower bounds for the rate of convergence under a given source condition assumption, restricting the class of admissible distributions. But since kernel methods scale poorly when massive datasets are involved, we study one example for saving computation time and memory requirements in more detail. We show that Parallelizing spectral algorithms also leads to minimax optimal rates of convergence provided the number of machines is chosen appropriately.

We emphasize that so far all estimators depend on the assumed  a-priori smoothness of the target function and on the eigenvalue decay of the kernel covariance operator, which are in general unknown. To obtain good purely data driven estimators constitutes the problem of adaptivity which we handle for the single machine problem via a version of the Lepskii principle.
N2  - In dieser Arbeit analysieren wir ein zufälliges und verrauschtes inverses Regressionsmodell im random design. Wir konstruiueren aus gegebenen Daten eine Schätzung der unbekannten Funktion, von der wir annehmen, dass sie in einem Hilbertraum mit reproduzierendem Kern liegt.

Ein erstes Hauptergebnis dieser Arbeit betrifft obere Schranken an die Konvergenzraten. Wir legen sog. source conditions fest, definiert über geeignete Kugeln im Wertebereich von (reellen) Potenzen des normierten Kern-Kovarianzoperators. Das führt zu  einer Einschränkung der Klasse der Verteilungen  in einem statistischen Modell, in dem die spektrale Asymptotik des von der Randverteilung abhängigen Kovarianzoperators eingeschränkt wird.

In diesem Kontext zeigen wir obere und entsprechende untere Schranken für die Konvergenzraten für eine sehr allgemeine Klasse spektraler Regularisierungsmethoden und etablieren damit die sog.  Minimax-Optimalität dieser Raten. Da selbst bei optimalen Konvergenzraten Kernmethoden, angewandt auf große Datenmengen, noch unbefriedigend viel Zeit verschlingen und hohen Speicherbedarf aufweisen, untersuchen wir einen Zugang zur Zeitersparnis und zur Reduktion des Speicherbedarfs detaillierter. Wir studieren das sog.  distributed learning  und beweisen für unsere Klasse allgemeiner spektraler Regularisierungen ein neues Resultat, allerdings immer noch unter der Annahme einer bekannten a priori Regularität der Zielfunktion, ausgedrückt durch die Fixierung einer source condition. Das große Problem bei der Behandlung realer Daten ist das der  Adaptivität, d.h. die Angabe eines Verfahrens, das ohne eine solche  a priori Voraussetzung  einen in einem gewissen Sinn optimalen Schätzer aus den  Daten konstruiert. Das behandeln wir vermöge einer Variante des Balancing principle.
KW  - inverse problems
KW  - kernel methods
KW  - minimax optimality
KW  - inverse Probleme
KW  - Kern Methoden
KW  - Minimax Optimalität
Y1  - 2017
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-403479
ER  - 
TY  - JOUR
A1  - Wormell, Caroline L.
A1  - Reich, Sebastian
T1  - Spectral convergence of diffusion maps
BT  - Improved error bounds and an alternative normalization
JF  - SIAM journal on numerical analysis / Society for Industrial and Applied Mathematics
N2  - Diffusion maps is a manifold learning algorithm widely used for dimensionality reduction. Using a sample from a distribution, it approximates the eigenvalues and eigenfunctions of associated Laplace-Beltrami operators. Theoretical bounds on the approximation error are, however, generally much weaker than the rates that are seen in practice. This paper uses new approaches to improve the error bounds in the model case where the distribution is supported on a hypertorus. For the data sampling (variance) component of the error we make spatially localized compact embedding estimates on certain Hardy spaces; we study the deterministic (bias) component as a perturbation of the Laplace-Beltrami operator's associated PDE and apply relevant spectral stability results. Using these approaches, we match long-standing pointwise error bounds for both the spectral data and the norm convergence of the operator discretization. We also introduce an alternative normalization for diffusion maps based on Sinkhorn weights. This normalization approximates a Langevin diffusion on the sample and yields a symmetric operator approximation. We prove that it has better convergence compared with the standard normalization on flat domains, and we present a highly efficient rigorous algorithm to compute the Sinkhorn weights.
KW  - diffusion maps
KW  - graph Laplacian
KW  - Sinkhorn problem
KW  - kernel methods
Y1  - 2021
U6  - https://doi.org/10.1137/20M1344093
SN  - 0036-1429
SN  - 1095-7170
VL  - 59
IS  - 3
SP  - 1687
EP  - 1734
PB  - Society for Industrial and Applied Mathematics
CY  - Philadelphia
ER  - 
TY  - THES
A1  - Zadorozhnyi, Oleksandr
T1  - Contributions to the theoretical analysis of the algorithms with adversarial and dependent data
N2  - In this work I present the concentration inequalities of Bernstein's type for the norms of Banach-valued random sums under a general functional weak-dependency assumption (the so-called $\cC-$mixing). The latter is then used to prove, in the asymptotic framework, excess risk upper bounds of the regularised Hilbert valued statistical learning rules under the τ-mixing assumption on the underlying training sample. These results (of the batch statistical setting) are then supplemented with the regret analysis over the classes of Sobolev balls of the type of kernel ridge regression algorithm in the setting of online nonparametric regression with arbitrary data sequences. Here, in particular, a question of robustness of the kernel-based forecaster is investigated. Afterwards, in the framework of sequential learning, the multi-armed bandit problem under $\cC-$mixing assumption on the arm's outputs is considered and the complete regret analysis of a version of Improved UCB algorithm is given. Lastly, probabilistic inequalities of the first part are extended to the case of deviations (both of Azuma-Hoeffding's and of Burkholder's type) to the partial sums of real-valued weakly dependent random fields (under the type of projective dependence condition).
KW  - Machine learning
KW  - nonparametric regression
KW  - kernel methods
KW  - regularisation
KW  - concentration inequalities
KW  - learning rates
KW  - sequential learning
KW  - multi-armed bandits
KW  - Sobolev spaces
Y1  - 2021
ER  -