TY  - JOUR
A1  - Mücke, Nicole
A1  - Blanchard, Gilles
T1  - Parallelizing spectrally regularized kernel algorithms
JF  - Journal of machine learning research
N2  - We consider a distributed learning approach in supervised learning for a large class of spectral regularization methods in an reproducing kernel Hilbert space (RKHS) framework. The data set of size n is partitioned into m = O (n(alpha)), alpha < 1/2, disjoint subsamples. On each subsample, some spectral regularization method (belonging to a large class, including in particular Kernel Ridge Regression, L-2-boosting and spectral cut-off) is applied. The regression function f is then estimated via simple averaging, leading to a substantial reduction in computation time. We show that minimax optimal rates of convergence are preserved if m grows sufficiently slowly (corresponding to an upper bound for alpha) as n -> infinity, depending on the smoothness assumptions on f and the intrinsic dimensionality. In spirit, the analysis relies on a classical bias/stochastic error analysis.
KW  - Distributed Learning
KW  - Spectral Regularization
KW  - Minimax Optimality
Y1  - 2018
SN  - 1532-4435
VL  - 19
PB  - Microtome Publishing
CY  - Cambridge, Mass.
ER  - 
TY  - THES
A1  - Mücke, Nicole
T1  - Direct and inverse problems in machine learning
T1  - Direkte und inverse Probleme im maschinellen Lernen
BT  - kernel methods and spectral regularization
BT  - Kern Methoden und spektrale Regularisierung
N2  - We analyze an inverse noisy regression model under random design with the aim of estimating the unknown target function based on a given set of data, drawn according to some unknown probability distribution. Our estimators are all constructed by kernel methods, which depend on  a Reproducing Kernel Hilbert Space structure using spectral regularization methods. 

A first main result establishes upper and lower bounds for the rate of convergence under a given source condition assumption, restricting the class of admissible distributions. But since kernel methods scale poorly when massive datasets are involved, we study one example for saving computation time and memory requirements in more detail. We show that Parallelizing spectral algorithms also leads to minimax optimal rates of convergence provided the number of machines is chosen appropriately.

We emphasize that so far all estimators depend on the assumed  a-priori smoothness of the target function and on the eigenvalue decay of the kernel covariance operator, which are in general unknown. To obtain good purely data driven estimators constitutes the problem of adaptivity which we handle for the single machine problem via a version of the Lepskii principle.
N2  - In dieser Arbeit analysieren wir ein zufälliges und verrauschtes inverses Regressionsmodell im random design. Wir konstruiueren aus gegebenen Daten eine Schätzung der unbekannten Funktion, von der wir annehmen, dass sie in einem Hilbertraum mit reproduzierendem Kern liegt.

Ein erstes Hauptergebnis dieser Arbeit betrifft obere Schranken an die Konvergenzraten. Wir legen sog. source conditions fest, definiert über geeignete Kugeln im Wertebereich von (reellen) Potenzen des normierten Kern-Kovarianzoperators. Das führt zu  einer Einschränkung der Klasse der Verteilungen  in einem statistischen Modell, in dem die spektrale Asymptotik des von der Randverteilung abhängigen Kovarianzoperators eingeschränkt wird.

In diesem Kontext zeigen wir obere und entsprechende untere Schranken für die Konvergenzraten für eine sehr allgemeine Klasse spektraler Regularisierungsmethoden und etablieren damit die sog.  Minimax-Optimalität dieser Raten. Da selbst bei optimalen Konvergenzraten Kernmethoden, angewandt auf große Datenmengen, noch unbefriedigend viel Zeit verschlingen und hohen Speicherbedarf aufweisen, untersuchen wir einen Zugang zur Zeitersparnis und zur Reduktion des Speicherbedarfs detaillierter. Wir studieren das sog.  distributed learning  und beweisen für unsere Klasse allgemeiner spektraler Regularisierungen ein neues Resultat, allerdings immer noch unter der Annahme einer bekannten a priori Regularität der Zielfunktion, ausgedrückt durch die Fixierung einer source condition. Das große Problem bei der Behandlung realer Daten ist das der  Adaptivität, d.h. die Angabe eines Verfahrens, das ohne eine solche  a priori Voraussetzung  einen in einem gewissen Sinn optimalen Schätzer aus den  Daten konstruiert. Das behandeln wir vermöge einer Variante des Balancing principle.
KW  - inverse problems
KW  - kernel methods
KW  - minimax optimality
KW  - inverse Probleme
KW  - Kern Methoden
KW  - Minimax Optimalität
Y1  - 2017
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-403479
ER  - 
TY  - JOUR
A1  - Blanchard, Gilles
A1  - Mücke, Nicole
T1  - Optimal rates for regularization of statistical inverse learning problems
JF  - Foundations of Computational Mathematics
N2  - We consider a statistical inverse learning (also called inverse regression) problem, where we observe the image of a function f through a linear operator A at i.i.d. random design points X-i , superposed with an additive noise. The distribution of the design points is unknown and can be very general. We analyze simultaneously the direct (estimation of Af) and the inverse (estimation of f) learning problems. In this general framework, we obtain strong and weak minimax optimal rates of convergence (as the number of observations n grows large) for a large class of spectral regularization methods over regularity classes defined through appropriate source conditions. This improves on or completes previous results obtained in related settings. The optimality of the obtained rates is shown not only in the exponent in n but also in the explicit dependency of the constant factor in the variance of the noise and the radius of the source condition set.
KW  - Reproducing kernel Hilbert space
KW  - Spectral regularization
KW  - Inverse problem
KW  - Statistical learning
KW  - Minimax convergence rates
Y1  - 2018
U6  - https://doi.org/10.1007/s10208-017-9359-7
SN  - 1615-3375
SN  - 1615-3383
VL  - 18
IS  - 4
SP  - 971
EP  - 1013
PB  - Springer
CY  - New York
ER  - 
TY  - JOUR
A1  - Blanchard, Gilles
A1  - Mücke, Nicole
T1  - Kernel regression, minimax rates and effective dimensionality
BT  - beyond the regular case
JF  - Analysis and applications
N2  - We investigate if kernel regularization methods can achieve minimax convergence rates over a source condition regularity assumption for the target function. These questions have been considered in past literature, but only under specific assumptions about the decay, typically polynomial, of the spectrum of the the kernel mapping covariance operator. In the perspective of distribution-free results, we investigate this issue under much weaker assumption on the eigenvalue decay, allowing for more complex behavior that can reflect different structure of the data at different scales.
KW  - Kernel regression
KW  - minimax optimality
KW  - eigenvalue decay
Y1  - 2020
U6  - https://doi.org/10.1142/S0219530519500258
SN  - 0219-5305
SN  - 1793-6861
VL  - 18
IS  - 4
SP  - 683
EP  - 696
PB  - World Scientific
CY  - New Jersey
ER  - 
TY  - INPR
A1  - Blanchard, Gilles
A1  - Mücke, Nicole
T1  - Optimal rates for regularization of statistical inverse learning problems
N2  - We consider a statistical inverse learning problem, where we observe the image of a function f through a linear operator A at i.i.d. random design points X_i, superposed with an additional noise. The distribution of the design points is unknown and can be very general. We analyze simultaneously the direct (estimation of Af) and the inverse (estimation of f) learning problems. In this general framework, we obtain strong and weak minimax optimal rates of convergence (as the number of observations n grows large) for a large class of spectral regularization methods over regularity classes defined through appropriate source conditions. This improves on or completes previous results obtained in related settings. The optimality of the obtained rates is shown not only in the exponent in n but also in the explicit dependence of the constant factor in the variance of the noise and the radius of the source condition set.
T3  - Preprints des Instituts für Mathematik der Universität Potsdam - 5 (2016) 5 
KW  - statistical inverse problem
KW  - minimax rate
KW  - kernel method
Y1  - 2016
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-89782
SN  - 2193-6943
VL  - 5
IS  - 5
PB  - Universitätsverlag Potsdam
CY  - Potsdam
ER  -