TY  - JOUR
A1  - Sugiyama, Masashi
A1  - Kawanabe, Motoaki
A1  - Müller, Klaus-Robert
T1  - Trading variance reduction with unbiasedness : the regularized subspace information criterion for robust model selection in kernel regression
N2  - A well-known result by Stein (1956) shows that in particular situations, biased estimators can yield better parameter estimates than their generally preferred unbiased counterparts. This letter follows the same spirit, as we will stabilize the unbiased generalization error estimates by regularization and finally obtain more robust model selection criteria for learning. We trade a small bias against a larger variance reduction, which has the beneficial effect of being more precise on a single training set. We focus on the subspace information criterion (SIC), which is an unbiased estimator of the expected generalization error measured by the reproducing kernel Hilbert space norm. SIC can be applied to the kernel regression, and it was shown in earlier experiments that a small regularization of SIC has a stabilization effect. However, it remained open how to appropriately determine the degree of regularization in SIC. In this article, we derive an unbiased estimator of the expected squared error, between SIC and the expected generalization error and propose determining the degree of regularization of SIC such that the estimator of the expected squared error is minimized. Computer simulations with artificial and real data sets illustrate that the proposed method works effectively for improving the precision of SIC, especially in the high-noise-level cases. We furthermore compare the proposed method to the original SIC, the cross-validation, and an empirical Bayesian method in ridge parameter selection, with good results
Y1  - 2004
SN  - 0899-7667
ER  - 
TY  - BOOK
A1  - Tsuda, Koji
A1  - Sugiyama, Masashi
A1  - Müller, Klaus-Robert
T1  - Subspace information criterion for non-quadratice regularizers : model selection for sparse regressors
T3  - GMD-Report
Y1  - 2000
VL  - 120
PB  - GMD-Forschungszentrum Informationstechnik
CY  - Sankt Augustin
ER  - 
TY  - JOUR
A1  - Blanchard, Gilles
A1  - Kawanabe, Motoaki
A1  - Sugiyama, Masashi
A1  - Spokoiny, Vladimir G.
A1  - Müller, Klaus-Robert
T1  - In search of non-Gaussian components of a high-dimensional distribution
N2  - Finding non-Gaussian components of high-dimensional data is an important preprocessing step for efficient information processing. This article proposes a new linear method to identify the '' non-Gaussian subspace '' within a very general semi-parametric framework. Our proposed method, called NGCA (non-Gaussian component analysis), is based on a linear operator which, to any arbitrary nonlinear (smooth) function, associates a vector belonging to the low dimensional non-Gaussian target subspace, up to an estimation error. By applying this operator to a family of different nonlinear functions, one obtains a family of different vectors lying in a vicinity of the target space. As a final step, the target space itself is estimated by applying PCA to this family of vectors. We show that this procedure is consistent in the sense that the estimaton error tends to zero at a parametric rate, uniformly over the family, Numerical examples demonstrate the usefulness of our method
Y1  - 2006
UR  - http://portal.acm.org/affiliated/jmlr/
SN  - 1532-4435
ER  - 
TY  - JOUR
A1  - Kawanabe, Motoaki
A1  - Blanchard, Gilles
A1  - Sugiyama, Masashi
A1  - Spokoiny, Vladimir G.
A1  - Müller, Klaus-Robert
T1  - A novel dimension reduction procedure for searching non-Gaussian subspaces
N2  - In this article, we consider high-dimensional data which contains a low-dimensional non-Gaussian structure contaminated with Gaussian noise and propose a new linear method to identify the non-Gaussian subspace. Our method NGCA (Non-Gaussian Component Analysis) is based on a very general semi-parametric framework and has a theoretical guarantee that the estimation error of finding the non-Gaussian components tends to zero at a parametric rate. NGCA can be used not only as preprocessing for ICA, but also for extracting and visualizing more general structures like clusters. A numerical study demonstrates the usefulness of our method
Y1  - 2006
UR  - http://www.springerlink.com/content/105633/
U6  - https://doi.org/10.1007/11679363_19
SN  - 0302-9743
ER  -