TY - JOUR A1 - Sugiyama, Masashi A1 - Kawanabe, Motoaki A1 - Müller, Klaus-Robert T1 - Trading variance reduction with unbiasedness : the regularized subspace information criterion for robust model selection in kernel regression N2 - A well-known result by Stein (1956) shows that in particular situations, biased estimators can yield better parameter estimates than their generally preferred unbiased counterparts. This letter follows the same spirit, as we will stabilize the unbiased generalization error estimates by regularization and finally obtain more robust model selection criteria for learning. We trade a small bias against a larger variance reduction, which has the beneficial effect of being more precise on a single training set. We focus on the subspace information criterion (SIC), which is an unbiased estimator of the expected generalization error measured by the reproducing kernel Hilbert space norm. SIC can be applied to the kernel regression, and it was shown in earlier experiments that a small regularization of SIC has a stabilization effect. However, it remained open how to appropriately determine the degree of regularization in SIC. In this article, we derive an unbiased estimator of the expected squared error, between SIC and the expected generalization error and propose determining the degree of regularization of SIC such that the estimator of the expected squared error is minimized. Computer simulations with artificial and real data sets illustrate that the proposed method works effectively for improving the precision of SIC, especially in the high-noise-level cases. We furthermore compare the proposed method to the original SIC, the cross-validation, and an empirical Bayesian method in ridge parameter selection, with good results Y1 - 2004 SN - 0899-7667 ER - TY - BOOK A1 - Tsuda, Koji A1 - Sugiyama, Masashi A1 - Müller, Klaus-Robert T1 - Subspace information criterion for non-quadratice regularizers : model selection for sparse regressors T3 - GMD-Report Y1 - 2000 VL - 120 PB - GMD-Forschungszentrum Informationstechnik CY - Sankt Augustin ER - TY - JOUR A1 - Blanchard, Gilles A1 - Kawanabe, Motoaki A1 - Sugiyama, Masashi A1 - Spokoiny, Vladimir G. A1 - Müller, Klaus-Robert T1 - In search of non-Gaussian components of a high-dimensional distribution N2 - Finding non-Gaussian components of high-dimensional data is an important preprocessing step for efficient information processing. This article proposes a new linear method to identify the '' non-Gaussian subspace '' within a very general semi-parametric framework. Our proposed method, called NGCA (non-Gaussian component analysis), is based on a linear operator which, to any arbitrary nonlinear (smooth) function, associates a vector belonging to the low dimensional non-Gaussian target subspace, up to an estimation error. By applying this operator to a family of different nonlinear functions, one obtains a family of different vectors lying in a vicinity of the target space. As a final step, the target space itself is estimated by applying PCA to this family of vectors. We show that this procedure is consistent in the sense that the estimaton error tends to zero at a parametric rate, uniformly over the family, Numerical examples demonstrate the usefulness of our method Y1 - 2006 UR - http://portal.acm.org/affiliated/jmlr/ SN - 1532-4435 ER - TY - JOUR A1 - Kawanabe, Motoaki A1 - Blanchard, Gilles A1 - Sugiyama, Masashi A1 - Spokoiny, Vladimir G. A1 - Müller, Klaus-Robert T1 - A novel dimension reduction procedure for searching non-Gaussian subspaces N2 - In this article, we consider high-dimensional data which contains a low-dimensional non-Gaussian structure contaminated with Gaussian noise and propose a new linear method to identify the non-Gaussian subspace. Our method NGCA (Non-Gaussian Component Analysis) is based on a very general semi-parametric framework and has a theoretical guarantee that the estimation error of finding the non-Gaussian components tends to zero at a parametric rate. NGCA can be used not only as preprocessing for ICA, but also for extracting and visualizing more general structures like clusters. A numerical study demonstrates the usefulness of our method Y1 - 2006 UR - http://www.springerlink.com/content/105633/ U6 - https://doi.org/10.1007/11679363_19 SN - 0302-9743 ER -