TY - JOUR A1 - Ziehe, Andreas A1 - Laskov, Pavel A1 - Nolte, G A1 - Müller, Klaus-Robert T1 - A fast algorithm for joint diagonalization with non-orthogonal transformations and its application to blind source separation N2 - A new efficient algorithm is presented for joint diagonalization of several matrices. The algorithm is based on the Frobenius-norm formulation of the joint diagonalization problem, and addresses diagonalization with a general, non- orthogonal transformation. The iterative scheme of the algorithm is based on a multiplicative update which ensures the invertibility of the diagonalizer. The algorithm's efficiency stems from the special approximation of the cost function resulting in a sparse, block-diagonal Hessian to be used in the computation of the quasi-Newton update step. Extensive numerical simulations illustrate the performance of the algorithm and provide a comparison to other leading diagonalization methods. The results of such comparison demonstrate that the proposed algorithm is a viable alternative to existing state-of-the-art joint diagonalization algorithms. The practical use of our algorithm is shown for blind source separation problems Y1 - 2004 ER - TY - JOUR A1 - Mieth, Bettina A1 - Kloft, Marius A1 - Rodriguez, Juan Antonio A1 - Sonnenburg, Soren A1 - Vobruba, Robin A1 - Morcillo-Suarez, Carlos A1 - Farre, Xavier A1 - Marigorta, Urko M. A1 - Fehr, Ernst A1 - Dickhaus, Thorsten A1 - Blanchard, Gilles A1 - Schunk, Daniel A1 - Navarro, Arcadi A1 - Müller, Klaus-Robert T1 - Combining Multiple Hypothesis Testing with Machine Learning Increases the Statistical Power of Genome-wide Association Studies JF - Scientific reports N2 - The standard approach to the analysis of genome-wide association studies (GWAS) is based on testing each position in the genome individually for statistical significance of its association with the phenotype under investigation. To improve the analysis of GWAS, we propose a combination of machine learning and statistical testing that takes correlation structures within the set of SNPs under investigation in a mathematically well-controlled manner into account. The novel two-step algorithm, COMBI, first trains a support vector machine to determine a subset of candidate SNPs and then performs hypothesis tests for these SNPs together with an adequate threshold correction. Applying COMBI to data from a WTCCC study (2007) and measuring performance as replication by independent GWAS published within the 2008-2015 period, we show that our method outperforms ordinary raw p-value thresholding as well as other state-of-the-art methods. COMBI presents higher power and precision than the examined alternatives while yielding fewer false (i.e. non-replicated) and more true (i.e. replicated) discoveries when its results are validated on later GWAS studies. More than 80% of the discoveries made by COMBI upon WTCCC data have been validated by independent studies. Implementations of the COMBI method are available as a part of the GWASpi toolbox 2.0. Y1 - 2016 U6 - https://doi.org/10.1038/srep36671 SN - 2045-2322 VL - 6 PB - Nature Publ. Group CY - London ER - TY - JOUR A1 - Kawanabe, Motoaki A1 - Müller, Klaus-Robert T1 - Estimating functions for blind separation when sources have variance dependencies N2 - A blind separation problem where the sources are not independent, but have variance dependencies is discussed. For this scenario Hyvarinen and Hurri (2004) proposed an algorithm which requires no assumption on distributions of sources and no parametric model of dependencies between components. In this paper, we extend the semiparametric approach of Amari and Cardoso (1997) to variance dependencies and study estimating functions for blind separation of such dependent sources. In particular, we show that many ICA algorithms are applicable to the variance-dependent model as well under mild conditions, although they should in principle not. Our results indicate that separation can be done based only on normalized sources which are adjusted to have stationary variances and is not affected by the dependent activity levels. We also study the asymptotic distribution of the quasi maximum likelihood method and the stability of the natural gradient learning in detail. Simulation results of artificial and realistic examples match well with our theoretical findings Y1 - 2005 ER - TY - JOUR A1 - Laub, Julian A1 - Müller, Klaus-Robert T1 - Feature discovery in non-metric pairwise data N2 - Pairwise proximity data, given as similarity or dissimilarity matrix, can violate metricity. This occurs either due to noise, fallible estimates, or due to intrinsic non-metric features such as they arise from human judgments. So far the problem of non-metric pairwise data has been tackled by essentially omitting the negative eigenvalues or shifting the spectrum of the associated (pseudo) covariance matrix for a subsequent embedding. However, little attention has been paid to the negative part of the spectrum itself. In particular no answer was given to whether the directions associated to the negative eigenvalues would at all code variance other than noise related. We show by a simple, exploratory analysis that the negative eigenvalues can code for relevant structure in the data, thus leading to the discovery of new features, which were lost by conventional data analysis techniques. The information hidden in the negative eigenvalue part of the spectrum is illustrated and discussed for three data sets, namely USPS handwritten digits, text-mining and data from cognitive psychology Y1 - 2004 ER - TY - JOUR A1 - Blanchard, Gilles A1 - Kawanabe, Motoaki A1 - Sugiyama, Masashi A1 - Spokoiny, Vladimir G. A1 - Müller, Klaus-Robert T1 - In search of non-Gaussian components of a high-dimensional distribution N2 - Finding non-Gaussian components of high-dimensional data is an important preprocessing step for efficient information processing. This article proposes a new linear method to identify the '' non-Gaussian subspace '' within a very general semi-parametric framework. Our proposed method, called NGCA (non-Gaussian component analysis), is based on a linear operator which, to any arbitrary nonlinear (smooth) function, associates a vector belonging to the low dimensional non-Gaussian target subspace, up to an estimation error. By applying this operator to a family of different nonlinear functions, one obtains a family of different vectors lying in a vicinity of the target space. As a final step, the target space itself is estimated by applying PCA to this family of vectors. We show that this procedure is consistent in the sense that the estimaton error tends to zero at a parametric rate, uniformly over the family, Numerical examples demonstrate the usefulness of our method Y1 - 2006 UR - http://portal.acm.org/affiliated/jmlr/ SN - 1532-4435 ER - TY - JOUR A1 - Rätsch, Gunnar A1 - Schölkopf, B. A1 - Smola, Alexander J. A1 - Mika, Sebastian A1 - Onoda, T. A1 - Müller, Klaus-Robert T1 - Robust ensemble learning for data analysis Y1 - 2000 ER -