## Institut für Informatik und Computational Science

### Refine

#### Document Type

- Article (25)
- Monograph/Edited Volume (2)

In this article, we consider high-dimensional data which contains a low-dimensional non-Gaussian structure contaminated with Gaussian noise and propose a new linear method to identify the non-Gaussian subspace. Our method NGCA (Non-Gaussian Component Analysis) is based on a very general semi-parametric framework and has a theoretical guarantee that the estimation error of finding the non-Gaussian components tends to zero at a parametric rate. NGCA can be used not only as preprocessing for ICA, but also for extracting and visualizing more general structures like clusters. A numerical study demonstrates the usefulness of our method

An asymptotic analysis and improvement of AdaBoost in the binary classification case (in Japanese)
(2000)

Recently blind source separation (BSS) methods have been highly successful when applied to biomedical data. This paper reviews the concept of BSS and demonstrates its usefulness in the context of event-related MEG measurements. In a first experiment we apply BSS to artifact identification of raw MEG data and discuss how the quality of the resulting independent component projections can be evaluated. The second part of our study considers averaged data of event-related magnetic fields. Here, it is particularly important to monitor and thus avoid possible overfitting due to limited sample size. A stability assessment of the BSS decomposition allows to solve this task and an additional grouping of the BSS components reveals interesting structure, that could ultimately be used for gaining a better physiological modeling of the data

Noninvasive electroencephalogram (EEG) recordings provide for easy and safe access to human neocortical processes which can be exploited for a brain-computer interface (BCI). At present, however, the use of BCIs is severely limited by low bit-transfer rates. We systematically analyze and develop two recent concepts, both capable of enhancing the information gain from multichannel scalp EEG recordings: 1) the combination of classifiers, each specifically tailored for different physiological phenomena, e.g., slow cortical potential shifts, such as the premovement Bereitschaftspotential or differences in spatio-spectral distributions of brain activity (i.e., focal event-related desynchronizations) and 2) behavioral paradigms inducing the subjects to generate one out of several brain states (multiclass approach) which all bare a distinctive spatio-temporal signature well discriminable in the standard scalp EEG. We derive information-theoretic predictions and demonstrate their relevance in experimental data. We will show that a suitably arranged interaction between these concepts can significantly boost BCI performances

When decomposing single trial electroencephalography it is a challenge to incorporate prior physiological knowledge. Here, we develop a method that uses prior information about the phase-locking property of event-related potentials in a regularization framework to bias a blind source separation algorithm toward an improved separation of single-trial phase-locked responses in terms of an increased signal-to-noise ratio. In particular, we suggest a transformation of the data, using weighted average of the single trial and trial-averaged response, that redirects the focus of source separation methods onto the subspace of event-related potentials. The practical benefit with respect to an improved separation of such components from ongoing background activity and extraneous noise is first illustrated on artificial data and finally verified in a real-world application of extracting single-trial somatosensory evoked potentials from multichannel EEG-recordings

We propose simple and fast methods based on nearest neighbors that order objects from high-dimensional data sets from typical points to untypical points. On the one hand, we show that these easy-to-compute orderings allow us to detect outliers (i.e. very untypical points) with a performance comparable to or better than other often much more sophisticated methods. On the other hand, we show how to use these orderings to detect prototypes (very typical points) which facilitate exploratory data analysis algorithms such as noisy nonlinear dimensionality reduction and clustering. Comprehensive experiments demonstrate the validity of our approach.

Independent component analysis of noninvasively recorded cortical magnetic DC-fields in humans
(2000)

Usually, noise is considered to be destructive. We present a new method that constructively injects noise to assess the reliability and the grouping structure of empirical ICA component estimates. Our method can be viewed as a Monte-Carlo-style approximation of the curvature of some performance measure at the solution. Simulations show that the true root-mean-squared angle distances between the real sources and the source estimates can be approximated well by our method. In a toy experiment, we see that we are also able to reveal the underlying grouping structure of the extracted ICA components. Furthermore, an experiment with fetal ECG data demonstrates that our approach is useful for exploratory data analysis of real-world data. (C) 2003 Elsevier B.V. All rights reserved

This paper proposes a new independent component analysis (ICA) method which is able to unmix overcomplete mixtures of sparce or structured signals like speech, music or images. Furthermore, the method is designed to be robust against outliers, which is a favorable feature for ICA algorithms since most of them are extremely sensitive to outliers. Our approach is based on a simple outlier index. However, instead of robustifying an existing algorithm by some outlier rejection technique we show how this index can be used directly to solve the ICA problem for super-Gaussian sources. The resulting inlier-based ICA (IBICA) is outlier-robust by construction and can be used for standard ICA as well as for overcomplete ICA (i.e. more source signals than observed signals). (c) 2005 Wiley Periodicals, Inc

Phase synchronization is an important phenomenon that occurs in a wide variety of complex oscillatory processes. Measuring phase synchronization can therefore help to gain fundamental insight into nature. In this Letter we point out that synchronization analysis techniques can detect spurious synchronization, if they are fed with a superposition of signals such as in electroencephalography or magnetoencephalography data. We show how techniques from blind source separation can help to nevertheless measure the true synchronization and avoid such pitfalls

Two common data representations are mostly used in intelligent data analysis, namely the vectorial and the pairwise representation. Pairwise data which satisfy the restrictive conditions of Euclidean spaces can be faithfully translated into a Euclidean vectorial representation by embedding. Non-metric pairwise data with violations of symmetry, reflexivity or triangle inequality pose a substantial conceptual problem for pattern recognition since the amount of predictive structural information beyond what can be measured by embeddings is unclear. We show by systematic modeling of non-Euclidean pairwise data that there exists metric violations which can carry valuable problem specific information. Furthermore, Euclidean and non-metric data can be unified on the level of structural information contained in the data. Stable component analysis selects linear subspaces which are particularly insensitive to data fluctuations. Experimental results from different domains support our pattern recognition strategy.

Robust ensemble learning
(2000)

Data recorded in electroencephalogram (EEG)-based brain-computer interface experiments is generally very noisy, non-stationary, and contaminated with artifacts that can deteriorate discrimination/classification methods. In this paper, we extend the common spatial pattern (CSP) algorithm with the aim to alleviate these adverse effects. In particular, we suggest an extension of CSP to the state space, which utilizes the method of time delay embedding. As we will show, this allows for individually tuned frequency filters at each electrode position and, thus, yields an improved and more robust machine learning procedure. The advantages of the proposed method over the original CSP method are verified in terms of an improved information transfer rate (bits per trial) on a set of EEG-recordings from experiments of imagined limb movements

SVM and boosting : one class
(2000)

Interest in developing a new method of man-to-machine communication-a brain-computer interface (BCI)-has grown steadily over the past few decades. BCIs create a new communication channel between the brain and an output device by bypassing conventional motor output pathways of nerves and muscles. These systems use signals recorded from the scalp, the surface of the cortex, or from inside the brain to enable users to control a variety of applications including simple word-processing software and orthotics. BCI technology could therefore provide a new communication and control option for individuals who cannot otherwise express their wishes to the outside world. Signal processing and classification methods are essential tools in the development of improved BCI technology. We organized the BCI Competition 2003 to evaluate the current state of the art of these tools. Four laboratories well versed in EEG-based BCI research provided six data sets in a documented format. We made these data sets (i.e., labeled training sets and unlabeled test sets) and their descriptions available on the Internet. The goal in the competition was to maximize the performance measure for the test labels. Researchers worldwide tested their algorithms and competed for the best classification results. This paper describes the six data sets and the results and function of the most successful algorithms

A brain-computer interface (BCI) is a system that allows its users to control external devices with brain activity. Although the proof-of-concept was given decades ago, the reliable translation of user intent into device control commands is still a major challenge. Success requires the effective interaction of two adaptive controllers: the user's brain, which produces brain activity that encodes intent, and the BCI system, which translates that activity into device control commands. In order to facilitate this interaction, many laboratories are exploring a variety of signal analysis techniques to improve the adaptation of the BCI system to the user. In the literature, many machine learning and pattern classification algorithms have been reported to give impressive results when applied to BCI data in offline analyses. However, it is more difficult to evaluate their relative value for actual online use. BCI data competitions have been organized to provide objective formal evaluations of alternative methods. Prompted by the great interest in the first two BCI Competitions, we organized the third BCI Competition to address several of the most difficult and important analysis problems in BCI research. The paper describes the data sets that were provided to the competitors and gives an overview of the results.

The Berlin Brain-Computer Interface (BBCI) project develops a noninvasive BCI system whose key features are 1) the use of well-established motor competences as control paradigms, 2) high-dimensional features from 128-channel electroencephalogram (EEG), and 3) advanced machine learning techniques. As reported earlier, our experiments demonstrate that very high information transfer rates can be achieved using the readiness potential (RP) when predicting the laterality of upcoming left-versus right-hand movements in healthy subjects. A more recent study showed that the RP similarily accompanies phantom movements in arm amputees, but the signal strength decreases with longer loss of the limb. In a complementary approach, oscillatory features are used to discriminate imagined movements (left hand versus right hand versus foot). In a recent feedback study with six healthy subjects with no or very little experience with BCI control, three subjects achieved an information transfer rate above 35 bits per minute (bpm), and further two subjects above 24 and 15 bpm, while one subject could not achieve any BCI control. These results are encouraging for an EEG-based BCI system in untrained subjects that is independent of peripheral nervous system activity and does not rely on evoked potentials even when compared to results with very well-trained subjects operating other BCI systems

Non-stationarities are ubiquitous in EEG signals. They are especially apparent in the use of EEG-based brain- computer interfaces (BCIs): (a) in the differences between the initial calibration measurement and the online operation of a BCI, or (b) caused by changes in the subject's brain processes during an experiment (e.g. due to fatigue, change of task involvement, etc). In this paper, we quantify for the first time such systematic evidence of statistical differences in data recorded during offline and online sessions. Furthermore, we propose novel techniques of investigating and visualizing data distributions, which are particularly useful for the analysis of (non-) stationarities. Our study shows that the brain signals used for control can change substantially from the offline calibration sessions to online control, and also within a single session. In addition to this general characterization of the signals, we propose several adaptive classification schemes and study their performance on data recorded during online experiments. An encouraging result of our study is that surprisingly simple adaptive methods in combination with an offline feature selection scheme can significantly increase BCI performance

A well-known result by Stein (1956) shows that in particular situations, biased estimators can yield better parameter estimates than their generally preferred unbiased counterparts. This letter follows the same spirit, as we will stabilize the unbiased generalization error estimates by regularization and finally obtain more robust model selection criteria for learning. We trade a small bias against a larger variance reduction, which has the beneficial effect of being more precise on a single training set. We focus on the subspace information criterion (SIC), which is an unbiased estimator of the expected generalization error measured by the reproducing kernel Hilbert space norm. SIC can be applied to the kernel regression, and it was shown in earlier experiments that a small regularization of SIC has a stabilization effect. However, it remained open how to appropriately determine the degree of regularization in SIC. In this article, we derive an unbiased estimator of the expected squared error, between SIC and the expected generalization error and propose determining the degree of regularization of SIC such that the estimator of the expected squared error is minimized. Computer simulations with artificial and real data sets illustrate that the proposed method works effectively for improving the precision of SIC, especially in the high-noise-level cases. We furthermore compare the proposed method to the original SIC, the cross-validation, and an empirical Bayesian method in ridge parameter selection, with good results

Unmixing hyperspectral data
(2000)