Institut für Informatik und Computational Science
Refine
Year of publication
- 2006 (61) (remove)
Document Type
- Article (38)
- Monograph/Edited Volume (12)
- Doctoral Thesis (9)
- Master's Thesis (2)
Keywords
- EEG (2)
- Maschinelles Lernen (2)
- 3D-Stadtmodelle (1)
- 3d city models (1)
- Ackerschmalwand (1)
- Aufzählung (1)
- Authentifizierung (1)
- BCI (1)
- Biocomputing (1)
- Bioinformatik (1)
- Brain Computer Interface (1)
- Classification (1)
- Code (1)
- Codierung (1)
- Common Spatial Pattern (1)
- Computergrafik (1)
- Computersicherheit (1)
- DNA (1)
- DNA computing (1)
- DNS (1)
- Dynamische Rekonfiguration (1)
- E-Learning (1)
- Feature Combination (1)
- Feedback (1)
- Fehlende Daten (1)
- Gehirn-Computer-Schnittstelle (1)
- Geovisualisierung (1)
- Hauptkomponentenanalyse (1)
- IT security (1)
- Information Transfer Rate (1)
- Kryptographie (1)
- Kybernetik (1)
- LBA problem (1)
- Lindenmayer systems (1)
- Machine Learning (1)
- Middleware (1)
- Multi-Class (1)
- Netzwerk (1)
- Neuronales Netz (1)
- Quantenkryptographie (1)
- Signal Processing (1)
- Single Trial Analysis (1)
- Spatio-Spectral Filter (1)
- Suche (1)
- Temporäre Anbindung (1)
- Texturen (1)
- Unabhängige Komponentenanalyse (1)
- VM (1)
- Web Services (1)
- accepting grammars (1)
- authentication (1)
- bio-computing (1)
- brain-computer interface (1)
- code (1)
- common spatial patterns (1)
- computer graphics (1)
- cryptography (1)
- degree of non-regulation (1)
- drug discovery (1)
- dynamic reconfiguration (1)
- enumeration (1)
- event-related desynchronization (1)
- finite state sequential transducers (1)
- geovisualization (1)
- incremental SVM (1)
- intrusion detection (1)
- leftmost derivations (1)
- molecular networks (1)
- molekulare Netzwerke (1)
- multiuser (1)
- network (1)
- nichtlineare PCA (NLPCA) (1)
- nonlinear PCA (NLPCA) (1)
- online learning (1)
- programmed grammars (1)
- quantum cryptography (1)
- search (1)
- single-trial-analysis (1)
- state complexity (1)
- temporary binding (1)
- textures (1)
- virtual machine (1)
Institute
(Near-)inverses of sequences
(2006)
We introduce the notion of a near-inverse of a non-decreasing sequence of positive integers; near-inverses are intended to assume the role of inverses in cases when the latter cannot exist. We prove that the near-inverse of such a sequence is unique; moreover, the relation of being near-inverses of each other is symmetric, i.e. if sequence g is the near-inverse of sequence f, then f is the near-inverse of g. There is a connection, by approximations, between near- inverses of sequences and inverses of continuous strictly increasing real-valued functions which can be exploited to derive simple expressions for near-inverses
In this article, we consider high-dimensional data which contains a low-dimensional non-Gaussian structure contaminated with Gaussian noise and propose a new linear method to identify the non-Gaussian subspace. Our method NGCA (Non-Gaussian Component Analysis) is based on a very general semi-parametric framework and has a theoretical guarantee that the estimation error of finding the non-Gaussian components tends to zero at a parametric rate. NGCA can be used not only as preprocessing for ICA, but also for extracting and visualizing more general structures like clusters. A numerical study demonstrates the usefulness of our method
This thesis discusses challenges in IT security education, points out a gap between e-learning and practical education, and presents a work to fill the gap. E-learning is a flexible and personalized alternative to traditional education. Nonetheless, existing e-learning systems for IT security education have difficulties in delivering hands-on experience because of the lack of proximity. Laboratory environments and practical exercises are indispensable instruction tools to IT security education, but security education in conventional computer laboratories poses particular problems such as immobility as well as high creation and maintenance costs. Hence, there is a need to effectively transform security laboratories and practical exercises into e-learning forms. In this thesis, we introduce the Tele-Lab IT-Security architecture that allows students not only to learn IT security principles, but also to gain hands-on security experience by exercises in an online laboratory environment. In this architecture, virtual machines are used to provide safe user work environments instead of real computers. Thus, traditional laboratory environments can be cloned onto the Internet by software, which increases accessibility to laboratory resources and greatly reduces investment and maintenance costs. Under the Tele-Lab IT-Security framework, a set of technical solutions is also proposed to provide effective functionalities, reliability, security, and performance. The virtual machines with appropriate resource allocation, software installation, and system configurations are used to build lightweight security laboratories on a hosting computer. Reliability and availability of laboratory platforms are covered by a virtual machine management framework. This management framework provides necessary monitoring and administration services to detect and recover critical failures of virtual machines at run time. Considering the risk that virtual machines can be misused for compromising production networks, we present a security management solution to prevent the misuse of laboratory resources by security isolation at the system and network levels. This work is an attempt to bridge the gap between e-learning/tele-teaching and practical IT security education. It is not to substitute conventional teaching in laboratories but to add practical features to e-learning. This thesis demonstrates the possibility to implement hands-on security laboratories on the Internet reliably, securely, and economically.
An Extended Query language for action languages (and its application to aggregates and preferences)
(2006)
Advances in biotechnologies rapidly increase the number of molecules of a cell which can be observed simultaneously. This includes expression levels of thousands or ten-thousands of genes as well as concentration levels of metabolites or proteins. Such Profile data, observed at different times or at different experimental conditions (e.g., heat or dry stress), show how the biological experiment is reflected on the molecular level. This information is helpful to understand the molecular behaviour and to identify molecules or combination of molecules that characterise specific biological condition (e.g., disease). This work shows the potentials of component extraction algorithms to identify the major factors which influenced the observed data. This can be the expected experimental factors such as the time or temperature as well as unexpected factors such as technical artefacts or even unknown biological behaviour. Extracting components means to reduce the very high-dimensional data to a small set of new variables termed components. Each component is a combination of all original variables. The classical approach for that purpose is the principal component analysis (PCA). It is shown that, in contrast to PCA which maximises the variance only, modern approaches such as independent component analysis (ICA) are more suitable for analysing molecular data. The condition of independence between components of ICA fits more naturally our assumption of individual (independent) factors which influence the data. This higher potential of ICA is demonstrated by a crossing experiment of the model plant Arabidopsis thaliana (Thale Cress). The experimental factors could be well identified and, in addition, ICA could even detect a technical artefact. However, in continuously observations such as in time experiments, the data show, in general, a nonlinear distribution. To analyse such nonlinear data, a nonlinear extension of PCA is used. This nonlinear PCA (NLPCA) is based on a neural network algorithm. The algorithm is adapted to be applicable to incomplete molecular data sets. Thus, it provides also the ability to estimate the missing data. The potential of nonlinear PCA to identify nonlinear factors is demonstrated by a cold stress experiment of Arabidopsis thaliana. The results of component analysis can be used to build a molecular network model. Since it includes functional dependencies it is termed functional network. Applied to the cold stress data, it is shown that functional networks are appropriate to visualise biological processes and thereby reveals molecular dynamics.
Aufzählen von DNA-Codes
(2006)
In dieser Arbeit wird ein Modell zum Aufzählen von DNA-Codes entwickelt. Indem eine Ordnung auf der Menge aller DNA-Codewörter eingeführt und auf die Menge aller Codes erweitert wird, erlaubt das Modell das Auffinden von DNA-Codes mit bestimmten Eigenschaften, wie Überlappungsfreiheit, Konformität, Kommafreiheit, Stickyfreiheit, Überhangfreiheit, Teilwortkonformität und anderer bezüglich einer gegebenen Involution auf der Menge der Codewörter. Ein auf Grundlage des geschaffenen Modells entstandenes Werkzeug erlaubt das Suchen von Codes mit beliebigen Kombinationen von Codeeigenschaften. Ein weiterer wesentlicher Bestandteil dieser Arbeit ist die Untersuchung der Optimalität von DNA-Codes bezüglich ihrer Informationsrate sowie das Finden solider DNA-Codes.
The emergence of drug resistance remains one of the most challenging issues in the treatment of HIV-1 infection. The extreme replication dynamics of HIV facilitates its escape from the selective pressure exerted by the human immune system and by the applied combination drug therapy. This article reviews computational methods whose combined use can support the design of optimal antiretroviral therapies based on viral genotypic and phenotypic data. Genotypic assays are based on the analysis of mutations associated with reduced drug susceptibility, but are difficult to interpret due to the numerous mutations and mutational patterns that confer drug resistance. Phenotypic resistance or susceptibility can be experimentally evaluated by measuring the inhibition of the viral replication in cell culture assays. However, this procedure is expensive and time consuming
We give a new view on building content clusters from page pair models. We measure the heuristic importance within every two pages by computing the distance of their accessed positions in usage sessions. We also compare our page pair models with the classical pair models used in information theories and natural language processing, and give different evaluation methods to build the reasonable content communities. And we finally interpret the advantages and disadvantages of our models from detailed experiment results
Business process management
(2006)
Combined optimization of spatial and temporal filters for improving brain-computer interfacing
(2006)
Brain-computer interface (BCI) systems create a novel communication channel from the brain to an output de ice by bypassing conventional motor output pathways of nerves and muscles. Therefore they could provide a new communication and control option for paralyzed patients. Modern BCI technology is essentially based on techniques for the classification of single-trial brain signals. Here we present a novel technique that allows the simultaneous optimization of a spatial and a spectral filter enhancing discriminability rates of multichannel EEG single-trials. The evaluation of 60 experiments involving 22 different subjects demonstrates the significant superiority of the proposed algorithm over to its classical counterpart: the median classification error rate was decreased by 11%. Apart from the enhanced classification, the spatial and/or the spectral filter that are determined by the algorithm can also be used for further analysis of the data, e.g., for source localization of the respective brain rhythms.
Most information systems log events (e.g., transaction logs, audit traits) to audit and monitor the processes they support. At the same time, many of these processes have been explicitly modeled. For example, SAP R/3 logs events in transaction logs and there are EPCs (Event-driven Process Chains) describing the so-called reference models. These reference models describe how the system should be used. The coexistence of event logs and process models raises an interesting question: "Does the event log conform to the process model and vice versa?". This paper demonstrates that there is not a simple answer to this question. To tackle the problem, we distinguish two dimensions of conformance: fitness (the event log may be the result of the process modeled) and appropriateness (the model is a likely candidate from a structural and behavioral point of view). Different metrics have been defined and a Conformance Checker has been implemented within the ProM Framework
Denken in Services ist der Schlüssel zu einer gemeinsamen Sicht für IT Experten, Business Experten und Manager auf bestehende und entstehende Anwendungen und damit zu einer engeren Zusammenarbeit, die heutige Arbeitsabläufe revolutionieren kann: Auf der Service-Ebene ist es erstmals möglich, Fachexperten kontinuierlich während des gesamten Lebens-zyklus einer Anwendung einzubinden. Ihren Ursprung hat die service- orientierte Denkweise in der Telekommunikation, wo sie die Grundlage der modernsten Anwendungen für mobile und feste Plattformen war, insbesondere der so genannten Mehrwertdienste, wie Televoting, Freephone (die 0800 Nummern), Virtual Private Network. Nur durch die für die service-orientierung charakteristische konsequente Virtualisierung der Infrastrukturen und die lose Kopplung der Funktionalitäten war es möglich, die hochgradig heterogene Landschaft der Telefonie zu beherrschen. Dieselben Prinzipien sind aber viel allgemeiner anwendbar, zum Beispiel auch für Dienste in weniger technischen Geschäftsbereichen wie e-commerce, Logistik, Gesundheitswesen oder Verwaltung. Ihre konsequente Umsetzung als neues Paradigma für die Konzeption, den Entwurf und das Management komplexer Anwendungen hat das Potential, der Gesellschaft eine neue Generation personalisierter, sicherer, hochverfügbarer und effizienter (Internet-) Dienstleistungen zu bescheren. Damit werden viele Geschäftsbereiche revolutioniert, ähnlich wie bereits die Email in vielen Bereichen die klassische Kommunikation per Post revolutioniert hat.
When decomposing single trial electroencephalography it is a challenge to incorporate prior physiological knowledge. Here, we develop a method that uses prior information about the phase-locking property of event-related potentials in a regularization framework to bias a blind source separation algorithm toward an improved separation of single-trial phase-locked responses in terms of an increased signal-to-noise ratio. In particular, we suggest a transformation of the data, using weighted average of the single trial and trial-averaged response, that redirects the focus of source separation methods onto the subspace of event-related potentials. The practical benefit with respect to an improved separation of such components from ongoing background activity and extraneous noise is first illustrated on artificial data and finally verified in a real-world application of extracting single-trial somatosensory evoked potentials from multichannel EEG-recordings
Experimentelles Software Engineering durch Modellierung wissensintensiver Entwicklungsprozesse
(2006)
We propose simple and fast methods based on nearest neighbors that order objects from high-dimensional data sets from typical points to untypical points. On the one hand, we show that these easy-to-compute orderings allow us to detect outliers (i.e. very untypical points) with a performance comparable to or better than other often much more sophisticated methods. On the other hand, we show how to use these orderings to detect prototypes (very typical points) which facilitate exploratory data analysis algorithms such as noisy nonlinear dimensionality reduction and clustering. Comprehensive experiments demonstrate the validity of our approach.
We investigate the usage of rule dependency graphs and their colorings for characterizing and computing answer sets of logic programs. This approach provides us with insights into the interplay between rules when inducing answer sets. We start with different characterizations of answer sets in terms of totally colored dependency graphs that differ ill graph-theoretical aspects. We then develop a series of operational characterizations of answer sets in terms of operators on partial colorings. In analogy to the notion of a derivation in proof theory, our operational characterizations are expressed as (non-deterministically formed) sequences of colorings, turning an uncolored graph into a totally colored one. In this way, we obtain an operational framework in which different combinations of operators result in different formal properties. Among others, we identify the basic strategy employed by the noMoRe system and justify its algorithmic approach. Furthermore, we distinguish operations corresponding to Fitting's operator as well as to well-founded semantics
The goal of a Brain-Computer Interface (BCI) consists of the development of a unidirectional interface between a human and a computer to allow control of a device only via brain signals. While the BCI systems of almost all other groups require the user to be trained over several weeks or even months, the group of Prof. Dr. Klaus-Robert Müller in Berlin and Potsdam, which I belong to, was one of the first research groups in this field which used machine learning techniques on a large scale. The adaptivity of the processing system to the individual brain patterns of the subject confers huge advantages for the user. Thus BCI research is considered a hot topic in machine learning and computer science. It requires interdisciplinary cooperation between disparate fields such as neuroscience, since only by combining machine learning and signal processing techniques based on neurophysiological knowledge will the largest progress be made. In this work I particularly deal with my part of this project, which lies mainly in the area of computer science. I have considered the following three main points: <b>Establishing a performance measure based on information theory:</b> I have critically illuminated the assumptions of Shannon's information transfer rate for application in a BCI context. By establishing suitable coding strategies I was able to show that this theoretical measure approximates quite well to what is practically achieveable. <b>Transfer and development of suitable signal processing and machine learning techniques:</b> One substantial component of my work was to develop several machine learning and signal processing algorithms to improve the efficiency of a BCI. Based on the neurophysiological knowledge that several independent EEG features can be observed for some mental states, I have developed a method for combining different and maybe independent features which improved performance. In some cases the performance of the combination algorithm outperforms the best single performance by more than 50 %. Furthermore, I have theoretically and practically addressed via the development of suitable algorithms the question of the optimal number of classes which should be used for a BCI. It transpired that with BCI performances reported so far, three or four different mental states are optimal. For another extension I have combined ideas from signal processing with those of machine learning since a high gain can be achieved if the temporal filtering, i.e., the choice of frequency bands, is automatically adapted to each subject individually. <b>Implementation of the Berlin brain computer interface and realization of suitable experiments:</b> Finally a further substantial component of my work was to realize an online BCI system which includes the developed methods, but is also flexible enough to allow the simple realization of new algorithms and ideas. So far, bitrates of up to 40 bits per minute have been achieved with this system by absolutely untrained users which, compared to results of other groups, is highly successful.
Incremental Support Vector Machines (SVM) are instrumental in practical applications of online learning. This work focuses on the design and analysis of efficient incremental SVM learning, with the aim of providing a fast, numerically stable and robust implementation. A detailed analysis of convergence and of algorithmic complexity of incremental SVM learning is carried out. Based on this analysis, a new design of storage and numerical operations is proposed, which speeds up the training of an incremental SVM by a factor of 5 to 20. The performance of the new algorithm is demonstrated in two scenarios: learning with limited resources and active learning. Various applications of the algorithm, such as in drug discovery, online monitoring of industrial devices and and surveillance of network traffic, can be foreseen.
Iterated finite state sequential transducers are considered as language generating devices. The hierarchy induced by the size of the state alphabet is proved to collapse to the fourth level. The corresponding language families are related to the families of languages generated by Lindenmayer systems and Chomsky grammars. Finally, some results on deterministic and extended iterated finite state transducers are established.
With the next generation Internet protocol IPv6 at the horizon, it is time to think about how applications can migrate to IPv6. Web traffic is currently one of the most important applications in the Internet. The increasing popularity of dynamically generated content on the World Wide Web, has created the need for fast web servers. Server clustering together with server load balancing has emerged as a promising technique to build scalable web servers. The paper gives a short overview over the new features of IPv6 and different server load balancing technologies. Further, we present and evaluate Loaded, an user-space server load balancer for IPv4 and IPv6 based on Linux.
In this paper a self-checking carry select adder is proposed. The duplicated adder blocks which are inherent to a carry select adder without error detection are checked modulo 3. Compared to a carry select adder without error detection the delay of the MSB of the sum of the proposed adder does not increase. Compared to a self-checking duplicated carry select adder the area is reduced by 20%. No restrictions are imposed on the design of the adder blocks
Two common data representations are mostly used in intelligent data analysis, namely the vectorial and the pairwise representation. Pairwise data which satisfy the restrictive conditions of Euclidean spaces can be faithfully translated into a Euclidean vectorial representation by embedding. Non-metric pairwise data with violations of symmetry, reflexivity or triangle inequality pose a substantial conceptual problem for pattern recognition since the amount of predictive structural information beyond what can be measured by embeddings is unclear. We show by systematic modeling of non-Euclidean pairwise data that there exists metric violations which can carry valuable problem specific information. Furthermore, Euclidean and non-metric data can be unified on the level of structural information contained in the data. Stable component analysis selects linear subspaces which are particularly insensitive to data fluctuations. Experimental results from different domains support our pattern recognition strategy.
We consider generating and accepting programmed grammars with bounded degree of non-regulation, that is, the maximum number of elements in success or in failure fields of the underlying grammar. In particular, it is shown that this measure can be restricted to two without loss of descriptional capacity, regardless of whether arbitrary derivations or left-most derivations are considered. Moreover, in some cases, precise characterizations of the linear bounded automaton problem in terms of programmed grammars are obtained. Thus, the results presented in this paper shed new light on some longstanding open problem in the theory of computational complexity.
Three quantum cryptographic protocols of multiuser quantum networks with embedded authentication, allowing quantum key distribution or quantum direct communication, are discussed in this work. The security of the protocols against different types of attacks is analysed with a focus on various impersonation attacks and the man-in-the-middle attack. On the basis of the security analyses several improvements are suggested and implemented in order to adjust the investigated vulnerabilities. Furthermore, the impact of the eavesdropping test procedure on impersonation attacks is outlined. The framework of a general eavesdropping test is proposed to provide additional protection against security risks in impersonation attacks.
An increasing number of applications requires user interfaces that facilitate the handling of large geodata sets. Using virtual 3D city models, complex geospatial information can be communicated visually in an intuitive way. Therefore, real-time visualization of virtual 3D city models represents a key functionality for interactive exploration, presentation, analysis, and manipulation of geospatial data. This thesis concentrates on the development and implementation of concepts and techniques for real-time city model visualization. It discusses rendering algorithms as well as complementary modeling concepts and interaction techniques. Particularly, the work introduces a new real-time rendering technique to handle city models of high complexity concerning texture size and number of textures. Such models are difficult to handle by current technology, primarily due to two problems: - Limited texture memory: The amount of simultaneously usable texture data is limited by the memory of the graphics hardware. - Limited number of textures: Using several thousand different textures simultaneously causes significant performance problems due to texture switch operations during rendering. The multiresolution texture atlases approach, introduced in this thesis, overcomes both problems. During rendering, it permanently maintains a small set of textures that are sufficient for the current view and the screen resolution available. The efficiency of multiresolution texture atlases is evaluated in performance tests. To summarize, the results demonstrate that the following goals have been achieved: - Real-time rendering becomes possible for 3D scenes whose amount of texture data exceeds the main memory capacity. - Overhead due to texture switches is kept permanently low, so that the number of different textures has no significant effect on the rendering frame rate. Furthermore, this thesis introduces two new approaches for real-time city model visualization that use textures as core visualization elements: - An approach for visualization of thematic information. - An approach for illustrative visualization of 3D city models. Both techniques demonstrate that multiresolution texture atlases provide a basic functionality for the development of new applications and systems in the domain of city model visualization.