TY - GEN A1 - Hempel, Sabrina A1 - Koseska, Aneta A1 - Nikoloski, Zoran A1 - Kurths, Jürgen T1 - Unraveling gene regulatory networks from time-resolved gene expression data BT - a measures comparison study N2 - Background: Inferring regulatory interactions between genes from transcriptomics time-resolved data, yielding reverse engineered gene regulatory networks, is of paramount importance to systems biology and bioinformatics studies. Accurate methods to address this problem can ultimately provide a deeper insight into the complexity, behavior, and functions of the underlying biological systems. However, the large number of interacting genes coupled with short and often noisy time-resolved read-outs of the system renders the reverse engineering a challenging task. Therefore, the development and assessment of methods which are computationally efficient, robust against noise, applicable to short time series data, and preferably capable of reconstructing the directionality of the regulatory interactions remains a pressing research problem with valuable applications. Results: Here we perform the largest systematic analysis of a set of similarity measures and scoring schemes within the scope of the relevance network approach which are commonly used for gene regulatory network reconstruction from time series data. In addition, we define and analyze several novel measures and schemes which are particularly suitable for short transcriptomics time series. We also compare the considered 21 measures and 6 scoring schemes according to their ability to correctly reconstruct such networks from short time series data by calculating summary statistics based on the corresponding specificity and sensitivity. Our results demonstrate that rank and symbol based measures have the highest performance in inferring regulatory interactions. In addition, the proposed scoring scheme by asymmetric weighting has shown to be valuable in reducing the number of false positive interactions. On the other hand, Granger causality as well as information-theoretic measures, frequently used in inference of regulatory networks, show low performance on the short time series analyzed in this study. Conclusions: Our study is intended to serve as a guide for choosing a particular combination of similarity measures and scoring schemes suitable for reconstruction of gene regulatory networks from short time series data. We show that further improvement of algorithms for reverse engineering can be obtained if one considers measures that are rooted in the study of symbolic dynamics or ranks, in contrast to the application of common similarity measures which do not consider the temporal character of the employed data. Moreover, we establish that the asymmetric weighting scoring scheme together with symbol based measures (for low noise level) and rank based measures (for high noise level) are the most suitable choices. T3 - Zweitveröffentlichungen der Universität Potsdam : Mathematisch-Naturwissenschaftliche Reihe - 371 KW - unferring cellular networks KW - mutual information KW - Escherichia-coli KW - cluster-analysis KW - series KW - algorithms KW - inference KW - models KW - recognition KW - variables Y1 - 2017 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-400924 ER - TY - THES A1 - Daub, Carsten Oliver T1 - Analysis of integrated transcriptomics and metabolomics data : a systems biology approach N2 - Moderne Hochdurchsatzmethoden erlauben die Messung einer Vielzahl von komplementären Daten und implizieren die Existenz von regulativen Netzwerken auf einem systembiologischen Niveau. Ein üblicher Ansatz zur Rekonstruktion solcher Netzwerke stellt die Clusteranalyse dar, die auf einem Ähnlichkeitsmaß beruht. Wir verwenden das informationstheoretische Konzept der wechselseitigen Information, das ursprünglich für diskrete Daten definiert ist, als Ähnlichkeitsmaß und schlagen eine Erweiterung eines für gewöhnlich für die Anwendung auf kontinuierliche biologische Daten verwendeten Algorithmus vor. Wir vergleichen unseren Ansatz mit bereits existierenden Algorithmen. Wir entwickeln ein geschwindigkeitsoptimiertes Computerprogramm für die Anwendung der wechselseitigen Information auf große Datensätze. Weiterhin konstruieren und implementieren wir einen web-basierten Dienst fuer die Analyse von integrierten Daten, die durch unterschiedliche Messmethoden gemessen wurden. Die Anwendung auf biologische Daten zeigt biologisch relevante Gruppierungen, und rekonstruierte Signalnetzwerke zeigen Übereinstimmungen mit physiologischen Erkenntnissen. N2 - Recent high-throughput technologies enable the acquisition of a variety of complementary data and imply regulatory networks on the systems biology level. A common approach to the reconstruction of such networks is the cluster analysis which is based on a similarity measure. We use the information theoretic concept of the mutual information, that has been originally defined for discrete data, as a measure of similarity and propose an extension to a commonly applied algorithm for its calculation from continuous biological data. We compare our approach to previously existing algorithms. We develop a performance optimised software package for the application of the mutual information to large-scale datasets. Furthermore, we design and implement a web-based service for the analysis of integrated data measured with different technologies. Application to biological data reveals biologically relevant groupings and reconstructed signalling networks show agreements with physiological findings. KW - Transinformation KW - wechselseitige Information KW - Ähnlichkeitsmaß KW - Genexpression KW - mutual information KW - distance measure KW - gene expression Y1 - 2004 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-0001251 ER -