Refine
Has Fulltext
- no (47) (remove)
Year of publication
- 2018 (47) (remove)
Document Type
- Doctoral Thesis (47) (remove)
Language
- English (47) (remove)
Is part of the Bibliography
- yes (47)
Keywords
- Akkermansia muciniphila (1)
- Aktuator (1)
- Alternatividentitäten (1)
- Alternativvarietäten (1)
- Biomarker (1)
- Chlamydomonas (1)
- Clifford semigroup (1)
- Clifford-Halbgruppen (1)
- Clusteranalyse (1)
- Competitive Negotiation Tactics (1)
Institute
- Institut für Biochemie und Biologie (14)
- Institut für Physik und Astronomie (8)
- Institut für Chemie (5)
- Institut für Ernährungswissenschaft (5)
- Sozialwissenschaften (4)
- Department Psychologie (3)
- Extern (3)
- Institut für Geowissenschaften (3)
- Hasso-Plattner-Institut für Digital Engineering GmbH (2)
- Potsdam Institute for Climate Impact Research (PIK) e. V. (2)
In recent years, the ever-growing amount of documents on the Web as well as in closed systems for private or business contexts led to a considerable increase of valuable textual information about topics, events, and entities. It is a truism that the majority of information (i.e., business-relevant data) is only available in unstructured textual form. The text mining research field comprises various practice areas that have the common goal of harvesting high-quality information from textual data. These information help addressing users' information needs.
In this thesis, we utilize the knowledge represented in user-generated content (UGC) originating from various social media services to improve text mining results. These social media platforms provide a plethora of information with varying focuses. In many cases, an essential feature of such platforms is to share relevant content with a peer group. Thus, the data exchanged in these communities tend to be focused on the interests of the user base. The popularity of social media services is growing continuously and the inherent knowledge is available to be utilized. We show that this knowledge can be used for three different tasks.
Initially, we demonstrate that when searching persons with ambiguous names, the information from Wikipedia can be bootstrapped to group web search results according to the individuals occurring in the documents. We introduce two models and different means to handle persons missing in the UGC source. We show that the proposed approaches outperform traditional algorithms for search result clustering. Secondly, we discuss how the categorization of texts according to continuously changing community-generated folksonomies helps users to identify new information related to their interests. We specifically target temporal changes in the UGC and show how they influence the quality of different tag recommendation approaches. Finally, we introduce an algorithm to attempt the entity linking problem, a necessity for harvesting entity knowledge from large text collections. The goal is the linkage of mentions within the documents with their real-world entities. A major focus lies on the efficient derivation of coherent links.
For each of the contributions, we provide a wide range of experiments on various text corpora as well as different sources of UGC.
The evaluation shows the added value that the usage of these sources provides and confirms the appropriateness of leveraging user-generated content to serve different information needs.
Neuroinflammatory and neurodegenerative diseases such as Parkinson's (PD) and multiple sclerosis (MS) often result in a severe impairment of the patient´s quality of life. Effective therapies for the treatment are currently not available, which results in a high socio-economic burden. Due to the heterogeneity of the disease subtypes, stratification is particularly difficult in the early phase of the disease and is mainly based on clinical parameters such as neurophysiological tests and central nervous imaging. Due to good accessibility and stability, blood and cerebrospinal fluid metabolite markers could serve as surrogates for neurodegenerative processes. This can lead to an improved mechanistic understanding of these diseases and further be used as "treatment response" biomarkers in preclinical and clinical development programs. Therefore, plasma and CSF metabolite profiles will be identified that allow differentiation of PD from healthy controls, association of PD with dementia (PDD) and differentiation of PD subtypes such as akinetic rigid and tremor dominant PD patients. In addition, plasma metabolites for the diagnosis of primary progressive MS (PPMS) should be investigated and tested for their specificity to relapsing-remitting MS (RRMS) and their development during PPMS progression.
By applying untargeted high-resolution metabolomics of PD patient samples and in using random forest and partial least square machine learning algorithms, this study identified 20 plasma metabolites and 14 CSF metabolite biomarkers. These differentiate against healthy individuals with an AUC of 0.8 and 0.9 in PD, respectively. We also identify ten PDD specific serum metabolites, which differentiate against healthy individuals and PD patients without dementia with an AUC of 1.0, respectively. Furthermore, 23 akinetic-rigid specific plasma markers were identified, which differentiate against tremor-dominant PD patients with an AUC of 0.94 and against healthy individuals with an AUC of 0.98. These findings also suggest more severe disease pathology in the akinetic-rigid PD than in tremor dominant PD. In the analysis of MS patient samples a partial least square analysis yielded predictive models for the classification of PPMS and resulted in 20 PPMS specific metabolites. In another MS study unknown changes in human metabolism were identified after administration of the multiple sclerosis drug dimethylfumarate, which is used for the treatment of RRMS. These results allow to describe and understand the hitherto completely unknown mechanism of action of this new drug and to use these findings for the further development of new drugs and targets against RRMS.
In conclusion, these results have the potential for improved diagnosis of these diseases and improvement of mechanistic understandings, as multiple deregulated pathways were identified. Moreover, novel Dimethylfumarate targets can be used to aid drug development and treatment efficiency. Overall, metabolite profiling in combination with machine learning identified as a promising approach for biomarker discovery and mode of action elucidation.