004 Datenverarbeitung; Informatik
Refine
Has Fulltext
- no (13) (remove)
Year of publication
- 2018 (13) (remove)
Document Type
- Article (8)
- Other (4)
- Doctoral Thesis (1)
Language
- English (13)
Is part of the Bibliography
- yes (13)
Keywords
- Bystander (1)
- Clusteranalyse (1)
- Deep learning (1)
- Entitätsverknüpfung (1)
- IOPS (1)
- Imperative calculi (1)
- Klassifikation (1)
- LSTM (1)
- O (1)
- Sharing (1)
In recent years, the ever-growing amount of documents on the Web as well as in closed systems for private or business contexts led to a considerable increase of valuable textual information about topics, events, and entities. It is a truism that the majority of information (i.e., business-relevant data) is only available in unstructured textual form. The text mining research field comprises various practice areas that have the common goal of harvesting high-quality information from textual data. These information help addressing users' information needs.
In this thesis, we utilize the knowledge represented in user-generated content (UGC) originating from various social media services to improve text mining results. These social media platforms provide a plethora of information with varying focuses. In many cases, an essential feature of such platforms is to share relevant content with a peer group. Thus, the data exchanged in these communities tend to be focused on the interests of the user base. The popularity of social media services is growing continuously and the inherent knowledge is available to be utilized. We show that this knowledge can be used for three different tasks.
Initially, we demonstrate that when searching persons with ambiguous names, the information from Wikipedia can be bootstrapped to group web search results according to the individuals occurring in the documents. We introduce two models and different means to handle persons missing in the UGC source. We show that the proposed approaches outperform traditional algorithms for search result clustering. Secondly, we discuss how the categorization of texts according to continuously changing community-generated folksonomies helps users to identify new information related to their interests. We specifically target temporal changes in the UGC and show how they influence the quality of different tag recommendation approaches. Finally, we introduce an algorithm to attempt the entity linking problem, a necessity for harvesting entity knowledge from large text collections. The goal is the linkage of mentions within the documents with their real-world entities. A major focus lies on the efficient derivation of coherent links.
For each of the contributions, we provide a wide range of experiments on various text corpora as well as different sources of UGC.
The evaluation shows the added value that the usage of these sources provides and confirms the appropriateness of leveraging user-generated content to serve different information needs.
Answer Set Programming faces an increasing popularity for problem solving in various domains. While its modeling language allows us to express many complex problems in an easy way, its solving technology enables their effective resolution. In what follows, we detail some of the key factors of its success. Answer Set Programming [ASP; Brewka et al. Commun ACM 54(12):92–103, (2011)] is seeing a rapid proliferation in academia and industry due to its easy and flexible way to model and solve knowledge-intense combinatorial (optimization) problems. To this end, ASP offers a high-level modeling language paired with high-performance solving technology. As a result, ASP systems provide out-off-the-box, general-purpose search engines that allow for enumerating (optimal) solutions. They are represented as answer sets, each being a set of atoms representing a solution. The declarative approach of ASP allows a user to concentrate on a problem’s specification rather than the computational means to solve it. This makes ASP a prime candidate for rapid prototyping and an attractive tool for teaching key AI techniques since complex problems can be expressed in a succinct and elaboration tolerant way. This is eased by the tuning of ASP’s modeling language to knowledge representation and reasoning (KRR). The resulting impact is nicely reflected by a growing range of successful applications of ASP [Erdem et al. AI Mag 37(3):53–68, 2016; Falkner et al. Industrial applications of answer set programming. K++nstliche Intelligenz (2018)]
Business processes constantly generate, manipulate, and consume data that are managed by organizational databases. Despite being central to process modeling and execution, the link between processes and data is often handled by developers when the process is implemented, thus leaving the connection unexplored during the conceptual design. In this paper, we introduce, formalize, and evaluate a novel conceptual view that bridges the gap between process and data models, and show some kinds of interesting insights that can be derived from this novel proposal.
The aim of our project design space exploration with answer set programming is to develop a general framework based on Answer Set Programming (ASP) that finds valid solutions to the system design problem and simultaneously performs Design Space Exploration (DSE) to find the most favorable alternatives. We leverage recent developments in ASP solving that allow for tight integration of background theories to create a holistic framework for effective DSE.
Generating a novel and descriptive caption of an image is drawing increasing interests in computer vision, natural language processing, and multimedia communities. In this work, we propose an end-to-end trainable deep bidirectional LSTM (Bi-LSTM (Long Short-Term Memory)) model to address the problem. By combining a deep convolutional neural network (CNN) and two separate LSTM networks, our model is capable of learning long-term visual-language interactions by making use of history and future context information at high-level semantic space. We also explore deep multimodal bidirectional models, in which we increase the depth of nonlinearity transition in different ways to learn hierarchical visual-language embeddings. Data augmentation techniques such as multi-crop, multi-scale, and vertical mirror are proposed to prevent over-fitting in training deep models. To understand how our models "translate" image to sentence, we visualize and qualitatively analyze the evolution of Bi-LSTM internal states over time. The effectiveness and generality of proposed models are evaluated on four benchmark datasets: Flickr8K, Flickr30K, MSCOCO, and Pascal1K datasets. We demonstrate that Bi-LSTM models achieve highly competitive performance on both caption generation and image-sentence retrieval even without integrating an additional mechanism (e.g., object detection, attention model). Our experiments also prove that multi-task learning is beneficial to increase model generality and gain performance. We also demonstrate the performance of transfer learning of the Bi-LSTM model significantly outperforms previous methods on the Pascal1K dataset.
While the role of and consequences of being a bystander to face-to-face bullying has received some attention in the literature, to date, little is known about the effects of being a bystander to cyberbullying. It is also unknown how empathy might impact the negative consequences associated with being a bystander of cyberbullying. The present study focused on examining the longitudinal association between bystander of cyberbullying depression, and anxiety, and the moderating role of empathy in the relationship between bystander of cyberbullying and subsequent depression and anxiety. There were 1,090 adolescents (M-age = 12.19; 50% female) from the United States included at Time 1, and they completed questionnaires on empathy, cyberbullying roles (bystander, perpetrator, victim), depression, and anxiety. One year later, at Time 2, 1,067 adolescents (M-age = 13.76; 51% female) completed questionnaires on depression and anxiety. Results revealed a positive association between bystander of cyberbullying and depression and anxiety. Further, empathy moderated the positive relationship between bystander of cyberbullying and depression, but not for anxiety. Implications for intervention and prevention programs are discussed.