Refine
Has Fulltext
- no (6) (remove)
Document Type
- Doctoral Thesis (6) (remove)
Language
- English (6) (remove)
Is part of the Bibliography
- yes (6)
Keywords
Institute
- Hasso-Plattner-Institut für Digital Engineering GmbH (6) (remove)
Organizations continue to assemble and rely upon teams of remote workers as an essential element of their business strategy; however, knowledge processing is particular difficult in such isolated, largely digitally mediated settings. The great challenge for a knowledge-based organization lies not in how individuals should interact using technology but in how to achieve effective cooperation and knowledge exchange. Currently more attention has been paid to technology and the difficulties machines have processing natural language and less to studies of the human aspect—the influence of our own individual cognitive abilities and preferences on the processing of information when interacting online. This thesis draws on four scientific domains involved in the process of interpreting and processing massive, unstructured data—knowledge management, linguistics, cognitive science, and artificial intelligence—to build a model that offers a reliable way to address the ambiguous nature of language and improve workers’ digitally mediated interactions. Human communication can be discouragingly imprecise and is characterized by a strong linguistic ambiguity; this represents an enormous challenge for the computer analysis of natural language. In this thesis, I propose and develop a new data interpretation layer for the processing of natural language based on the human cognitive preferences of the conversants themselves. Such a semantic analysis merges information derived both from the content and from the associated social and individual contexts, as well as the social dynamics that emerge online. At the same time, assessment taxonomies are used to analyze online comportment at the individual and community level in order to successfully identify characteristics leading to greater effectiveness of communication. Measurement patterns for identifying effective methods of individual interaction with regard to individual cognitive and learning preferences are also evaluated; a novel Cyber-Cognitive Identity (CCI)—a perceptual profile of an individual’s cognitive and learning styles—is proposed. Accommodation of such cognitive preferences can greatly facilitate knowledge management in the geographically dispersed and collaborative digital environment. Use of the CCI is proposed for cognitively labeled Latent Dirichlet Allocation (CLLDA), a novel method for automatically labeling and clustering knowledge that does not rely solely on probabilistic methods, but rather on a fusion of machine learning algorithms and the cognitive identities of the associated individuals interacting in a digitally mediated environment. Advantages include: a greater perspicuity of dynamic and meaningful cognitive rules leading to greater tagging accuracy and a higher content portability at the sentence, document, and corpus level with respect to digital communication.
In recent years, the ever-growing amount of documents on the Web as well as in closed systems for private or business contexts led to a considerable increase of valuable textual information about topics, events, and entities. It is a truism that the majority of information (i.e., business-relevant data) is only available in unstructured textual form. The text mining research field comprises various practice areas that have the common goal of harvesting high-quality information from textual data. These information help addressing users' information needs.
In this thesis, we utilize the knowledge represented in user-generated content (UGC) originating from various social media services to improve text mining results. These social media platforms provide a plethora of information with varying focuses. In many cases, an essential feature of such platforms is to share relevant content with a peer group. Thus, the data exchanged in these communities tend to be focused on the interests of the user base. The popularity of social media services is growing continuously and the inherent knowledge is available to be utilized. We show that this knowledge can be used for three different tasks.
Initially, we demonstrate that when searching persons with ambiguous names, the information from Wikipedia can be bootstrapped to group web search results according to the individuals occurring in the documents. We introduce two models and different means to handle persons missing in the UGC source. We show that the proposed approaches outperform traditional algorithms for search result clustering. Secondly, we discuss how the categorization of texts according to continuously changing community-generated folksonomies helps users to identify new information related to their interests. We specifically target temporal changes in the UGC and show how they influence the quality of different tag recommendation approaches. Finally, we introduce an algorithm to attempt the entity linking problem, a necessity for harvesting entity knowledge from large text collections. The goal is the linkage of mentions within the documents with their real-world entities. A major focus lies on the efficient derivation of coherent links.
For each of the contributions, we provide a wide range of experiments on various text corpora as well as different sources of UGC.
The evaluation shows the added value that the usage of these sources provides and confirms the appropriateness of leveraging user-generated content to serve different information needs.
3D geovisualization systems (3DGeoVSs) that use 3D geovirtual environments as a conceptual and technical framework are increasingly used for various applications. They facilitate obtaining insights from ubiquitous geodata by exploiting human abilities that other methods cannot provide. 3DGeoVSs are often complex and evolving systems required to be adaptable and to leverage distributed resources. Designing a 3DGeoVS based on service-oriented architectures, standards, and image-based representations (SSI) facilitates resource sharing and the agile and efficient construction and change of interoperable systems. In particular, exploiting image-based representations (IReps) of 3D views on geodata supports taking full advantage of the potential of such system designs by providing an efficient, decoupled, interoperable, and increasingly applied representation.
However, there is insufficient knowledge on how to build service-oriented, standards-based 3DGeoVSs that exploit IReps. This insufficiency is substantially due to technology and interoperability gaps between the geovisualization domain and further domains that such systems rely on.
This work presents a coherent framework of contributions that support designing the software architectures of targeted systems and exploiting IReps for providing, styling, and interacting with geodata. The contributions uniquely integrate existing concepts from multiple domains and novel contributions for identified limitations. The proposed software reference architecture (SRA) for 3DGeoVSs based on SSI facilitates designing concrete software architectures of such systems. The SRA describes the decomposition of 3DGeoVSs into a network of services and integrates the following contributions to facilitate exploiting IReps effectively and efficiently. The proposed generalized visualization pipeline model generalizes the prevalent visualization pipeline model and overcomes its expressiveness limitations with respect to transforming IReps. The proposed approach for image-based provisioning enables generating and supplying service consumers with image-based views (IViews). IViews act as first-class data entities in the communication between services and provide a suitable IRep and encoding of geodata. The proposed approach for image-based styling separates concerns of styling from image generation and enables styling geodata uniformly represented as IViews specified as algebraic compositions of high-level styling operators. The proposed approach for interactive image-based novel view generation enables generating new IViews from existing IViews in response to interactive manipulations of the viewing camera and includes an architectural pattern that generalizes common novel view generation. The proposed interactive assisting, constrained 3D navigation technique demonstrates how a navigation technique can be built that supports users in navigating multiscale virtual 3D city models, operates in 3DGeoVSs based on SSI as an application of the SRA, can exploit IReps, and can support collaborating services in exploiting IReps.
The validity of the contributions is supported by proof-of-concept prototype implementations and applications and effectiveness and efficiency studies including a user study. Results suggest that this work promises to support designing 3DGeoVSs based on SSI that are more effective and efficient and that can exploit IReps effectively and efficiently. This work presents a template software architecture and key building blocks for building novel IT solutions and applications for geodata, e.g., as components of spatial data infrastructures.