Refine
Has Fulltext
- yes (3) (remove)
Document Type
- Doctoral Thesis (3)
Language
- English (3)
Keywords
- Kontext (3) (remove)
In this work the concept of 'context' is considered in five main points. First, context is seen as always necessary for an adequate explication of the concepts of meaning and understanding. Context always plays a role and is not merely brought into consideration when handling a special class of statements or terms, or when there is doubt and clarification is necessary. Second, context cannot be completely reduced to some system of representation. The reason for this is the presence of humans, which is always an important component of a context. Humans experience situations in ways that are not always reducible to symbolic representation. Third, contexts are in principle open. In normal cases they cannot be determined or described in advance. A context is not to be equated with a set of information. Fourth, we understand the parameters of a context pragmatically, which is why we are not led into doubt or even to meaning skepticism by the open nature of a context. This pragmatic knowledge belongs to the category of an ability. Fifth, contexts are, in principle, accessible. This denies the idea that some contexts are incommensurable. There are a number of pragmatic ways of accessing unfamiliar contexts. Some of these are here examined in light of the so-called 'culture wars' in the U.S.A.
This thesis gives formal definitions of discourse-givenness, coreference and reference, and reports on experiments with computational models of discourse-givenness of noun phrases for English and German. Definitions are based on Bach's (1987) work on reference, Kibble and van Deemter's (2000) work on coreference, and Kamp and Reyle's Discourse Representation Theory (1993). For the experiments, the following corpora with coreference annotation were used: MUC-7, OntoNotes and ARRAU for Englisch, and TueBa-D/Z for German. As for classification algorithms, they cover J48 decision trees, the rule based learner Ripper, and linear support vector machines. New features are suggested, representing the noun phrase's specificity as well as its context, which lead to a significant improvement of classification quality.
The Semantic Web provides information contained in the World Wide Web as machine-readable facts. In comparison to a keyword-based inquiry, semantic search enables a more sophisticated exploration of web documents. By clarifying the meaning behind entities, search results are more precise and the semantics simultaneously enable an exploration of semantic relationships. However, unlike keyword searches, a semantic entity-focused search requires that web documents are annotated with semantic representations of common words and named entities. Manual semantic annotation of (web) documents is time-consuming; in response, automatic annotation services have emerged in recent years. These annotation services take continuous text as input, detect important key terms and named entities and annotate them with semantic entities contained in widely used semantic knowledge bases, such as Freebase or DBpedia. Metadata of video documents require special attention. Semantic analysis approaches for continuous text cannot be applied, because information of a context in video documents originates from multiple sources possessing different reliabilities and characteristics. This thesis presents a semantic analysis approach consisting of a context model and a disambiguation algorithm for video metadata. The context model takes into account the characteristics of video metadata and derives a confidence value for each metadata item. The confidence value represents the level of correctness and ambiguity of the textual information of the metadata item. The lower the ambiguity and the higher the prospective correctness, the higher the confidence value. The metadata items derived from the video metadata are analyzed in a specific order from high to low confidence level. Previously analyzed metadata are used as reference points in the context for subsequent disambiguation. The contextually most relevant entity is identified by means of descriptive texts and semantic relationships to the context. The context is created dynamically for each metadata item, taking into account the confidence value and other characteristics. The proposed semantic analysis follows two hypotheses: metadata items of a context should be processed in descendent order of their confidence value, and the metadata that pertains to a context should be limited by content-based segmentation boundaries. The evaluation results support the proposed hypotheses and show increased recall and precision for annotated entities, especially for metadata that originates from sources with low reliability. The algorithms have been evaluated against several state-of-the-art annotation approaches. The presented semantic analysis process is integrated into a video analysis framework and has been successfully applied in several projects for the purpose of semantic video exploration of videos.