Refine
Year of publication
Document Type
- Article (24)
- Monograph/Edited Volume (4)
- Other (1)
- Postprint (1)
Keywords
- Festschrift (2)
- Informationsstruktur (2)
- Linguistic annotation (2)
- Linguistik (2)
- Morphologie (2)
- Syntax (2)
- argument mining (2)
- argumentation structure (2)
- festschrift (2)
- information structure (2)
ANNIS
(2004)
In this paper, we discuss the design and implementation of our first version of the database "ANNIS" ("ANNotation of Information Structure"). For research based on empirical data, ANNIS provides a uniform environment for storing this data together with its linguistic annotations. A central database promotes standardized annotation, which facilitates interpretation and comparison of the data. ANNIS is used through a standard web browser and offers tier-based visualization of data and annotations, as well as search facilities that allow for cross-level and cross-sentential queries. The paper motivates the design of the system, characterizes its user interface, and provides an initial technical evaluation of ANNIS with respect to data size and query processing.
Coherence relations are typically taken to link two clauses or larger units and to be signaled at the text surface by conjunctions and certain adverbials. Relations, however, also can hold within clauses, indicated by prepositions like despite, due to, or in case of, when these have an internal argument denoting an eventuality. Although these prepositions act as reliable cues to indicate a specific relation, others are lexically more neutral. We investigated this situation for the German preposition bei, which turns out to be highly ambiguous. We demonstrate the range of readings in a corpus study, proposing 6 more specific prepositions as a comprehensive substitution set. All these uses of bei share a common kernel meaning, which is missed by the standard accounts that assume lexical polysemy. We examine the range of coherence relations that can be signaled by bei and provide some factors here supporting the disambiguation task in a framework of discourse interpretation
Empirical studies of text coherence often use tree-like structures in the spirit of Rhetorical Structure Theory (RST) as representational device. This paper identifies several sources of ambiguity in RST-inspired trees and argues that such structures are therefore not as explanatory as a text representation should be. As an alternative, an approach toward multi-level annotation (MLA) of texts is proposed, which separates the information into distinct levels of representation, in particular: referential structure, thematic structure, conjunctive relations, and intentional structure. Levels are conceptually built upon each other, and human annotators can produce them using a dedicated software environment. We argue that the resulting multi-level corpora are descriptively more adequate, and as a resource are more useful than RST-style treebanks.