Refine
Year of publication
Document Type
- Article (24)
- Monograph/Edited Volume (4)
- Other (1)
- Postprint (1)
Keywords
- Festschrift (2)
- Informationsstruktur (2)
- Linguistic annotation (2)
- Linguistik (2)
- Morphologie (2)
- Syntax (2)
- argument mining (2)
- argumentation structure (2)
- festschrift (2)
- information structure (2)
The meaning of linguistic connectives has often been characterized in terms of their position in a bipartite (semantic, pragmatic) or a tripartite (content, epistemic, speech act) structure of domains, depending on what kinds of entities are being connected (largely: propositions or speech acts). This paper argues that a more fine-grained analysis can be achieved by directing some more attention to the characterization of the entities being related. We propose an inventory of categories of illocutionary status for labelling the spans that are being connected. On this basis, the distinction between the content and the epistemic domain, in particular, can be made more explicit. Focusing on the group of causal connectives in German, we conducted a corpus annotation study from which we derived distinct pragmatic 'usage profiles' of the most frequent causal connectives. Finally, we offer some suggestions on the role of illocutions in relation-based accounts of discourse structure.
Of Trees and Birds
(2019)
Gisbert Fanselow’s work has been invaluable and inspiring to many researchers working on syntax, morphology, and information structure, both from a theoretical and from an experimental perspective. This volume comprises a collection of articles dedicated to Gisbert on the occasion of his 60th birthday, covering a range of topics from these areas and beyond. The contributions have in common that in a broad sense they have to do with language structures (and thus trees), and that in a more specific sense they have to do with birds. They thus cover two of Gisbert’s major interests in- and outside of the linguistic world (and perhaps even at the interface).
We present a lexicon-based approach to extracting sentiment from text. The Semantic Orientation CALculator (SO-CAL) uses dictionaries of words annotated with their semantic orientation (polarity and strength), and incorporates intensification and negation. SO-CAL is applied to the polarity classification task, the process of assigning a positive or negative label to a text that captures the text's opinion towards its main subject matter. We show that SO-CAL's performance is consistent across domains and on completely unseen data. Additionally, we describe the process of dictionary creation, and our use of Mechanical Turk to check dictionaries for consistency and reliability.
Annotating linguistic data has become a major field of interest, both for supplying the necessary data for machine learning approaches to NLP applications, and as a research issue in its own right. This comprises issues of technical formats, tools, and methodologies of annotation. We provide a brief overview of these notions and then introduce the papers assembled in this special issue.
Handbuch Textannotation
(2015)
Das Potsdamer Kommentarkorpus ist eine Sammlung von Zeitungstexten, die dem Genre ‘Kommentar' zuzuordnen sind. Der öffentlich verfügbare Teil besteht aus 175 Texten aus der Märkischen Allgemeinen Zeitung, die hinsichtlich Syntax, Koreferenz, Konnektoren und Rhetorische Struktur manuell annotiert wurden. Weitere Ebenen werden bei zukünftigen Korpusversionen hinzukommen. Dieses Buch enthält die Annotationsrichtlinien, die der Bearbeitung des öffentlichen Teils des Korpus zugrunde lagen, sowie auch anderer Teile, bei denen mit weiteren Annotationsebenen experimentiert wurde. Die meisten der Richtlinien werden auch für ähnliche Text-Genres und für andere Sprachen verwendbar sein.
The notion of coherence relations is quite widely accepted in general, but concrete proposals differ considerably on the questions of how they should be motivated, which relations are to be assumed, and how they should be defined. This paper takes a "bottom-up" perspective by assessing the contribution made by linguistic signals (connectives), using insights from the relevant literature as well as verification by practical text annotation. We work primarily with the German language here and focus on the realm of contrast. Thus, we suggest a new inventory of contrastive connective functions and discuss their relationship to contrastive coherence relations that have been proposed in earlier work.
Empirical studies of text coherence often use tree-like structures in the spirit of Rhetorical Structure Theory (RST) as representational device. This paper identifies several sources of ambiguity in RST-inspired trees and argues that such structures are therefore not as explanatory as a text representation should be. As an alternative, an approach toward multi-level annotation (MLA) of texts is proposed, which separates the information into distinct levels of representation, in particular: referential structure, thematic structure, conjunctive relations, and intentional structure. Levels are conceptually built upon each other, and human annotators can produce them using a dedicated software environment. We argue that the resulting multi-level corpora are descriptively more adequate, and as a resource are more useful than RST-style treebanks.
Connective-Lex
(2019)
In this paper, we present a tangible outcome of the TextLink network: a joint online database project displaying and linking existing and newly-created lexicons of discourse connectives in multiple languages. We discuss the definition and demarcation of the class of connectives that should be included in such a resource, and present the syntactic, semantic/pragmatic, and lexicographic information we collected. Further, the technical implementation of the database and the search functionality are presented. We discuss how the multilingual integration of several connective lexicons provides added value for linguistic researchers and other users interested in connectives, by allowing crosslinguistic comparison and a direct linking between discourse relational devices in different languages. Finally, we provide pointers for possible future extensions both in breadth (i.e., by adding lexicons for additional languages) and depth (by extending the information provided for each connective item and by strengthening the crosslinguistic links).