publish.UP Suchen

2 Treffer

1 bis 2

Sortieren nach

By all these lovely tokens... Merging conflicting tokenizations (2012)

Chiarcos, Christian ; Ritz, Julia ; Stede, Manfred

Given the contemporary trend to modular NLP architectures and multiple annotation frameworks, the existence of concurrent tokenizations of the same text represents a pervasive problem in everyday's NLP practice and poses a non-trivial theoretical problem to the integration of linguistic annotations and their interpretability in general. This paper describes a solution for integrating different tokenizations using a standoff XML format, and discusses the consequences from a corpus-linguistic perspective.

Inter-operability and reusability the science of annotation (2012)

Stede, Manfred ; Huang, Chu-Ren

Annotating linguistic data has become a major field of interest, both for supplying the necessary data for machine learning approaches to NLP applications, and as a research issue in its own right. This comprises issues of technical formats, tools, and methodologies of annotation. We provide a brief overview of these notions and then introduce the papers assembled in this special issue.

1 bis 2

Filtern

Volltext vorhanden

Autor*in

Erscheinungsjahr

Dokumenttyp

Sprache

Gehört zur Bibliographie

Schlagworte

Institut

2 Treffer