Developing a finite-state morphological analyzer for Urdu and Hindi

Das Suchergebnis hat sich seit Ihrer Suchanfrage verändert. Eventuell werden Dokumente in anderer Reihenfolge angezeigt.

We introduce and discuss a number of issues that arise in the process of building a finite-state morphological analyzer for Urdu, in particular issues with potential ambiguity and non-concatenative morphology. Our approach allows for an underlyingly similar treatment of both Urdu and Hindi via a cascade of finite-state transducers that transliterates the very different scripts into a common ASCII transcription system. As this transliteration system is based on the XFST tools that the Urdu/Hindi common morphological analyzer is also implemented in, no compatibility problems arise.

Metadaten
Verfasserangaben:	Tina Bögel, Miriam Butt, Annette Hautli, Sebastian Sulger
URN:	urn:nbn:de:kobv:517-opus-27155
Publikationstyp:	Konferenzveröffentlichung
Sprache:	Englisch
Erscheinungsjahr:	2008
Veröffentlichende Institution:	Universität Potsdam
Datum der Freischaltung:	11.12.2008
Organisationseinheiten:	Extern / Extern
DDC-Klassifikation:	4 Sprache / 40 Sprache / 400 Sprache
Sammlung(en):	Universität Potsdam / Tagungsbände/Proceedings (nicht fortlaufend) / Finite-state methods and natural language processing : 6th International Workshop, FSMNLP 2007 / II Regular Papers
Lizenz (Deutsch):	Keine öffentliche Lizenz: Unter Urheberrechtsschutz
Externe Anmerkung:	The complete edition of the proceedings "Finite-state methods and natural language processing : 6th International Workshop, FSMNLP 2007 ; Revised Papers" is available: URN urn:nbn:de:kobv:517-opus-23812