Refine
Year of publication
Document Type
- Conference Proceeding (445) (remove)
Language
- English (445) (remove)
Keywords
- Information Structure (4)
- Cloud Computing (3)
- E-Mail Tracking (3)
- ERP (3)
- MOOC (3)
- Privacy (3)
- enterprise systems (3)
- knowledge management (3)
- middleware (3)
- social media (3)
Institute
- Extern (122)
- Institut für Biochemie und Biologie (54)
- Fachgruppe Betriebswirtschaftslehre (53)
- Department Sport- und Gesundheitswissenschaften (38)
- Institut für Ernährungswissenschaft (36)
- Department Psychologie (27)
- Institut für Künste und Medien (22)
- Institut für Physik und Astronomie (15)
- Institut für Chemie (13)
- Hasso-Plattner-Institut für Digital Engineering gGmbH (11)
Finite state methods for natural language processing often require the construction and the intersection of several automata. In this paper, we investigate the question of determining the best order in which these intersections should be performed. We take as an example lexical disambiguation in polarity grammars. We show that there is no efficient way to minimize the state complexity of these intersections.
Since Harris’ parser in the late 50s, multiword units have been progressively integrated in parsers. Nevertheless, in the most part, they are still restricted to compound words, that are more stable and less numerous. Actually, language is full of semi-fixed expressions that also form basic semantic units: semi-fixed adverbial expressions (e.g. time), collocations. Like compounds, the identification of these structures limits the combinatorial complexity induced by lexical ambiguity. In this paper, we detail an experiment that largely integrates these notions in a finite-state procedure of segmentation into super-chunks, preliminary to a parser.We show that the chunker, developped for French, reaches 92.9% precision and 98.7% recall. Moreover, multiword units realize 36.6% of the attachments within nominal and prepositional phrases.
This paper describes a two-level formalism where feature structures are used in contextual rules. Whereas usual two-level grammars describe rational sets over symbol pairs, this new formalism uses tree structured regular expressions. They allow an explicit and precise definition of the scope of feature structures. A given surface form may be described using several feature structures. Feature unification is expressed in contextual rules using variables, like in a unification grammar. Grammars are compiled in finite state multi-tape transducers.
This article describes a HMM-based word-alignment method that can selectively enforce a contiguity constraint. This method has a direct application in the extraction of a bilingual terminological lexicon from a parallel corpus, but can also be used as a preliminary step for the extraction of phrase pairs in a Phrase-Based Statistical Machine Translation system. Contiguous source words composing terms are aligned to contiguous target language words. The HMM is transformed into a Weighted Finite State Transducer (WFST) and contiguity constraints are enforced by specific multi-tape WFSTs. The proposed method is especially suited when basic linguistic resources (morphological analyzer, part-of-speech taggers and term extractors) are available for the source language only.
Nested complementation plays an important role in expressing counter- i.e. star-free and first-order definable languages and their hierarchies. In addition, methods that compile phonological rules into finite-state networks use double-nested complementation or “double negation”. This paper reviews how the double-nested complementation extends to a relatively new operation, generalized restriction (GR), coined by the author (Yli-Jyrä and Koskenniemi 2004). This operation encapsulates a double-nested complementation and elimination of a concatenation marker, diamond, whose finite occurrences align concatenations in the arguments of the operation. The paper demonstrates that the GR operation has an interesting potential in expressing regular languages, various kinds of grammars, bimorphisms and relations. This motivates a further study of optimized implementation of the operator.
Observational evidence exists that winds of massive stars are clumped. Many massive star systems are known as non-thermal particle production sites, as indicated by their synchrotron emission in the radio band. As a consequence they are also considered as candidate sites for non-thermal high-energy photon production up to gamma-ray energies. The present work considers the effects of wind clumpiness expected on the emitting relativistic particle spectrum in colliding wind systems, built up from the pool of thermal wind particles through diffusive particle acceleration, and taking into account inverse Compton and synchrotron losses. In comparison to a homogeneous wind, a clumpy wind causes flux variations of the emitting particle spectrum when the clump enters the wind collision region. It is found that the spectral features associated with this variability moves temporally from low to high energy bands with the time shift between any two spectral bands being dependent on clump size, filling factor, and the energy-dependence of particle energy gains and losses.
The most massive stars are those with the shortest but most active life. One group of massive stars, the Luminous Blue Variables (LBVs), of which only a few objects are known, are in particular of interest concerning the stability of stars. They have a high mass loss rate and are close to being instable. This is even more likely as rotation becomes an important factor in stellar evolution of these stars. Through massive stellar winds and sometimes giant eruptions, LBV nebulae are formed. Various aspects in the evolution in the LBV phase lead, beside the large scale morphological and kinematical differences, to a diversity of small structures like clumps, rims, and outflows in these nebulae.