publish.UP Search

‘What will we learn from the current crisis?’ (2021)

Björk, Jennie ; Hölzle, Katharina ; Boer, Harry

X-tracking the usage interest on web sites (2011)

Wang, Long

The exponential expanding of the numbers of web sites and Internet users makes WWW the most important global information resource. From information publishing and electronic commerce to entertainment and social networking, the Web allows an inexpensive and efficient access to the services provided by individuals and institutions. The basic units for distributing these services are the web sites scattered throughout the world. However, the extreme fragility of web services and content, the high competence between similar services supplied by different sites, and the wide geographic distributions of the web users drive the urgent requirement from the web managers to track and understand the usage interest of their web customers. This thesis, "X-tracking the Usage Interest on Web Sites", aims to fulfill this requirement. "X" stands two meanings: one is that the usage interest differs from various web sites, and the other is that usage interest is depicted from multi aspects: internal and external, structural and conceptual, objective and subjective. "Tracking" shows that our concentration is on locating and measuring the differences and changes among usage patterns. This thesis presents the methodologies on discovering usage interest on three kinds of web sites: the public information portal site, e-learning site that provides kinds of streaming lectures and social site that supplies the public discussions on IT issues. On different sites, we concentrate on different issues related with mining usage interest. The educational information portal sites were the first implementation scenarios on discovering usage patterns and optimizing the organization of web services. In such cases, the usage patterns are modeled as frequent page sets, navigation paths, navigation structures or graphs. However, a necessary requirement is to rebuild the individual behaviors from usage history. We give a systematic study on how to rebuild individual behaviors. Besides, this thesis shows a new strategy on building content clusters based on pair browsing retrieved from usage logs. The difference between such clusters and the original web structure displays the distance between the destinations from usage side and the expectations from design side. Moreover, we study the problem on tracking the changes of usage patterns in their life cycles. The changes are described from internal side integrating conceptual and structure features, and from external side for the physical features; and described from local side measuring the difference between two time spans, and global side showing the change tendency along the life cycle. A platform, Web-Cares, is developed to discover the usage interest, to measure the difference between usage interest and site expectation and to track the changes of usage patterns. E-learning site provides the teaching materials such as slides, recorded lecture videos and exercise sheets. We focus on discovering the learning interest on streaming lectures, such as real medias, mp4 and flash clips. Compared to the information portal site, the usage on streaming lectures encapsulates the variables such as viewing time and actions during learning processes. The learning interest is discovered in the form of answering 6 questions, which covers finding the relations between pieces of lectures and the preference among different forms of lectures. We prefer on detecting the changes of learning interest on the same course from different semesters. The differences on the content and structure between two courses leverage the changes on the learning interest. We give an algorithm on measuring the difference on learning interest integrated with similarity comparison between courses. A search engine, TASK-Moniminer, is created to help the teacher query the learning interest on their streaming lectures on tele-TASK site. Social site acts as an online community attracting web users to discuss the common topics and share their interesting information. Compared to the public information portal site and e-learning web site, the rich interactions among users and web content bring the wider range of content quality, on the other hand, provide more possibilities to express and model usage interest. We propose a framework on finding and recommending high reputation articles in a social site. We observed that the reputation is classified into global and local categories; the quality of the articles having high reputation is related with the content features. Based on these observations, our framework is implemented firstly by finding the articles having global or local reputation, and secondly clustering articles based on their content relations, and then the articles are selected and recommended from each cluster based on their reputation ranks.

What's creative about sentences? (2022)

Weinstein, Theresa Julia ; Ceh, Simon Majed ; Meinel, Christoph ; Benedek, Mathias

Evaluating creativity of verbal responses or texts is a challenging task due to psychometric issues associated with subjective ratings and the peculiarities of textual data. We explore an approach to objectively assess the creativity of responses in a sentence generation task to 1) better understand what language-related aspects are valued by human raters and 2) further advance the developments toward automating creativity evaluations. Over the course of two prior studies, participants generated 989 four-word sentences based on a four-letter prompt with the instruction to be creative. We developed an algorithm that scores each sentence on eight different metrics including 1) general word infrequency, 2) word combination infrequency, 3) context-specific word uniqueness, 4) syntax uniqueness, 5) rhyme, 6) phonetic similarity, and similarity of 7) sequence spelling and 8) semantic meaning to the cue. The text metrics were then used to explain the averaged creativity ratings of eight human raters. We found six metrics to be significantly correlated with the human ratings, explaining a total of 16% of their variance. We conclude that the creative impression of sentences is partly driven by different aspects of novelty in word choice and syntax, as well as rhythm and sound, which are amenable to objective assessment.

What are we talking about when we talk about technology pivots? (2020)

Bohn, Nicolai ; Kundisch, Dennis

Technology pivots were designed to help digital startups make adjustments to the technology underpinning their products and services. While academia and the media make liberal use of the term "technology pivot," they rarely align themselves to Ries' foundational conceptualization. Recent research suggests that a more granulated conceptualization of technology pivots is required. To scientifically derive a comprehensive conceptualization, we conduct a Delphi study with a panel of 38 experts drawn from academia and practice to explore their understanding of "technology pivots." Our study thus makes an important contribution to advance the seminal work by Ries on technology pivots.

Wendepunkt für Gesundheit (2019)

Böttinger, Erwin

Weight-based strategy for an I/O-intensive application at a cloud data center (2018)

Peng, Junjie ; Liu, Danxu ; Wang, Yingtao ; Zeng, Ying ; Cheng, Feng ; Zhang, Wenqiang

Applications with different characteristics in the cloud may have different resources preferences. However, traditional resource allocation and scheduling strategies rarely take into account the characteristics of applications. Considering that an I/O-intensive application is a typical type of application and that frequent I/O accesses, especially small files randomly accessing the disk, may lead to an inefficient use of resources and reduce the quality of service (QoS) of applications, a weight allocation strategy is proposed based on the available resources that a physical server can provide as well as the characteristics of the applications. Using the weight obtained, a resource allocation and scheduling strategy is presented based on the specific application characteristics in the data center. Extensive experiments show that the strategy is correct and can guarantee a high concurrency of I/O per second (IOPS) in a cloud data center with high QoS. Additionally, the strategy can efficiently improve the utilization of the disk and resources of the data center without affecting the service quality of applications.

Web-based development in the lively kernel (2011)

The World Wide Web as an application platform becomes increasingly important. However, the development of Web applications is often more complex than for the desktop. Web-based development environments like Lively Webwerkstatt can mitigate this problem by making the development process more interactive and direct. By moving the development environment into the Web, applications can be developed collaboratively in a Wiki-like manner. This report documents the results of the project seminar on Web-based Development Environments 2010. In this seminar, participants extended the Web-based development environment Lively Webwerkstatt. They worked in small teams on current research topics from the field of Web-development and tool support for programmers and implemented their results in the Webwerkstatt environment.

Weak conformance between process models and synchronized object life cycles (2014)

Meyer, Andreas ; Weske, Mathias

Process models specify behavioral execution constraints between activities as well as between activities and data objects. A data object is characterized by its states and state transitions represented as object life cycle. For process execution, all behavioral execution constraints must be correct. Correctness can be verified via soundness checking which currently only considers control flow information. For data correctness, conformance between a process model and its object life cycles is checked. Current approaches abstract from dependencies between multiple data objects and require fully specified process models although, in real-world process repositories, often underspecified models are found. Coping with these issues, we introduce the concept of synchronized object life cycles and we define a mapping of data constraints of a process model to Petri nets extending an existing mapping. Further, we apply the notion of weak conformance to process models to tell whether each time an activity needs to access a data object in a particular state, it is guaranteed that the data object is in or can reach the expected state. Then, we introduce an algorithm for an integrated verification of control flow correctness and weak data conformance using soundness checking.

Von Biopolymeren und Biokunststoffen : Antrittsvorlesung 2009-06-04 (2009)

Fink, Hans-Peter

Prof. Fink wird zum einen auf die industriell schon lange genutzten natürlichen Polymere wie Cellulose, Stärke und Lignin eingehen, zum anderen auf neue Entwicklungen bei biobasierten Kunststoffen. Von besonderer Bedeutung ist dabei die Aufklärung von Zusammenhängen zwischen Prozessparametern, Strukturen und Eigenschaften.

VMI-PL: A monitoring language for virtual platforms using virtual machine introspection (2014)

Westphal, Florian ; Axelsson, Stefan ; Neuhaus, Christian ; Polze, Andreas

With the growth of virtualization and cloud computing, more and more forensic investigations rely on being able to perform live forensics on a virtual machine using virtual machine introspection (VMI). Inspecting a virtual machine through its hypervisor enables investigation without risking contamination of the evidence, crashing the computer, etc. To further access to these techniques for the investigator/researcher we have developed a new VMI monitoring language. This language is based on a review of the most commonly used VMI-techniques to date, and it enables the user to monitor the virtual machine's memory, events and data streams. A prototype implementation of our monitoring system was implemented in KVM, though implementation on any hypervisor that uses the common x86 virtualization hardware assistance support should be straightforward. Our prototype outperforms the proprietary VMWare VProbes in many cases, with a maximum performance loss of 18% for a realistic test case, which we consider acceptable. Our implementation is freely available under a liberal software distribution license. (C) 2014 Digital Forensics Research Workshop. Published by Elsevier Ltd. All rights reserved.

Visually specifying compliance rules and explaining their violations for business processes (2011)

Awad, Ahmed Mahmoud Hany Aly ; Weidlich, Matthias ; Weske, Mathias

A business process is a set of steps designed to be executed in a certain order to achieve a business value. Such processes are often driven by and documented using process models. Nowadays, process models are also applied to drive process execution. Thus, correctness of business process models is a must. Much of the work has been devoted to check general, domain-independent correctness criteria, such as soundness. However, business processes must also adhere to and show compliance with various regulations and constraints, the so-called compliance requirements. These are domain-dependent requirements. In many situations, verifying compliance on a model level is of great value, since violations can be resolved in an early stage prior to execution. However, this calls for using formal verification techniques, e.g., model checking, that are too complex for business experts to apply. In this paper, we utilize a visual language. BPMN-Q to express compliance requirements visually in a way similar to that used by business experts to build process models. Still, using a pattern based approach, each BPMN-Qgraph has a formal temporal logic expression in computational tree logic (CTL). Moreover, the user is able to express constraints, i.e., compliance rules, regarding control flow and data flow aspects. In order to provide valuable feedback to a user in case of violations, we depend on temporal logic querying approaches as well as BPMN-Q to visually highlight paths in a process model whose execution causes violations.

Visualizing movement dynamics in virtual urban environments (2006)

Nienhaus, Marc ; Gooch, Bruce ; Döllner, Jürgen Roland Friedrich

Dynamics in urban environments encompasses complex processes and phenomena such as related to movement (e.g.,traffic, people) and development (e.g., construction, settlement). This paper presents novel methods for creating human-centric illustrative maps for visualizing the movement dynamics in virtual 3D environments. The methods allow a viewer to gain rapid insight into traffic density and flow. The illustrative maps represent vehicle behavior as light threads. Light threads are a familiar visual metaphor caused by moving light sources producing streaks in a long-exposure photograph. A vehicle’s front and rear lights produce light threads that convey its direction of motion as well as its velocity and acceleration. The accumulation of light threads allows a viewer to quickly perceive traffic flow and density. The light-thread technique is a key element to effective visualization systems for analytic reasoning, exploration, and monitoring of geospatial processes.

Visualization techniques for the analysis of software behavior and related structures (2014)

Trümper, Jonas

Software maintenance encompasses any changes made to a software system after its initial deployment and is thereby one of the key phases in the typical software-engineering lifecycle. In software maintenance, we primarily need to understand structural and behavioral aspects, which are difficult to obtain, e.g., by code reading. Software analysis is therefore a vital tool for maintaining these systems: It provides - the preferably automated - means to extract and evaluate information from their artifacts such as software structure, runtime behavior, and related processes. However, such analysis typically results in massive raw data, so that even experienced engineers face difficulties directly examining, assessing, and understanding these data. Among other things, they require tools with which to explore the data if no clear question can be formulated beforehand. For this, software analysis and visualization provide its users with powerful interactive means. These enable the automation of tasks and, particularly, the acquisition of valuable and actionable insights into the raw data. For instance, one means for exploring runtime behavior is trace visualization. This thesis aims at extending and improving the tool set for visual software analysis by concentrating on several open challenges in the fields of dynamic and static analysis of software systems. This work develops a series of concepts and tools for the exploratory visualization of the respective data to support users in finding and retrieving information on the system artifacts concerned. This is a difficult task, due to the lack of appropriate visualization metaphors; in particular, the visualization of complex runtime behavior poses various questions and challenges of both a technical and conceptual nature. This work focuses on a set of visualization techniques for visually representing control-flow related aspects of software traces from shared-memory software systems: A trace-visualization concept based on icicle plots aids in understanding both single-threaded as well as multi-threaded runtime behavior on the function level. The concept’s extensibility further allows the visualization and analysis of specific aspects of multi-threading such as synchronization, the correlation of such traces with data from static software analysis, and a comparison between traces. Moreover, complementary techniques for simultaneously analyzing system structures and the evolution of related attributes are proposed. These aim at facilitating long-term planning of software architecture and supporting management decisions in software projects by extensions to the circular-bundle-view technique: An extension to 3-dimensional space allows for the use of additional variables simultaneously; interaction techniques allow for the modification of structures in a visual manner. The concepts and techniques presented here are generic and, as such, can be applied beyond software analysis for the visualization of similarly structured data. The techniques' practicability is demonstrated by several qualitative studies using subject data from industry-scale software systems. The studies provide initial evidence that the techniques' application yields useful insights into the subject data and its interrelationships in several scenarios.

Visual suggestions for improvements in business process diagrams (2011)

Laue, Ralf ; Awad, Ahmed Mahmoud Hany Aly

Business processes are commonly modeled using a graphical modeling language. The most widespread notation for this purpose is business process diagrams in the Business Process Modeling Notation (BPMN). In this article, we use the visual query language BPMN-Q for expressing patterns that are related to possible problems in such business process diagrams. We discuss two classes of problems that can be found frequently in real-world models: sequence flow errors and model fragments that can make the model difficult to understand. By using a query processor, a business process modeler is able to identify possible errors in business process diagrams. Moreover, the erroneous parts of the business process diagram can be highlighted when an instance of an error pattern is found. This way, the modeler gets an easy-to-understand feedback in the visual modeling language he or she is familiar with. This is an advantage over current validation methods, which usually lack this kind of intuitive feedback.

Virtualisierung und Cloud Computing : Konzepte, Technologiestudie, Marktübersicht (2011)

Meinel, Christoph ; Willems, Christian ; Roschke, Sebastian ; Schnjakin, Maxim

Virtualisierung und Cloud Computing gehören derzeit zu den wichtigsten Schlagworten für Betreiber von IT Infrastrukturen. Es gibt eine Vielzahl unterschiedlicher Technologien, Produkte und Geschäftsmodelle für vollkommen verschiedene Anwendungsszenarien. Die vorliegende Studie gibt zunächst einen detaillierten Überblick über aktuelle Entwicklungen in Konzepten und Technologien der Virtualisierungstechnologie – von klassischer Servervirtualisierung über Infrastrukturen für virtuelle Arbeitsplätze bis zur Anwendungsvirtualisierung und macht den Versuch einer Klassifikation der Virtualisierungsvarianten. Bei der Betrachtung des Cloud Computing-Konzepts werden deren Grundzüge sowie verschiedene Architekturvarianten und Anwendungsfälle eingeführt. Die ausführliche Untersuchung von Vorteilen des Cloud Computing sowie möglicher Bedenken, die bei der Nutzung von Cloud-Ressourcen im Unternehmen beachtet werden müssen, zeigt, dass Cloud Computing zwar große Chancen bietet, aber nicht für jede Anwendung und nicht für jeden rechtlichen und wirtschaftlichen Rahmen in Frage kommt.. Die anschließende Marktübersicht für Virtualisierungstechnologie zeigt, dass die großen Hersteller – Citrix, Microsoft und VMware – jeweils Produkte für fast alle Virtualisierungsvarianten anbieten und hebt entscheidende Unterschiede bzw. die Stärken der jeweiligen Anbieter heraus. So ist beispielsweise die Lösung von Citrix für Virtual Desktop Infrastructures sehr ausgereift, während Microsoft hier nur auf Standardtechnologie zurückgreifen kann. VMware hat als Marktführer die größte Verbreitung in Rechenzentren gefunden und bietet als einziger Hersteller echte Fehlertoleranz. Microsoft hingegen punktet mit der nahtlosen Integration ihrer Virtualisierungsprodukte in bestehende Windows-Infrastrukturen. Im Bereich der Cloud Computing-Systeme zeigen sich einige quelloffene Softwareprojekte, die durchaus für den produktiven Betrieb von sogenannten privaten Clouds geeignet sind.

Virtual prototypes for the model-based elicitation and validation of collaborative scenarios (2013)

Berg, Gregor

Requirements engineers have to elicit, document, and validate how stakeholders act and interact to achieve their common goals in collaborative scenarios. Only after gathering all information concerning who interacts with whom to do what and why, can a software system be designed and realized which supports the stakeholders to do their work. To capture and structure requirements of different (groups of) stakeholders, scenario-based approaches have been widely used and investigated. Still, the elicitation and validation of requirements covering collaborative scenarios remains complicated, since the required information is highly intertwined, fragmented, and distributed over several stakeholders. Hence, it can only be elicited and validated collaboratively. In times of globally distributed companies, scheduling and conducting workshops with groups of stakeholders is usually not feasible due to budget and time constraints. Talking to individual stakeholders, on the other hand, is feasible but leads to fragmented and incomplete stakeholder scenarios. Going back and forth between different individual stakeholders to resolve this fragmentation and explore uncovered alternatives is an error-prone, time-consuming, and expensive task for the requirements engineers. While formal modeling methods can be employed to automatically check and ensure consistency of stakeholder scenarios, such methods introduce additional overhead since their formal notations have to be explained in each interaction between stakeholders and requirements engineers. Tangible prototypes as they are used in other disciplines such as design, on the other hand, allow designers to feasibly validate and iterate concepts and requirements with stakeholders. This thesis proposes a model-based approach for prototyping formal behavioral specifications of stakeholders who are involved in collaborative scenarios. By simulating and animating such specifications in a remote domain-specific visualization, stakeholders can experience and validate the scenarios captured so far, i.e., how other stakeholders act and react. This interactive scenario simulation is referred to as a model-based virtual prototype. Moreover, through observing how stakeholders interact with a virtual prototype of their collaborative scenarios, formal behavioral specifications can be automatically derived which complete the otherwise fragmented scenarios. This, in turn, enables requirements engineers to elicit and validate collaborative scenarios in individual stakeholder sessions – decoupled, since stakeholders can participate remotely and are not forced to be available for a joint session at the same time. This thesis discusses and evaluates the feasibility, understandability, and modifiability of model-based virtual prototypes. Similarly to how physical prototypes are perceived, the presented approach brings behavioral models closer to being tangible for stakeholders and, moreover, combines the advantages of joint stakeholder sessions and decoupled sessions.

Virtual machine integrity verification in Crowd-Resourcing Virtual Laboratory (2019)

Sianipar, Johannes Harungguan ; Willems, Christian ; Meinel, Christoph

In cloud computing, users are able to use their own operating system (OS) image to run a virtual machine (VM) on a remote host. The virtual machine OS is started by the user using some interfaces provided by a cloud provider in public or private cloud. In peer to peer cloud, the VM is started by the host admin. After the VM is running, the user could get a remote access to the VM to install, configure, and run services. For the security reasons, the user needs to verify the integrity of the running VM, because a malicious host admin could modify the image or even replace the image with a similar image, to be able to get sensitive data from the VM. We propose an approach to verify the integrity of a running VM on a remote host, without using any specific hardware such as Trusted Platform Module (TPM). Our approach is implemented on a Linux platform where the kernel files (vmlinuz and initrd) could be replaced with new files, while the VM is running. kexec is used to reboot the VM with the new kernel files. The new kernel has secret codes that will be used to verify whether the VM was started using the new kernel files. The new kernel is used to further measuring the integrity of the running VM.

Vierter Deutscher IPv6 Gipfel 2011 (2012)

Am 1. und 2. Dezember 2011 fand am Hasso-Plattner-Institut für Softwaresystemtechnik GmbH in Potsdam der 4. Deutsche IPv6 Gipfel 2011 statt, dessen Dokumentation der vorliegende technische Report dient. Wie mit den vorhergegangenen nationalen IPv6-Gipfeln verfolgte der Deutsche IPv6-Rat auch mit dem 4. Gipfel, der unter dem Motto „Online on the Road - Der neue Standard IPv6 als Treiber der mobilen Kommunikation” stand, das Ziel, Einblicke in aktuelle Entwicklungen rund um den Einsatz von IPv6 diesmal mit einem Fokus auf die automobile Vernetzung zu geben. Gleichzeitig wurde betont, den effizienten und flächendeckenden Umstieg auf IPv6 voranzutreiben, Erfahrungen mit dem Umstieg auf und dem Einsatz von IPv6 auszutauschen, Wirtschaft und öffentliche Verwaltung zu ermutigen und motivieren, IPv6-basierte Lösungen einzusetzen und das öffentliche Problembewusstsein für die Notwendigkeit des Umstiegs auf IPv6 zu erhöhen. Ehrengast war in diesem Jahr die EU-Kommissarin für die Digitale Agenda, Neelie Kroes deren Vortrag von weiteren Beiträgen hochrangiger Vertretern aus Politik, Wissenschaft und Wirtschaft ergänzt wurde.

Vereinfachung der Entwicklung von Geschäftsanwendungen durch Konsolidierung von Programmierkonzepten und -technologien (2013)

Berov, Leonid ; Henning, Johannes ; Mattis, Toni ; Rein, Patrick ; Schreiber, Robin ; Seckler, Eric ; Steinert, Bastian ; Hirschfeld, Robert

Die Komplexität heutiger Geschäftsabläufe und die Menge der zu verwaltenden Daten stellen hohe Anforderungen an die Entwicklung und Wartung von Geschäftsanwendungen. Ihr Umfang entsteht unter anderem aus der Vielzahl von Modellentitäten und zugehörigen Nutzeroberflächen zur Bearbeitung und Analyse der Daten. Dieser Bericht präsentiert neuartige Konzepte und deren Umsetzung zur Vereinfachung der Entwicklung solcher umfangreichen Geschäftsanwendungen. Erstens: Wir schlagen vor, die Datenbank und die Laufzeitumgebung einer dynamischen objektorientierten Programmiersprache zu vereinen. Hierzu organisieren wir die Speicherstruktur von Objekten auf die Weise einer spaltenorientierten Hauptspeicherdatenbank und integrieren darauf aufbauend Transaktionen sowie eine deklarative Anfragesprache nahtlos in dieselbe Laufzeitumgebung. Somit können transaktionale und analytische Anfragen in derselben objektorientierten Hochsprache implementiert werden, und dennoch nah an den Daten ausgeführt werden. Zweitens: Wir beschreiben Programmiersprachkonstrukte, welche es erlauben, Nutzeroberflächen sowie Nutzerinteraktionen generisch und unabhängig von konkreten Modellentitäten zu beschreiben. Um diese abstrakte Beschreibung nutzen zu können, reichert man die Domänenmodelle um vormals implizite Informationen an. Neue Modelle müssen nur um einige Informationen erweitert werden um bereits vorhandene Nutzeroberflächen und -interaktionen auch für sie verwenden zu können. Anpassungen, die nur für ein Modell gelten sollen, können unabhängig vom Standardverhalten, inkrementell, definiert werden. Drittens: Wir ermöglichen mit einem weiteren Programmiersprachkonstrukt die zusammenhängende Beschreibung von Abläufen der Anwendung, wie z.B. Bestellprozesse. Unser Programmierkonzept kapselt Nutzerinteraktionen in synchrone Funktionsaufrufe und macht somit Prozesse als zusammenhängende Folge von Berechnungen und Interaktionen darstellbar. Viertens: Wir demonstrieren ein Konzept, wie Endnutzer komplexe analytische Anfragen intuitiver formulieren können. Es basiert auf der Idee, dass Endnutzer Anfragen als Konfiguration eines Diagramms sehen. Entsprechend beschreibt ein Nutzer eine Anfrage, indem er beschreibt, was sein Diagramm darstellen soll. Nach diesem Konzept beschriebene Diagramme enthalten ausreichend Informationen, um daraus eine Anfrage generieren zu können. Hinsichtlich der Ausführungsdauer sind die generierten Anfragen äquivalent zu Anfragen, die mit konventionellen Anfragesprachen formuliert sind. Das Anfragemodell setzen wir in einem Prototypen um, der auf den zuvor eingeführten Konzepten aufsetzt.

Using Hidden Markov Models for the accurate linguistic analysis of process model activity labels (2019)

Leopold, Henrik ; van der Aa, Han ; Offenberg, Jelmer ; Reijers, Hajo A.

Many process model analysis techniques rely on the accurate analysis of the natural language contents captured in the models’ activity labels. Since these labels are typically short and diverse in terms of their grammatical style, standard natural language processing tools are not suitable to analyze them. While a dedicated technique for the analysis of process model activity labels was proposed in the past, it suffers from considerable limitations. First of all, its performance varies greatly among data sets with different characteristics and it cannot handle uncommon grammatical styles. What is more, adapting the technique requires in-depth domain knowledge. We use this paper to propose a machine learning-based technique for activity label analysis that overcomes the issues associated with this rule-based state of the art. Our technique conceptualizes activity label analysis as a tagging task based on a Hidden Markov Model. By doing so, the analysis of activity labels no longer requires the manual specification of rules. An evaluation using a collection of 15,000 activity labels demonstrates that our machine learning-based technique outperforms the state of the art in all aspects.

Refine

Has Fulltext

Author

Year of publication

Document Type

Language

Is part of the Bibliography

Keywords

Institute

382 search hits