TY  - JOUR
A1  - Aa, Han van der
A1  - Rebmann, Adrian
A1  - Leopold, Henrik
T1  - Natural language-based detection of semantic execution anomalies in event logs
JF  - Information systems : IS ; an international journal ; data bases
N2  - Anomaly detection in process mining aims to recognize outlying or unexpected behavior in event logs for purposes such as the removal of noise and identification of conformance violations. Existing techniques for this task are primarily frequency-based, arguing that behavior is anomalous because it is uncommon. However, such techniques ignore the semantics of recorded events and, therefore, do not take the meaning of potential anomalies into consideration. In this work, we overcome this caveat and focus on the detection of anomalies from a semantic perspective, arguing that anomalies can be recognized when process behavior does not make sense. To achieve this, we propose an approach that exploits the natural language associated with events. Our key idea is to detect anomalous process behavior by identifying semantically inconsistent execution patterns. To detect such patterns, we first automatically extract business objects and actions from the textual labels of events. We then compare these against a process-independent knowledge base. By populating this knowledge base with patterns from various kinds of resources, our approach can be used in a range of contexts and domains. We demonstrate the capability of our approach to successfully detect semantic execution anomalies through an evaluation based on a set of real-world and synthetic event logs and show the complementary nature of semantics-based anomaly detection to existing frequency-based techniques.
KW  - Process mining
KW  - Natural language processing
KW  - Anomaly detection
Y1  - 2021
U6  - https://doi.org/10.1016/j.is.2021.101824
SN  - 0306-4379
SN  - 1873-6076
VL  - 102
PB  - Elsevier
CY  - Amsterdam
ER  - 
TY  - BOOK
A1  - Adriano, Christian
A1  - Bleifuß, Tobias
A1  - Cheng, Lung-Pan
A1  - Diba, Kiarash
A1  - Fricke, Andreas
A1  - Grapentin, Andreas
A1  - Jiang, Lan
A1  - Kovacs, Robert
A1  - Krejca, Martin Stefan
A1  - Mandal, Sankalita
A1  - Marwecki, Sebastian
A1  - Matthies, Christoph
A1  - Mattis, Toni
A1  - Niephaus, Fabio
A1  - Pirl, Lukas
A1  - Quinzan, Francesco
A1  - Ramson, Stefan
A1  - Rezaei, Mina
A1  - Risch, Julian
A1  - Rothenberger, Ralf
A1  - Roumen, Thijs
A1  - Stojanovic, Vladeta
A1  - Wolf, Johannes
ED  - Meinel, Christoph
ED  - Plattner, Hasso
ED  - Döllner, Jürgen Roland Friedrich
ED  - Weske, Mathias
ED  - Polze, Andreas
ED  - Hirschfeld, Robert
ED  - Naumann, Felix
ED  - Giese, Holger
ED  - Baudisch, Patrick
ED  - Friedrich, Tobias
ED  - Böttinger, Erwin
ED  - Lippert, Christoph
T1  - Technical report
BT  - Fall Retreat 2018
N2  - Design and Implementation of service-oriented architectures imposes a huge number of research questions from the fields of software engineering, system analysis and modeling, adaptability, and application integration. Component orientation and web services are two approaches for design and realization of complex web-based system. Both approaches allow for dynamic application adaptation as well as integration of enterprise application.

Commonly used technologies, such as J2EE and .NET, form de facto standards for the realization of complex distributed systems. Evolution of component systems has lead to web services and service-based architectures. This has been manifested in a multitude of industry standards and initiatives such as XML, WSDL UDDI, SOAP, etc. All these achievements lead to a new and promising paradigm in IT systems engineering which proposes to design complex software solutions as collaboration of contractually defined software services.

Service-Oriented Systems Engineering represents a symbiosis of best practices in object-orientation, component-based development, distributed computing, and business process management. It provides integration of business and IT concerns.

The annual Ph.D. Retreat of the Research School provides each member the opportunity to present his/her current state of their research and to give an outline of a prospective Ph.D. thesis. Due to the interdisciplinary structure of the research school, this technical report covers a wide range of topics. These include but are not limited to: Human Computer Interaction and Computer Vision as Service; Service-oriented Geovisualization Systems; Algorithm Engineering for Service-oriented Systems; Modeling and Verification of Self-adaptive Service-oriented Systems; Tools and Methods for Software Engineering in Service-oriented Systems; Security Engineering of Service-based IT Systems; Service-oriented Information Systems; Evolutionary Transition of Enterprise Applications to Service Orientation; Operating System Abstractions for Service-oriented Computing; and Services Specification, Composition, and Enactment.
N2  - Der Entwurf und die Realisierung dienstbasierender Architekturen wirft eine Vielzahl von Forschungsfragestellungen aus den Gebieten der Softwaretechnik, der Systemmodellierung und -analyse, sowie der Adaptierbarkeit und Integration von Applikationen auf. Komponentenorientierung und WebServices sind zwei Ansätze für den effizienten Entwurf und die Realisierung komplexer Web-basierender Systeme. Sie ermöglichen die Reaktion auf wechselnde Anforderungen ebenso, wie die Integration großer komplexer Softwaresysteme.

Heute übliche Technologien, wie J2EE und .NET, sind de facto Standards für die Entwicklung großer verteilter Systeme. Die Evolution solcher Komponentensysteme führt über WebServices zu dienstbasierenden Architekturen. Dies manifestiert sich in einer Vielzahl von Industriestandards und Initiativen wie XML, WSDL, UDDI, SOAP. All diese Schritte führen letztlich zu einem neuen, vielversprechenden Paradigma für IT Systeme, nach dem komplexe Softwarelösungen durch die Integration vertraglich vereinbarter Software-Dienste aufgebaut werden sollen.

"Service-Oriented Systems Engineering" repräsentiert die Symbiose bewährter Praktiken aus den Gebieten der Objektorientierung, der Komponentenprogrammierung, des verteilten Rechnen sowie der Geschäftsprozesse und berücksichtigt auch die Integration von Geschäftsanliegen und Informationstechnologien.

Die Klausurtagung des Forschungskollegs "Service-oriented Systems Engineering" findet einmal jährlich statt und bietet allen Kollegiaten die Möglichkeit den Stand ihrer aktuellen Forschung darzulegen. Bedingt durch die Querschnittstruktur des Kollegs deckt dieser Bericht ein weites Spektrum aktueller Forschungsthemen ab. Dazu zählen unter anderem Human Computer Interaction and Computer Vision as Service; Service-oriented Geovisualization Systems; Algorithm Engineering for Service-oriented Systems; Modeling and Verification of Self-adaptive Service-oriented Systems; Tools and Methods for Software Engineering in Service-oriented Systems; Security Engineering of Service-based IT Systems; Service-oriented Information Systems; Evolutionary Transition of Enterprise Applications to Service Orientation; Operating System Abstractions for Service-oriented Computing; sowie Services Specification, Composition, and Enactment.
T3  - Technische Berichte des Hasso-Plattner-Instituts für Digital Engineering an der Universität Potsdam - 129 
KW  - Hasso Plattner Institute
KW  - research school
KW  - Ph.D. retreat
KW  - service-oriented systems engineering
KW  - Hasso-Plattner-Institut
KW  - Forschungskolleg
KW  - Klausurtagung
KW  - Service-oriented Systems Engineering
Y1  - 2019
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-427535
SN  - 978-3-86956-465-4
SN  - 1613-5652
SN  - 2191-1665
IS  - 129
PB  - Universitätsverlag Potsdam
CY  - Potsdam
ER  - 
TY  - THES
A1  - Afifi, Haitham
T1  - Wireless In-Network Processing for Multimedia Applications
T1  - Drahtlose In-Network-Verarbeitung für Multimedia-Anwendungen
N2  - With the recent growth of sensors, cloud computing handles the data processing of many applications. Processing some of this data on the cloud raises, however, many concerns regarding, e.g., privacy, latency, or single points of failure. Alternatively, thanks to the development of embedded systems, smart wireless devices can share their computation capacity, creating a local wireless cloud for in-network processing. In this context, the processing of an application is divided into smaller jobs so that a device can run one or more jobs.
The contribution of this thesis to this scenario is divided into three parts. In part one, I focus on wireless aspects, such as power control and interference management, for deciding which jobs to run on which node and how to route data between nodes. Hence, I formulate optimization problems and develop heuristic and meta-heuristic algorithms to allocate wireless and computation resources. Additionally, to deal with multiple applications competing for these resources, I develop a reinforcement learning (RL) admission controller to decide which application should be admitted. Next, I look into acoustic applications to improve wireless throughput by using microphone clock synchronization to synchronize wireless transmissions.
In the second part, I jointly work with colleagues from the acoustic processing field to optimize both network and application (i.e., acoustic) qualities. My contribution focuses on the network part, where I study the relation between acoustic and network qualities when selecting a subset of microphones for collecting audio data or selecting a subset of optional jobs for processing these data; too many microphones or too many jobs can lessen quality by unnecessary delays. Hence, I develop RL solutions to select the subset of microphones under network constraints when the speaker is moving while still providing good acoustic quality. Furthermore, I show that autonomous vehicles carrying microphones improve the acoustic qualities of different applications. Accordingly, I develop RL solutions (single and multi-agent ones) for controlling these vehicles.
In the third part, I close the gap between theory and practice. I describe the features of my open-source framework used as a proof of concept for wireless in-network processing. Next, I demonstrate how to run some algorithms developed by colleagues from acoustic processing using my framework. I also use the framework for studying in-network delays (wireless and processing) using different distributions of jobs and network topologies.
N2  - Mit der steigenden Anzahl von Sensoren übernimmt Cloud Computing die Datenverarbeitung vieler Anwendungen. Dies wirft jedoch viele Bedenken auf, z. B. in Bezug auf Datenschutz, Latenzen oder Fehlerquellen. Alternativ und dank der Entwicklung eingebetteter Systeme können drahtlose intelligente Geräte für die lokale Verarbeitung verwendet werden, indem sie ihre Rechenkapazität gemeinsam nutzen und so eine lokale drahtlose Cloud für die netzinterne Verarbeitung schaffen. In diesem Zusammenhang wird eine Anwendung in kleinere Aufgaben unterteilt, so dass ein Gerät eine oder mehrere Aufgaben ausführen kann. Der Beitrag dieser Arbeit zu diesem Szenario gliedert sich in drei Teile.

 Im ersten Teil konzentriere ich mich auf drahtlose Aspekte wie Leistungssteuerung und Interferenzmanagement um zu entscheiden, welche Aufgaben auf welchem Knoten ausgeführt werden sollen und wie die Daten zwischen den Knoten weitergeleitet werden sollen. Daher formuliere ich Optimierungsprobleme und entwickle heuristische und metaheuristische Algorithmen zur Zuweisung von Ressourcen eines drahtlosen Netzwerks. Um mit mehreren Anwendungen, die um diese Ressourcen konkurrieren, umgehen zu können, entwickle ich außerdem einen Reinforcement Learning (RL) Admission Controller, um zu entscheiden, welche Anwendung zugelassen werden soll. Als Nächstes untersuche ich akustische Anwendungen zur Verbesserung des drahtlosen Durchsatzes, indem ich Mikrofon-Taktsynchronisation zur Synchronisierung drahtloser Übertragungen verwende.

Im zweiten Teil arbeite ich mit Kollegen aus dem Bereich der Akustikverarbeitung zusammen, um sowohl die Netzwerk- als auch die Anwendungsqualitäten (d.h. die akustischen) zu optimieren. Mein Beitrag konzentriert sich auf den Netzwerkteil, wo ich die Beziehung zwischen akustischen und Netzwerkqualitäten bei der Auswahl einer Teilmenge von Mikrofonen für die Erfassung von Audiodaten oder der Auswahl einer Teilmenge von optionalen Aufgaben für die Verarbeitung dieser Daten untersuche; zu viele Mikrofone oder zu viele Aufgaben können die Qualität
durch unnötige Verzögerungen verringern. Daher habe ich RL-Lösungen entwickelt, um die Teilmenge der Mikrofone unter Netzwerkbeschränkungen auszuwählen, wenn sich der Sprecher bewegt, und dennoch eine gute akustische Qualität gewährleistet. Außerdem zeige ich, dass autonome Fahrzeuge, die Mikrofone mit sich führen, die akustische Qualität verschiedener Anwendungen verbessern. Dementsprechend entwickle ich RL-Lösungen (Einzel- und Multi-Agenten-Lösungen) für die Steuerung dieser Fahrzeuge.

Im dritten Teil schließe ich die Lücke zwischen Theorie und Praxis. Ich beschreibe die Eigenschaften meines Open-Source-Frameworks, das als Prototyp für die drahtlose netzinterne Verarbeitung verwendet wird. Anschließend zeige ich, wie einige Algorithmen, die von Kollegen aus der Akustikverarbeitung entwickelt wurden, mit meinem Framework ausgeführt werden können. Außerdem verwende ich das Framework für die Untersuchung von netzinternen Verzögerungen unter Verwendung verschiedener Aufgabenverteilungen und Netzwerktopologien.
KW  - wireless networks
KW  - reinforcement learning
KW  - network optimization
KW  - Netzoptimierung
KW  - bestärkendes Lernen
KW  - drahtloses Netzwerk
Y1  - 2023
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-604371
ER  - 
TY  - JOUR
A1  - Alario Hoyos, Carlos
A1  - Delgado Kloos, Carlos
A1  - Kiendl, Doris
A1  - Terzieva, Liliya
ED  - Meinel, Christoph
ED  - Schweiger, Stefanie
ED  - Staubitz, Thomas
ED  - Conrad, Robert
ED  - Alario Hoyos, Carlos
ED  - Ebner, Martin
ED  - Sancassani, Susanna
ED  - Żur, Agnieszka
ED  - Friedl, Christian
ED  - Halawa, Sherif
ED  - Gamage, Dilrukshi
ED  - Scott, Jeffrey
ED  - Kristine Jonson Carlon, May
ED  - Deville, Yves
ED  - Gaebel, Michael
ED  - Delgado Kloos, Carlos
ED  - von Schmieden, Karen
T1  - Innovat MOOC
BT  - teacher training on educational innovation in higher education
JF  - EMOOCs 2023 : Post-Covid Prospects for Massive Open Online Courses - Boost or Backlash?
N2  - The COVID-19 pandemic has revealed the importance for university teachers to have adequate pedagogical and technological competences to cope with the various possible educational scenarios (face-to-face, online, hybrid, etc.), making use of appropriate active learning methodologies and supporting technologies to foster a more effective learning environment. In this context, the InnovaT project has been an important initiative to support the development of pedagogical and technological competences of university teachers in Latin America through several trainings aiming to promote teacher innovation. These trainings combined synchronous online training through webinars and workshops with asynchronous online training through the MOOC “Innovative Teaching in Higher Education.” This MOOC was released twice. The first run took place right during the lockdown of 2020, when Latin American teachers needed urgent training to move to emergency remote teaching overnight. The second run took place in 2022 with the return to face-to-face teaching and the implementation of hybrid educational models. This article shares the results of the design of the MOOC considering the constraints derived from the lockdowns applied in each country, the lessons learned from the delivery of such a MOOC to Latin American university teachers, and the results of the two runs of the MOOC.
KW  - Digitale Bildung
KW  - Kursdesign
KW  - MOOC
KW  - Micro Degree
KW  - Online-Lehre
KW  - Onlinekurs
KW  - Onlinekurs-Produktion
KW  - digital education
KW  - e-learning
KW  - micro degree
KW  - micro-credential
KW  - online course creation
KW  - online course design
KW  - online teaching
Y1  - 2023
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-624560
SP  - 229
EP  - 237
PB  - Universitätsverlag Potsdam
CY  - Potsdam
ER  - 
TY  - GEN
A1  - Albert, Justin Amadeus
A1  - Owolabi, Victor
A1  - Gebel, Arnd
A1  - Brahms, Clemens Markus
A1  - Granacher, Urs
A1  - Arnrich, Bert
T1  - Evaluation of the Pose Tracking Performance of the Azure Kinect and Kinect v2 for Gait Analysis in Comparison with a Gold Standard
BT  - A Pilot Study
T2  - Postprints der Universität Potsdam : Reihe der Digital Engineering Fakultät
N2  - Gait analysis is an important tool for the early detection of neurological diseases and for the assessment of risk of falling in elderly people. The availability of low-cost camera hardware on the market today and recent advances in Machine Learning enable a wide range of clinical and health-related applications, such as patient monitoring or exercise recognition at home. In this study, we evaluated the motion tracking performance of the latest generation of the Microsoft Kinect camera, Azure Kinect, compared to its predecessor Kinect v2 in terms of treadmill walking using a gold standard Vicon multi-camera motion capturing system and the 39 marker Plug-in Gait model. Five young and healthy subjects walked on a treadmill at three different velocities while data were recorded simultaneously with all three camera systems. An easy-to-administer camera calibration method developed here was used to spatially align the 3D skeleton data from both Kinect cameras and the Vicon system. With this calibration, the spatial agreement of joint positions between the two Kinect cameras and the reference system was evaluated. In addition, we compared the accuracy of certain spatio-temporal gait parameters, i.e., step length, step time, step width, and stride time calculated from the Kinect data, with the gold standard system. Our results showed that the improved hardware and the motion tracking algorithm of the Azure Kinect camera led to a significantly higher accuracy of the spatial gait parameters than the predecessor Kinect v2, while no significant differences were found between the temporal parameters. Furthermore, we explain in detail how this experimental setup could be used to continuously monitor the progress during gait rehabilitation in older people.
T3  - Zweitveröffentlichungen der Universität Potsdam : Reihe der Digital Engineering Fakultät - 3 
KW  - motion capture
KW  - evaluation
KW  - human motion
KW  - RGB-D cameras
KW  - digital health
Y1  - 2020
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-484130
IS  - 3
ER  - 
TY  - JOUR
A1  - Albert, Justin Amadeus
A1  - Owolabi, Victor
A1  - Gebel, Arnd
A1  - Brahms, Clemens Markus
A1  - Granacher, Urs
A1  - Arnrich, Bert
T1  - Evaluation of the Pose Tracking Performance of the Azure Kinect and Kinect v2 for Gait Analysis in Comparison with a Gold Standard
BT  - A Pilot Study
JF  - Sensors
N2  - Gait analysis is an important tool for the early detection of neurological diseases and for the assessment of risk of falling in elderly people. The availability of low-cost camera hardware on the market today and recent advances in Machine Learning enable a wide range of clinical and health-related applications, such as patient monitoring or exercise recognition at home. In this study, we evaluated the motion tracking performance of the latest generation of the Microsoft Kinect camera, Azure Kinect, compared to its predecessor Kinect v2 in terms of treadmill walking using a gold standard Vicon multi-camera motion capturing system and the 39 marker Plug-in Gait model. Five young and healthy subjects walked on a treadmill at three different velocities while data were recorded simultaneously with all three camera systems. An easy-to-administer camera calibration method developed here was used to spatially align the 3D skeleton data from both Kinect cameras and the Vicon system. With this calibration, the spatial agreement of joint positions between the two Kinect cameras and the reference system was evaluated. In addition, we compared the accuracy of certain spatio-temporal gait parameters, i.e., step length, step time, step width, and stride time calculated from the Kinect data, with the gold standard system. Our results showed that the improved hardware and the motion tracking algorithm of the Azure Kinect camera led to a significantly higher accuracy of the spatial gait parameters than the predecessor Kinect v2, while no significant differences were found between the temporal parameters. Furthermore, we explain in detail how this experimental setup could be used to continuously monitor the progress during gait rehabilitation in older people.
KW  - motion capture
KW  - evaluation
KW  - human motion
KW  - RGB-D cameras
KW  - digital health
Y1  - 2020
U6  - https://doi.org/10.3390/s20185104
SN  - 1424-8220
VL  - 20
IS  - 18
PB  - MDPI
CY  - Basel
ER  - 
TY  - THES
A1  - Alhosseini Almodarresi Yasin, Seyed Ali
T1  - Classification, prediction and evaluation of graph neural networks on online social media platforms
T1  - Klassifizierung, Vorhersage und Bewertung graphischer neuronaler Netze auf Online-Social-Media-Plattformen
N2  - The vast amount of data generated on social media platforms have made them a valuable source of information for businesses, governments and researchers. Social media data can provide insights into user behavior, preferences, and opinions. In this work, we address two important challenges in social media analytics. Predicting user engagement with online content has become a critical task for content creators to increase user engagement and reach larger audiences. Traditional user engagement prediction approaches rely solely on features derived from the user and content. However, a new class of deep learning methods based on graphs captures not only the content features but also the graph structure of social media networks.

This thesis proposes a novel Graph Neural Network (GNN) approach to predict user interaction with tweets. The proposed approach combines the features of users, tweets and their engagement graphs. The tweet text features are extracted using pre-trained embeddings from language models, and a GNN layer is used to embed the user in a vector space. The GNN model then combines the features and graph structure to predict user engagement. The proposed approach achieves an accuracy value of 94.22% in classifying user interactions, including likes, retweets, replies, and quotes.

Another major challenge in social media analysis is detecting and classifying social bot accounts. Social bots are automated accounts used to manipulate public opinion by spreading misinformation or generating fake interactions. Detecting social bots is critical to prevent their negative impact on public opinion and trust in social media. In this thesis, we classify social bots on Twitter by applying Graph Neural Networks. The proposed approach uses a combination of both the features of a node and an aggregation of the features of a node’s neighborhood to classify social bot accounts. Our final results indicate a 6% improvement in the area under the curve score in the final predictions through the utilization of GNN.

Overall, our work highlights the importance of social media data and the potential of new methods such as GNNs to predict user engagement and detect social bots. These methods have important implications for improving the quality and reliability of information on social media platforms and mitigating the negative impact of social bots on public opinion and discourse.
N2  - Die riesige Menge an Daten, die auf Social-Media-Plattformen generiert wird, hat sie zu einer wertvollen Informationsquelle für Unternehmen, Regierungen und Forscher gemacht. Daten aus sozialen Medien können Einblicke in das Verhalten, die Vorlieben und die Meinungen der Nutzer geben. In dieser Arbeit befassen wir uns mit zwei wichtigen Herausforderungen im Bereich der Social-Media-Analytik. Die Vorhersage des Nutzerinteresses an Online-Inhalten ist zu einer wichtigen Aufgabe für die Ersteller von Inhalten geworden, um das Nutzerengagement zu steigern und ein größeres Publikum zu erreichen. Herkömmliche Ansätze zur Vorhersage des Nutzerengagements stützen sich ausschließlich auf Merkmale, die aus dem Nutzer und dem Inhalt abgeleitet werden. Eine neue Klasse von Deep-Learning-Methoden, die auf Graphen basieren, erfasst jedoch nicht nur die Inhaltsmerkmale, sondern auch die Graphenstruktur von Social-Media-Netzwerken.

In dieser Arbeit wird ein neuartiger Graph Neural Network (GNN)-Ansatz zur Vorhersage der Nutzerinteraktion mit Tweets vorgeschlagen. Der vorgeschlagene Ansatz kombiniert die Merkmale von Nutzern, Tweets und deren Engagement-Graphen. Die Textmerkmale der Tweets werden mit Hilfe von vortrainierten Einbettungen aus Sprachmodellen extrahiert, und eine GNN-Schicht wird zur Einbettung des Nutzers in einen Vektorraum verwendet. Das GNN-Modell kombiniert dann die Merkmale und die Graphenstruktur, um das Nutzerengagement vorherzusagen. Der vorgeschlagene Ansatz erreicht eine Genauigkeit von 94,22% bei der Klassifizierung von Benutzerinteraktionen, einschließlich Likes, Retweets, Antworten und Zitaten.

Eine weitere große Herausforderung bei der Analyse sozialer Medien ist die Erkennung und Klassifizierung von Social-Bot-Konten. Social Bots sind automatisierte Konten, die dazu dienen, die öffentliche Meinung zu manipulieren, indem sie Fehlinformationen verbreiten oder gefälschte Interaktionen erzeugen. Die Erkennung von Social Bots ist entscheidend, um ihre negativen Auswirkungen auf die öffentliche Meinung und das Vertrauen in soziale Medien zu verhindern. In dieser Arbeit klassifizieren wir Social Bots auf Twitter mit Hilfe von Graph Neural Networks. Der vorgeschlagene Ansatz verwendet eine Kombination aus den Merkmalen eines Knotens und einer Aggregation der Merkmale der Nachbarschaft eines Knotens, um Social-Bot-Konten zu klassifizieren. Unsere Endergebnisse zeigen eine 6%ige Verbesserung der Fläche unter der Kurve bei den endgültigen Vorhersagen durch die Verwendung von GNN.

Insgesamt unterstreicht unsere Arbeit die Bedeutung von Social-Media-Daten und das Potenzial neuer Methoden wie GNNs zur Vorhersage des Nutzer-Engagements und zur Erkennung von Social Bots. Diese Methoden haben wichtige Auswirkungen auf die Verbesserung der Qualität und Zuverlässigkeit von Informationen auf Social-Media-Plattformen und die Abschwächung der negativen Auswirkungen von Social Bots auf die öffentliche Meinung und den Diskurs.
KW  - graph neural networks
KW  - social bot detection
KW  - user engagement
KW  - graphische neuronale Netze
KW  - Social Bots erkennen
KW  - Nutzer-Engagement
Y1  - 2024
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-626421
ER  - 
TY  - GEN
A1  - Alviano, Mario
A1  - Romero Davila, Javier
A1  - Schaub, Torsten H.
T1  - Preference Relations by Approximation
T2  - Sixteenth International Conference on Principles of Knowledge Representation and Reasoning
N2  - Declarative languages for knowledge representation and reasoning provide constructs to define preference relations over the set of possible interpretations, so that preferred models represent optimal solutions of the encoded problem. We introduce the notion of approximation for replacing preference relations with stronger preference relations, that is, relations comparing more pairs of interpretations. Our aim is to accelerate the computation of a non-empty subset of the optimal solutions by means of highly specialized algorithms. We implement our approach in Answer Set Programming (ASP), where problems involving quantitative and qualitative preference relations can be addressed by ASPRIN, implementing a generic optimization algorithm. Unlike this, chains of approximations allow us to reduce several preference relations to the preference relations associated with ASP’s native weak constraints and heuristic directives. In this way, ASPRIN can now take advantage of several highly optimized algorithms implemented by ASP solvers for computing optimal solutions
Y1  - 2018
SP  - 2
EP  - 11
PB  - AAAI Conference on Artificial Intelligence
CY  - Palo Alto
ER  - 
TY  - JOUR
A1  - Ambassa, Pacome L.
A1  - Kayem, Anne Voluntas dei Massah
A1  - Wolthusen, Stephen D.
A1  - Meinel, Christoph
T1  - Inferring private user behaviour based on information leakage
JF  - Smart Micro-Grid Systems Security and Privacy
N2  - In rural/remote areas, resource constrained smart micro-grid (RCSMG) architectures can provide a cost-effective power supply alternative in cases when connectivity to the national power grid is impeded by factors such as load shedding. RCSMG architectures can be designed to handle communications over a distributed lossy network in order to minimise operation costs. However, due to the unreliable nature of lossy networks communication data can be distorted by noise additions that alter the veracity of the data. In this chapter, we consider cases in which an adversary who is internal to the RCSMG, deliberately distorts communicated data to gain an unfair advantage over the RCSMG’s users. The adversary’s goal is to mask malicious data manipulations as distortions due to additive noise due to communication channel unreliability. Distinguishing malicious data distortions from benign distortions is important in ensuring trustworthiness of the RCSMG. Perturbation data anonymisation algorithms can be used to alter transmitted data to ensure that adversarial manipulation of the data reveals no information that the adversary can take advantage of. However, because existing data perturbation anonymisation algorithms operate by using additive noise to anonymise data, using these algorithms in the RCSMG context is challenging. This is due to the fact that distinguishing benign noise additions from malicious noise additions is a difficult problem. In this chapter, we present a brief survey of cases of privacy violations due to inferences drawn from observed power consumption patterns in RCSMGs centred on inference, and propose a method of mitigating these risks. The lesson here is that while RCSMGs give users more control over power management and distribution, good anonymisation is essential to protecting personal information on RCSMGs.
KW  - Approximation algorithms
KW  - Electrical products
KW  - Home appliances
KW  - Load modeling
KW  - Monitoring
KW  - Power demand
KW  - Wireless sensor networks
KW  - Distributed snapshot algorithm
KW  - Micro-grid networks
KW  - Power consumption characterization
KW  - Sensor networks
Y1  - 2018
SN  - 978-3-319-91427-5
SN  - 978-3-319-91426-8
U6  - https://doi.org/10.1007/978-3-319-91427-5_7
VL  - 71
SP  - 145
EP  - 159
PB  - Springer
CY  - Dordrecht
ER  - 
TY  - THES
A1  - Amirkhanyan, Aragats
T1  - Methods and frameworks for GeoSpatioTemporal data analytics
T1  - Methoden und Frameworks für geo-raumzeitliche Datenanalysen
N2  - In the era of social networks, internet of things and location-based services, many online services produce a huge amount of data that have valuable objective information, such as geographic coordinates and date time. These characteristics (parameters) in the combination with a textual parameter bring the challenge for the discovery of geospatiotemporal knowledge. This challenge requires efficient methods for clustering and pattern mining in spatial, temporal and textual spaces.   

In this thesis, we address the challenge of providing methods and frameworks for geospatiotemporal data analytics. As an initial step, we address the challenges of geospatial data processing: data gathering,  normalization, geolocation, and storage. That initial step is the basement to tackle the next challenge -- geospatial clustering challenge. The first step of this challenge is to design the method for online clustering of georeferenced data. This algorithm can be used as a server-side clustering algorithm for online maps that visualize massive georeferenced data. As the second step, we develop the extension of this method that considers, additionally, the temporal aspect of data. For that, we propose the density and intensity-based geospatiotemporal clustering algorithm with fixed distance and time radius. 
Each version of the clustering algorithm has its own use case that we show in the thesis.

In the next chapter of the thesis, we look at the spatiotemporal analytics from the perspective of the sequential rule mining challenge. We design and implement the framework that transfers data into textual geospatiotemporal data - data that contain geographic coordinates, time and textual parameters. By this way, we address the challenge of applying pattern/rule mining algorithms in geospatiotemporal space. As the applicable use case study, we propose spatiotemporal crime analytics -- discovery spatiotemporal patterns of crimes in publicly available crime data. 

The second part of the thesis, we dedicate to the application part and use case studies. We design and implement the application that uses the proposed clustering algorithms to discover knowledge in data. Jointly with the application, we propose the use case studies for analysis of georeferenced data in terms of situational and public safety awareness.
N2  - Heute ist die Zeit der sozialen Netzwerke, des Internets der Dinge und der Standortbezogenen Diensten (Location-Based services). Viele Online-Dienste erzeugen eine riesige Datenmenge, die wertvolle Informationen enthält, wie z. B. geographische Koordinaten und Datum sowie Zeit. Diese Informationen (Parameter) in Kombination mit einem Textparameter stellen die Herausforderung für die Entdeckung von geo-raumzeitlichem (geospatiotemporal) Wissen dar. Diese Herausforderung erfordert effiziente Methoden zum Clustering und Pattern-Mining in räumlichen, zeitlichen und textlichen Aspekten.

In dieser Dissertation stellen wir uns der Herausforderung, Methoden und Frameworks für geo-raumzeitliche Datenanalysen bereitzustellen. Im ersten Schritt gehen wir auf die Herausforderungen der Geodatenverarbeitung ein: Datenerfassung, -Normalisierung, -Ortung und -Speicherung. Dieser Schritt ist der Grundstein für die nächste Herausforderung – das geographische Clustering. Es erfordert das Entwerfen einer Methode für das Online-Clustering georeferenzierter Daten. Dieser Algorithmus kann als Serverseitiger Clustering-Algorithmus für Online-Karten verwendet werden, die massive georeferenzierte Daten visualisieren. Im zweiten Schritt entwickeln wir die Erweiterung dieser Methode, die zusätzlich den zeitlichen Aspekt der Daten berücksichtigt. Dazu schlagen wir den Dichte und Intensitätsbasierten geo-raumzeitlichen Clustering-Algorithmus mit festem Abstand und Zeitradius vor. Jede Version des Clustering-Algorithmus hat einen eigenen Anwendungsfall, den wir in dieser Doktorarbeit zeigen.

Im nächsten Kapitel dieser Arbeit betrachten wir die raumzeitlich Analyse aus der Perspektive der sequentiellen Regel-Mining-Herausforderung. Wir entwerfen und implementieren ein Framework, das Daten in textliche raumzeitliche Daten umwandelt. Solche Daten enthalten geographische Koordinaten, Zeit und Textparameter. Auf diese Weise stellen wir uns der Herausforderung, Muster- / Regel-Mining-Algorithmen auf geo-raumzeitliche Daten anzuwenden. Als Anwendungsfallstudie schlagen wir raumzeitliche Verbrechensanalysen vor – Entdeckung raumzeitlicher Muster von Verbrechen in öffentlich zugänglichen Datenbanken.

Im zweiten Teil der Arbeit diskutieren wir über die Anwendung und die Fallstudien. Wir entwerfen und implementieren eine Anwendungssoftware, die die vorgeschlagene Clustering-Algorithmen verwendet, um das Wissen in Daten zu entdecken. Gemeinsam mit der Anwendungssoftware betrachten wir Anwendungsbeispiele für die Analyse georeferenzierter Daten im Hinblick auf das Situationsbewusstsein.
KW  - geospatial data
KW  - data analytics
KW  - clustering
KW  - situational awareness
KW  - Geodaten
KW  - Datenanalyse
KW  - Clustering
KW  - Situationsbewusstsein
Y1  - 2019
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-441685
ER  - 
TY  - GEN
A1  - Andjelkovic, Marko
A1  - Babic, Milan
A1  - Li, Yuanqing
A1  - Schrape, Oliver
A1  - Krstić, Miloš
A1  - Kraemer, Rolf
T1  - Use of decoupling cells for mitigation of SET effects in CMOS combinational gates
T2  - 2018 25th IEEE International Conference on Electronics, Circuits and Systems (ICECS)
N2  - This paper investigates the applicability of CMOS decoupling cells for mitigating the Single Event Transient (SET) effects in standard combinational gates. The concept is based on the insertion of two decoupling cells between the gate's output and the power/ground terminals. To verify the proposed hardening approach, extensive SPICE simulations have been performed with standard combinational cells designed in IHP's 130 nm bulk CMOS technology. Obtained simulation results have shown that the insertion of decoupling cells results in the increase of the gate's critical charge, thus reducing the gate's soft error rate (SER). Moreover, the decoupling cells facilitate the suppression of SET pulses propagating through the gate. It has been shown that the decoupling cells may be a competitive alternative to gate upsizing and gate duplication for hardening the gates with lower critical charge and multiple (3 or 4) inputs, as well as for filtering the short SET pulses induced by low-LET particles.
KW  - decoupling cells
KW  - radiation hardening
KW  - SET effects
KW  - CMOS technology
KW  - combinational logic
Y1  - 2019
SN  - 978-1-5386-9562-3
U6  - https://doi.org/10.1109/ICECS.2018.8617996
SP  - 361
EP  - 364
PB  - IEEE
CY  - New York
ER  - 
TY  - GEN
A1  - Aranda, Juan
A1  - Schölzel, Mario
A1  - Mendez, Diego
A1  - Carrillo, Henry
T1  - An energy consumption model for multiModal wireless sensor networks based on wake-up radio receivers
T2  - 2018 IEEE Colombian Conference on Communications and Computing (COLCOM)
N2  - Energy consumption is a major concern in Wireless Sensor Networks. A significant waste of energy occurs due to the idle listening and overhearing problems, which are typically avoided by turning off the radio, while no transmission is ongoing. The classical approach for allowing the reception of messages in such situations is to use a low-duty-cycle protocol, and to turn on the radio periodically, which reduces the idle listening problem, but requires timers and usually unnecessary wakeups. A better solution is to turn on the radio only on demand by using a Wake-up Radio Receiver (WuRx). In this paper, an energy model is presented to estimate the energy saving in various multi-hop network topologies under several use cases, when a WuRx is used instead of a classical low-duty-cycling protocol. The presented model also allows for estimating the benefit of various WuRx properties like using addressing or not.
KW  - Energy efficiency
KW  - multimodal wireless sensor network
KW  - low-duty-cycling
KW  - wake-up radio
Y1  - 2018
SN  - 978-1-5386-6820-7
U6  - https://doi.org/10.1109/ColComCon.2018.8466728
PB  - IEEE
CY  - New York
ER  - 
TY  - BOOK
A1  - Baltzer, Wanda
A1  - Hradilak, Theresa
A1  - Pfennigschmidt, Lara
A1  - Prestin, Luc Maurice
A1  - Spranger, Moritz
A1  - Stadlinger, Simon
A1  - Wendt, Leo
A1  - Lincke, Jens
A1  - Rein, Patrick
A1  - Church, Luke
A1  - Hirschfeld, Robert
T1  - An individual-centered approach to visualize people’s opinions and demographic information
N2  - The noble way to substantiate decisions that affect many people is to ask these people for their opinions. For governments that run whole countries, this means asking all citizens for their views to consider their situations and needs.

Organizations such as Africa's Voices Foundation, who want to facilitate communication between decision-makers and citizens of a country, have difficulty mediating between these groups. To enable understanding, statements need to be summarized and visualized. Accomplishing these goals in a way that does justice to the citizens' voices and situations proves challenging. Standard charts do not help this cause as they fail to create empathy for the people behind their graphical abstractions. Furthermore, these charts do not create trust in the data they are representing as there is no way to see or navigate back to the underlying code and the original data. To fulfill these functions, visualizations would highly benefit from interactions to explore the displayed data, which standard charts often only limitedly provide.

To help improve the understanding of people's voices, we developed and categorized 80 ideas for new visualizations, new interactions, and better connections between different charts, which we present in this report. From those ideas, we implemented 10 prototypes and two systems that integrate different visualizations. We show that this integration allows consistent appearance and behavior of visualizations. The visualizations all share the same main concept: representing each individual with a single dot. To realize this idea, we discuss technologies that efficiently allow the rendering of a large number of these dots. With these visualizations, direct interactions with representations of individuals are achievable by clicking on them or by dragging a selection around them. This direct interaction is only possible with a bidirectional connection from the visualization to the data it displays. We discuss different strategies for bidirectional mappings and the trade-offs involved. Having unified behavior across visualizations enhances exploration. For our prototypes, that includes grouping, filtering, highlighting, and coloring of dots. Our prototyping work was enabled by the development environment Lively4. We explain which parts of Lively4 facilitated our prototyping process. Finally, we evaluate our approach to domain problems and our developed visualization concepts.

Our work provides inspiration and a starting point for visualization development in this domain. Our visualizations can improve communication between citizens and their government and motivate empathetic decisions. Our approach, combining low-level entities to create visualizations, provides value to an explorative and empathetic workflow. We show that the design space for visualizing this kind of data has a lot of potential and that it is possible to combine qualitative and quantitative approaches to data analysis.
N2  - Der noble Weg, Entscheidungen, die viele Menschen betreffen, zu begründen, besteht darin, diese Menschen nach ihrer Meinung zu fragen. Für Regierungen, die ganze Länder führen, bedeutet dies, alle Bürger nach ihrer Meinung zu fragen, um ihre Situationen und Bedürfnisse zu berücksichtigen.

Organisationen wie die Africa's Voices Foundation, die die Kommunikation zwischen Entscheidungsträgern und Bürgern eines Landes erleichtern wollen, haben Schwierigkeiten, zwischen diesen Gruppen zu vermitteln. Um Verständnis zu ermöglichen, müssen die Aussagen zusammengefasst und visualisiert werden. Diese Ziele auf eine Weise zu erreichen, die den Stimmen und Situationen der Bürgerinnen und Bürger gerecht wird, erweist sich als Herausforderung. Standardgrafiken helfen dabei nicht weiter, da es ihnen nicht gelingt, Empathie für die Menschen hinter ihren grafischen Abstraktionen zu schaffen. Darüber hinaus schaffen diese Diagramme kein Vertrauen in die Daten, die sie darstellen, da es keine Möglichkeit gibt, den verwendeten Code und die Originaldaten zu sehen oder zu ihnen zurück zu navigieren. Um diese Funktionen zu erfüllen, würden Visualisierungen sehr von Interaktionen zur Erkundung der angezeigten Daten profitieren, die Standardgrafiken oft nur begrenzt bieten.

Um das Verständnis der Stimmen der Menschen zu verbessern, haben wir 80 Ideen für neue Visualisierungen, neue Interaktionen und bessere Verbindungen zwischen verschiedenen Diagrammen entwickelt und kategorisiert, die wir in diesem Bericht vorstellen. Aus diesen Ideen haben wir 10 Prototypen und zwei Systeme implementiert, die verschiedene Visualisierungen integrieren. Wir zeigen, dass diese Integration ein einheitliches Erscheinungsbild und Verhalten der Visualisierungen ermöglicht. Die Visualisierungen haben alle das gleiche Grundkonzept: Jedes Individuum wird durch einen einzigen Punkt dargestellt. Um diese Idee zu verwirklichen, diskutieren wir Technologien, die die effiziente Darstellung einer großen Anzahl dieser Punkte ermöglichen. Mit diesen Visualisierungen sind direkte Interaktionen mit Darstellungen von Individuen möglich, indem man auf sie klickt oder eine Auswahl um sie herumzieht. Diese direkte Interaktion ist nur mit einer bidirektionalen Verbindung von der Visualisierung zu den angezeigten Daten möglich. Wir diskutieren verschiedene Strategien für bidirektionale Mappings und die damit verbundenen Kompromisse. Ein einheitliches Verhalten über Visualisierungen hinweg verbessert die Exploration. Für unsere Prototypen umfasst dies Gruppierung, Filterung, Hervorhebung und Einfärbung von Punkten. Unsere Arbeit an den Prototypen wurde durch die Entwicklungsumgebung Lively4 ermöglicht. Wir erklären, welche Teile von Lively4 unseren Prototyping-Prozess erleichtert haben. Schließlich bewerten wir unsere Herangehensweise an Domänenprobleme und die von uns entwickelten Visualisierungskonzepte.

Unsere Arbeit liefert Inspiration und einen Ausgangspunkt für die Entwicklung von Visualisierungen in diesem Bereich. Unsere Visualisierungen können die Kommunikation zwischen Bürgern und ihrer Regierung verbessern und einfühlsame Entscheidungen motivieren. Unser Ansatz, bei dem wir niedrigstufige Entitäten zur Erstellung von Visualisierungen kombinieren, bietet einen wertvollen Ansatz für einen explorativen und einfühlsamen Arbeitsablauf. Wir zeigen, dass der Designraum für die Visualisierung dieser Art von Daten ein großes Potenzial hat und dass es möglich ist, qualitative und quantitative Ansätze zur Datenanalyse zu kombinieren.
T3  - Technische Berichte des Hasso-Plattner-Instituts für Digital Engineering an der Universität Potsdam - 136 
KW  - data visualization
KW  - demographic information
KW  - visualization concept exploration
KW  - web-based development environment
KW  - Datenvisualisierung
KW  - demografische Informationen
KW  - Visualisierungskonzept-Exploration
KW  - web-basierte Entwicklungsumgebung
Y1  - 2021
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-491457
SN  - 978-3-86956-504-0
SN  - 1613-5652
SN  - 2191-1665
IS  - 136
PB  - Universitätsverlag Potsdam
CY  - Potsdam
ER  - 
TY  - THES
A1  - Bano, Dorina
T1  - Discovering data models from event logs
T1  - Entdecken von Datenmodellen aus Ereignisprotokollen
N2  - In the last two decades, process mining has developed from a niche
discipline to a significant research area with considerable impact on academia and industry. Process mining enables organisations to identify the running business processes from historical execution data. The first requirement of any process mining technique is an event log, an artifact that represents concrete business process executions in the form of sequence of events. These logs can be extracted from the organization's information systems and are used by process experts to retrieve deep insights from the organization's running processes. Considering the events pertaining to such logs, the process models can be automatically discovered and enhanced or annotated with performance-related information. Besides behavioral information, event logs contain domain specific data, albeit implicitly. However, such data are usually overlooked and, thus, not utilized to their full potential.

Within the process mining area, we address in this thesis the research gap of discovering, from event logs, the contextual information that cannot be captured by applying existing process mining techniques. Within this research gap, we identify four key problems and tackle them by looking at an event log from different angles. First, we address the problem of deriving an event log in the absence of a proper database access and domain knowledge. The second problem is related to the under-utilization of the implicit domain knowledge present in an event log that can increase the understandability of the discovered process model. Next, there is a lack of a holistic representation of the historical data manipulation at the process model level of abstraction. Last but not least, each process model presumes to be independent of other process models when discovered from an event log, thus, ignoring possible data dependencies between processes within an organization. 

For each of the problems mentioned above, this thesis proposes a dedicated method. The first method provides a solution to extract an event log only from the transactions performed on the database that are stored in the form of redo logs. The second method deals with discovering the underlying data model that is implicitly embedded in the event log, thus, complementing the discovered process model with important domain knowledge information. The third method captures, on the process model level, how the data affects the running process instances. Lastly, the fourth method is about the discovery of the relations between business processes (i.e., how they exchange data) from a set of event logs and explicitly representing such complex interdependencies in a business process architecture.

All the methods introduced in this thesis are implemented as a prototype and their feasibility is proven by being applied on real-life event logs.
N2  - In den letzten zwei Jahrzehnten hat sich Process Mining von einer Nischendisziplin zu einem bedeutenden Forschungsgebiet mit erheblichen Auswirkungen auf Wissenschaft und Industrie entwickelt. Process Mining ermöglicht es Unternehmen, die laufenden Geschäftsprozesse anhand historischer Ausführungsdaten zu identifizieren. Die erste Voraussetzung für jede Process-Mining-Technik ist ein Ereignisprotokoll (Event Log), ein Artefakt, das konkrete Geschäftsprozessausführungen in Form einer Abfolge von Ereignissen darstellt. Diese Protokolle (Logs) können aus den Informationssystemen der Unternehmen extrahiert werden und ermöglichen es Prozessexperten, tiefe Einblicke in die laufenden Unternehmensprozesse zu gewinnen. Unter Berücksichtigung der Abfolge der Ereignisse in diesen Protokollen (Logs) können Prozessmodelle automatisch entdeckt und mit leistungsbezogenen Informationen erweitert werden. Neben verhaltensbezogenen Informationen enthalten Ereignisprotokolle (Event Logs) auch domänenspezifische Daten, wenn auch nur implizit. Solche Daten werden jedoch in der Regel nicht in vollem Umfang genutzt. Diese Arbeit befasst sich
im Bereich Process Mining mit der Forschungslücke der Extraktion von Kontextinformationen aus Ereignisprotokollen (Event Logs), die von bestehenden Process Mining-Techniken nicht erfasst werden.

Innerhalb dieser Forschungslücke identifizieren wir vier Schlüsselprobleme, bei denen wir die Ereignisprotokolle (Event Logs) aus verschiedenen Perspektiven betrachten. Zunächst befassen wir uns mit dem Problem der Erfassung eines Ereignisprotokolls (Event Logs) ohne hinreichenden Datenbankzugang. Das zweite Problem ist die unzureichende Nutzung des in Ereignisprotokollen (Event Logs) enthaltenen Domänenwissens, das zum besseren Verständnis der generierten Prozessmodelle beitragen kann. Außerdem mangelt es an einer ganzheitlichen Darstellung der historischen Datenmanipulation auf Prozessmodellebene. Nicht zuletzt werden Prozessmodelle häufig unabhängig
von anderen Prozessmodellen betrachtet, wenn sie aus Ereignisprotokollen (Event Logs) ermittelt wurden. Dadurch können mögliche Datenabhängigkeiten zwischen Prozessen innerhalb einer Organisation übersehen werden.

Für jedes der oben genannten Probleme schlägt diese Arbeit eine eigene Methode vor. Die erste Methode ermöglicht es, ein Ereignisprotokoll (Event Log) ausschließlich anhand der Historie der auf einer Datenbank durchgeführten Transaktionen zu extrahieren, die in Form von Redo-Logs gespeichert ist. Die zweite Methode befasst sich mit der Entdeckung des 

zugrundeliegenden Datenmodells, das implizit in dem jeweiligen Ereignisprotokoll (Event Log) eingebettet ist, und ergänzt so mit das entdeckte Prozessmodell mit wichtigen, domänenspezifischen Informationen. Bei der dritten Methode wird auf der Ebene des Prozess-
modells erfasst, wie sich die Daten auf die laufenden Prozessinstanzen auswirken. Die vierte Methode befasst sich schließlich mit der Entdeckung der Beziehungen zwischen Geschäftsprozessen (d.h. deren Datenaustausch) auf Basis der jeweiligen Ereignisprotokolle (Event Logs), sowie mit der expliziten Darstellung solcher komplexen Abhängigkeiten in einer Geschäftsprozessarchitektur.

 

Alle in dieser Arbeit vorgestellten Methoden sind als Prototyp implementiert und ihre Anwendbarkeit wird anhand ihrer Anwendung auf reale Ereignisprotokolle (Event Logs) nachgewiesen.
KW  - process mining
KW  - data models
KW  - business process architectures
KW  - Datenmodelle
KW  - Geschäftsprozessarchitekturen
Y1  - 2023
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-585427
ER  - 
TY  - BOOK
A1  - Barkowsky, Matthias
A1  - Giese, Holger
T1  - Modular and incremental global model management with extended generalized discrimination networks
T1  - Modulares und inkrementelles Globales Modellmanagement mit erweiterten Generalized Discrimination Networks
N2  - Complex projects developed under the model-driven engineering paradigm nowadays often involve several interrelated models, which are automatically processed via a multitude of model operations. Modular and incremental construction and execution of such networks of models and model operations are required to accommodate efficient development with potentially large-scale models. The underlying problem is also called Global Model Management.


In this report, we propose an approach to modular and incremental Global Model Management via an extension to the existing technique of Generalized Discrimination Networks (GDNs). In addition to further generalizing the notion of query operations employed in GDNs, we adapt the previously query-only mechanism to operations with side effects to integrate model transformation and model synchronization. We provide incremental algorithms for the execution of the resulting extended Generalized Discrimination Networks (eGDNs), as well as a prototypical implementation for a number of example eGDN operations.


Based on this prototypical implementation, we experiment with an application scenario from the software development domain to empirically evaluate our approach with respect to scalability and conceptually demonstrate its applicability in a typical scenario. Initial results confirm that the presented approach can indeed be employed to realize efficient Global Model Management in the considered scenario.
N2  - Komplexe Projekte, die unter dem Paradigma der modellgetriebenen Entwicklung entwickelt werden, nutzen heutzutage oft mehrere miteinander in Beziehung stehende Modelle, die durch eine Vielzahl von Modelloperationen automatiscsh verarbeitet werden. Die modulare und inkrementelle Konstruktion und Ausführung solcher Netzwerke von Modelloperationen ist eine Voraussetzung für effiziente Entwicklung mit potenziell sehr großen Modellen. Das zugrunde liegende Forschungsproblem heißt auch Globales Modellmanagement.

In diesem Bericht schlagen wir einen Ansatz für modulares und inkrementelles Globales Modellmanagement vor, der auf einer Erweiterung der existierenden Technik der Generalized Discrimination Networks (GDNs) basiert. Neben einer weiteren Verallgemeinerung des Konzepts der Anfrageoperationen in GDNs erweitern wir den zuvor rein lesenden Mechanismus auf Operationen mit Seiteneffekten, um Modelltransformationen und Modellsynchronisationen zu integrieren. Wir präsentieren inkrementelle Algorithmen für die Ausführung der resultierenden erweiterten GDNs (eGDNs) sowie eine prototypische Implementierung von Beispieloperationen für eGDNs.

Mithilfe dieser prototypischen Implementierung evaluieren wir unsere Lösung hinsichtlich ihrer Skalierbarkeit in einem Anwendungsszenario aus dem Bereich der Softwareentwicklung. Außerdem demonstrieren wir die Anwendbarkeit der entwickelten Technik konzeptionell anhand eines typischen Anwendugsszenario. Unsere ersten Ergebnisse bestätigen, dass die Lösung genutzt werden kann, um effizientes Globales Modellmanagement im betrachteten Szenario zu realisieren.
T3  - Technische Berichte des Hasso-Plattner-Instituts für Digital Engineering an der Universität Potsdam - 154 
KW  - global model management
KW  - generalized discrimination networks
KW  - globales Modellmanagement
KW  - Generalized Discrimination Networks
Y1  - 2023
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-573965
SN  - 978-3-86956-555-2
SN  - 1613-5652
SN  - 2191-1665
IS  - 154
SP  - 63
EP  - 63
ER  - 
TY  - BOOK
A1  - Barkowsky, Matthias
A1  - Giese, Holger
T1  - Triple graph grammars for multi-version models
N2  - Like conventional software projects, projects in model-driven software engineering require adequate management of multiple versions of development artifacts, importantly allowing living with temporary inconsistencies. In the case of model-driven software engineering, employed versioning approaches also have to handle situations where different artifacts, that is, different models, are linked via automatic model transformations.

In this report, we propose a technique for jointly handling the transformation of multiple versions of a source model into corresponding versions of a target model, which enables the use of a more compact representation that may afford improved execution time of both the transformation and further analysis operations. Our approach is based on the well-known formalism of triple graph grammars and a previously introduced encoding of model version histories called multi-version models. In addition to showing the correctness of our approach with respect to the standard semantics of triple graph grammars, we conduct an empirical evaluation that demonstrates the potential benefit regarding execution time performance.
N2  - Ähnlich zu konventionellen Softwareprojekten erfordern Projekte im Bereich der modellgetriebenen Softwareentwicklung eine adäquate Verwaltung mehrerer Versionen von Entwicklungsartefakten. Eine solche Versionsverwaltung muss es insbesondere ermöglichen, zeitweise mit Inkonsistenzen zu leben. Im Fall der modellgetriebenen Softwareentwicklung muss ein verwendeter Ansatz zusätzlich mit Situationen umgehen können, in denen verschiedene Entwicklungsartefakte, das heißt verschiedene Modelle, durch automatische Modelltransformationen verknüpft sind.

In diesem Bericht schlagen wir eine Technik für die integrierte Transformation mehrerer Versionen eines Quellmodells in entsprechende Versionen eines Zielmodells vor. Dies ermöglicht die Verwendung einer kompakteren Repräsentation der Modelle, was zu verbesserten Laufzeiteigenschaften der Transformation und weiterführender Operationen führen kann. Unser Ansatz basiert auf dem bekannten Formalismus der Tripel-Graph-Grammatiken und einer in früheren Arbeiten eingeführten Kodierung von Versionshistorien von Modellen. Neben einem Beweis der Korrektheit des Ansatzes in Bezug auf die standardmäßige Semantik von Tripel-Graph-Grammatiken führen wir eine empirische Evaluierung durch, die den potenziellen Performancevorteil der Technik demonstriert.
T3  - Technische Berichte des Hasso-Plattner-Instituts für Digital Engineering an der Universität Potsdam - 155 
KW  - triple graph grammars
KW  - multi-version models
KW  - Tripel-Graph-Grammatiken
KW  - Modelle mit mehreren Versionen
Y1  - 2023
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-573994
SN  - 978-3-86956-556-9
SN  - 1613-5652
SN  - 2191-1665
IS  - 155
SP  - 28
EP  - 28
ER  - 
TY  - THES
A1  - Bartz, Christian
T1  - Reducing the annotation burden: deep learning for optical character recognition using less manual annotations
N2  - Text is a ubiquitous entity in our world and daily life. We encounter it nearly everywhere in shops, on the street, or in our flats. Nowadays, more and more text is contained in digital images. These images are either taken using cameras, e.g., smartphone cameras, or taken using scanning devices such as document scanners. The sheer amount of available data, e.g., millions of images taken by Google Streetview, prohibits manual analysis and metadata extraction. Although much progress was made in the area of optical character recognition (OCR) for printed text in documents, broad areas of OCR are still not fully explored and hold many research challenges. With the mainstream usage of machine learning and especially deep learning, one of the most pressing problems is the availability and acquisition of annotated ground truth for the training of machine learning models because obtaining annotated training data using manual annotation mechanisms is time-consuming and costly. In this thesis, we address of how we can reduce the costs of acquiring ground truth annotations for the application of state-of-the-art machine learning methods to optical character recognition pipelines. To this end, we investigate how we can reduce the annotation cost by using only a fraction of the typically required ground truth annotations, e.g., for scene text recognition systems. We also investigate how we can use synthetic data to reduce the need of manual annotation work, e.g., in the area of document analysis for archival material. In the area of scene text recognition, we have developed a novel end-to-end scene text recognition system that can be trained using inexact supervision and shows competitive/state-of-the-art performance on standard benchmark datasets for scene text recognition. Our method consists of two independent neural networks, combined using spatial transformer networks. Both networks learn together to perform text localization and text recognition at the same time while only using annotations for the recognition task. We apply our model to end-to-end scene text recognition (meaning localization and recognition of words) and pure scene text recognition without any changes in the network architecture.

In the second part of this thesis, we introduce novel approaches for using and generating synthetic data to analyze handwriting in archival data. First, we propose a novel preprocessing method to determine whether a given document page contains any handwriting. We propose a novel data synthesis strategy to train a classification model and show that our data synthesis strategy is viable by evaluating the trained model on real images from an archive. Second, we introduce the new analysis task of handwriting classification. Handwriting classification entails classifying a given handwritten word image into classes such as date, word, or number. Such an analysis step allows us to select the best fitting recognition model for subsequent text recognition; it also allows us to reason about the semantic content of a given document page without the need for fine-grained text recognition and further analysis steps, such as Named Entity Recognition. We show that our proposed approaches work well when trained on synthetic data. Further, we propose a flexible metric learning approach to allow zero-shot classification of classes unseen during the network’s training. Last, we propose a novel data synthesis algorithm to train off-the-shelf pixel-wise semantic segmentation networks for documents. Our data synthesis pipeline is based on the famous Style-GAN architecture and can synthesize realistic document images with their corresponding segmentation annotation without the need for any annotated data!
N2  - Text umgibt uns überall. Wir finden Text in allen Lebenslagen, z.B. in einem Geschäft, an Gebäuden, oder in unserer Wohnung. Viele dieser Textentitäten können heutzutage auch in digitalen Bildern gefunden werden, welche auf verschiedene Art und Weise erstellt werden können, z.B. mittels einer Kamera in einem Smartphone oder durch einen Dokumentenscanner. Die Anzahl verfügbarer digitaler Bilder, z.B. Millionen – wenn nicht Milliarden von Bildern – in Google Streetview, macht eine manuelle Analyse der Bilddaten unmöglich. Obwohl es im Gebiet der Optical Character Recognition (OCR) in den letzten Jahren viel Fortschritt gab, gibt es doch noch viele Bereiche, die noch nicht vollständig erforscht worden sind. Der immer zunehmende Einsatz von Methoden des maschinellen Lernens, insbesondere der Einsatz von Deep Learning Technologien, im Bereich der OCR, führt zu dem großen Problem der Verfügbarkeit von annotierten Trainingsdaten. Die Beschaffung annotierter Daten mittels manueller Annotation ist zeitintensiv und sehr teuer. In dieser Arbeit zeigen wir neue Wege und Verfahren auf, wie das Problem der Beschaffung annotierter Daten für die Anwendung von modernsten Deep Learning Verfahren im Bereich der OCR gelöst werden könnte. Hierbei zeigen wir neue Verfahren in zwei Unterbereichen der OCR. Einerseits untersuchen wir, wie wir die Annotationskosten reduzieren könnten, indem wir inexakte Annotationen benutzen um z.B. die Kosten der Annotation von echten Daten im Bereich der Texterkennung aus natürlichen Bildern zu reduzieren. Dieses System wird mittels weak supervision trainiert und erreicht Ergebnisse, die auf dem Stand der Technik bzw. darüber liegen. Unsere Methode basiert auf zwei unabhängigen neuronalen Netzwerken, die mittels eines Spatial Transformers verbunden werden. Beide Netzwerke werden zusammen trainiert und lernen zusammen, wie Text gefunden und gelesen werden kann. Dabei nutzen wir aber nur Annotationen und Supervision für das Lesen (recognition) des Textes, nicht für die Textfindung. Wir zeigen weiterhin, dass unser System für eine Mehrzahl von Aufgaben im Bereich der Texterkennung aus natürlichen Bildern genutzt werden kann, ohne Veränderungen im Netzwerk vornehmen zu müssen. Andererseits untersuchen wir, wie wir Verfahren zur Erstellung von synthetischen Daten benutzen können, um die Kosten und den Aufwand der manuellen Annotation zu verringern und zeigen Ergebnisse aus dem Bereich der Analyse von Handschrift in historischen Archivdokumenten. Zuerst präsentieren wir ein System zur Erkennung, ob ein Bild überhaupt Handschrift enthält. Hier schlagen wir eine neue Datengenerierungsmethode vor. Die generierten Daten werden zum Training eines Klassifizierungsmodells genutzt. Unsere experimentellen Ergebnisse belegen, dass unsere Idee auch auf echten Daten aus einem Archiv eingesetzt werden kann.

Als Zweites führen wir einen neuen Schritt in einer Dokumentenanalyseplattform ein: Handschriftklassifizierung. Hier ordnen wir Bilder einzelner handgeschriebener Wörter anhand ihrer visuellen Struktur in Klassen, wie Zahlen, Datumsangaben oder Wörter ein. Die Einführung dieses Analyseschrittes erlaubt es uns den besten Algorithmus für den nächsten Schritt, die eigentliche Handschrifterkennung, zu finden. Der Analyseschritt erlaubt es uns auch, bereits Aussagen über den semantischen Inhalt eines Dokumentes zu treffen, ohne weitere Analyseschritte, wie Named Entity Recognition, durchführen zu müssen. Wir zeigen, dass unser Ansatz sehr gut funktioniert, wenn er auf synthetischen Daten trainiert wird; wir zeigen weiterhin, dass unser Ansatz auch für zero-shot Klassifikation eingesetzt werden kann. Zum Schluss präsentieren wir ein neues Verfahren zur Generierung von Trainingsdaten für die pixelgenaue semantische Segmentierung in Bildern von Dokumenten. Unser Verfahren basiert auf der bekannten StyleGAN Architektur und ist in der Lage Bilder mit entsprechender Annotation automatisch zu generieren. Hierbei werden keine echten annotierten Daten benötigt und das Verfahren kann auf jeder Form von Dokumenten eingesetzt werden.
KW  - computer vision
KW  - optical character recognition
KW  - archive analysis
KW  - data synthesis
KW  - weak supervision
KW  - Archivanalyse
KW  - maschinelles Sehen
KW  - Datensynthese
KW  - Texterkennung
KW  - schwach überwachtes maschinelles Lernen
Y1  - 2022
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-555407
ER  - 
TY  - BOOK
A1  - Bartz, Christian
A1  - Krestel, Ralf
T1  - Deep learning for computer vision in the art domain
BT  - proceedings of the master seminar on practical introduction to deep learning for computer vision, HPI WS 20/21
N2  - In recent years, computer vision algorithms based on machine learning have seen rapid development. In the past, research mostly focused on solving computer vision problems such as image classification or object detection on images displaying natural scenes. Nowadays other fields such as the field of cultural heritage, where an abundance of data is available, also get into the focus of research. In the line of current research endeavours, we collaborated with the Getty Research Institute which provided us with a challenging dataset, containing images of paintings and drawings. In this technical report, we present the results of the seminar "Deep Learning for Computer Vision". In this seminar, students of the Hasso Plattner Institute evaluated state-of-the-art approaches for image classification, object detection and image recognition on the dataset of the Getty Research Institute. The main challenge when applying modern computer vision methods to the available data is the availability of annotated training data, as the dataset provided by the Getty Research Institute does not contain a sufficient amount of annotated samples for the training of deep neural networks. However, throughout the report we show that it is possible to achieve satisfying to very good results, when using further publicly available datasets, such as the WikiArt dataset, for the training of machine learning models.
N2  - Methoden zur Anwendung von maschinellem Lernen für das maschinelle Sehen haben sich in den letzten Jahren stark weiterentwickelt. Dabei konzentrierte sich die Forschung hauptsächlich auf die Lösung von Problemen im Bereich der Bildklassifizierung, oder der Objekterkennung aus Bildern mit natürlichen Motiven. Mehr und mehr kommen zusätzlich auch andere Inhaltsbereiche, vor allem aus dem kulturellen Umfeld in den Fokus der Forschung. Kulturforschungsinstitute, wie das Getty Research Institute, besitzen eine Vielzahl von digitalisierten Dokumenten, die bisher noch nicht analysiert wurden. Im Rahmen einer Zusammenarbeit, überließ das Getty Research Institute uns einen Datensatz, bestehend aus Photos von Kunstwerken. In diesem technischen Bericht präsentieren wir die Ergebnisse des Masterseminars "Deep Learning for Computer Vision", in dem Studierende des Hasso-Plattner-Instituts den Stand der Kunst, bei der Anwendung von Bildklassifizierungs, Objekterkennungs und Image Retrieval Algorithmen evaluierten. Eine besondere Schwierigkeit war, dass es nicht möglich ist bestehende Verfahren direkt auf dem Datensatz anzuwenden, da keine, bzw. kaum Annotationen für das Training von Machine Learning Modellen verfügbar sind. In den einzelnen Teilen des Berichts zeigen wir jedoch, dass es möglich ist unter Zuhilfenahme von weiteren öffentlich verfügbaren Datensätzen, wie dem WikiArt Datensatz, zufriedenstellende bis sehr gute Ergebnisse für die einzelnen Analyseaufgaben zu erreichen.
T3  - Technische Berichte des Hasso-Plattner-Instituts für Digital Engineering an der Universität Potsdam - 139 
KW  - computer vision
KW  - cultural heritage
KW  - art analysis
KW  - maschinelles Sehen
KW  - kulturelles Erbe
KW  - Kunstanalyse
Y1  - 2021
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-512906
SN  - 978-3-86956-514-9
SN  - 1613-5652
SN  - 2191-1665
IS  - 139
PB  - Universitätsverlag Potsdam
CY  - Potsdam
ER  - 
TY  - GEN
A1  - Bartz, Christian
A1  - Yang, Haojin
A1  - Meinel, Christoph
T1  - SEE: Towards semi-supervised end-to-end scene text recognition
T2  - Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, Thirtieth Innovative Applications of Artificial Intelligence Conference, Eight Symposium on Educational Advances in Artificial Intelligence
N2  - Detecting and recognizing text in natural scene images is a challenging, yet not completely solved task. In recent years several new systems that try to solve at least one of the two sub-tasks (text detection and text recognition) have been proposed. In this paper we present SEE, a step towards semi-supervised neural networks for scene text detection and recognition, that can be optimized end-to-end. Most existing works consist of multiple deep neural networks and several pre-processing steps. In contrast to this, we propose to use a single deep neural network, that learns to detect and recognize text from natural images, in a semi-supervised way. SEE is a network that integrates and jointly learns a spatial transformer network, which can learn to detect text regions in an image, and a text recognition network that takes the identified text regions and recognizes their textual content. We introduce the idea behind our novel approach and show its feasibility, by performing a range of experiments on standard benchmark datasets, where we achieve competitive results.
Y1  - 2018
SN  - 978-1-57735-800-8
VL  - 10
SP  - 6674
EP  - 6681
PB  - ASSOC Association for the Advancement of Artificial Intelligence
CY  - Palo Alto
ER  - 
TY  - THES
A1  - Batoulis, Kimon
T1  - Sound integration of process and decision models
T1  - Korrekte Integration von Prozess- und Entscheidungsmodellen
N2  - Business process management is an established technique for business organizations to manage and support their processes. Those processes are typically represented by graphical models designed with modeling languages, such as the Business Process Model and Notation (BPMN).
Since process models do not only serve the purpose of documentation but are also a basis for implementation and automation of the processes, they have to satisfy certain correctness requirements. In this regard, the notion of soundness of workflow nets was developed, that can be applied to BPMN process models in order to verify their correctness. Because the original soundness criteria are very restrictive regarding the behavior of the model, different variants of the soundness notion have been developed for situations in which certain violations are not even harmful.
All of those notions do only consider the control-flow structure of a process model, however. This poses a problem, taking into account the fact that with the recent release and the ongoing development of the Decision Model and Notation (DMN) standard, an increasing number of process models are complemented by respective decision models. DMN is a dedicated modeling language for decision logic and separates the concerns of process and decision logic into two different models, process and decision models respectively.
Hence, this thesis is concerned with the development of decisionaware soundness notions, i.e., notions of soundness that build upon the original soundness ideas for process models, but additionally take into account complementary decision models. Similar to the various notions of workflow net soundness, this thesis investigates different notions of decision soundness that can be applied depending on the desired degree of restrictiveness. Since decision tables are a standardized means of DMN to represent decision logic, this thesis also puts special focus on decision tables, discussing how they can be translated into an unambiguous format and how their possible output values can be efficiently determined.
Moreover, a prototypical implementation is described that supports checking a basic version of decision soundness. The decision soundness notions were also empirically evaluated on models from participants of an online course on process and decision modeling as well as from a process management project of a large insurance company. The evaluation demonstrates that violations of decision soundness indeed occur and can be detected with our approach.
N2  - Das Prozessmanagement ist eine etablierte Methode für Unternehmen zur Verwaltung und Unterstützung ihrer Geschäftsprozesse. Solche Prozesse werden typischerweise durch graphische Modelle dargestellt, welche mit Modellierungssprachen wie etwa der Business Process Model and Notation (BPMN) erstellt werden.
Da Prozessmodelle nicht nur der Dokumentation der Prozesse dienen, sondern auch die Grundlage für deren Implementierung und Automatisierung sind, müssen sie bestimmte Korrektheitsanforderungen erfüllen. In dieser Hinsicht wurde der Begriff der Soundness einesWorkflow-Netzes entwickelt, welcher auch auf BPMN-Prozessmodelle angewendet werden kann, um deren Korrektheit zu prüfen. Da die ursprünglichen Soundness-Kriterien sehr restriktiv bezüglich des Verhaltens des Modells sind, wurden zudem Varianten des Soundness-Begriffs entwickelt. Diese können in Situationen verwendet werden, in denen bestimmte Verletzungen der Kriterien tolerabel sind.
Diese Soundness-Begriffe berücksichtigen allerdings ausschließlich den Kontrollfluss der Prozessmodelle. Dies stellt ein Problem dar, weil viele Prozessmodelle heutzutage durch Entscheidungsmodelle ergänzt werden. In diesem Kontext ist die Decision Model and Notation (DMN) eine dedizierte Sprache zur Modellierung von Entscheidungen und unterstüzt die Trennung von Kontrollfluss- und Entscheidungslogik.
Die vorliegende Dissertation befasst sich daher mit der Entwicklung von erweiterten Soundness-Begriffen, die sowohl Prozess- als auch Entscheidungsmodelle berücksichtigen. Ähnlich zu den bestehenden Soundness-Varianten, werden in dieser Arbeit Varianten des erweiterten Soundness-Begriffs untersucht, die je nach gewünschtem Restriktionsgrad angewendet werden können. Da Entscheidungstabellen eine in der DMN standadisierte Form sind, um Entscheidungslogik auszudrücken, fokussiert sich diese Arbeit inbesondere auf Entscheidungstabellen. So wird diskutiert wie DMN-Tabellen in ein eindeutiges Format übersetzt werden können und wie sich deren möglichen Rückgabewerte effizient bestimmen lassen.
Ferner beschreibt die Arbeit eine prototypische Implementierung, die das Prüfen einer elementaren Variante des erweiterten Soundness-Begriffs erlaubt. Die Begriffe wurden außerdem empirisch evaluiert. Dazu dienten zum einen Modelle von Teilnehmern eines Online-Kurses zur Prozess- und Entscheidungsmodellierung. Zum anderen wurden Modelle eines Versicherungsunternehmens analysiert. Die Evaluierung zeigt, das Verstöße gegen den erweiterten Soundness-Begriff in der Tat auftreten und durch den hier beschriebenen Ansatz erkannt werden können.
KW  - decision-aware process models
KW  - soundness
KW  - decision soundness
KW  - formal verification
KW  - entscheidungsbewusste Prozessmodelle
KW  - Korrektheit
KW  - Entscheidungskorrektheit
KW  - formale Verifikation
Y1  - 2019
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-437386
ER  -