004 Datenverarbeitung; Informatik
Refine
Year of publication
- 2022 (62) (remove)
Document Type
- Article (29)
- Monograph/Edited Volume (12)
- Doctoral Thesis (12)
- Postprint (6)
- Conference Proceeding (3)
Is part of the Bibliography
- yes (62)
Keywords
- machine learning (7)
- Business Process Management (2)
- Data profiling (2)
- MOOC (2)
- YouTube (2)
- authorship attribution (2)
- digital enlightenment (2)
- digital learning platform (2)
- digitale Aufklärung (2)
- digitale Lernplattform (2)
Institute
- Hasso-Plattner-Institut für Digital Engineering GmbH (25)
- Wirtschaftswissenschaften (9)
- Fachgruppe Betriebswirtschaftslehre (7)
- Hasso-Plattner-Institut für Digital Engineering gGmbH (6)
- Digital Engineering Fakultät (4)
- Institut für Informatik und Computational Science (4)
- Bürgerliches Recht (3)
- Sozialwissenschaften (2)
- Fachgruppe Soziologie (1)
- Institut für Biochemie und Biologie (1)
“Broadcast your gender.”
(2022)
Social media platforms provide a large array of behavioral data relevant to social scientific research. However, key information such as sociodemographic characteristics of agents are often missing. This paper aims to compare four methods of classifying social attributes from text. Specifically, we are interested in estimating the gender of German social media creators. By using the example of a random sample of 200 YouTube channels, we compare several classification methods, namely (1) a survey among university staff, (2) a name dictionary method with the World Gender Name Dictionary as a reference list, (3) an algorithmic approach using the website gender-api.com, and (4) a Multinomial Naïve Bayes (MNB) machine learning technique. These different methods identify gender attributes based on YouTube channel names and descriptions in German but are adaptable to other languages. Our contribution will evaluate the share of identifiable channels, accuracy and meaningfulness of classification, as well as limits and benefits of each approach. We aim to address methodological challenges connected to classifying gender attributes for YouTube channels as well as related to reinforcing stereotypes and ethical implications.
“Broadcast your gender.”
(2022)
Social media platforms provide a large array of behavioral data relevant to social scientific research. However, key information such as sociodemographic characteristics of agents are often missing. This paper aims to compare four methods of classifying social attributes from text. Specifically, we are interested in estimating the gender of German social media creators. By using the example of a random sample of 200 YouTube channels, we compare several classification methods, namely (1) a survey among university staff, (2) a name dictionary method with the World Gender Name Dictionary as a reference list, (3) an algorithmic approach using the website gender-api.com, and (4) a Multinomial Naïve Bayes (MNB) machine learning technique. These different methods identify gender attributes based on YouTube channel names and descriptions in German but are adaptable to other languages. Our contribution will evaluate the share of identifiable channels, accuracy and meaningfulness of classification, as well as limits and benefits of each approach. We aim to address methodological challenges connected to classifying gender attributes for YouTube channels as well as related to reinforcing stereotypes and ethical implications.
Knowledge-intensive business processes are flexible and data-driven. Therefore, traditional process modeling languages do not meet their requirements: These languages focus on highly structured processes in which data plays a minor role. As a result, process-oriented information systems fail to assist knowledge workers on executing their processes. We propose a novel case management approach that combines flexible activity-centric processes with data models, and we provide a joint semantics using colored Petri nets. The approach is suited to model, verify, and enact knowledge-intensive processes and can aid the development of information systems that support knowledge work.
Knowledge-intensive processes are human-centered, multi-variant, and data-driven. Typical domains include healthcare, insurances, and law. The processes cannot be fully modeled, since the underlying knowledge is too vast and changes too quickly. Thus, models for knowledge-intensive processes are necessarily underspecified. In fact, a case emerges gradually as knowledge workers make informed decisions. Knowledge work imposes special requirements on modeling and managing respective processes. They include flexibility during design and execution, ad-hoc adaption to unforeseen situations, and the integration of behavior and data. However, the predominantly used process modeling languages (e.g., BPMN) are unsuited for this task.
Therefore, novel modeling languages have been proposed. Many of them focus on activities' data requirements and declarative constraints rather than imperative control flow. Fragment-Based Case Management, for example, combines activity-centric imperative process fragments with declarative data requirements. At runtime, fragments can be combined dynamically, and new ones can be added. Yet, no integrated semantics for flexible activity-centric process models and data models exists.
In this thesis, Wickr, a novel case modeling approach extending fragment-based Case Management, is presented. It supports batch processing of data, sharing data among cases, and a full-fledged data model with associations and multiplicity constraints. We develop a translational semantics for Wickr targeting (colored) Petri nets. The semantics assert that a case adheres to the constraints in both the process fragments and the data models. Among other things, multiplicity constraints must not be violated. Furthermore, the semantics are extended to multiple cases that operate on shared data. Wickr shows that the data structure may reflect process behavior and vice versa. Based on its semantics, prototypes for executing and verifying case models showcase the feasibility of Wickr. Its applicability to knowledge-intensive and to data-centric processes is evaluated using well-known requirements from related work.
Dynamic resource management is an essential requirement for private and public cloud computing environments. With dynamic resource management, the physical resources assignment to the cloud virtual resources depends on the actual need of the applications or the running services, which enhances the cloud physical resources utilization and reduces the offered services cost. In addition, the virtual resources can be moved across different physical resources in the cloud environment without an obvious impact on the running applications or services production. This means that the availability of the running services and applications in the cloud is independent on the hardware resources including the servers, switches and storage failures. This increases the reliability of using cloud services compared to the classical data-centers environments.
In this thesis we briefly discuss the dynamic resource management topic and then deeply focus on live migration as the definition of the compute resource dynamic management. Live migration is a commonly used and an essential feature in cloud and virtual data-centers environments. Cloud computing load balance, power saving and fault tolerance features are all dependent on live migration to optimize the virtual and physical resources usage. As we will discuss in this thesis, live migration shows many benefits to cloud and virtual data-centers environments, however the cost of live migration can not be ignored. Live migration cost includes the migration time, downtime, network overhead, power consumption increases and CPU overhead.
IT admins run virtual machines live migrations without an idea about the migration cost. So, resources bottlenecks, higher migration cost and migration failures might happen. The first problem that we discuss in this thesis is how to model the cost of the virtual machines live migration. Secondly, we investigate how to make use of machine learning techniques to help the cloud admins getting an estimation of this cost before initiating the migration for one of multiple virtual machines. Also, we discuss the optimal timing for a specific virtual machine before live migration to another server. Finally, we propose practical solutions that can be used by the cloud admins to be integrated with the cloud administration portals to answer the raised research questions above.
Our research methodology to achieve the project objectives is to propose empirical models based on using VMware test-beds with different benchmarks tools. Then we make use of the machine learning techniques to propose a prediction approach for virtual machines live migration cost. Timing optimization for live migration is also proposed in this thesis based on using the cost prediction and data-centers network utilization prediction. Live migration with persistent memory clusters is also discussed at the end of the thesis. The cost prediction and timing optimization techniques proposed in this thesis could be practically integrated with VMware vSphere cluster portal such that the IT admins can now use the cost prediction feature and timing optimization option before proceeding with a virtual machine live migration.
Testing results show that our proposed approach for VMs live migration cost prediction shows acceptable results with less than 20% prediction error and can be easily implemented and integrated with VMware vSphere as an example of a commonly used resource management portal for virtual data-centers and private cloud environments. The results show that using our proposed VMs migration timing optimization technique also could save up to 51% of migration time of the VMs migration time for memory intensive workloads and up to 27% of the migration time for network intensive workloads. This timing optimization technique can be useful for network admins to save migration time with utilizing higher network rate and higher probability of success.
At the end of this thesis, we discuss the persistent memory technology as a new trend in servers memory technology. Persistent memory modes of operation and configurations are discussed in detail to explain how live migration works between servers with different memory configuration set up. Then, we build a VMware cluster with persistent memory inside server and also with DRAM only servers to show the live migration cost difference between the VMs with DRAM only versus the VMs with persistent memory inside.
Identity management is at the forefront of applications’ security posture. It separates the unauthorised user from the legitimate individual. Identity management models have evolved from the isolated to the centralised paradigm and identity federations. Within this advancement, the identity provider emerged as a trusted third party that holds a powerful position. Allen postulated the novel self-sovereign identity paradigm to establish a new balance. Thus, extensive research is required to comprehend its virtues and limitations. Analysing the new paradigm, initially, we investigate the blockchain-based self-sovereign identity concept structurally. Moreover, we examine trust requirements in this context by reference to patterns. These shapes comprise major entities linked by a decentralised identity provider. By comparison to the traditional models, we conclude that trust in credential management and authentication is removed. Trust-enhancing attribute aggregation based on multiple attribute providers provokes a further trust shift. Subsequently, we formalise attribute assurance trust modelling by a metaframework. It encompasses the attestation and trust network as well as the trust decision process, including the trust function, as central components. A secure attribute assurance trust model depends on the security of the trust function. The trust function should consider high trust values and several attribute authorities. Furthermore, we evaluate classification, conceptual study, practical analysis and simulation as assessment strategies of trust models. For realising trust-enhancing attribute aggregation, we propose a probabilistic approach. The method exerts the principle characteristics of correctness and validity. These values are combined for one provider and subsequently for multiple issuers. We embed this trust function in a model within the self-sovereign identity ecosystem. To practically apply the trust function and solve several challenges for the service provider that arise from adopting self-sovereign identity solutions, we conceptualise and implement an identity broker. The mediator applies a component-based architecture to abstract from a single solution. Standard identity and access management protocols build the interface for applications. We can conclude that the broker’s usage at the side of the service provider does not undermine self-sovereign principles, but fosters the advancement of the ecosystem. The identity broker is applied to sample web applications with distinct attribute requirements to showcase usefulness for authentication and attribute-based access control within a case study.
Scrollytellings are an innovative form of web content. Combining the benefits of books, images, movies, and video games, they are a tool to tell compelling stories and provide excellent learning opportunities. Due to their multi-modality, creating high-quality scrollytellings is not an easy task. Different professions, such as content designers, graphics designers, and developers, need to collaborate to get the best out of the possibilities the scrollytelling format provides. Collaboration unlocks great potential. However, content designers cannot create scrollytellings directly and always need to consult with developers to implement their vision. This can result in misunderstandings. Often, the resulting scrollytelling will not match the designer’s vision sufficiently, causing unnecessary iterations. Our project partner Typeshift specializes in the creation of individualized scrollytellings for their clients. Examined existing solutions for authoring interactive content are not optimally suited for creating highly customized scrollytellings while still being able to manipulate all their elements programmatically. Based on their experience and expertise, we developed an editor to author scrollytellings in the lively.next live-programming environment. In this environment, a graphical user interface for content design is combined with powerful possibilities for programming behavior with the morphic system. The editor allows content designers to take on large parts of the creation process of scrollytellings on their own, such as creating the visible elements, animating content, and fine-tuning the scrollytelling. Hence, developers can focus on interactive elements such as simulations and games. Together with Typeshift, we evaluated the tool by recreating an existing scrollytelling and identified possible future enhancements. Our editor streamlines the creation process of scrollytellings. Content designers and developers can now both work on the same scrollytelling. Due to the editor inside of the lively.next environment, they can both work with a set of tools familiar to them and their traits. Thus, we mitigate unnecessary iterations and misunderstandings by enabling content designers to realize large parts of their vision of a scrollytelling on their own. Developers can add advanced and individual behavior. Thus, developers and content designers benefit from a clearer distribution of tasks while keeping the benefits of collaboration.
Traditional organizations are strongly encouraged by emerging digital customer behavior and digital competition to transform their businesses for the digital age. Incumbents are particularly exposed to the field of tension between maintaining and renewing their business model. Banking is one of the industries most affected by digitalization, with a large stream of digital innovations around Fintech. Most research contributions focus on digital innovations, such as Fintech, but there are only a few studies on the related challenges and perspectives of incumbent organizations, such as traditional banks. Against this background, this dissertation examines the specific causes, effects and solutions for traditional banks in digital transformation − an underrepresented research area so far.
The first part of the thesis examines how digitalization has changed the latent customer expectations in banking and studies the underlying technological drivers of evolving business-to-consumer (B2C) business models. Online consumer reviews are systematized to identify latent concepts of customer behavior and future decision paths as strategic digitalization effects. Furthermore, the service attribute preferences, the impact of influencing factors and the underlying customer segments are uncovered for checking accounts in a discrete choice experiment. The dissertation contributes here to customer behavior research in digital transformation, moving beyond the technology acceptance model. In addition, the dissertation systematizes value proposition types in the evolving discourse around smart products and services as key drivers of business models and market power in the platform economy.
The second part of the thesis focuses on the effects of digital transformation on the strategy development of financial service providers, which are classified along with their firm performance levels. Standard types are derived based on fuzzy-set qualitative comparative analysis (fsQCA), with facade digitalization as one typical standard type for low performing incumbent banks that lack a holistic strategic response to digital transformation. Based on this, the contradictory impact of digitalization measures on key business figures is examined for German savings banks, confirming that the shift towards digital customer interaction was not accompanied by new revenue models diminishing bank profitability. The dissertation further contributes to the discourse on digitalized work designs and the consequences for job perceptions in banking customer advisory. The threefold impact of the IT support perceived in customer interaction on the job satisfaction of customer advisors is disentangled.
In the third part of the dissertation, solutions are developed design-oriented for core action areas of digitalized business models, i.e., data and platforms. A consolidated taxonomy for data-driven business models and a future reference model for digital banking have been developed. The impact of the platform economy is demonstrated here using the example of the market entry by Bigtech. The role-based e3-value modeling is extended by meta-roles and role segments and linked to value co-creation mapping in VDML. In this way, the dissertation extends enterprise modeling research on platform ecosystems and value co-creation using the example of banking.
The intensity of cosmic radiation may differ over five orders of magnitude within a few hours or days during the Solar Particle Events (SPEs), thus increasing for several orders of magnitude the probability of Single Event Upsets (SEUs) in space-borne electronic systems. Therefore, it is vital to enable the early detection of the SEU rate changes in order to ensure timely activation of dynamic radiation hardening measures. In this paper, an embedded approach for the prediction of SPEs and SRAM SEU rate is presented. The proposed solution combines the real-time SRAM-based SEU monitor, the offline-trained machine learning model and online learning algorithm for the prediction. With respect to the state-of-the-art, our solution brings the following benefits: (1) Use of existing on-chip data storage SRAM as a particle detector, thus minimizing the hardware and power overhead, (2) Prediction of SRAM SEU rate one hour in advance, with the fine-grained hourly tracking of SEU variations during SPEs as well as under normal conditions, (3) Online optimization of the prediction model for enhancing the prediction accuracy during run-time, (4) Negligible cost of hardware accelerator design for the implementation of selected machine learning model and online learning algorithm. The proposed design is intended for a highly dependable and self-adaptive multiprocessing system employed in space applications, allowing to trigger the radiation mitigation mechanisms before the onset of high radiation levels.
Complex networks like the Internet or social networks are fundamental parts of our everyday lives. It is essential to understand their structural properties and how these networks are formed. A game-theoretic approach to network design problems has become of high interest in the last decades. The reason is that many real-world networks are the outcomes of decentralized strategic behavior of independent agents without central coordination. Fabrikant, Luthra, Maneva, Papadimitriou, and Schenker proposed a game-theoretic model aiming to explain the formation of the Internet-like networks. In this model, called the Network Creation Game, agents are associated with nodes of a network. Each agent seeks to maximize her centrality by establishing costly connections to other agents. The model is relatively simple but shows a high potential in modeling complex real-world networks. In this thesis, we contribute to the line of research on variants of the Network Creation Games. Inspired by real-world networks, we propose and analyze several novel network creation models. We aim to understand the impact of certain realistic modeling assumptions on the structure of the created networks and the involved agents’ behavior.
The first natural additional objective that we consider is the network’s robustness. We consider a game where the agents seek to maximize their centrality and, at the same time, the stability of the created network against random edge failure.
Our second point of interest is a model that incorporates an underlying geometry. We consider a network creation model where the agents correspond to points in some underlying space and where edge lengths are equal to the distances between the endpoints in that space. The geometric setting captures many physical real-world networks like transport networks and fiber-optic communication networks.
We focus on the formation of social networks and consider two models that incorporate particular realistic behavior observed in real-world networks. In the first model, we embed the anti-preferential attachment link formation. Namely, we assume that the cost of the connection is proportional to the popularity of the targeted agent. Our second model is based on the observation that the probability of two persons to connect is inversely proportional to the length of their shortest chain of mutual acquaintances.
For each of the four models above, we provide a complete game-theoretical analysis. In particular, we focus on distinctive structural properties of the equilibria, the hardness of computing a best response, the quality of equilibria in comparison to the centrally designed socially optimal networks. We also analyze the game dynamics, i.e., the process of sequential strategic improvements by the agents, and analyze the convergence to an equilibrium state and its properties.