TY - JOUR
A1 - Combi, Carlo
A1 - Oliboni, Barbara
A1 - Weske, Mathias
A1 - Zerbato, Francesca
T1 - Seamless conceptual modeling of processes with transactional and analytical data
JF - Data & knowledge engineering
N2 - In the field of Business Process Management (BPM), modeling business processes and related data is a critical issue since process activities need to manage data stored in databases. The connection between processes and data is usually handled at the implementation level, even if modeling both processes and data at the conceptual level should help designers in improving business process models and identifying requirements for implementation. Especially in data -and decision-intensive contexts, business process activities need to access data stored both in databases and data warehouses. In this paper, we complete our approach for defining a novel conceptual view that bridges process activities and data. The proposed approach allows the designer to model the connection between business processes and database models and define the operations to perform, providing interesting insights on the overall connected perspective and hints for identifying activities that are crucial for decision support.
KW - Conceptual modeling
KW - Business process modeling
KW - BPMN
KW - Data modeling
KW - Data warehouse
KW - Decision support
Y1 - 2021
U6 - https://doi.org/10.1016/j.datak.2021.101895
SN - 0169-023X
SN - 1872-6933
VL - 134
PB - Elsevier
CY - Amsterdam
ER -
TY - JOUR
A1 - Schladebach, Marcus
T1 - Satelliten-Megakonstellationen im Weltraumrecht
JF - Kommunikation & Recht : K & R / Beihefter
Y1 - 2022
SN - 1434-6354
IS - 2
SP - 26
EP - 29
PB - dfv-Mediengruppe
CY - Frankfurt am Main
ER -
TY - JOUR
A1 - Körppen, Tim
A1 - Ullrich, André
A1 - Bertheau, Clementine
T1 - Durchblick statt Bauchgefühl – Transformation zur Data-Driven Organization
JF - Wirtschaftsinformatik & Management
N2 - Um in der digitalisierten Wirtschaft mitzuspielen, müssen Unternehmen, Markt und insbesondere Kunden detailliert verstanden werden. Neben den „Big Playern“ aus dem Silicon Valley sieht der deutsche Mittelstand, der zu großen Teilen noch auf gewachsenen IT-Infrastrukturen und Prozessen agiert, oft alt aus. Um in den nächsten Jahren nicht gänzlich abgehängt zu werden, ist ein Umbruch notwendig. Sowohl Leistungserstellungsprozesse als auch Leistungsangebot müssen transparent und datenbasiert ausgerichtet werden. Nur so können Geschäftsvorfälle, das Marktgeschehen sowie Handeln der Akteure integrativ bewertet und fundierte Entscheidungen getroffen werden. In diesem Beitrag wird das Konzept der Data-Driven Organization vorgestellt und aufgezeigt, wie Unternehmen den eigenen Analyticsreifegrad ermitteln und in einem iterativen Transformationsprozess steigern können.
Y1 - 2021
U6 - https://doi.org/10.1365/s35764-021-00370-7
SN - 1867-5905
VL - 13
IS - 6
SP - 452
EP - 459
PB - Springer Gabler
CY - Wiesbaden
ER -
TY - JOUR
A1 - Ullrich, André
A1 - Teichmann, Malte
A1 - Gronau, Norbert
T1 - Fast trainable capabilities in software engineering-skill development in learning factories
JF - Ji suan ji jiao yu = Computer Education / Qing hua da xue
N2 - The increasing demand for software engineers cannot completely be fulfilled by university education and conventional training approaches due to limited capacities. Accordingly, an alternative approach is necessary where potential software engineers are being educated in software engineering skills using new methods. We suggest micro tasks combined with theoretical lessons to overcome existing skill deficits and acquire fast trainable capabilities. This paper addresses the gap between demand and supply of software engineers by introducing an actionoriented and scenario-based didactical approach, which enables non-computer scientists to code. Therein, the learning content is provided in small tasks and embedded in learning factory scenarios. Therefore, different requirements for software engineers from the market side and from an academic viewpoint are analyzed and synthesized into an integrated, yet condensed skills catalogue. This enables the development of training and education units that focus on the most important skills demanded on the market. To achieve this objective, individual learning scenarios are developed. Of course, proper basic skills in coding cannot be learned over night but software programming is also no sorcery.
KW - learning factory
KW - programming skills
KW - software engineering
KW - training
Y1 - 2021
U6 - https://doi.org/10.16512/j.cnki.jsjjy.2020.12.002
SN - 1672-5913
IS - 12
SP - 2
EP - 10
PB - [Verlag nicht ermittelbar]
CY - Bei jing shi
ER -
TY - JOUR
A1 - Roostapour, Vahid
A1 - Neumann, Aneta
A1 - Neumann, Frank
A1 - Friedrich, Tobias
T1 - Pareto optimization for subset selection with dynamic cost constraints
JF - Artificial intelligence
N2 - We consider the subset selection problem for function f with constraint bound B that changes over time. Within the area of submodular optimization, various greedy approaches are commonly used. For dynamic environments we observe that the adaptive variants of these greedy approaches are not able to maintain their approximation quality. Investigating the recently introduced POMC Pareto optimization approach, we show that this algorithm efficiently computes a phi=(alpha(f)/2)(1 - 1/e(alpha)f)-approximation, where alpha(f) is the submodularity ratio of f, for each possible constraint bound b <= B. Furthermore, we show that POMC is able to adapt its set of solutions quickly in the case that B increases. Our experimental investigations for the influence maximization in social networks show the advantage of POMC over generalized greedy algorithms. We also consider EAMC, a new evolutionary algorithm with polynomial expected time guarantee to maintain phi approximation ratio, and NSGA-II with two different population sizes as advanced multi-objective optimization algorithm, to demonstrate their challenges in optimizing the maximum coverage problem. Our empirical analysis shows that, within the same number of evaluations, POMC is able to perform as good as NSGA-II under linear constraint, while EAMC performs significantly worse than all considered algorithms in most cases.
KW - Subset selection
KW - Submodular function
KW - Multi-objective optimization
KW - Runtime analysis
Y1 - 2022
U6 - https://doi.org/10.1016/j.artint.2021.103597
SN - 0004-3702
SN - 1872-7921
VL - 302
PB - Elsevier
CY - Amsterdam
ER -
TY - CHAP
A1 - Krasnova, Hanna
A1 - Gundlach, Jana
A1 - Baumann, Annika
T1 - Coming back for more
BT - the effect of news feed serendipity on social networking site sage
T2 - PACIS 2022 proceedings
N2 - Recent spikes in social networking site (SNS) usage times have launched investigations into reasons for excessive SNS usage. Extending research on social factors (i.e., fear of missing out), this study considers the News Feed setup. More specifically, we suggest that the order of the News Feed (chronological vs. algorithmically assembled posts) affects usage behaviors. Against the background of the variable reward schedule, this study hypothesizes that the different orders exert serendipity differently. Serendipity, termed as unexpected lucky encounters with information, resembles variable rewards. Studies have evidenced a relation between variable rewards and excessive behaviors. Similarly, we hypothesize that order-induced serendipitous encounters affect SNS usage times and explore this link in a two-wave survey with an experimental setup (users using either chronological or algorithmic News Feeds). While theoretically extending explanations for increased SNS usage times by considering the News Feed order, practically the study will offer recommendations for relevant stakeholders.
Y1 - 2022
UR - https://aisel.aisnet.org/pacis2022/271
SN - 9781958200018
PB - AIS Electronic Library (AISeL)
CY - [Erscheinungsort nicht ermittelbar]
ER -
TY - CHAP
A1 - Abramova, Olga
A1 - Gundlach, Jana
A1 - Bilda, Juliane
T1 - Understanding the role of newsfeed clutter in stereotype activation
BT - the case of Facebook
T2 - PACIS 2021 proceedings
N2 - Despite the phenomenal growth of Big Data Analytics in the last few years, little research is done to explicate the relationship between Big Data Analytics Capability (BDAC) and indirect strategic value derived from such digital capabilities. We attempt to address this gap by proposing a conceptual model of the BDAC - Innovation relationship using dynamic capability theory. The work expands on BDAC business value research and extends the nominal research done on BDAC – innovation. We focus on BDAC's relationship with different innovation objects, namely product, business process, and business model innovation, impacting all value chain activities. The insights gained will stimulate academic and practitioner interest in explicating strategic value generated from BDAC and serve as a framework for future research on the subject
Y1 - 2021
UR - https://aisel.aisnet.org/pacis2021/79
SN - 978-1-7336325-7-7
IS - 473
PB - AIS Electronic Library (AISeL)
CY - [Erscheinungsort nicht ermittelbar]
ER -
TY - JOUR
A1 - Ndashimye, Felix
A1 - Hebie, Oumarou
A1 - Tjaden, Jasper
T1 - Effectiveness of WhatsApp for measuring migration in follow-up phone surveys
BT - lessons from a mode experiment in two low-income countries during COVID contact restrictions
JF - Social science computer review
N2 - Phone surveys have increasingly become important data collection tools in developing countries, particularly in the context of sudden contact restrictions due to the COVID-19 pandemic. So far, there is limited evidence regarding the potential of the messenger service WhatsApp for remote data collection despite its large global coverage and expanding membership. WhatsApp may offer advantages in terms of reducing panel attrition and cutting survey costs. WhatsApp may offer additional benefits to migration scholars interested in cross-border migration behavior which is notoriously difficult to measure using conventional face-to-face surveys. In this field experiment, we compared the response rates between WhatsApp and interactive voice response (IVR) modes using a sample of 8446 contacts in Senegal and Guinea. At 12%, WhatsApp survey response rates were nearly eight percentage points lower than IVR survey response rates. However, WhatsApp offers higher survey completion rates, substantially lower costs and does not introduce more sample selection bias compared to IVR. We discuss the potential of WhatsApp surveys in low-income contexts and provide practical recommendations for field implementation.
KW - WhatsApp
KW - survey mode
KW - migration
KW - Covid
KW - phone
Y1 - 2022
U6 - https://doi.org/10.1177/08944393221111340
SN - 0894-4393
SN - 1552-8286
PB - Sage
CY - Thousand Oaks
ER -
TY - JOUR
A1 - Spiekermann, Sarah
A1 - Krasnova, Hanna
A1 - Hinz, Oliver
A1 - Baumann, Annika
A1 - Benlian, Alexander
A1 - Gimpel, Henner
A1 - Heimbach, Irina
A1 - Koester, Antonia
A1 - Maedche, Alexander
A1 - Niehaves, Bjoern
A1 - Risius, Marten
A1 - Trenz, Manuel
T1 - Values and ethics in information systems
BT - a state-of-the-art analysis and avenues for future research
JF - Business & information systems engineering
Y1 - 2022
U6 - https://doi.org/10.1007/s12599-021-00734-8
SN - 2363-7005
SN - 1867-0202
VL - 64
IS - 2
SP - 247
EP - 264
PB - Springer Gabler
CY - Wiesbaden
ER -
TY - JOUR
A1 - Cseh, Ágnes
A1 - Juhos, Attila
T1 - Pairwise preferences in the stable marriage problem
JF - ACM Transactions on Economics and Computation / Association for Computing Machinery
N2 - We study the classical, two-sided stable marriage problem under pairwise preferences. In the most general setting, agents are allowed to express their preferences as comparisons of any two of their edges, and they also have the right to declare a draw or even withdraw from such a comparison. This freedom is then gradually restricted as we specify six stages of orderedness in the preferences, ending with the classical case of strictly ordered lists. We study all cases occurring when combining the three known notions of stability-weak, strong, and super-stability-under the assumption that each side of the bipartite market obtains one of the six degrees of orderedness. By designing three polynomial algorithms and two NP-completeness proofs, we determine the complexity of all cases not yet known and thus give an exact boundary in terms of preference structure between tractable and intractable cases.
KW - Stable marriage
KW - intransitivity
KW - acyclic preferences
KW - poset
KW - weakly
KW - stable matching
KW - strongly stable matching
KW - super stable matching
Y1 - 2021
U6 - https://doi.org/10.1145/3434427
SN - 2167-8375
SN - 2167-8383
VL - 9
IS - 1
PB - Association for Computing Machinery
CY - New York
ER -
TY - JOUR
A1 - Cseh, Ágnes
A1 - Kavitha, Telikepalli
T1 - Popular matchings in complete graphs
JF - Algorithmica : an international journal in computer science
N2 - Our input is a complete graph G on n vertices where each vertex has a strict ranking of all other vertices in G. The goal is to construct a matching in G that is popular. A matching M is popular if M does not lose a head-to-head election against any matching M ': here each vertex casts a vote for the matching in {M,M '} in which it gets a better assignment. Popular matchings need not exist in the given instance G and the popular matching problem is to decide whether one exists or not. The popular matching problem in G is easy to solve for odd n. Surprisingly, the problem becomes NP-complete for even n, as we show here. This is one of the few graph theoretic problems efficiently solvable when n has one parity and NP-complete when n has the other parity.
KW - Popular matching
KW - Complexity
KW - Stable matching
Y1 - 2021
U6 - https://doi.org/10.1007/s00453-020-00791-7
SN - 0178-4617
SN - 1432-0541
VL - 83
IS - 5
SP - 1493
EP - 1523
PB - Springer
CY - New York
ER -
TY - JOUR
A1 - Brede, Nuria
A1 - Botta, Nicola
T1 - On the correctness of monadic backward induction
JF - Journal of functional programming
N2 - In control theory, to solve a finite-horizon sequential decision problem (SDP) commonly means to find a list of decision rules that result in an optimal expected total reward (or cost) when taking a given number of decision steps. SDPs are routinely solved using Bellman's backward induction. Textbook authors (e.g. Bertsekas or Puterman) typically give more or less formal proofs to show that the backward induction algorithm is correct as solution method for deterministic and stochastic SDPs. Botta, Jansson and Ionescu propose a generic framework for finite horizon, monadic SDPs together with a monadic version of backward induction for solving such SDPs. In monadic SDPs, the monad captures a generic notion of uncertainty, while a generic measure function aggregates rewards. In the present paper, we define a notion of correctness for monadic SDPs and identify three conditions that allow us to prove a correctness result for monadic backward induction that is comparable to textbook correctness proofs for ordinary backward induction. The conditions that we impose are fairly general and can be cast in category-theoretical terms using the notion of Eilenberg-Moore algebra. They hold in familiar settings like those of deterministic or stochastic SDPs, but we also give examples in which they fail. Our results show that backward induction can safely be employed for a broader class of SDPs than usually treated in textbooks. However, they also rule out certain instances that were considered admissible in the context of Botta et al. 's generic framework. Our development is formalised in Idris as an extension of the Botta et al. framework and the sources are available as supplementary material.
Y1 - 2021
U6 - https://doi.org/10.1017/S0956796821000228
SN - 1469-7653
SN - 0956-7968
VL - 31
PB - Cambridge University Press
CY - Cambridge
ER -
TY - JOUR
A1 - Stauffer, Maxime
A1 - Mengesha, Isaak
A1 - Seifert, Konrad
A1 - Krawczuk, Igor
A1 - Fischer, Jens
A1 - Serugendo, Giovanna Di Marzo
T1 - A computational turn in policy process studies
BT - coevolving network dynamics of policy change
JF - Complexity
N2 - The past three decades of policy process studies have seen the emergence of a clear intellectual lineage with regard to complexity. Implicitly or explicitly, scholars have employed complexity theory to examine the intricate dynamics of collective action in political contexts. However, the methodological counterparts to complexity theory, such as computational methods, are rarely used and, even if they are, they are often detached from established policy process theory. Building on a critical review of the application of complexity theory to policy process studies, we present and implement a baseline model of policy processes using the logic of coevolving networks. Our model suggests that an actor's influence depends on their environment and on exogenous events facilitating dialogue and consensus-building. Our results validate previous opinion dynamics models and generate novel patterns. Our discussion provides ground for further research and outlines the path for the field to achieve a computational turn.
Y1 - 2022
U6 - https://doi.org/10.1155/2022/8210732
SN - 1076-2787
SN - 1099-0526
VL - 2022
PB - Wiley-Hindawi
CY - London
ER -
TY - JOUR
A1 - Wendering, Philipp
A1 - Nikoloski, Zoran
T1 - COMMIT
BT - Consideration of metabolite leakage and community composition improves microbial community reconstructions
JF - PLoS Computational Biology : a new community journal / publ. by the Public Library of Science (PLoS) in association with the International Society for Computational Biology (ISCB)
N2 - Composition and functions of microbial communities affect important traits in diverse hosts, from crops to humans. Yet, mechanistic understanding of how metabolism of individual microbes is affected by the community composition and metabolite leakage is lacking. Here, we first show that the consensus of automatically generated metabolic reconstructions improves the quality of the draft reconstructions, measured by comparison to reference models. We then devise an approach for gap filling, termed COMMIT, that considers metabolites for secretion based on their permeability and the composition of the community. By applying COMMIT with two soil communities from the Arabidopsis thaliana culture collection, we could significantly reduce the gap-filling solution in comparison to filling gaps in individual reconstructions without affecting the genomic support. Inspection of the metabolic interactions in the soil communities allows us to identify microbes with community roles of helpers and beneficiaries. Therefore, COMMIT offers a versatile fully automated solution for large-scale modelling of microbial communities for diverse biotechnological applications.
Author summaryMicrobial communities are important in ecology, human health, and crop productivity. However, detailed information on the interactions within natural microbial communities is hampered by the community size, lack of detailed information on the biochemistry of single organisms, and the complexity of interactions between community members. Metabolic models are comprised of biochemical reaction networks based on the genome annotation, and can provide mechanistic insights into community functions. Previous analyses of microbial community models have been performed with high-quality reference models or models generated using a single reconstruction pipeline. However, these models do not contain information on the composition of the community that determines the metabolites exchanged between the community members. In addition, the quality of metabolic models is affected by the reconstruction approach used, with direct consequences on the inferred interactions between community members. Here, we use fully automated consensus reconstructions from four approaches to arrive at functional models with improved genomic support while considering the community composition. We applied our pipeline to two soil communities from the Arabidopsis thaliana culture collection, providing only genome sequences. Finally, we show that the obtained models have 90% genomic support and demonstrate that the derived interactions are corroborated by independent computational predictions.
Y1 - 2022
U6 - https://doi.org/10.1371/journal.pcbi.1009906
SN - 1553-734X
SN - 1553-7358
VL - 18
IS - 3
PB - Public Library of Science
CY - San Fransisco
ER -
TY - JOUR
A1 - Benlian, Alexander
A1 - Wiener, Martin
A1 - Cram, W. Alec
A1 - Krasnova, Hanna
A1 - Maedche, Alexander
A1 - Mohlmann, Mareike
A1 - Recker, Jan
A1 - Remus, Ulrich
T1 - Algorithmic management
BT - bright and dark sides, practical implications, and research opportunities
JF - Business and information systems engineering
Y1 - 2022
U6 - https://doi.org/10.1007/s12599-022-00764-w
SN - 2363-7005
SN - 1867-0202
VL - 64
IS - 6
SP - 825
EP - 839
PB - Springer Gabler
CY - Wiesbaden
ER -
TY - JOUR
A1 - Benson, Lawrence
A1 - Makait, Hendrik
A1 - Rabl, Tilmann
T1 - Viper
BT - An Efficient Hybrid PMem-DRAM Key-Value Store
JF - Proceedings of the VLDB Endowment
N2 - Key-value stores (KVSs) have found wide application in modern software systems. For persistence, their data resides in slow secondary storage, which requires KVSs to employ various techniques to increase their read and write performance from and to the underlying medium. Emerging persistent memory (PMem) technologies offer data persistence at close-to-DRAM speed, making them a promising alternative to classical disk-based storage. However, simply drop-in replacing existing storage with PMem does not yield good results, as block-based access behaves differently in PMem than on disk and ignores PMem's byte addressability, layout, and unique performance characteristics. In this paper, we propose three PMem-specific access patterns and implement them in a hybrid PMem-DRAM KVS called Viper. We employ a DRAM-based hash index and a PMem-aware storage layout to utilize the random-write speed of DRAM and efficient sequential-write performance PMem. Our evaluation shows that Viper significantly outperforms existing KVSs for core KVS operations while providing full data persistence. Moreover, Viper outperforms existing PMem-only, hybrid, and disk-based KVSs by 4-18x for write workloads, while matching or surpassing their get performance.
KW - memory
Y1 - 2021
U6 - https://doi.org/10.14778/3461535.3461543
SN - 2150-8097
VL - 14
IS - 9
SP - 1544
EP - 1556
PB - Association for Computing Machinery
CY - New York
ER -
TY - CHAP
A1 - Sultanow, Eldar
A1 - Chircu, Alina
A1 - Wüstemann, Stefanie
A1 - Schwan, André
A1 - Lehmann, Andreas
A1 - Sept, André
A1 - Szymaski, Oliver
A1 - Venkatesan, Sripriya
A1 - Ritterbusch, Georg David
A1 - Teichmann, Malte Rolf
T1 - Metaverse opportunities for the public sector
T2 - International Conference on Information Systems 2022 : Special Interest Group on Big Data : Proceedings
N2 - The metaverse is envisioned as a virtual shared space facilitated by emerging technologies such as virtual reality (VR), augmented reality (AR), the Internet of Things (IoT), 5G, artificial intelligence (AI), big data, spatial computing, and digital twins (Allam et al., 2022; Dwivedi et al., 2022; Ravenscraft, 2022; Wiles, 2022). While still a nascent concept, the metaverse has the potential to “transform the physical world, as well as transport or extend physical activities to a virtual world” (Wiles, 2022). Big data technologies will also be essential in managing the enormous amounts of data created in the metaverse (Sun et al., 2022). Metaverse technologies can offer the public sector a host of benefits, such as simplified information exchange, stronger communication with citizens, better access to public services, or benefiting from a new virtual economy. Implementations are underway in several cities around the world (Geraghty et al., 2022). In this paper, we analyze metaverse opportunities for the public sector and explore their application in the context of Germany’s Federal Employment Agency. Based on an analysis of academic literature and practical examples, we create a capability map for potential metaverse business capabilities for different areas of the public sector (broadly defined). These include education (virtual training and simulation, digital campuses that offer not just online instruction but a holistic university campus experience, etc.), tourism (virtual travel to remote locations and museums, virtual festival participation, etc.), health (employee training – as for emergency situations, virtual simulations for patient treatment – for example, for depression or anxiety, etc.), military (virtual training to experience operational scenarios without being exposed to a real-world threats, practice strategic decision-making, or gain technical knowledge for operating and repairing equipment, etc.), administrative services (document processing, virtual consultations for citizens, etc.), judiciary (AI decision-making aids, virtual proceedings, etc.), public safety (virtual training for procedural issues, special operations, or unusual situations, etc.), emergency management (training for natural disasters, etc.), and city planning (visualization of future development projects and interactive feedback, traffic management, attraction gamification, etc.), among others. We further identify several metaverse application areas for Germany's Federal Employment Agency. These applications can help it realize the goals of the German government for digital transformation that enables faster, more effective, and innovative government services. They include training of employees, training of customers, and career coaching for customers. These applications can be implemented using interactive learning games with AI agents, virtual representations of the organizational spaces, and avatars interacting with each other in these spaces. Metaverse applications will both use big data (to design the virtual environments) and generate big data (from virtual interactions). Issues related to data availability, quality, storage, processing (and related computing power requirements), interoperability, sharing, privacy and security will need to be addressed in these emerging metaverse applications (Sun et al., 2022). Special attention is needed to understand the potential for power inequities (wealth inequity, algorithmic bias, digital exclusion) due to technologies such as VR (Egliston & Carter, 2021), harmful surveillance practices (Bibri & Allam, 2022), and undesirable user behavior or negative psychological impacts (Dwivedi et al., 2022). The results of this exploratory study can inform public sector organizations of emerging metaverse opportunities and enable them to develop plans for action as more of the metaverse technologies become a reality. While the metaverse body of research is still small and research agendas are only now starting to emerge (Dwivedi et al., 2022), this study offers a building block for future development and analysis of metaverse applications.
Y1 - 2022
UR - https://aisel.aisnet.org/sigbd2022/5/
PB - AIS
CY - Atlanta
ER -
TY - CHAP
A1 - Krause, Hannes-Vincent
A1 - Baumann, Annika
T1 - The devil in disguise
BT - malicious envy’s impact on harmful interactions between social networking site users
T2 - ICIS 2021: user behaviors, engagement, and consequences
N2 - Envy constitutes a serious issue on Social Networking Sites (SNSs), as this painful emotion can severely diminish individuals' well-being. With prior research mainly focusing on the affective consequences of envy in the SNS context, its behavioral consequences remain puzzling. While negative interactions among SNS users are an alarming issue, it remains unclear to which extent the harmful emotion of malicious envy contributes to these toxic dynamics. This study constitutes a first step in understanding malicious envy’s causal impact on negative interactions within the SNS sphere. Within an online experiment, we experimentally induce malicious envy and measure its immediate impact on users’ negative behavior towards other users. Our findings show that malicious envy seems to be an essential factor fueling negativity among SNS users and further illustrate that this effect is especially pronounced when users are provided an objective factor to mask their envy and justify their norm-violating negative behavior.
Y1 - 2021
UR - https://aisel.aisnet.org/icis2021/user_behaivors/user_behaivors/21
PB - AIS Electronic Library (AISeL)
CY - [Erscheinungsort nicht ermittelbar]
ER -
TY - JOUR
A1 - Seewann, Lena
A1 - Verwiebe, Roland
A1 - Buder, Claudia
A1 - Fritsch, Nina-Sophie
T1 - “Broadcast your gender.”
BT - A comparison of four text-based classification methods of German YouTube channels
JF - Frontiers in Big Data
N2 - Social media platforms provide a large array of behavioral data relevant to social scientific research. However, key information such as sociodemographic characteristics of agents are often missing. This paper aims to compare four methods of classifying social attributes from text. Specifically, we are interested in estimating the gender of German social media creators. By using the example of a random sample of 200 YouTube channels, we compare several classification methods, namely (1) a survey among university staff, (2) a name dictionary method with the World Gender Name Dictionary as a reference list, (3) an algorithmic approach using the website gender-api.com, and (4) a Multinomial Naïve Bayes (MNB) machine learning technique. These different methods identify gender attributes based on YouTube channel names and descriptions in German but are adaptable to other languages. Our contribution will evaluate the share of identifiable channels, accuracy and meaningfulness of classification, as well as limits and benefits of each approach. We aim to address methodological challenges connected to classifying gender attributes for YouTube channels as well as related to reinforcing stereotypes and ethical implications.
KW - text based classification methods
KW - gender
KW - YouTube
KW - machine learning
KW - authorship attribution
Y1 - 2022
U6 - https://doi.org/10.3389/fdata.2022.908636
SN - 2624-909X
IS - 5
PB - Frontiers
CY - Lausanne, Schweiz
ER -
TY - JOUR
A1 - Wright, Michelle F.
A1 - Wachs, Sebastian
A1 - Harper, Bridgette D.
T1 - The moderation of empathy in the longitudinal association between witnessing cyberbullying, depression, and anxiety
JF - Journal of Psychosocial Research on Cyberspace
N2 - While the role of and consequences of being a bystander to face-to-face bullying has received some attention in the literature, to date, little is known about the effects of being a bystander to cyberbullying. It is also unknown how empathy might impact the negative consequences associated with being a bystander of cyberbullying. The present study focused on examining the longitudinal association between bystander of cyberbullying depression, and anxiety, and the moderating role of empathy in the relationship between bystander of cyberbullying and subsequent depression and anxiety. There were 1,090 adolescents (M-age = 12.19; 50% female) from the United States included at Time 1, and they completed questionnaires on empathy, cyberbullying roles (bystander, perpetrator, victim), depression, and anxiety. One year later, at Time 2, 1,067 adolescents (M-age = 13.76; 51% female) completed questionnaires on depression and anxiety. Results revealed a positive association between bystander of cyberbullying and depression and anxiety. Further, empathy moderated the positive relationship between bystander of cyberbullying and depression, but not for anxiety. Implications for intervention and prevention programs are discussed.
KW - Bystander
KW - cyberbullying
KW - empathy
KW - depression
KW - anxiety
KW - longitudinal
Y1 - 2018
U6 - https://doi.org/10.5817/CP2018-4-6
SN - 1802-7962
VL - 12
IS - 4
PB - Masrykova Univ.
CY - Brno
ER -
TY - JOUR
A1 - Xu, Rudan
A1 - Razaghi-Moghadam, Zahra
A1 - Nikoloski, Zoran
T1 - Maximization of non-idle enzymes improves the coverage of the estimated maximal in vivo enzyme catalytic rates in Escherichia coli
JF - Bioinformatics
N2 - Motivation:
Constraint-based modeling approaches allow the estimation of maximal in vivo enzyme catalytic rates that can serve as proxies for enzyme turnover numbers. Yet, genome-scale flux profiling remains a challenge in deploying these approaches to catalogue proxies for enzyme catalytic rates across organisms.
Results:
Here, we formulate a constraint-based approach, termed NIDLE-flux, to estimate fluxes at a genome-scale level by using the principle of efficient usage of expressed enzymes. Using proteomics data from Escherichia coli, we show that the fluxes estimated by NIDLE-flux and the existing approaches are in excellent qualitative agreement (Pearson correlation > 0.9). We also find that the maximal in vivo catalytic rates estimated by NIDLE-flux exhibits a Pearson correlation of 0.74 with in vitro enzyme turnover numbers. However, NIDLE-flux results in a 1.4-fold increase in the size of the estimated maximal in vivo catalytic rates in comparison to the contenders. Integration of the maximum in vivo catalytic rates with publically available proteomics and metabolomics data provide a better match to fluxes estimated by NIDLE-flux. Therefore, NIDLE-flux facilitates more effective usage of proteomics data to estimate proxies for kcatomes.
Y1 - 2021
U6 - https://doi.org/10.1093/bioinformatics/btab575
SN - 1367-4803
SN - 1460-2059
VL - 37
IS - 21
SP - 3848
EP - 3855
PB - Oxford Univ. Press
CY - Oxford
ER -
TY - JOUR
A1 - Angeleska, Angela
A1 - Omranian, Sara
A1 - Nikoloski, Zoran
T1 - Coherent network partitions
BT - Characterizations with cographs and prime graphs
JF - Theoretical computer science : the journal of the EATCS
N2 - We continue to study coherent partitions of graphs whereby the vertex set is partitioned into subsets that induce biclique spanned subgraphs. The problem of identifying the minimum number of edges to obtain biclique spanned connected components (CNP), called the coherence number, is NP-hard even on bipartite graphs. Here, we propose a graph transformation geared towards obtaining an O (log n)-approximation algorithm for the CNP on a bipartite graph with n vertices. The transformation is inspired by a new characterization of biclique spanned subgraphs. In addition, we study coherent partitions on prime graphs, and show that finding coherent partitions reduces to the problem of finding coherent partitions in a prime graph. Therefore, these results provide future directions for approximation algorithms for the coherence number of a given graph.
KW - Graph partitions
KW - Network clustering
KW - Cographs
KW - Coherent partition
KW - Prime graphs
Y1 - 2021
U6 - https://doi.org/10.1016/j.tcs.2021.10.002
SN - 0304-3975
VL - 894
SP - 3
EP - 11
PB - Elsevier
CY - Amsterdam [u.a.]
ER -
TY - JOUR
A1 - Chen, Junchao
A1 - Lange, Thomas
A1 - Andjelkovic, Marko
A1 - Simevski, Aleksandar
A1 - Lu, Li
A1 - Krstić, Miloš
T1 - Solar particle event and single event upset prediction from SRAM-based monitor and supervised machine learning
JF - IEEE transactions on emerging topics in computing / IEEE Computer Society, Institute of Electrical and Electronics Engineers
N2 - The intensity of cosmic radiation may differ over five orders of magnitude within a few hours or days during the Solar Particle Events (SPEs), thus increasing for several orders of magnitude the probability of Single Event Upsets (SEUs) in space-borne electronic systems. Therefore, it is vital to enable the early detection of the SEU rate changes in order to ensure timely activation of dynamic radiation hardening measures. In this paper, an embedded approach for the prediction of SPEs and SRAM SEU rate is presented. The proposed solution combines the real-time SRAM-based SEU monitor, the offline-trained machine learning model and online learning algorithm for the prediction. With respect to the state-of-the-art, our solution brings the following benefits: (1) Use of existing on-chip data storage SRAM as a particle detector, thus minimizing the hardware and power overhead, (2) Prediction of SRAM SEU rate one hour in advance, with the fine-grained hourly tracking of SEU variations during SPEs as well as under normal conditions, (3) Online optimization of the prediction model for enhancing the prediction accuracy during run-time, (4) Negligible cost of hardware accelerator design for the implementation of selected machine learning model and online learning algorithm. The proposed design is intended for a highly dependable and self-adaptive multiprocessing system employed in space applications, allowing to trigger the radiation mitigation mechanisms before the onset of high radiation levels.
KW - Machine learning
KW - Single event upsets
KW - Random access memory
KW - monitoring
KW - machine learning algorithms
KW - predictive models
KW - space missions
KW - solar particle event
KW - single event upset
KW - machine learning
KW - online learning
KW - hardware accelerator
KW - reliability
KW - self-adaptive multiprocessing system
Y1 - 2022
U6 - https://doi.org/10.1109/TETC.2022.3147376
SN - 2168-6750
VL - 10
IS - 2
SP - 564
EP - 580
PB - Institute of Electrical and Electronics Engineers
CY - [New York, NY]
ER -
TY - THES
A1 - Hosp, Sven
T1 - Modifizierte Cross-Party Codes zur schnellen Mehrbit-Fehlerkorrektur
Y1 - 2015
ER -
TY - JOUR
A1 - Taleb, Aiham
A1 - Rohrer, Csaba
A1 - Bergner, Benjamin
A1 - De Leon, Guilherme
A1 - Rodrigues, Jonas Almeida
A1 - Schwendicke, Falk
A1 - Lippert, Christoph
A1 - Krois, Joachim
T1 - Self-supervised learning methods for label-efficient dental caries classification
JF - Diagnostics : open access journal
N2 - High annotation costs are a substantial bottleneck in applying deep learning architectures to clinically relevant use cases, substantiating the need for algorithms to learn from unlabeled data.
In this work, we propose employing self-supervised methods. To that end, we trained with three self-supervised algorithms on a large corpus of unlabeled dental images, which contained 38K bitewing radiographs (BWRs). We then applied the learned neural network representations on tooth-level dental caries classification, for which we utilized labels extracted from electronic health records (EHRs). Finally, a holdout test-set was established, which consisted of 343 BWRs and was annotated by three dental professionals and approved by a senior dentist.
This test-set was used to evaluate the fine-tuned caries classification models. Our experimental results demonstrate the obtained gains by pretraining models using self-supervised algorithms. These include improved caries classification performance (6 p.p. increase in sensitivity) and, most importantly, improved label-efficiency.
In other words, the resulting models can be fine-tuned using few labels (annotations).
Our results show that using as few as 18 annotations can produce >= 45% sensitivity, which is comparable to human-level diagnostic performance.
This study shows that self-supervision can provide gains in medical image analysis, particularly when obtaining labels is costly and expensive.
KW - unsupervised methods
KW - self-supervised learning
KW - representation learning
KW - dental caries classification
KW - data driven approaches
KW - annotation
KW - efficient deep learning
Y1 - 2022
U6 - https://doi.org/10.3390/diagnostics12051237
SN - 2075-4418
VL - 12
IS - 5
PB - MDPI
CY - Basel
ER -
TY - CHAP
A1 - Vladova, Gergana
A1 - Ullrich, André
A1 - Sultanow, Eldar
A1 - Tobolla, Marinho
A1 - Sebrak, Sebastian
A1 - Czarnecki, Christian
A1 - Brockmann, Carsten
ED - Klein, Maike
ED - Krupka, Daniel
ED - Winter, Cornelia
ED - Wohlgemuth, Volker
T1 - Visual analytics for knowledge management
BT - advantages for organizations and interorganizational teams
T2 - Informatik 2023
N2 - The management of knowledge in organizations considers both established long-term
processes and cooperation in agile project teams. Since knowledge can be both tacit and explicit, its transfer from the individual to the organizational knowledge base poses a challenge in organizations. This challenge increases when the fluctuation of knowledge carriers is exceptionally high. Especially in large projects in which external consultants are involved, there is a risk that critical, company-relevant knowledge generated in the project will leave the company with the external knowledge carrier and thus be lost. In this paper, we show the advantages of an early warning system for knowledge management to avoid this loss. In particular, the potential of visual analytics in the context of knowledge management systems is presented and discussed. We present a project for the development of a business-critical software system and discuss the first implementations and results.
KW - knowledge management
KW - visual analytics
KW - knowledge transfer
KW - teamwork
KW - knowledge management system
KW - tacit knowledge
KW - explicit knowledge
Y1 - 2023
SN - 978-3-88579-731-9
U6 - https://doi.org/10.18420/inf2023_187
SN - 1617-5468
SP - 1851
EP - 1870
PB - Gesellschaft für Informatik e.V. (GI)
CY - Bonn
ER -
TY - JOUR
A1 - Hagemann, Linus
A1 - Abramova, Olga
T1 - Emotions and information diffusion on social media
BT - a replication in the context of political communication on Twitter
JF - AIS transactions on replication research
N2 - This paper presents a methodological and conceptual replication of Stieglitz and Dang-Xuan’s (2013) investigation of the role of sentiment in information-sharing behavior on social media. Whereas Stieglitz and Dang-Xuan (2013) focused on Twitter communication prior to the state parliament elections in the German states Baden-Wurttemberg, Rheinland-Pfalz, and Berlin in 2011, we test their theoretical propositions in the context of the state parliament elections in Saxony-Anhalt (Germany) 2021. We confirm the positive link between sentiment in a political Twitter message and its number of retweets in a methodological replication. In a conceptual replication, where sentiment was assessed with the alternative dictionary-based tool LIWC, the sentiment was negatively associated with the retweet volume. In line with the original study, the strength of association between sentiment and retweet time lag insignificantly differs between tweets with negative sentiment and tweets with positive sentiment. We also found that the number of an author’s followers was an essential determinant of sharing behavior. However, two hypotheses supported in the original study did not hold for our sample. Precisely, the total amount of sentiments was insignificantly linked to the time lag to the first retweet. Finally, in our data, we do not observe that the association between the overall sentiment and retweet quantity is stronger for tweets with negative sentiment than for those with positive sentiment.
KW - Twitter
KW - information diffusion
KW - sentiment
KW - elections
Y1 - 2023
U6 - https://doi.org/10.17705/1atrr.00079
SN - 2473-3458
VL - 9
IS - 1
SP - 1
EP - 19
PB - AIS
CY - Atlanta
ER -
TY - JOUR
A1 - Steinrötter, Björn
T1 - Das Konzept einer datenaltruistischen Organisation
JF - Datenschutz und Datensicherheit
N2 - Dass Technologien wie Machine Learning-Anwendungen oder Big bzw. Smart Data- Verfahren unbedingt Daten in ausreichender Menge und Güte benötigen, erscheint inzwischen als Binsenweisheit. Vor diesem Hintergrund hat insbesondere der EU-Gesetzgeber für sich zuletzt ein neues Betätigungsfeld entdeckt, indem er versucht, auf unterschiedlichen Wegen Anreize zum Datenteilen zu schaffen, um Innovation zu kreieren. Hierzu zählt auch eine geradezu wohltönend mit ,,Datenaltruismus‘‘ verschlagwortete Konstellation. Der Beitrag stellt die diesbezüglichen Regulierungserwägungen auf supranationaler Ebene dar und nimmt eine erste Analyse vor.
KW - coding and information theory
KW - computer science
KW - general
KW - cryptology
KW - data structures and information theory
Y1 - 2021
U6 - https://doi.org/10.1007/s11623-021-1539-6
SN - 1862-2607
SN - 1614-0702
VL - 45
IS - 12
SP - 794
EP - 798
PB - Springer
CY - Berlin
ER -
TY - JOUR
A1 - Puri, Manish
A1 - Varde, Aparna S.
A1 - Melo, Gerard de
T1 - Commonsense based text mining on urban policy
JF - Language resources and evaluation
N2 - Local laws on urban policy, i.e., ordinances directly affect our daily life in various ways (health, business etc.), yet in practice, for many citizens they remain impervious and complex. This article focuses on an approach to make urban policy more accessible and comprehensible to the general public and to government officials, while also addressing pertinent social media postings. Due to the intricacies of the natural language, ranging from complex legalese in ordinances to informal lingo in tweets, it is practical to harness human judgment here. To this end, we mine ordinances and tweets via reasoning based on commonsense knowledge so as to better account for pragmatics and semantics in the text. Ours is pioneering work in ordinance mining, and thus there is no prior labeled training data available for learning. This gap is filled by commonsense knowledge, a prudent choice in situations involving a lack of adequate training data. The ordinance mining can be beneficial to the public in fathoming policies and to officials in assessing policy effectiveness based on public reactions. This work contributes to smart governance, leveraging transparency in governing processes via public involvement. We focus significantly on ordinances contributing to smart cities, hence an important goal is to assess how well an urban region heads towards a smart city as per its policies mapping with smart city characteristics, and the corresponding public satisfaction.
KW - Commonsense reasoning
KW - Opinion mining
KW - Ordinances
KW - Smart cities
KW - Social
KW - media
KW - Text mining
Y1 - 2022
U6 - https://doi.org/10.1007/s10579-022-09584-6
SN - 1574-020X
SN - 1574-0218
VL - 57
SP - 733
EP - 763
PB - Springer
CY - Dordrecht [u.a.]
ER -
TY - JOUR
A1 - Kurpiers, Jona
A1 - Neher, Dieter
T1 - Dispersive Non-Geminate Recombination in an Amorphous Polymer:Fullerene Blend
JF - Scientific reports
N2 - Recombination of free charge is a key process limiting the performance of solar cells. For low mobility materials, such as organic semiconductors, the kinetics of non-geminate recombination (NGR) is strongly linked to the motion of charges. As these materials possess significant disorder, thermalization of photogenerated carriers in the inhomogeneously broadened density of state distribution is an unavoidable process. Despite its general importance, knowledge about the kinetics of NGR in complete organic solar cells is rather limited. We employ time delayed collection field (TDCF) experiments to study the recombination of photogenerated charge in the high-performance polymer:fullerene blend PCDTBT:PCBM. NGR in the bulk of this amorphous blend is shown to be highly dispersive, with a continuous reduction of the recombination coefficient throughout the entire time scale, until all charge carriers have either been extracted or recombined. Rapid, contact-mediated recombination is identified as an additional loss channel, which, if not properly taken into account, would erroneously suggest a pronounced field dependence of charge generation. These findings are in stark contrast to the results of TDCF experiments on photovoltaic devices made from ordered blends, such as P3HT:PCBM, where non-dispersive recombination was proven to dominate the charge carrier dynamics under application relevant conditions.
Y1 - 2016
U6 - https://doi.org/10.1038/srep26832
SN - 2045-2322
VL - 6
PB - Nature Publishing Group
CY - London
ER -
TY - JOUR
A1 - Neher, Dieter
A1 - Kniepert, Juliane
A1 - Elimelech, Arik
A1 - Koster, L. Jan Anton
T1 - A New Figure of Merit for Organic Solar Cells with Transport-limited Photocurrents
JF - Scientific reports
N2 - Compared to their inorganic counterparts, organic semiconductors suffer from relatively low charge carrier mobilities. Therefore, expressions derived for inorganic solar cells to correlate characteristic performance parameters to material properties are prone to fail when applied to organic devices. This is especially true for the classical Shockley-equation commonly used to describe current-voltage (JV)-curves, as it assumes a high electrical conductivity of the charge transporting material. Here, an analytical expression for the JV-curves of organic solar cells is derived based on a previously published analytical model. This expression, bearing a similar functional dependence as the Shockley-equation, delivers a new figure of merit α to express the balance between free charge recombination and extraction in low mobility photoactive materials. This figure of merit is shown to determine critical device parameters such as the apparent series resistance and the fill factor.
KW - Electronic and spintronic devices
KW - Semiconductors
Y1 - 2016
U6 - https://doi.org/10.1038/srep24861
SN - 2045-2322
VL - 6
PB - Nature Publishing Group
CY - London
ER -
TY - JOUR
A1 - Schindler, Daniel
A1 - Moldenhawer, Ted
A1 - Stange, Maike
A1 - Lepro, Valentino
A1 - Beta, Carsten
A1 - Holschneider, Matthias
A1 - Huisinga, Wilhelm
T1 - Analysis of protrusion dynamics in amoeboid cell motility by means of regularized contour flows
JF - PLoS Computational Biology : a new community journal
N2 - Amoeboid cell motility is essential for a wide range of biological processes including wound healing, embryonic morphogenesis, and cancer metastasis. It relies on complex dynamical patterns of cell shape changes that pose long-standing challenges to mathematical modeling and raise a need for automated and reproducible approaches to extract quantitative morphological features from image sequences. Here, we introduce a theoretical framework and a computational method for obtaining smooth representations of the spatiotemporal contour dynamics from stacks of segmented microscopy images. Based on a Gaussian process regression we propose a one-parameter family of regularized contour flows that allows us to continuously track reference points (virtual markers) between successive cell contours. We use this approach to define a coordinate system on the moving cell boundary and to represent different local geometric quantities in this frame of reference. In particular, we introduce the local marker dispersion as a measure to identify localized membrane expansions and provide a fully automated way to extract the properties of such expansions, including their area and growth time. The methods are available as an open-source software package called AmoePy, a Python-based toolbox for analyzing amoeboid cell motility (based on time-lapse microscopy data), including a graphical user interface and detailed documentation. Due to the mathematical rigor of our framework, we envision it to be of use for the development of novel cell motility models. We mainly use experimental data of the social amoeba Dictyostelium discoideum to illustrate and validate our approach.
Author summary Amoeboid motion is a crawling-like cell migration that plays an important key role in multiple biological processes such as wound healing and cancer metastasis. This type of cell motility results from expanding and simultaneously contracting parts of the cell membrane. From fluorescence images, we obtain a sequence of points, representing the cell membrane, for each time step. By using regression analysis on these sequences, we derive smooth representations, so-called contours, of the membrane. Since the number of measurements is discrete and often limited, the question is raised of how to link consecutive contours with each other. In this work, we present a novel mathematical framework in which these links are described by regularized flows allowing a certain degree of concentration or stretching of neighboring reference points on the same contour. This stretching rate, the so-called local dispersion, is used to identify expansions and contractions of the cell membrane providing a fully automated way of extracting properties of these cell shape changes. We applied our methods to time-lapse microscopy data of the social amoeba Dictyostelium discoideum.
Y1 - 2021
U6 - https://doi.org/10.1371/journal.pcbi.1009268
SN - 1553-734X
SN - 1553-7358
VL - 17
IS - 8
PB - PLoS
CY - San Fransisco
ER -
TY - JOUR
A1 - Panzer, Marcel
A1 - Bender, Benedict
A1 - Gronau, Norbert
T1 - Neural agent-based production planning and control
BT - an architectural review
JF - Journal of Manufacturing Systems
N2 - Nowadays, production planning and control must cope with mass customization, increased fluctuations in demand, and high competition pressures. Despite prevailing market risks, planning accuracy and increased adaptability in the event of disruptions or failures must be ensured, while simultaneously optimizing key process indicators. To manage that complex task, neural networks that can process large quantities of high-dimensional data in real time have been widely adopted in recent years. Although these are already extensively deployed in production systems, a systematic review of applications and implemented agent embeddings and architectures has not yet been conducted. The main contribution of this paper is to provide researchers and practitioners with an overview of applications and applied embeddings and to motivate further research in neural agent-based production. Findings indicate that neural agents are not only deployed in diverse applications, but are also increasingly implemented in multi-agent environments or in combination with conventional methods — leveraging performances compared to benchmarks and reducing dependence on human experience. This not only implies a more sophisticated focus on distributed production resources, but also broadening the perspective from a local to a global scale. Nevertheless, future research must further increase scalability and reproducibility to guarantee a simplified transfer of results to reality.
KW - production planning and control
KW - machine learning
KW - neural networks
KW - systematic literature review
KW - taxonomy
Y1 - 2022
U6 - https://doi.org/10.1016/j.jmsy.2022.10.019
SN - 0278-6125
SN - 1878-6642
VL - 65
SP - 743
EP - 766
PB - Elsevier
CY - Amsterdam
ER -
TY - JOUR
A1 - Monti, Remo
A1 - Rautenstrauch, Pia
A1 - Ghanbari, Mahsa
A1 - Rani James, Alva
A1 - Kirchler, Matthias
A1 - Ohler, Uwe
A1 - Konigorski, Stefan
A1 - Lippert, Christoph
T1 - Identifying interpretable gene-biomarker associations with functionally informed kernel-based tests in 190,000 exomes
JF - Nature Communications
N2 - Here we present an exome-wide rare genetic variant association study for 30 blood biomarkers in 191,971 individuals in the UK Biobank. We compare gene- based association tests for separate functional variant categories to increase interpretability and identify 193 significant gene-biomarker associations. Genes associated with biomarkers were ~ 4.5-fold enriched for conferring Mendelian disorders. In addition to performing weighted gene-based variant collapsing tests, we design and apply variant-category-specific kernel-based tests that integrate quantitative functional variant effect predictions for mis- sense variants, splicing and the binding of RNA-binding proteins. For these tests, we present a computationally efficient combination of the likelihood- ratio and score tests that found 36% more associations than the score test alone while also controlling the type-1 error. Kernel-based tests identified 13% more associations than their gene-based collapsing counterparts and had advantages in the presence of gain of function missense variants. We introduce local collapsing by amino acid position for missense variants and use it to interpret associations and identify potential novel gain of function variants in PIEZO1. Our results show the benefits of investigating different functional mechanisms when performing rare-variant association tests, and demonstrate pervasive rare-variant contribution to biomarker variability.
Y1 - 2022
U6 - https://doi.org/10.1038/s41467-022-32864-2
SN - 2041-1723
VL - 13
PB - Nature Publishing Group UK
CY - London
ER -
TY - JOUR
A1 - Tavakoli, Hamad
A1 - Alirezazadeh, Pendar
A1 - Hedayatipour, Ava
A1 - Nasib, A. H. Banijamali
A1 - Landwehr, Niels
T1 - Leaf image-based classification of some common bean cultivars using discriminative convolutional neural networks
JF - Computers and electronics in agriculture : COMPAG online ; an international journal
N2 - In recent years, many efforts have been made to apply image processing techniques for plant leaf identification. However, categorizing leaf images at the cultivar/variety level, because of the very low inter-class variability, is still a challenging task. In this research, we propose an automatic discriminative method based on convolutional neural networks (CNNs) for classifying 12 different cultivars of common beans that belong to three various species. We show that employing advanced loss functions, such as Additive Angular Margin Loss and Large Margin Cosine Loss, instead of the standard softmax loss function for the classification can yield better discrimination between classes and thereby mitigate the problem of low inter-class variability. The method was evaluated by classifying species (level I), cultivars from the same species (level II), and cultivars from different species (level III), based on images from the leaf foreside and backside. The results indicate that the performance of the classification algorithm on the leaf backside image dataset is superior. The maximum mean classification accuracies of 95.86, 91.37 and 86.87% were obtained at the levels I, II and III, respectively. The proposed method outperforms the previous relevant works and provides a reliable approach for plant cultivars identification.
KW - Bean
KW - Plant identification
KW - Digital image analysis
KW - VGG16
KW - Loss
KW - functions
Y1 - 2021
U6 - https://doi.org/10.1016/j.compag.2020.105935
SN - 0168-1699
SN - 1872-7107
VL - 181
PB - Elsevier
CY - Amsterdam [u.a.]
ER -
TY - JOUR
A1 - Pfitzner, Bjarne
A1 - Steckhan, Nico
A1 - Arnrich, Bert
T1 - Federated learning in a medical context
BT - a systematic literature review
JF - ACM transactions on internet technology : TOIT / Association for Computing
N2 - Data privacy is a very important issue. Especially in fields like medicine, it is paramount to abide by the existing privacy regulations to preserve patients' anonymity. However, data is required for research and training machine learning models that could help gain insight into complex correlations or personalised treatments that may otherwise stay undiscovered. Those models generally scale with the amount of data available, but the current situation often prohibits building large databases across sites. So it would be beneficial to be able to combine similar or related data from different sites all over the world while still preserving data privacy. Federated learning has been proposed as a solution for this, because it relies on the sharing of machine learning models, instead of the raw data itself. That means private data never leaves the site or device it was collected on. Federated learning is an emerging research area, and many domains have been identified for the application of those methods. This systematic literature review provides an extensive look at the concept of and research into federated learning and its applicability for confidential healthcare datasets.
KW - Federated learning
Y1 - 2021
U6 - https://doi.org/10.1145/3412357
SN - 1533-5399
SN - 1557-6051
VL - 21
IS - 2
SP - 1
EP - 31
PB - Association for Computing Machinery
CY - New York
ER -
TY - JOUR
A1 - Garrels, Tim
A1 - Khodabakhsh, Athar
A1 - Renard, Bernhard Y.
A1 - Baum, Katharina
T1 - LazyFox: fast and parallelized overlapping community detection in large graphs
JF - PEERJ Computer Science
N2 - The detection of communities in graph datasets provides insight about a graph's underlying structure and is an important tool for various domains such as social sciences, marketing, traffic forecast, and drug discovery. While most existing algorithms provide fast approaches for community detection, their results usually contain strictly separated communities. However, most datasets would semantically allow for or even require overlapping communities that can only be determined at much higher computational cost. We build on an efficient algorithm, FOX, that detects such overlapping communities. FOX measures the closeness of a node to a community by approximating the count of triangles which that node forms with that community. We propose LAZYFOX, a multi-threaded adaptation of the FOX algorithm, which provides even faster detection without an impact on community quality. This allows for the analyses of significantly larger and more complex datasets. LAZYFOX enables overlapping community detection on complex graph datasets with millions of nodes and billions of edges in days instead of weeks. As part of this work, LAZYFOX's implementation was published and is available as a tool under an MIT licence at https://github.com/TimGarrels/LazyFox.
KW - Overlapping community detection
KW - Large networks
KW - Weighted clustering coefficient
KW - Heuristic triangle estimation
KW - Parallelized algorithm
KW - C++ tool
KW - Runtime improvement
KW - Open source
KW - Graph algorithm
KW - Community analysis
Y1 - 2023
U6 - https://doi.org/10.7717/peerj-cs.1291
SN - 2376-5992
VL - 9
PB - PeerJ Inc.
CY - London
ER -
TY - JOUR
A1 - Bonnet, Philippe
A1 - Dong, Xin Luna
A1 - Naumann, Felix
A1 - Tözün, Pınar
T1 - VLDB 2021
BT - Designing a hybrid conference
JF - SIGMOD record
N2 - The 47th International Conference on Very Large Databases (VLDB'21) was held on August 16-20, 2021 as a hybrid conference. It attracted 180 in-person attendees in Copenhagen and 840 remote attendees. In this paper, we describe our key decisions as general chairs and program committee chairs and share the lessons we learned.
Y1 - 2021
U6 - https://doi.org/10.1145/3516431.3516447
SN - 0163-5808
SN - 1943-5835
VL - 50
IS - 4
SP - 50
EP - 53
PB - Association for Computing Machinery
CY - New York
ER -
TY - CHAP
A1 - Hagemann, Linus
A1 - Abramova, Olga
T1 - Crafting audience engagement in social media conversations
BT - evidence from the U.S. 2020 presidential elections
T2 - Proceedings of the 55th Hawaii International Conference on System Sciences
N2 - Observing inconsistent results in prior studies, this paper applies the elaboration likelihood model to investigate the impact of affective and cognitive cues embedded in social media messages on audience engagement during a political event. Leveraging a rich dataset in the context of the 2020 U.S. presidential elections containing more than 3 million tweets, we found the prominence of both cue types. For the overall sample, positivity and sentiment are negatively related to engagement. In contrast, the post-hoc sub-sample analysis of tweets from famous users shows that emotionally charged content is more engaging. The role of sentiment decreases when the number of followers grows and ultimately becomes insignificant for Twitter participants with a vast number of followers. Prosocial orientation (“we-talk”) is consistently associated with more likes, comments, and retweets in the overall sample and sub-samples.
KW - mediated conversation
KW - big data
KW - engagement
KW - sentiment analysis
KW - social media
Y1 - 2022
SN - 978-0-9981331-5-7
SP - 3222
EP - 3231
PB - HICSS Conference Office University of Hawaii at Manoa
CY - Honolulu
ER -
TY - CHAP
A1 - Abramova, Olga
T1 - Does a smile open all doors?
BT - understanding the impact of appearance disclosure on accommodation sharing platforms
T2 - Proceedings of the 53rd Hawaii International Conference on System Sciences
N2 - Online photographs govern an individual’s choices across a variety of contexts. In sharing arrangements, facial appearance has been shown to affect the desire to collaborate, interest to explore a listing, and even willingness to pay for a stay. Because of the ubiquity of online images and their influence on social attitudes, it seems crucial to be able to control these aspects. The present study examines the effect of different photographic self-disclosures on the provider’s perceptions and willingness to accept a potential co-sharer. The findings from our experiment in the accommodation-sharing context suggest social attraction mediates the effect of photographic self-disclosures on willingness to host. Implications of the results for IS research and practitioners are discussed.
KW - The Sharing Economy
KW - airbnb
KW - online photographs
KW - self-disclosure
KW - sharing economy
KW - social attraction
Y1 - 2020
SN - 978-0-9981331-3-3
SP - 831
EP - 840
PB - HICSS Conference Office University of Hawaii at Manoa
CY - Honolulu
ER -
TY - JOUR
A1 - Nguyen, Dong Hai Phuong
A1 - Georgie, Yasmin Kim
A1 - Kayhan, Ezgi
A1 - Eppe, Manfred
A1 - Hafner, Verena Vanessa
A1 - Wermter, Stefan
T1 - Sensorimotor representation learning for an "active self" in robots
BT - a model survey
JF - Künstliche Intelligenz : KI ; Forschung, Entwicklung, Erfahrungen ; Organ des Fachbereichs 1 Künstliche Intelligenz der Gesellschaft für Informatik e.V., GI / Fachbereich 1 der Gesellschaft für Informatik e.V
N2 - Safe human-robot interactions require robots to be able to learn how to behave appropriately in spaces populated by people and thus to cope with the challenges posed by our dynamic and unstructured environment, rather than being provided a rigid set of rules for operations. In humans, these capabilities are thought to be related to our ability to perceive our body in space, sensing the location of our limbs during movement, being aware of other objects and agents, and controlling our body parts to interact with them intentionally. Toward the next generation of robots with bio-inspired capacities, in this paper, we first review the developmental processes of underlying mechanisms of these abilities: The sensory representations of body schema, peripersonal space, and the active self in humans. Second, we provide a survey of robotics models of these sensory representations and robotics models of the self; and we compare these models with the human counterparts. Finally, we analyze what is missing from these robotics models and propose a theoretical computational framework, which aims to allow the emergence of the sense of self in artificial agents by developing sensory representations through self-exploration.
KW - Developmental robotics
KW - Body schema
KW - Peripersonal space
KW - Agency
KW - Robot learning
Y1 - 2021
U6 - https://doi.org/10.1007/s13218-021-00703-z
SN - 0933-1875
SN - 1610-1987
VL - 35
IS - 1
SP - 9
EP - 35
PB - Springer
CY - Berlin
ER -
TY - JOUR
A1 - Wiemker, Veronika
A1 - Bunova, Anna
A1 - Neufeld, Maria
A1 - Gornyi, Boris
A1 - Yurasova, Elena
A1 - Konigorski, Stefan
A1 - Kalinina, Anna
A1 - Kontsevaya, Anna
A1 - Ferreira-Borges, Carina
A1 - Probst, Charlotte
T1 - Pilot study to evaluate usability and acceptability of the 'Animated Alcohol Assessment Tool' in Russian primary healthcare
JF - Digital health
N2 - Background and aims: Accurate and user-friendly assessment tools quantifying alcohol consumption are a prerequisite to effective prevention and treatment programmes, including Screening and Brief Intervention. Digital tools offer new potential in this field. We developed the ‘Animated Alcohol Assessment Tool’ (AAA-Tool), a mobile app providing an interactive version of the World Health Organization's Alcohol Use Disorders Identification Test (AUDIT) that facilitates the description of individual alcohol consumption via culturally informed animation features. This pilot study evaluated the Russia-specific version of the Animated Alcohol Assessment Tool with regard to (1) its usability and acceptability in a primary healthcare setting, (2) the plausibility of its alcohol consumption assessment results and (3) the adequacy of its Russia-specific vessel and beverage selection. Methods: Convenience samples of 55 patients (47% female) and 15 healthcare practitioners (80% female) in 2 Russian primary healthcare facilities self-administered the Animated Alcohol Assessment Tool and rated their experience on the Mobile Application Rating Scale – User Version. Usage data was automatically collected during app usage, and additional feedback on regional content was elicited in semi-structured interviews. Results: On average, patients completed the Animated Alcohol Assessment Tool in 6:38 min (SD = 2.49, range = 3.00–17.16). User satisfaction was good, with all subscale Mobile Application Rating Scale – User Version scores averaging >3 out of 5 points. A majority of patients (53%) and practitioners (93%) would recommend the tool to ‘many people’ or ‘everyone’. Assessed alcohol consumption was plausible, with a low number (14%) of logically impossible entries. Most patients reported the Animated Alcohol Assessment Tool to reflect all vessels (78%) and all beverages (71%) they typically used. Conclusion: High acceptability ratings by patients and healthcare practitioners, acceptable completion time, plausible alcohol usage assessment results and perceived adequacy of region-specific content underline the Animated Alcohol Assessment Tool's potential to provide a novel approach to alcohol assessment in primary healthcare. After its validation, the Animated Alcohol Assessment Tool might contribute to reducing alcohol-related harm by facilitating Screening and Brief Intervention implementation in Russia and beyond.
KW - Alcohol use assessment
KW - Alcohol Use Disorders Identification Test
KW - screening tools
KW - digital health
KW - mobile applications
KW - Russia
KW - primary healthcare
KW - usability
KW - acceptability
Y1 - 2022
U6 - https://doi.org/10.1177/20552076211074491
SN - 2055-2076
VL - 8
PB - Sage Publications
CY - London
ER -
TY - JOUR
A1 - Omranian, Sara
A1 - Angeleska, Angela
A1 - Nikoloski, Zoran
T1 - PC2P
BT - parameter-free network-based prediction of protein complexes
JF - Bioinformatics
N2 - Motivation:
Prediction of protein complexes from protein-protein interaction (PPI) networks is an important problem in systems biology, as they control different cellular functions. The existing solutions employ algorithms for network community detection that identify dense subgraphs in PPI networks. However, gold standards in yeast and human indicate that protein complexes can also induce sparse subgraphs, introducing further challenges in protein complex prediction.
Results:
To address this issue, we formalize protein complexes as biclique spanned subgraphs, which include both sparse and dense subgraphs. We then cast the problem of protein complex prediction as a network partitioning into biclique spanned subgraphs with removal of minimum number of edges, called coherent partition. Since finding a coherent partition is a computationally intractable problem, we devise a parameter-free greedy approximation algorithm, termed Protein Complexes from Coherent Partition (PC2P), based on key properties of biclique spanned subgraphs. Through comparison with nine contenders, we demonstrate that PC2P: (i) successfully identifies modular structure in networks, as a prerequisite for protein complex prediction, (ii) outperforms the existing solutions with respect to a composite score of five performance measures on 75% and 100% of the analyzed PPI networks and gold standards in yeast and human, respectively, and (iii,iv) does not compromise GO semantic similarity and enrichment score of the predicted protein complexes. Therefore, our study demonstrates that clustering of networks in terms of biclique spanned subgraphs is a promising framework for detection of complexes in PPI networks.
Y1 - 2021
U6 - https://doi.org/10.1093/bioinformatics/btaa1089
SN - 1367-4811
VL - 37
IS - 1
SP - 73
EP - 81
PB - Oxford Univ. Press
CY - Oxford
ER -
TY - JOUR
A1 - Ulrich, Jens-Uwe
A1 - Lutfi, Ahmad
A1 - Rutzen, Kilian
A1 - Renard, Bernhard Y.
T1 - ReadBouncer
BT - precise and scalable adaptive sampling for nanopore sequencing
JF - Bioinformatics
N2 - Motivation:
Nanopore sequencers allow targeted sequencing of interesting nucleotide sequences by rejecting other sequences from individual pores. This feature facilitates the enrichment of low-abundant sequences by depleting overrepresented ones in-silico. Existing tools for adaptive sampling either apply signal alignment, which cannot handle human-sized reference sequences, or apply read mapping in sequence space relying on fast graphical processing units (GPU) base callers for real-time read rejection. Using nanopore long-read mapping tools is also not optimal when mapping shorter reads as usually analyzed in adaptive sampling applications.
Results:
Here, we present a new approach for nanopore adaptive sampling that combines fast CPU and GPU base calling with read classification based on Interleaved Bloom Filters. ReadBouncer improves the potential enrichment of low abundance sequences by its high read classification sensitivity and specificity, outperforming existing tools in the field. It robustly removes even reads belonging to large reference sequences while running on commodity hardware without GPUs, making adaptive sampling accessible for in-field researchers. Readbouncer also provides a user-friendly interface and installer files for end-users without a bioinformatics background.
Y1 - 2022
U6 - https://doi.org/10.1093/bioinformatics/btac223
SN - 1367-4803
SN - 1367-4811
VL - 38
IS - SUPPL 1
SP - 153
EP - 160
PB - Oxford Univ. Press
CY - Oxford
ER -
TY - JOUR
A1 - Wittig, Alice
A1 - Miranda, Fabio Malcher
A1 - Hölzer, Martin
A1 - Altenburg, Tom
A1 - Bartoszewicz, Jakub Maciej
A1 - Beyvers, Sebastian
A1 - Dieckmann, Marius Alfred
A1 - Genske, Ulrich
A1 - Giese, Sven Hans-Joachim
A1 - Nowicka, Melania
A1 - Richard, Hugues
A1 - Schiebenhoefer, Henning
A1 - Schmachtenberg, Anna-Juliane
A1 - Sieben, Paul
A1 - Tang, Ming
A1 - Tembrockhaus, Julius
A1 - Renard, Bernhard Y.
A1 - Fuchs, Stephan
T1 - CovRadar
BT - continuously tracking and filtering SARS-CoV-2 mutations for genomic surveillance
JF - Bioinformatics
N2 - The ongoing pandemic caused by SARS-CoV-2 emphasizes the importance of genomic surveillance to understand the evolution of the virus, to monitor the viral population, and plan epidemiological responses. Detailed analysis, easy visualization and intuitive filtering of the latest viral sequences are powerful for this purpose. We present CovRadar, a tool for genomic surveillance of the SARS-CoV-2 Spike protein. CovRadar consists of an analytical pipeline and a web application that enable the analysis and visualization of hundreds of thousand sequences. First, CovRadar extracts the regions of interest using local alignment, then builds a multiple sequence alignment, infers variants and consensus and finally presents the results in an interactive app, making accessing and reporting simple, flexible and fast.
Y1 - 2022
U6 - https://doi.org/10.1093/bioinformatics/btac411
SN - 1367-4803
SN - 1367-4811
VL - 38
IS - 17
SP - 4223
EP - 4225
PB - Oxford Univ. Press
CY - Oxford
ER -
TY - JOUR
A1 - Trautmann, Justin
A1 - Zhou, Lin
A1 - Brahms, Clemens Markus
A1 - Tunca, Can
A1 - Ersoy, Cem
A1 - Granacher, Urs
A1 - Arnrich, Bert
T1 - TRIPOD
BT - A treadmill walking dataset with IMU, pressure-distribution and photoelectric data for gait analysis
JF - Data : open access ʻData in scienceʼ journal
N2 - Inertial measurement units (IMUs) enable easy to operate and low-cost data recording for gait analysis. When combined with treadmill walking, a large number of steps can be collected in a controlled environment without the need of a dedicated gait analysis laboratory. In order to evaluate existing and novel IMU-based gait analysis algorithms for treadmill walking, a reference dataset that includes IMU data as well as reliable ground truth measurements for multiple participants and walking speeds is needed. This article provides a reference dataset consisting of 15 healthy young adults who walked on a treadmill at three different speeds. Data were acquired using seven IMUs placed on the lower body, two different reference systems (Zebris FDMT-HQ and OptoGait), and two RGB cameras. Additionally, in order to validate an existing IMU-based gait analysis algorithm using the dataset, an adaptable modular data analysis pipeline was built. Our results show agreement between the pressure-sensitive Zebris and the photoelectric OptoGait system (r = 0.99), demonstrating the quality of our reference data. As a use case, the performance of an algorithm originally designed for overground walking was tested on treadmill data using the data pipeline. The accuracy of stride length and stride time estimations was comparable to that reported in other studies with overground data, indicating that the algorithm is equally applicable to treadmill data. The Python source code of the data pipeline is publicly available, and the dataset will be provided by the authors upon request, enabling future evaluations of IMU gait analysis algorithms without the need of recording new data.
KW - inertial measurement unit
KW - gait analysis algorithm
KW - OptoGait
KW - Zebris
KW - data pipeline
KW - public dataset
Y1 - 2021
U6 - https://doi.org/10.3390/data6090095
SN - 2306-5729
VL - 6
IS - 9
PB - MDPI
CY - Basel
ER -
TY - JOUR
A1 - Rosin, Paul L.
A1 - Lai, Yu-Kun
A1 - Mould, David
A1 - Yi, Ran
A1 - Berger, Itamar
A1 - Doyle, Lars
A1 - Lee, Seungyong
A1 - Li, Chuan
A1 - Liu, Yong-Jin
A1 - Semmo, Amir
A1 - Shamir, Ariel
A1 - Son, Minjung
A1 - Winnemöller, Holger
T1 - NPRportrait 1.0: A three-level benchmark for non-photorealistic rendering of portraits
JF - Computational visual media
N2 - Recently, there has been an upsurge of activity in image-based non-photorealistic rendering (NPR), and in particular portrait image stylisation, due to the advent of neural style transfer (NST). However, the state of performance evaluation in this field is poor, especially compared to the norms in the computer vision and machine learning communities. Unfortunately, the task of evaluating image stylisation is thus far not well defined, since it involves subjective, perceptual, and aesthetic aspects. To make progress towards a solution, this paper proposes a new structured, three-level, benchmark dataset for the evaluation of stylised portrait images. Rigorous criteria were used for its construction, and its consistency was validated by user studies. Moreover, a new methodology has been developed for evaluating portrait stylisation algorithms, which makes use of the different benchmark levels as well as annotations provided by user studies regarding the characteristics of the faces. We perform evaluation for a wide variety of image stylisation methods (both portrait-specific and general purpose, and also both traditional NPR approaches and NST) using the new benchmark dataset.
KW - non-photorealistic rendering (NPR)
KW - image stylization
KW - style transfer
KW - portrait
KW - evaluation
KW - benchmark
Y1 - 2022
U6 - https://doi.org/10.1007/s41095-021-0255-3
SN - 2096-0433
SN - 2096-0662
VL - 8
IS - 3
SP - 445
EP - 465
PB - Springer Nature
CY - London
ER -
TY - CHAP
A1 - Rojahn, Marcel
A1 - Gronau, Norbert
ED - Bui, Tung X.
T1 - Openness indicators for the evaluation of digital platforms between the launch and maturity phase
T2 - Proceedings of the 57th Annual Hawaii International Conference on System Sciences
N2 - In recent years, the evaluation of digital platforms has become an important focus in the field of information systems science. The identification of influential indicators that drive changes in digital platforms, specifically those related to openness, is still an unresolved issue. This paper addresses the challenge of identifying measurable indicators and characterizing the transition from launch to maturity in digital platforms. It proposes a systematic analytical approach to identify relevant openness indicators for evaluation purposes. The main contributions of this study are the following (1) the development of a comprehensive procedure for analyzing indicators, (2) the categorization of indicators as evaluation metrics within a multidimensional grid-box model, (3) the selection and evaluation of relevant indicators, (4) the identification and assessment of digital platform architectures during the launch-to-maturity transition, and (5) the evaluation of the applicability of the conceptualization and design process for digital platform evaluation.
KW - federated industrial platform ecosystems
KW - technologies
KW - business models
KW - data-driven artifacts
KW - design-science research
KW - digital platform openness
KW - evaluation
KW - morphological analysis
Y1 - 2024
SN - 978-0-99813-317-1
SP - 4516
EP - 4525
PB - Department of IT Management Shidler College of Business University of Hawaii
CY - Honolulu, HI
ER -
TY - JOUR
A1 - Cabalar, Pedro
A1 - Fandiño, Jorge
A1 - Fariñas del Cerro, Luis
T1 - Splitting epistemic logic programs
JF - Theory and practice of logic programming / publ. for the Association for Logic Programming
N2 - Epistemic logic programs constitute an extension of the stable model semantics to deal with new constructs called subjective literals. Informally speaking, a subjective literal allows checking whether some objective literal is true in all or some stable models. As it can be imagined, the associated semantics has proved to be non-trivial, since the truth of subjective literals may interfere with the set of stable models it is supposed to query. As a consequence, no clear agreement has been reached and different semantic proposals have been made in the literature. Unfortunately, comparison among these proposals has been limited to a study of their effect on individual examples, rather than identifying general properties to be checked. In this paper, we propose an extension of the well-known splitting property for logic programs to the epistemic case. We formally define when an arbitrary semantics satisfies the epistemic splitting property and examine some of the consequences that can be derived from that, including its relation to conformant planning and to epistemic constraints. Interestingly, we prove (through counterexamples) that most of the existing approaches fail to fulfill the epistemic splitting property, except the original semantics proposed by Gelfond 1991 and a recent proposal by the authors, called Founded Autoepistemic Equilibrium Logic.
KW - knowledge representation and nonmonotonic reasoning
KW - logic programming methodology and applications
KW - theory
Y1 - 2021
U6 - https://doi.org/10.1017/S1471068420000058
SN - 1471-0684
SN - 1475-3081
VL - 21
IS - 3
SP - 296
EP - 316
PB - Cambridge Univ. Press
CY - Cambridge [u.a.]
ER -
TY - JOUR
A1 - De Freitas, Jessica K.
A1 - Johnson, Kipp W.
A1 - Golden, Eddye
A1 - Nadkarni, Girish N.
A1 - Dudley, Joel T.
A1 - Böttinger, Erwin
A1 - Glicksberg, Benjamin S.
A1 - Miotto, Riccardo
T1 - Phe2vec
BT - Automated disease phenotyping based on unsupervised embeddings from electronic health records
JF - Patterns
N2 - Robust phenotyping of patients from electronic health records (EHRs) at scale is a challenge in clinical informatics. Here, we introduce Phe2vec, an automated framework for disease phenotyping from EHRs based on unsupervised learning and assess its effectiveness against standard rule-based algorithms from Phenotype KnowledgeBase (PheKB). Phe2vec is based on pre-computing embeddings of medical concepts and patients' clinical history. Disease phenotypes are then derived from a seed concept and its neighbors in the embedding space. Patients are linked to a disease if their embedded representation is close to the disease phenotype. Comparing Phe2vec and PheKB cohorts head-to-head using chart review, Phe2vec performed on par or better in nine out of ten diseases. Differently from other approaches, it can scale to any condition and was validated against widely adopted expert-based standards. Phe2vec aims to optimize clinical informatics research by augmenting current frameworks to characterize patients by condition and derive reliable disease cohorts.
Y1 - 2021
U6 - https://doi.org/10.1016/j.patter.2021.100337
SN - 2666-3899
VL - 2
IS - 9
PB - Elsevier
CY - Amsterdam
ER -