Refine
Year of publication
- 2021 (68) (remove)
Document Type
- Article (43)
- Doctoral Thesis (13)
- Monograph/Edited Volume (6)
- Postprint (5)
- Conference Proceeding (1)
Language
- English (68)
Keywords
- MOOC (3)
- blockchain (3)
- business process management (3)
- machine learning (3)
- smart contracts (3)
- 3D-Visualisierung (2)
- Feature selection (2)
- Gene expression (2)
- OptoGait (2)
- Prior knowledge (2)
Institute
- Hasso-Plattner-Institut für Digital Engineering GmbH (68) (remove)
Anomaly detection in process mining aims to recognize outlying or unexpected behavior in event logs for purposes such as the removal of noise and identification of conformance violations. Existing techniques for this task are primarily frequency-based, arguing that behavior is anomalous because it is uncommon. However, such techniques ignore the semantics of recorded events and, therefore, do not take the meaning of potential anomalies into consideration. In this work, we overcome this caveat and focus on the detection of anomalies from a semantic perspective, arguing that anomalies can be recognized when process behavior does not make sense. To achieve this, we propose an approach that exploits the natural language associated with events. Our key idea is to detect anomalous process behavior by identifying semantically inconsistent execution patterns. To detect such patterns, we first automatically extract business objects and actions from the textual labels of events. We then compare these against a process-independent knowledge base. By populating this knowledge base with patterns from various kinds of resources, our approach can be used in a range of contexts and domains. We demonstrate the capability of our approach to successfully detect semantic execution anomalies through an evaluation based on a set of real-world and synthetic event logs and show the complementary nature of semantics-based anomaly detection to existing frequency-based techniques.
The noble way to substantiate decisions that affect many people is to ask these people for their opinions. For governments that run whole countries, this means asking all citizens for their views to consider their situations and needs.
Organizations such as Africa's Voices Foundation, who want to facilitate communication between decision-makers and citizens of a country, have difficulty mediating between these groups. To enable understanding, statements need to be summarized and visualized. Accomplishing these goals in a way that does justice to the citizens' voices and situations proves challenging. Standard charts do not help this cause as they fail to create empathy for the people behind their graphical abstractions. Furthermore, these charts do not create trust in the data they are representing as there is no way to see or navigate back to the underlying code and the original data. To fulfill these functions, visualizations would highly benefit from interactions to explore the displayed data, which standard charts often only limitedly provide.
To help improve the understanding of people's voices, we developed and categorized 80 ideas for new visualizations, new interactions, and better connections between different charts, which we present in this report. From those ideas, we implemented 10 prototypes and two systems that integrate different visualizations. We show that this integration allows consistent appearance and behavior of visualizations. The visualizations all share the same main concept: representing each individual with a single dot. To realize this idea, we discuss technologies that efficiently allow the rendering of a large number of these dots. With these visualizations, direct interactions with representations of individuals are achievable by clicking on them or by dragging a selection around them. This direct interaction is only possible with a bidirectional connection from the visualization to the data it displays. We discuss different strategies for bidirectional mappings and the trade-offs involved. Having unified behavior across visualizations enhances exploration. For our prototypes, that includes grouping, filtering, highlighting, and coloring of dots. Our prototyping work was enabled by the development environment Lively4. We explain which parts of Lively4 facilitated our prototyping process. Finally, we evaluate our approach to domain problems and our developed visualization concepts.
Our work provides inspiration and a starting point for visualization development in this domain. Our visualizations can improve communication between citizens and their government and motivate empathetic decisions. Our approach, combining low-level entities to create visualizations, provides value to an explorative and empathetic workflow. We show that the design space for visualizing this kind of data has a lot of potential and that it is possible to combine qualitative and quantitative approaches to data analysis.
In recent years, computer vision algorithms based on machine learning have seen rapid development. In the past, research mostly focused on solving computer vision problems such as image classification or object detection on images displaying natural scenes. Nowadays other fields such as the field of cultural heritage, where an abundance of data is available, also get into the focus of research. In the line of current research endeavours, we collaborated with the Getty Research Institute which provided us with a challenging dataset, containing images of paintings and drawings. In this technical report, we present the results of the seminar "Deep Learning for Computer Vision". In this seminar, students of the Hasso Plattner Institute evaluated state-of-the-art approaches for image classification, object detection and image recognition on the dataset of the Getty Research Institute. The main challenge when applying modern computer vision methods to the available data is the availability of annotated training data, as the dataset provided by the Getty Research Institute does not contain a sufficient amount of annotated samples for the training of deep neural networks. However, throughout the report we show that it is possible to achieve satisfying to very good results, when using further publicly available datasets, such as the WikiArt dataset, for the training of machine learning models.
Confidence Counts
(2021)
The increasing reliance on online learning in higher education has been further expedited by the on-going Covid-19 pandemic. Students need to be supported as they adapt to this new learning environment. Research has established that learners with positive online learning self-efficacy beliefs are more likely to persevere and achieve their higher education goals when learning online. In this paper, we explore how MOOC design can contribute to the four sources of self-efficacy beliefs posited by Bandura [4]. Specifically, we will explore, drawing on learner reflections, whether design elements of the MOOC, The Digital Edge: Essentials for the Online Learner, provided participants with the necessary mastery experiences, vicarious experiences, verbal persuasion, and affective regulation opportunities, to evaluate and develop their online learning self-efficacy beliefs. Findings from a content analysis of discussion forum posts show that learners referenced three of the four information sources when reflecting on their experience of the MOOC. This paper illustrates the potential of MOOCs as a pedagogical tool for enhancing online learning self-efficacy among students.
CrashNet
(2021)
Destructive car crash tests are an elaborate, time-consuming, and expensive necessity of the automotive development process. Today, finite element method (FEM) simulations are used to reduce costs by simulating car crashes computationally. We propose CrashNet, an encoder-decoder deep neural network architecture that reduces costs further and models specific outcomes of car crashes very accurately. We achieve this by formulating car crash events as time series prediction enriched with a set of scalar features. Traditional sequence-to-sequence models are usually composed of convolutional neural network (CNN) and CNN transpose layers. We propose to concatenate those with an MLP capable of learning how to inject the given scalars into the output time series. In addition, we replace the CNN transpose with 2D CNN transpose layers in order to force the model to process the hidden state of the set of scalars as one time series. The proposed CrashNet model can be trained efficiently and is able to process scalars and time series as input in order to infer the results of crash tests. CrashNet produces results faster and at a lower cost compared to destructive tests and FEM simulations. Moreover, it represents a novel approach in the car safety management domain.
Viper
(2021)
Key-value stores (KVSs) have found wide application in modern software systems. For persistence, their data resides in slow secondary storage, which requires KVSs to employ various techniques to increase their read and write performance from and to the underlying medium. Emerging persistent memory (PMem) technologies offer data persistence at close-to-DRAM speed, making them a promising alternative to classical disk-based storage. However, simply drop-in replacing existing storage with PMem does not yield good results, as block-based access behaves differently in PMem than on disk and ignores PMem's byte addressability, layout, and unique performance characteristics. In this paper, we propose three PMem-specific access patterns and implement them in a hybrid PMem-DRAM KVS called Viper. We employ a DRAM-based hash index and a PMem-aware storage layout to utilize the random-write speed of DRAM and efficient sequential-write performance PMem. Our evaluation shows that Viper significantly outperforms existing KVSs for core KVS operations while providing full data persistence. Moreover, Viper outperforms existing PMem-only, hybrid, and disk-based KVSs by 4-18x for write workloads, while matching or surpassing their get performance.
Viper
(2021)
Key-value stores (KVSs) have found wide application in modern software systems. For persistence, their data resides in slow secondary storage, which requires KVSs to employ various techniques to increase their read and write performance from and to the underlying medium. Emerging persistent memory (PMem) technologies offer data persistence at close-to-DRAM speed, making them a promising alternative to classical disk-based storage. However, simply drop-in replacing existing storage with PMem does not yield good results, as block-based access behaves differently in PMem than on disk and ignores PMem's byte addressability, layout, and unique performance characteristics. In this paper, we propose three PMem-specific access patterns and implement them in a hybrid PMem-DRAM KVS called Viper. We employ a DRAM-based hash index and a PMem-aware storage layout to utilize the random-write speed of DRAM and efficient sequential-write performance PMem. Our evaluation shows that Viper significantly outperforms existing KVSs for core KVS operations while providing full data persistence. Moreover, Viper outperforms existing PMem-only, hybrid, and disk-based KVSs by 4-18x for write workloads, while matching or surpassing their get performance.
TransPipe
(2021)
Online learning environments, such as Massive Open Online Courses (MOOCs), often rely on videos as a major component to convey knowledge. However, these videos exclude potential participants who do not understand the lecturer’s language, regardless of whether that is due to language unfamiliarity or aural handicaps. Subtitles and/or interactive transcripts solve this issue, ease navigation based on the content, and enable indexing and retrieval by search engines. Although there are several automated speech-to-text converters and translation tools, their quality varies and the process of integrating them can be quite tedious. Thus, in practice, many videos on MOOC platforms only receive subtitles after the course is already finished (if at all) due to a lack of resources. This work describes an approach to tackle this issue by providing a dedicated tool, which is closing this gap between MOOC platforms and transcription and translation tools and offering a simple workflow that can easily be handled by users with a less technical background. The proposed method is designed and evaluated by qualitative interviews with three major MOOC providers.
Learning During COVID-19
(2021)
During the COVID-19 pandemic, learning in higher education and beyond shifted en masse to online formats, with the short- and long-term consequences for Massive Open Online Course (MOOC) platforms, learners, and creators still under evaluation. In this paper, we sought to determine whether the COVID-19 pandemic and this shift to online learning led to increased learner engagement and attainment in a single introductory biology MOOC through evaluating enrollment, proportional and individual engagement, and verification and performance data. As this MOOC regularly operates each year, we compared these data collected from two course runs during the pandemic to three pre-pandemic runs. During the first pandemic run, the number and rate of learners enrolling in the course doubled when compared to prior runs, while the second pandemic run indicated a gradual return to pre-pandemic enrollment. Due to higher enrollment, more learners viewed videos, attempted problems, and posted to the discussion forums during the pandemic. Participants engaged with forums in higher proportions in both pandemic runs, but the proportion of participants who viewed videos decreased in the second pandemic run relative to the prior runs. A higher percentage of learners chose to pursue a certificate via the verified track in each pandemic run, though a smaller proportion earned certification in the second pandemic run. During the pandemic, more enrolled learners did not necessarily correlate to greater engagement by all metrics. While verified-track learner performance varied widely during each run, the effects of the pandemic were not uniform for learners, much like in other aspects of life. As such, individual engagement trends in the first pandemic run largely resemble pre-pandemic metrics but with more learners overall, while engagement trends in the second pandemic run are less like pre-pandemic metrics, hinting at learner “fatigue”. This study serves to highlight the life-long learning opportunity that MOOCs offer is even more critical when traditional education modes are disrupted and more people are at home or unemployed. This work indicates that this boom in MOOC participation may not remain at a high level for the longer term in any one course, but overall, the number of MOOCs, programs, and learners continues to grow.
VLDB 2021
(2021)
The 47th International Conference on Very Large Databases (VLDB'21) was held on August 16-20, 2021 as a hybrid conference. It attracted 180 in-person attendees in Copenhagen and 840 remote attendees. In this paper, we describe our key decisions as general chairs and program committee chairs and share the lessons we learned.
MOOCs have been produced using a variety of instructional design approaches and frameworks. This paper presents experiences from the instructional approach based on the ADDIE model applied to designing and producing MOOCs in the Erasmus+ strategic partnership on Open Badge Ecosystem for Research Data Management (OBERRED). Specifically, this paper describes the case study of the production of the MOOC “Open Badges for Open Science”, delivered on the European MOOC platform EMMA. The key goal of this MOOC is to help learners develop a capacity to use Open Badges in the field of Research Data Management (RDM). To produce the MOOC, the ADDIE model was applied as a generic instructional design model and a systematic approach to the design and development following the five design phases: Analysis, Design, Development, Implementation, Evaluation. This paper outlines the MOOC production including methods, templates and tools used in this process including the interactive micro-content created with H5P in form of Open Educational Resources and digital credentials created with Open Badges and issued to MOOC participants upon successful completion of MOOC levels. The paper also outlines the results from qualitative evaluation, which applied the cognitive walkthrough methodology to elicit user requirements. The paper ends with conclusions about pros and cons of using the ADDIE model in MOOC production and formulates recommendations for further work in this area.
The COVID-19 pandemic emergency has forced a profound reshape of our lives. Our way of working and studying has been disrupted with the result of an acceleration of the shift to the digital world. To properly adapt to this change, we need to outline and implement new urgent strategies and approaches which put learning at the center, supporting workers and students to further develop “future proof” skills. In the last period, universities and educational institutions have demonstrated that they can play an important role in this context, also leveraging on the potential of Massive Open Online Courses (MOOCs) which proved to be an important vehicle of flexibility and adaptation in a general context characterised by several constraints. From March 2020 till now, we have witnessed an exponential growth of MOOCs enrollments numbers, with “traditional” students interested in different topics not necessarily integrated to their curricular studies. To support students and faculty development during the spreading of the pandemic, Politecnico di Milano focused on one main dimension: faculty development for a better integration of digital tools and contents in the e-learning experience. The current discussion focuses on how to improve the integration of MOOCs in the in-presence activities to create meaningful learning and teaching experiences, thereby leveraging blended learning approaches to engage both students and external stakeholders to equip them with future job relevance skills.
In the field of Business Process Management (BPM), modeling business processes and related data is a critical issue since process activities need to manage data stored in databases. The connection between processes and data is usually handled at the implementation level, even if modeling both processes and data at the conceptual level should help designers in improving business process models and identifying requirements for implementation. Especially in data -and decision-intensive contexts, business process activities need to access data stored both in databases and data warehouses. In this paper, we complete our approach for defining a novel conceptual view that bridges process activities and data. The proposed approach allows the designer to model the connection between business processes and database models and define the operations to perform, providing interesting insights on the overall connected perspective and hints for identifying activities that are crucial for decision support.
In the context of the Fostering Women to STEM MOOCs (FOSTWOM) project, we present here the general ideas of a gender balance Toolkit, i.e. a collection of recommendations and resources for instructional designers, visual designers, and teaching staff to apply while designing and preparing storyboards for MOOCs and their visual components, so that future STEM online courses have a greater chance to be more inclusive and gender-balanced. Overall, The FOSTWOM project intends to use the inclusive potential of Massive Open Online Courses to propose STEM subjects free of stereotyping assumptions on gender abilities. Moreover, the consortium is interested in attracting girls and young women to science and technology careers, through accessible online content, which can include role models’ interviews, relevant real-world situations, and strong conceptual frameworks.
We study the classical, two-sided stable marriage problem under pairwise preferences. In the most general setting, agents are allowed to express their preferences as comparisons of any two of their edges, and they also have the right to declare a draw or even withdraw from such a comparison. This freedom is then gradually restricted as we specify six stages of orderedness in the preferences, ending with the classical case of strictly ordered lists. We study all cases occurring when combining the three known notions of stability-weak, strong, and super-stability-under the assumption that each side of the bipartite market obtains one of the six degrees of orderedness. By designing three polynomial algorithms and two NP-completeness proofs, we determine the complexity of all cases not yet known and thus give an exact boundary in terms of preference structure between tractable and intractable cases.
Our input is a complete graph G on n vertices where each vertex has a strict ranking of all other vertices in G. The goal is to construct a matching in G that is popular. A matching M is popular if M does not lose a head-to-head election against any matching M ': here each vertex casts a vote for the matching in {M,M '} in which it gets a better assignment. Popular matchings need not exist in the given instance G and the popular matching problem is to decide whether one exists or not. The popular matching problem in G is easy to solve for odd n. Surprisingly, the problem becomes NP-complete for even n, as we show here. This is one of the few graph theoretic problems efficiently solvable when n has one parity and NP-complete when n has the other parity.
Universitat Politècnica de València’s Experience with EDX MOOC Initiatives During the Covid Lockdown
(2021)
In March 2020, when massive lockdowns started to be enforced around the world to contain the spread of the COVID-19 pandemic, edX launched two initiatives to help students around the world providing free certificates for its courses, RAP, for member institutions and OCE, for any accredited academic institution. In this paper we analyze how Universitat Poltècnica de València contributed with its courses to both initiatives, providing almost 14,000 free certificate codes in total, and how UPV used the RAP initiative as a customer, describing the mechanism used to distribute more than 22,000 codes for free certificates to more than 7,000 UPV community members, what led to the achievement of more than 5,000 free certificates. We also comment the results of a post initiative survey answered by 1,612 UPV members about 3,241 edX courses, in which they communicated a satisfaction of 4,69 over 5 with the initiative.
An ever-increasing number of prediction models is published every year in different medical specialties. Prognostic or diagnostic in nature, these models support medical decision making by utilizing one or more items of patient data to predict outcomes of interest, such as mortality or disease progression. While different computer tools exist that support clinical predictive modeling, I observed that the state of the art is lacking in the extent to which the needs of research clinicians are addressed. When it comes to model development, current support tools either 1) target specialist data engineers, requiring advanced coding skills, or 2) cater to a general-purpose audience, therefore not addressing the specific needs of clinical researchers. Furthermore, barriers to data access across institutional silos, cumbersome model reproducibility and extended experiment-to-result times significantly hampers validation of existing models. Similarly, without access to interpretable explanations, which allow a given model to be fully scrutinized, acceptance of machine learning approaches will remain limited. Adequate tool support, i.e., a software artifact more targeted at the needs of clinical modeling, can help mitigate the challenges identified with respect to model development, validation and interpretation. To this end, I conducted interviews with modeling practitioners in health care to better understand the modeling process itself and ascertain in what aspects adequate tool support could advance the state of the art. The functional and non-functional requirements identified served as the foundation for a software artifact that can be used for modeling outcome and risk prediction in health research. To establish the appropriateness of this approach, I implemented a use case study in the Nephrology domain for acute kidney injury, which was validated in two different hospitals. Furthermore, I conducted user evaluation to ascertain whether such an approach provides benefits compared to the state of the art and the extent to which clinical practitioners could benefit from it. Finally, when updating models for external validation, practitioners need to apply feature selection approaches to pinpoint the most relevant features, since electronic health records tend to contain several candidate predictors. Building upon interpretability methods, I developed an explanation-driven recursive feature elimination approach. This method was comprehensively evaluated against state-of-the art feature selection methods. Therefore, this thesis' main contributions are three-fold, namely, 1) designing and developing a software artifact tailored to the specific needs of the clinical modeling domain, 2) demonstrating its application in a concrete case in the Nephrology context and 3) development and evaluation of a new feature selection approach applicable in a validation context that builds upon interpretability methods. In conclusion, I argue that appropriate tooling, which relies on standardization and parametrization, can support rapid model prototyping and collaboration between clinicians and data scientists in clinical predictive modeling.
Despite advances in machine learning-based clinical prediction models, only few of such models are actually deployed in clinical contexts. Among other reasons, this is due to a lack of validation studies. In this paper, we present and discuss the validation results of a machine learning model for the prediction of acute kidney injury in cardiac surgery patients initially developed on the MIMIC-III dataset when applied to an external cohort of an American research hospital. To help account for the performance differences observed, we utilized interpretability methods based on feature importance, which allowed experts to scrutinize model behavior both at the global and local level, making it possible to gain further insights into why it did not behave as expected on the validation cohort. The knowledge gleaned upon derivation can be potentially useful to assist model update during validation for more generalizable and simpler models. We argue that interpretability methods should be considered by practitioners as a further tool to help explain performance differences and inform model update in validation studies.
Peer assessment in MOOCs
(2021)
We report on a systematic review of the landscape of peer assessment in massive open online courses (MOOCs) with papers from 2014 to 2020 in 20 leading education technology publication venues across four databases containing education technology-related papers, addressing three research issues: the evolution of peer assessment in MOOCs during the period 2014 to 2020, the methods used in MOOCs to assess peers, and the challenges of and future directions in MOOC peer assessment. We provide summary statistics and a review of methods across the corpus and highlight three directions for improving the use of peer assessment in MOOCs: the need for focusing on scaling learning through peer evaluations, the need for scaling and optimizing team submissions in team peer assessments, and the need for embedding a social process for peer assessment.
Massive Open Online Courses (MOOCs) offer online courses at low cost for anyone with an internet access. At its early days, the MOOC movement raised the flag of democratizing education, but soon enough, this utopian idea collided with the need to find sustainable business models. Moving from open access to a new financially sustainable certification and monetization policy in December 2015 we aim at this change-point and observe the completion rates before and after this monetary change. In this study we investigate the impact of the change on learners from countries of different development status. Our findings suggest that this change has lowered the completion rates among learners from developing countries, increasing gaps that already existed between global learners from countries of low and high development status. This suggests that more inclusive monetization policies may help MOOCs benefits to spread more equally among global learners.
The goal of this paper is to study the demand factors driving enrollment in massive open online courses. Using course level data from a French MOOC platform, we study the course, teacher and institution related characteristics that influence the enrollment decision of students, in a setting where enrollment is open to all students without administrative barriers. Coverage from social and traditional media done around the course is a key driver. In addition, the language of instruction and the (estimated) amount of work needed to complete the course also have a significant impact. The data also suggests that the presence of same-side externalities is limited. Finally, preferences of national and of international students tend to differ on several dimensions.
EMOOCs 2021
(2021)
From June 22 to June 24, 2021, Hasso Plattner Institute, Potsdam, hosted the seventh European MOOC Stakeholder Summit (EMOOCs 2021) together with the eighth ACM Learning@Scale Conference.
Due to the COVID-19 situation, the conference was held fully online.
The boost in digital education worldwide as a result of the pandemic was also one of the main topics of this year’s EMOOCs. All institutions of learning have been forced to transform and redesign their educational methods, moving from traditional models to hybrid or completely online models at scale. The learnings, derived from practical experience and research, have been explored in EMOOCs 2021 in six tracks and additional workshops, covering various aspects of this field. In this publication, we present papers from the conference’s Experience Track, the Policy Track, the Business Track, the International Track, and the Workshops.
Aside from providing instructional materials to the public, developing massive open online courses (MOOCs) can benefit institutions in different ways. Some examples include providing training opportunities for their students aspiring to work in the online learning space, strengthening its brand recognition through courses appealing to enthusiasts, and enabling online linkages with other universities. One such example is the monozukuri MOOC offered by the Tokyo Institute of Technology on edX, which initially presented the Japanese philosophy of making things in the context of a mechanical engineering course. In this paper, we describe the importance of involving a course development team with a diverse background. The monozukuri MOOC and its revision enabled us to showcase an otherwise distinctively Japanese topic (philosophy) as an intersection of various topics of interest to learners with an equally diverse background. The revision resulted in discussing monozukuri in a mechanical engineering lesson and how monozukuri is actively being practiced in the Japanese workplace and academic setting while juxtaposing it to the relatively Western concept of experiential learning. Aside from presenting the course with a broader perspective, the revision had been an exercise for its team members on working in a multicultural environment within a Japanese institution, thus developing their project management and communication skills.
There are a plethora of ways to guide and support people to learn about MOOC (massive open online course) development, from their first interest, sourcing supportive resources, methods and tools to better aid their understanding of the concepts and pedagogical approaches of MOOC design, to becoming a MOOC developer. This contribution highlights tools and methods that are openly available and re-usable under Creative Commons licenses. Our collection builds upon the experiences from three MOOC development and hosting teams with joint experiences of several hundred MOOCs (University of Applied Sciences in Lübeck, Graz University of Technology, University of Glasgow) in three European countries, which are Germany, Austria and the UK. The contribution recommends and shares experiences with short articles and poster for first information sharing a Monster MOOC assignment for beginners, a MOOC canvas for first sketches, the MOOC design kit for details of instructional design and a MOOC for MOOC makers and a MOOC map as introduction into a certain MOOC platform.
Clustering in education is important in identifying groups of objects in order to find linked patterns of correlations in educational datasets. As such, MOOCs provide a rich source of educational datasets which enable a wide selection of options to carry out clustering and an opportunity for cohort analyses. In this experience paper, five research studies on clustering in MOOCs are reviewed, drawing out several reasonings, methods, and students’ clusters that reflect certain kinds of learning behaviours. The collection of the varied clusters shows that each study identifies and defines clusters according to distinctive engagement patterns. Implications and a summary are provided at the end of the paper.
Recent trends in ubiquitous computing have led to a proliferation of studies that focus on human activity recognition (HAR) utilizing inertial sensor data that consist of acceleration, orientation and angular velocity. However, the performances of such approaches are limited by the amount of annotated training data, especially in fields where annotating data is highly time-consuming and requires specialized professionals, such as in healthcare. In image classification, this limitation has been mitigated by powerful oversampling techniques such as data augmentation. Using this technique, this work evaluates to what extent transforming inertial sensor data into movement trajectories and into 2D heatmap images can be advantageous for HAR when data are scarce. A convolutional long short-term memory (ConvLSTM) network that incorporates spatiotemporal correlations was used to classify the heatmap images. Evaluation was carried out on Deep Inertial Poser (DIP), a known dataset composed of inertial sensor data. The results obtained suggest that for datasets with large numbers of subjects, using state-of-the-art methods remains the best alternative. However, a performance advantage was achieved for small datasets, which is usually the case in healthcare. Moreover, movement trajectories provide a visual representation of human activities, which can help researchers to better interpret and analyze motion patterns.
The COVID-19 pandemic has accelerated the pace of digital transformation, which has forced people to quickly adapt to working and collaborating online. Learning in digital environments has without a doubt gained increased significance during this rather unique time and, therefore, Massive Open Online Courses (MOOCs) have more potential to attract a wider target audience. This has also brought about more possibilities for global collaboration among learners as learning is not limited to physical spaces. Despite the wide interest in MOOCs, there is a need for further research on the global collaboration potential they offer. The aim of this paper is to adopt an action research approach to study how a hybrid MOOC design enables learners’ global collaboration. During the years 2019–2020 together with an international consortium called Corship (Corporate Edupreneurship) we jointly designed, created and implemented a hybrid model MOOC, called the “Co-innovation Journey for Startups and Corporates”. It was targeted towards startup entrepreneurs, corporate representatives and higher education students and it was funded by the EU. The MOOC started with 2,438 enrolled learners and the completion rate for the first four weeks was 29.7%. Out of these 208 learners enrolled for the last two weeks, which in turn had a completion rate of 58%. These figures were clearly above the general average for MOOCs. According to our findings, we argue that a hybrid MOOC design may foster global collaboration within a learning community even beyond the course boundaries. The course included four weeks of independent learning, an xMOOC part, and two weeks of collaborative learning, a cMOOC part. The xMOOC part supported learners in creating a shared knowledge base, which enhanced the collaborative learning when entering the cMOOC part of the course.
In Systems Medicine, in addition to high-throughput molecular data (*omics), the wealth of clinical characterization plays a major role in the overall understanding of a disease. Unique problems and challenges arise from the heterogeneity of data and require new solutions to software and analysis methods. The SMART and EurValve studies establish a Systems Medicine approach to valvular heart disease -- the primary cause of subsequent heart failure.
With the aim to ascertain a holistic understanding, different *omics as well as the clinical picture of patients with aortic stenosis (AS) and mitral regurgitation (MR) are collected. Our task within the SMART consortium was to develop an IT platform for Systems Medicine as a basis for data storage, processing, and analysis as a prerequisite for collaborative research. Based on this platform, this thesis deals on the one hand with the transfer of the used Systems Biology methods to their use in the Systems Medicine context and on the other hand with the clinical and biomolecular differences of the two heart valve diseases. To advance differential expression/abundance (DE/DA) analysis software for use in Systems Medicine, we state 21 general software requirements and features of automated DE/DA software, including a novel concept for the simple formulation of experimental designs that can represent complex hypotheses, such as comparison of multiple experimental groups, and demonstrate our handling of the wealth of clinical data in two research applications DEAME and Eatomics. In user interviews, we show that novice users are empowered to formulate and test their multiple DE hypotheses based on clinical phenotype. Furthermore, we describe insights into users' general impression and expectation of the software's performance and show their intention to continue using the software for their work in the future. Both research applications cover most of the features of existing tools or even extend them, especially with respect to complex experimental designs. Eatomics is freely available to the research community as a user-friendly R Shiny application.
Eatomics continued to help drive the collaborative analysis and interpretation of the proteomic profile of 75 human left myocardial tissue samples from the SMART and EurValve studies. Here, we investigate molecular changes within the two most common types of valvular heart disease: aortic valve stenosis (AS) and mitral valve regurgitation (MR). Through DE/DA analyses, we explore shared and disease-specific protein alterations, particularly signatures that could only be found in the sex-stratified analysis. In addition, we relate changes in the myocardial proteome to parameters from clinical imaging. We find comparable cardiac hypertrophy but differences in ventricular size, the extent of fibrosis, and cardiac function. We find that AS and MR show many shared remodeling effects, the most prominent of which is an increase in the extracellular matrix and a decrease in metabolism. Both effects are stronger in AS. In muscle and cytoskeletal adaptations, we see a greater increase in mechanotransduction in AS and an increase in cortical cytoskeleton in MR. The decrease in proteostasis proteins is mainly attributable to the signature of female patients with AS. We also find relevant therapeutic targets.
In addition to the new findings, our work confirms several concepts from animal and heart failure studies by providing the largest collection of human tissue from in vivo collected biopsies to date. Our dataset contributing a resource for isoform-specific protein expression in two of the most common valvular heart diseases. Apart from the general proteomic landscape, we demonstrate the added value of the dataset by showing proteomic and transcriptomic evidence for increased expression of the SARS-CoV-2- receptor at pressure load but not at volume load in the left ventricle and also provide the basis of a newly developed metabolic model of the heart.
Patent document collections are an immense source of knowledge for research and innovation communities worldwide. The rapid growth of the number of patent documents poses an enormous challenge for retrieving and analyzing information from this source in an effective manner. Based on deep learning methods for natural language processing, novel approaches have been developed in the field of patent analysis. The goal of these approaches is to reduce costs by automating tasks that previously only domain experts could solve. In this article, we provide a comprehensive survey of the application of deep learning for patent analysis. We summarize the state-of-the-art techniques and describe how they are applied to various tasks in the patent domain. In a detailed discussion, we categorize 40 papers based on the dataset, the representation, and the deep learning architecture that were used, as well as the patent analysis task that was targeted. With our survey, we aim to foster future research at the intersection of patent analysis and deep learning and we conclude by listing promising paths for future work.
Smart contracts promise to reform the legal domain by automating clerical and procedural work, and minimizing the risk of fraud and manipulation. Their core idea is to draft contract documents in a way which allows machines to process them, to grasp the operational and non-operational parts of the underlying legal agreements, and to use tamper-proof code execution alongside established judicial systems to enforce their terms. The implementation of smart contracts has been largely limited by the lack of an adequate technological foundation which does not place an undue amount of trust in any contract party or external entity. Only recently did the emergence of Decentralized Applications (DApps) change this: Stored and executed via transactions on novel distributed ledger and blockchain networks, powered by complex integrity and consensus protocols, DApps grant secure computation and immutable data storage while at the same time eliminating virtually all assumptions of trust.
However, research on how to effectively capture, deploy, and most of all enforce smart contracts with DApps in mind is still in its infancy. Starting from the initial expression of a smart contract's intent and logic, to the operation of concrete instances in practical environments, to the limits of automatic enforcement---many challenges remain to be solved before a widespread use and acceptance of smart contracts can be achieved.
This thesis proposes a model-driven smart contract management approach to tackle some of these issues. A metamodel and semantics of smart contracts are presented, containing concepts such as legal relations, autonomous and non-autonomous actions, and their interplay. Guided by the metamodel, the notion and a system architecture of a Smart Contract Management System (SCMS) is introduced, which facilitates smart contracts in all phases of their lifecycle. Relying on DApps in heterogeneous multi-chain environments, the SCMS approach is evaluated by a proof-of-concept implementation showing both its feasibility and its limitations.
Further, two specific enforceability issues are explored in detail: The performance of fully autonomous tamper-proof behavior with external off-chain dependencies and the evaluation of temporal constraints within DApps, both of which are essential for smart contracts but challenging to support in the restricted transaction-driven and closed environment of blockchain networks. Various strategies of implementing or emulating these capabilities, which are ultimately applicable to all kinds of DApp projects independent of smart contracts, are presented and evaluated.
Which event happened first?
(2021)
First come, first served: Critical choices between alternative actions are often made based on events external to an organization, and reacting promptly to their occurrence can be a major advantage over the competition. In Business Process Management (BPM), such deferred choices can be expressed in process models, and they are an important aspect of process engines. Blockchain-based process execution approaches are no exception to this, but are severely limited by the inherent properties of the platform: The isolated environment prevents direct access to external entities and data, and the non-continual runtime based entirely on atomic transactions impedes the monitoring and detection of events. In this paper we provide an in-depth examination of the semantics of deferred choice, and transfer them to environments such as the blockchain. We introduce and compare several oracle architectures able to satisfy certain requirements, and show that they can be implemented using state-of-the-art blockchain technology.
First come, first served: Critical choices between alternative actions are often made based on events external to an organization, and reacting promptly to their occurrence can be a major advantage over the competition. In Business Process Management (BPM), such deferred choices can be expressed in process models, and they are an important aspect of process engines. Blockchain-based process execution approaches are no exception to this, but are severely limited by the inherent properties of the platform: The isolated environment prevents direct access to external entities and data, and the non-continual runtime based entirely on atomic transactions impedes the monitoring and detection of events. In this paper we provide an in-depth examination of the semantics of deferred choice, and transfer them to environments such as the blockchain. We introduce and compare several oracle architectures able to satisfy certain requirements, and show that they can be implemented using state-of-the-art blockchain technology.
In this paper, we take a closer look at the development of Massive Open Online Courses (MOOC) in Norway. We want to contribute to nuancing the image of a sound and sustainable policy for flexible and lifelong learning at national and institutional levels and point to some critical areas of improvement in higher education institutions (HEI). 10 semistructured qualitative interviews were carried out in the autumn 2020 at ten different HE institutions across Norway. The informants were strategically selected among employees involved in MOOC-technology, MOOCproduction and MOOC-support over a period of time stretching from 2010–2020. A main finding is that academics engaged in MOOCs find that their entrepreneurial ideas and results, to a large extent, are overlooked at higher institutional levels, and that progress is frustratingly slow. So far, there seems to be little common understanding of the MOOC-concept and the disruptive and transformative effect that MOOC-technology may have at HEIs. At national levels, digital strategies, funding and digital infrastructure are mainly provided in governmental silos. We suggest that governmental bodies and institutional stake holders pay more attention to entrepreneurial MOOC-initiatives to develop sustainability in flexible and lifelong learning in HEIs. This involves connecting the generous funding of digital projects to the provision of a national portal and platform for Open Access to education. To facilitate sustainable lifelong learning in and across HEIs, more quality control to enhance the legitimacy of MOOC certificates and micro-credentials is also a necessary measure.
As part of our everyday life we consume breaking news and interpret it based on our own viewpoints and beliefs. We have easy access to online social networking platforms and news media websites, where we inform ourselves about current affairs and often post about our own views, such as in news comments or social media posts. The media ecosystem enables opinions and facts to travel from news sources to news readers, from news article commenters to other readers, from social network users to their followers, etc. The views of the world many of us have depend on the information we receive via online news and social media. Hence, it is essential to maintain accurate, reliable and objective online content to ensure democracy and verity on the Web. To this end, we contribute to a trustworthy media ecosystem by analyzing news and social media in the context of politics to ensure that media serves the public interest. In this thesis, we use text mining, natural language processing and machine learning techniques to reveal underlying patterns in political news articles and political discourse in social networks.
Mainstream news sources typically cover a great amount of the same news stories every day, but they often place them in a different context or report them from different perspectives. In this thesis, we are interested in how distinct and predictable newspaper journalists are, in the way they report the news, as a means to understand and identify their different political beliefs. To this end, we propose two models that classify text from news articles to their respective original news source, i.e., reported speech and also news comments. Our goal is to capture systematic quoting and commenting patterns by journalists and news commenters respectively, which can lead us to the newspaper where the quotes and comments are originally published. Predicting news sources can help us understand the potential subjective nature behind news storytelling and the magnitude of this phenomenon. Revealing this hidden knowledge can restore our trust in media by advancing transparency and diversity in the news.
Media bias can be expressed in various subtle ways in the text and it is often challenging to identify these bias manifestations correctly, even for humans. However, media experts, e.g., journalists, are a powerful resource that can help us overcome the vague definition of political media bias and they can also assist automatic learners to find the hidden bias in the text. Due to the enormous technological advances in artificial intelligence, we hypothesize that identifying political bias in the news could be achieved through the combination of sophisticated deep learning modelsxi and domain expertise. Therefore, our second contribution is a high-quality and reliable news dataset annotated by journalists for political bias and a state-of-the-art solution for this task based on curriculum learning. Our aim is to discover whether domain expertise is necessary for this task and to provide an automatic solution for this traditionally manually-solved problem. User generated content is fundamentally different from news articles, e.g., messages are shorter, they are often personal and opinionated, they refer to specific topics and persons, etc. Regarding political and socio-economic news, individuals in online communities make use of social networks to keep their peers up-to-date and to share their own views on ongoing affairs. We believe that social media is also an as powerful instrument for information flow as the news sources are, and we use its unique characteristic of rapid news coverage for two applications. We analyze Twitter messages and debate transcripts during live political presidential debates to automatically predict the topics that Twitter users discuss. Our goal is to discover the favoured topics in online communities on the dates of political events as a way to understand the political subjects of public interest. With the up-to-dateness of microblogs, an additional opportunity emerges, namely to use social media posts and leverage the real-time verity about discussed individuals to find their locations.
That is, given a person of interest that is mentioned in online discussions, we use the wisdom of the crowd to automatically track her physical locations over time. We evaluate our approach in the context of politics, i.e., we predict the locations of US politicians as a proof of concept for important use cases, such as to track people that
are national risks, e.g., warlords and wanted criminals.
Modern knowledge bases contain and organize knowledge from many different topic areas. Apart from specific entity information, they also store information about their relationships amongst each other. Combining this information results in a knowledge graph that can be particularly helpful in cases where relationships are of central importance. Among other applications, modern risk assessment in the financial sector can benefit from the inherent network structure of such knowledge graphs by assessing the consequences and risks of certain events, such as corporate insolvencies or fraudulent behavior, based on the underlying network structure. As public knowledge bases often do not contain the necessary information for the analysis of such scenarios, the need arises to create and maintain dedicated domain-specific knowledge bases.
This thesis investigates the process of creating domain-specific knowledge bases from structured and unstructured data sources. In particular, it addresses the topics of named entity recognition (NER), duplicate detection, and knowledge validation, which represent essential steps in the construction of knowledge bases.
As such, we present a novel method for duplicate detection based on a Siamese neural network that is able to learn a dataset-specific similarity measure which is used to identify duplicates. Using the specialized network architecture, we design and implement a knowledge transfer between two deduplication networks, which leads to significant performance improvements and a reduction of required training data.
Furthermore, we propose a named entity recognition approach that is able to identify company names by integrating external knowledge in the form of dictionaries into the training process of a conditional random field classifier. In this context, we study the effects of different dictionaries on the performance of the NER classifier. We show that both the inclusion of domain knowledge as well as the generation and use of alias names results in significant performance improvements.
For the validation of knowledge represented in a knowledge base, we introduce Colt, a framework for knowledge validation based on the interactive quality assessment of logical rules. In its most expressive implementation, we combine Gaussian processes with neural networks to create Colt-GP, an interactive algorithm for learning rule models. Unlike other approaches, Colt-GP uses knowledge graph embeddings and user feedback to cope with data quality issues of knowledge bases. The learned rule model can be used to conditionally apply a rule and assess its quality.
Finally, we present CurEx, a prototypical system for building domain-specific knowledge bases from structured and unstructured data sources. Its modular design is based on scalable technologies, which, in addition to processing large datasets, ensures that the modules can be easily exchanged or extended. CurEx offers multiple user interfaces, each tailored to the individual needs of a specific user group and is fully compatible with the Colt framework, which can be used as part of the system.
We conduct a wide range of experiments with different datasets to determine the strengths and weaknesses of the proposed methods. To ensure the validity of our results, we compare the proposed methods with competing approaches.
The MOOC-CEDIA Observatory
(2021)
In the last few years, an important amount of Massive Open Online Courses (MOOCS) has been made available to the worldwide community, mainly by European and North American universities (i.e. United States). Since its emergence, the adoption of these educational resources has been widely studied by several research groups and universities with the aim of understanding their evolution and impact in educational models, through the time. In the case of Latin America, data from the MOOC-UC Observatory (updated until 2018) shows that, the adoption of these courses by universities in the region has been slow and heterogeneous. In the specific case of Ecuador, although some data is available, there is lack of information regarding the construction, publication and/or adoption of such courses by universities in the country. Moreover, there are not updated studies designed to identify and analyze the barriers and factors affecting the adoption of MOOCs in the country. The aim of this work is to present the MOOC-CEDIA Observatory, a web platform that offers interactive visualizations on the adoption of MOOCs in Ecuador. The main results of the study show that: (1) until 2020 there have been 99 MOOCs in Ecuador, (2) the domains of MOOCs are mostly related to applied sciences, social sciences and natural sciences, with the humanities being the least covered, (3) Open edX and Moodle are the most widely used platforms to deploy such courses. It is expected that the conclusions drawn from this analysis, will allow the design of recommendations aimed to promote the creation and use of quality MOOCs in Ecuador and help institutions to chart the route for their adoption, both for internal use by their community but also by society in general.
Virtualizing physical space
(2021)
The true cost for virtual reality is not the hardware, but the physical space it requires, as a one-to-one mapping of physical space to virtual space allows for the most immersive way of navigating in virtual reality. Such “real-walking” requires physical space to be of the same size and the same shape of the virtual world represented. This generally prevents real-walking applications from running on any space that they were not designed for.
To reduce virtual reality’s demand for physical space, creators of such applications let users navigate virtual space by means of a treadmill, altered mappings of physical to virtual space, hand-held controllers, or gesture-based techniques. While all of these solutions succeed at reducing virtual reality’s demand for physical space, none of them reach the same level of immersion that real-walking provides.
Our approach is to virtualize physical space: instead of accessing physical space directly, we allow applications to express their need for space in an abstract way, which our software systems then map to the physical space available. We allow real-walking applications to run in spaces of different size, different shape, and in spaces containing different physical objects. We also allow users immersed in different virtual environments to share the same space.
Our systems achieve this by using a tracking volume-independent representation of real-walking experiences — a graph structure that expresses the spatial and logical relationships between virtual locations, virtual elements contained within those locations, and user interactions with those elements. When run in a specific physical space, this graph representation is used to define a custom mapping of the elements of the virtual reality application and the physical space by parsing the graph using a constraint solver. To re-use space, our system splits virtual scenes and overlap virtual geometry. The system derives this split by means of hierarchically clustering of our virtual objects as nodes of our bi-partite directed graph that represents the logical ordering of events of the experience. We let applications express their demands for physical space and use pre-emptive scheduling between applications to have them share space. We present several application examples enabled by our system. They all enable real-walking, despite being mapped to physical spaces of different size and shape, containing different physical objects or other users.
We see substantial real-world impact in our systems. Today’s commercial virtual reality applications are generally designing to be navigated using less immersive solutions, as this allows them to be operated on any tracking volume. While this is a commercial necessity for the developers, it misses out on the higher immersion offered by real-walking. We let developers overcome this hurdle by allowing experiences to bring real-walking to any tracking volume, thus potentially bringing real-walking to consumers.
Die eigentlichen Kosten für Virtual Reality Anwendungen entstehen nicht primär durch die erforderliche Hardware, sondern durch die Nutzung von physischem Raum, da die eins-zu-eins Abbildung von physischem auf virtuellem Raum die immersivste Art von Navigation ermöglicht. Dieses als „Real-Walking“ bezeichnete Erlebnis erfordert hinsichtlich Größe und Form eine Entsprechung von physischem Raum und virtueller Welt. Resultierend daraus können Real-Walking-Anwendungen nicht an Orten angewandt werden, für die sie nicht entwickelt wurden.
Um den Bedarf an physischem Raum zu reduzieren, lassen Entwickler von Virtual Reality-Anwendungen ihre Nutzer auf verschiedene Arten navigieren, etwa mit Hilfe eines Laufbandes, verfälschten Abbildungen von physischem zu virtuellem Raum, Handheld-Controllern oder gestenbasierten Techniken. All diese Lösungen reduzieren zwar den Bedarf an physischem Raum, erreichen jedoch nicht denselben Grad an Immersion, den Real-Walking bietet.
Unser Ansatz zielt darauf, physischen Raum zu virtualisieren: Anstatt auf den physischen Raum direkt zuzugreifen, lassen wir Anwendungen ihren Raumbedarf auf abstrakte Weise formulieren, den unsere Softwaresysteme anschließend auf den verfügbaren physischen Raum abbilden. Dadurch ermöglichen wir Real-Walking-Anwendungen Räume mit unterschiedlichen Größen und Formen und Räume, die unterschiedliche physische Objekte enthalten, zu nutzen. Wir ermöglichen auch die zeitgleiche Nutzung desselben Raums durch mehrere Nutzer verschiedener Real-Walking-Anwendungen.
Unsere Systeme erreichen dieses Resultat durch eine Repräsentation von Real-Walking-Erfahrungen, die unabhängig sind vom gegebenen Trackingvolumen – eine Graphenstruktur, die die räumlichen und logischen Beziehungen zwischen virtuellen Orten, den virtuellen Elementen innerhalb dieser Orte, und Benutzerinteraktionen mit diesen Elementen, ausdrückt. Bei der Instanziierung der Anwendung in einem bestimmten physischen Raum wird diese Graphenstruktur und ein Constraint Solver verwendet, um eine individuelle Abbildung der virtuellen Elemente auf den physischen Raum zu erreichen. Zur mehrmaligen Verwendung des Raumes teilt unser System virtuelle Szenen und überlagert virtuelle Geometrie. Das System leitet diese Aufteilung anhand eines hierarchischen Clusterings unserer virtuellen Objekte ab, die als Knoten unseres bi-partiten, gerichteten Graphen die logische Reihenfolge aller Ereignisse repräsentieren. Wir verwenden präemptives Scheduling zwischen den Anwendungen für die zeitgleiche Nutzung von physischem Raum. Wir stellen mehrere Anwendungsbeispiele vor, die Real-Walking ermöglichen – in physischen Räumen mit unterschiedlicher Größe und Form, die verschiedene physische Objekte oder weitere Nutzer enthalten.
Wir sehen in unseren Systemen substantielles Potential. Heutige Virtual Reality-Anwendungen sind bisher zwar so konzipiert, dass sie auf einem beliebigen Trackingvolumen betrieben werden können, aber aus kommerzieller Notwendigkeit kein Real-Walking beinhalten. Damit entgeht Entwicklern die Gelegenheit eine höhere Immersion herzustellen. Indem wir es ermöglichen, Real-Walking auf jedes Trackingvolumen zu bringen, geben wir Entwicklern die Möglichkeit Real-Walking zu ihren Nutzern zu bringen.
Information technology and digital solutions as enablers in the tourism sector require continuous development of skills, as digital transformation is characterized by fast change, complexity and uncertainty. This research investigates how a cMOOC concept could support the tourism industry. A consortium of three universities, a tourism association, and a tourist attraction investigates online learning needs and habits of tourism industry stakeholders in the field of digitalization in a cross-border study in the Baltic Sea region. The multi-national survey (n = 244) reveals a high interest in participating in an online learning community, with two-thirds of respondents seeing opportunities to contributing to such community apart from consuming knowledge. The paper demonstrates preferred ways of learning, motivational and hampering aspects as well as types of possible contributions.
The formal modeling and analysis is of crucial importance for software development processes following the model based approach. We present the formalism of Interval Probabilistic Timed Graph Transformation Systems (IPTGTSs) as a high-level modeling language. This language supports structure dynamics (based on graph transformation), timed behavior (based on clocks, guards, resets, and invariants as in Timed Automata (TA)), and interval probabilistic behavior (based on Discrete Interval Probability Distributions). That is, for the probabilistic behavior, the modeler using IPTGTSs does not need to provide precise probabilities, which are often impossible to obtain, but rather provides a probability range instead from which a precise probability is chosen nondeterministically. In fact, this feature on capturing probabilistic behavior distinguishes IPTGTSs from Probabilistic Timed Graph Transformation Systems (PTGTSs) presented earlier.
Following earlier work on Interval Probabilistic Timed Automata (IPTA) and PTGTSs, we also provide an analysis tool chain for IPTGTSs based on inter-formalism transformations. In particular, we provide in our tool AutoGraph a translation of IPTGTSs to IPTA and rely on a mapping of IPTA to Probabilistic Timed Automata (PTA) to allow for the usage of the Prism model checker. The tool Prism can then be used to analyze the resulting PTA w.r.t. probabilistic real-time queries asking for worst-case and best-case probabilities to reach a certain set of target states in a given amount of time.
Proceedings of the HPI Research School on Service-oriented Systems Engineering 2020 Fall Retreat
(2021)
Design and Implementation of service-oriented architectures imposes a huge number of research questions from the fields of software engineering, system analysis and modeling, adaptability, and application integration. Component orientation and web services are two approaches for design and realization of complex web-based system. Both approaches allow for dynamic application adaptation as well as integration of enterprise application.
Service-Oriented Systems Engineering represents a symbiosis of best practices in object-orientation, component-based development, distributed computing, and business process management. It provides integration of business and IT concerns.
The annual Ph.D. Retreat of the Research School provides each member the opportunity to present his/her current state of their research and to give an outline of a prospective Ph.D. thesis. Due to the interdisciplinary structure of the research school, this technical report covers a wide range of topics. These include but are not limited to: Human Computer Interaction and Computer Vision as Service; Service-oriented Geovisualization Systems; Algorithm Engineering for Service-oriented Systems; Modeling and Verification of Self-adaptive Service-oriented Systems; Tools and Methods for Software Engineering in Service-oriented Systems; Security Engineering of Service-based IT Systems; Service-oriented Information Systems; Evolutionary Transition of Enterprise Applications to Service Orientation; Operating System Abstractions for Service-oriented Computing; and Services Specification, Composition, and Enactment.
Graphs play an important role in many areas of Computer Science. In particular, our work is motivated by model-driven software development and by graph databases. For this reason, it is very important to have the means to express and to reason about the properties that a given graph may satisfy. With this aim, in this paper we present a visual logic that allows us to describe graph properties, including navigational properties, i.e., properties about the paths in a graph. The logic is equipped with a deductive tableau method that we have proved to be sound and complete.
Compound values are not universally supported in virtual machine (VM)-based programming systems and languages. However, providing data structures with value characteristics can be beneficial. On one hand, programming systems and languages can adequately represent physical quantities with compound values and avoid inconsistencies, for example, in representation of large numbers. On the other hand, just-in-time (JIT) compilers, which are often found in VMs, can rely on the fact that compound values are immutable, which is an important property in optimizing programs. Considering this, compound values have an optimization potential that can be put to use by implementing them in VMs in a way that is efficient in memory usage and execution time. Yet, optimized compound values in VMs face certain challenges: to maintain consistency, it should not be observable by the program whether compound values are represented in an optimized way by a VM; an optimization should take into account, that the usage of compound values can exhibit certain patterns at run-time; and that necessary value-incompatible properties due to implementation restrictions should be reduced.
We propose a technique to detect and compress common patterns of compound value usage at run-time to improve memory usage and execution speed. Our approach identifies patterns of frequent compound value references and introduces abbreviated forms for them. Thus, it is possible to store multiple inter-referenced compound values in an inlined memory representation, reducing the overhead of metadata and object references. We extend our approach by a notion of limited mutability, using cells that act as barriers for our approach and provide a location for shared, mutable access with the possibility of type specialization. We devise an extension to our approach that allows us to express automatic unboxing of boxed primitive data types in terms of our initial technique. We show that our approach is versatile enough to express another optimization technique that relies on values, such as Booleans, that are unique throughout a programming system. Furthermore, we demonstrate how to re-use learned usage patterns and optimizations across program runs, thus reducing the performance impact of pattern recognition.
We show in a best-case prototype that the implementation of our approach is feasible and can also be applied to general purpose programming systems, namely implementations of the Racket language and Squeak/Smalltalk. In several micro-benchmarks, we found that our approach can effectively reduce memory consumption and improve execution speed.
In an attempt to pave the way for more extensive Computer Science Education (CSE) coverage in K-12, this research developed and made a preliminary evaluation of a blended-learning Introduction to CS program based on an academic MOOC. Using an academic MOOC that is pedagogically effective and engaging, such a program may provide teachers with disciplinary scaffolds and allow them to focus their attention on enhancing students’ learning experience and nurturing critical 21st-century skills such as self-regulated learning. As we demonstrate, this enabled us to introduce an academic level course to middle-school students. In this research, we developed the principals and initial version of such a program, targeting ninth-graders in science-track classes who learn CS as part of their standard curriculum. We found that the middle-schoolers who participated in the program achieved academic results on par with undergraduate students taking this MOOC for academic credit. Participating students also developed a more accurate perception of the essence of CS as a scientific discipline. The unplanned school closure due to the COVID19 pandemic outbreak challenged the research but underlined the advantages of such a MOOCbased blended learning program above classic pedagogy in times of global or local crises that lead to school closure. While most of the science track classes seem to stop learning CS almost entirely, and the end-of-year MoE exam was discarded, the program’s classes smoothly moved to remote learning mode, and students continued to study at a pace similar to that experienced before the school shut down.
Comprior
(2021)
Background
Reproducible benchmarking is important for assessing the effectiveness of novel feature selection approaches applied on gene expression data, especially for prior knowledge approaches that incorporate biological information from online knowledge bases. However, no full-fledged benchmarking system exists that is extensible, provides built-in feature selection approaches, and a comprehensive result assessment encompassing classification performance, robustness, and biological relevance. Moreover, the particular needs of prior knowledge feature selection approaches, i.e. uniform access to knowledge bases, are not addressed. As a consequence, prior knowledge approaches are not evaluated amongst each other, leaving open questions regarding their effectiveness.
Results
We present the Comprior benchmark tool, which facilitates the rapid development and effortless benchmarking of feature selection approaches, with a special focus on prior knowledge approaches. Comprior is extensible by custom approaches, offers built-in standard feature selection approaches, enables uniform access to multiple knowledge bases, and provides a customizable evaluation infrastructure to compare multiple feature selection approaches regarding their classification performance, robustness, runtime, and biological relevance.
Conclusion
Comprior allows reproducible benchmarking especially of prior knowledge approaches, which facilitates their applicability and for the first time enables a comprehensive assessment of their effectiveness
Background
Reproducible benchmarking is important for assessing the effectiveness of novel feature selection approaches applied on gene expression data, especially for prior knowledge approaches that incorporate biological information from online knowledge bases. However, no full-fledged benchmarking system exists that is extensible, provides built-in feature selection approaches, and a comprehensive result assessment encompassing classification performance, robustness, and biological relevance. Moreover, the particular needs of prior knowledge feature selection approaches, i.e. uniform access to knowledge bases, are not addressed. As a consequence, prior knowledge approaches are not evaluated amongst each other, leaving open questions regarding their effectiveness.
Results
We present the Comprior benchmark tool, which facilitates the rapid development and effortless benchmarking of feature selection approaches, with a special focus on prior knowledge approaches. Comprior is extensible by custom approaches, offers built-in standard feature selection approaches, enables uniform access to multiple knowledge bases, and provides a customizable evaluation infrastructure to compare multiple feature selection approaches regarding their classification performance, robustness, runtime, and biological relevance.
Conclusion
Comprior allows reproducible benchmarking especially of prior knowledge approaches, which facilitates their applicability and for the first time enables a comprehensive assessment of their effectiveness
This paper aims to present the results of a higher education experience promoted by the research centres INTELLECT (University of Modena and Reggio Emilia) and CDM (University of Roma Tre), as part of difference master’s degrees programme of the academic years 2018/2019, 2019/2020, and 2020/2021. Through different online activities, 37 students attended and evaluated a MOOC on museum education content, such promoting their professionals and transverse skills, such as critical thinking, and developing their knowledge relative to OERs, within culture and heritage education contexts. Moreover, results from the online evaluation activities support the implementation of the MOOC in a collaborative way: during the academic years, evaluation data have been used by researcher to make changes to the course modules, thus realizing a more effective online path from and educational point of view.