004 Datenverarbeitung; Informatik
Refine
Year of publication
- 2021 (45) (remove)
Document Type
- Article (32)
- Monograph/Edited Volume (6)
- Doctoral Thesis (4)
- Postprint (2)
- Conference Proceeding (1)
Language
- English (45) (remove)
Keywords
- blockchain (2)
- business process management (2)
- business processes (2)
- cyber-physical systems (2)
- deferred choice (2)
- formal semantics (2)
- memory (2)
- oracles (2)
- probabilistic timed systems (2)
- qualitative Analyse (2)
Institute
- Hasso-Plattner-Institut für Digital Engineering GmbH (45) (remove)
If taking a flipped learning approach, MOOC content can be used for online pre-class instruction. After which students can put the knowledge they gained from the MOOC into practice either synchronously or asynchronously. This study examined one such, asynchronous, course in teacher education. The course ran with 40 students over 13 weeks from February to May 2020. A case study approach was followed using mixed methods to assess the efficacy of the course. Quantitative data was gathered on achievement of learning outcomes, online engagement, and satisfaction. Qualitative data was gathered via student interviews from which a thematic analysis was undertaken. From a combined analysis of the data, three themes emerged as pertinent to course efficacy: quality and quantity of communication and collaboration; suitability of the MOOC; and significance for career development.
3D point clouds are a universal and discrete digital representation of three-dimensional objects and environments. For geospatial applications, 3D point clouds have become a fundamental type of raw data acquired and generated using various methods and techniques. In particular, 3D point clouds serve as raw data for creating digital twins of the built environment.
This thesis concentrates on the research and development of concepts, methods, and techniques for preprocessing, semantically enriching, analyzing, and visualizing 3D point clouds for applications around transport infrastructure. It introduces a collection of preprocessing techniques that aim to harmonize raw 3D point cloud data, such as point density reduction and scan profile detection. Metrics such as, e.g., local density, verticality, and planarity are calculated for later use. One of the key contributions tackles the problem of analyzing and deriving semantic information in 3D point clouds. Three different approaches are investigated: a geometric analysis, a machine learning approach operating on synthetically generated 2D images, and a machine learning approach operating on 3D point clouds without intermediate representation.
In the first application case, 2D image classification is applied and evaluated for mobile mapping data focusing on road networks to derive road marking vector data. The second application case investigates how 3D point clouds can be merged with ground-penetrating radar data for a combined visualization and to automatically identify atypical areas in the data. For example, the approach detects pavement regions with developing potholes. The third application case explores the combination of a 3D environment based on 3D point clouds with panoramic imagery to improve visual representation and the detection of 3D objects such as traffic signs.
The presented methods were implemented and tested based on software frameworks for 3D point clouds and 3D visualization. In particular, modules for metric computation, classification procedures, and visualization techniques were integrated into a modular pipeline-based C++ research framework for geospatial data processing, extended by Python machine learning scripts. All visualization and analysis techniques scale to large real-world datasets such as road networks of entire cities or railroad networks.
The thesis shows that some use cases allow taking advantage of established image vision methods to analyze images rendered from mobile mapping data efficiently. The two presented semantic classification methods working directly on 3D point clouds are use case independent and show similar overall accuracy when compared to each other. While the geometry-based method requires less computation time, the machine learning-based method supports arbitrary semantic classes but requires training the network with ground truth data. Both methods can be used in combination to gradually build this ground truth with manual corrections via a respective annotation tool.
This thesis contributes results for IT system engineering of applications, systems, and services that require spatial digital twins of transport infrastructure such as road networks and railroad networks based on 3D point clouds as raw data. It demonstrates the feasibility of fully automated data flows that map captured 3D point clouds to semantically classified models. This provides a key component for seamlessly integrated spatial digital twins in IT solutions that require up-to-date, object-based, and semantically enriched information about the built environment.
TRIPOD
(2021)
Inertial measurement units (IMUs) enable easy to operate and low-cost data recording for gait analysis. When combined with treadmill walking, a large number of steps can be collected in a controlled environment without the need of a dedicated gait analysis laboratory. In order to evaluate existing and novel IMU-based gait analysis algorithms for treadmill walking, a reference dataset that includes IMU data as well as reliable ground truth measurements for multiple participants and walking speeds is needed. This article provides a reference dataset consisting of 15 healthy young adults who walked on a treadmill at three different speeds. Data were acquired using seven IMUs placed on the lower body, two different reference systems (Zebris FDMT-HQ and OptoGait), and two RGB cameras. Additionally, in order to validate an existing IMU-based gait analysis algorithm using the dataset, an adaptable modular data analysis pipeline was built. Our results show agreement between the pressure-sensitive Zebris and the photoelectric OptoGait system (r = 0.99), demonstrating the quality of our reference data. As a use case, the performance of an algorithm originally designed for overground walking was tested on treadmill data using the data pipeline. The accuracy of stride length and stride time estimations was comparable to that reported in other studies with overground data, indicating that the algorithm is equally applicable to treadmill data. The Python source code of the data pipeline is publicly available, and the dataset will be provided by the authors upon request, enabling future evaluations of IMU gait analysis algorithms without the need of recording new data.
Generative adversarial networks (GANs) have been broadly applied to a wide range of application domains since their proposal. In this thesis, we propose several methods that aim to tackle different existing problems in GANs. Particularly, even though GANs are generally able to generate high-quality samples, the diversity of the generated set is often sub-optimal. Moreover, the common increase of the number of models in the original GANs framework, as well as their architectural sizes, introduces additional costs. Additionally, even though challenging, the proper evaluation of a generated set is an important direction to ultimately improve the generation process in GANs. We start by introducing two diversification methods that extend the original GANs framework to multiple adversaries to stimulate sample diversity in a generated set. Then, we introduce a new post-training compression method based on Monte Carlo methods and importance sampling to quantize and prune the weights and activations of pre-trained neural networks without any additional training. The previous method may be used to reduce the memory and computational costs introduced by increasing the number of models in the original GANs framework. Moreover, we use a similar procedure to quantize and prune gradients during training, which also reduces the communication costs between different workers in a distributed training setting. We introduce several topology-based evaluation methods to assess data generation in different settings, namely image generation and language generation. Our methods retrieve both single-valued and double-valued metrics, which, given a real set, may be used to broadly assess a generated set or separately evaluate sample quality and sample diversity, respectively. Moreover, two of our metrics use locality-sensitive hashing to accurately assess the generated sets of highly compressed GANs. The analysis of the compression effects in GANs paves the way for their efficient employment in real-world applications. Given their general applicability, the methods proposed in this thesis may be extended beyond the context of GANs. Hence, they may be generally applied to enhance existing neural networks and, in particular, generative frameworks.
CoFeeMOOC-v.2
(2021)
Providing adequate support to MOOC participants is often a challenging task due to massiveness of the learners’ population and the asynchronous communication among peers and MOOC practitioners. This workshop aims at discussing common learners’ problems reported in the literature and reflect on designing adequate feedback interventions with the use of learning data. Our aim is three-fold: a) to pinpoint MOOC aspects that impact the planning of feedback, b) to explore the use of learning data in designing feedback strategies, and c) to propose design guidelines for developing and delivering scaffolding interventions for personalized feedback in MOOCs. To do so, we will carry out hands-on activities that aim to involve participants in interpreting learning data and using them to design adaptive feedback. This workshop appeals to researchers, practitioners and MOOC stakeholders who aim to providing contextualized scaffolding. We envision that this workshop will provide insights for bridging the gap between pedagogical theory and practice when it comes to feedback interventions in MOOCs.
Developing highly skilled researchers is essential to accelerate the economic progress of developing countries such as Cambodia in South East Asia. While there is continuing research investigating Cambodia’s potential to cultivate such a workforce, the circumstances of undergraduate students in public provincial universities do not receive ample attention. This is crucial as numerous multinational corporations are participating via foreign direct investments in special economic zones at the border provinces and need talented human resources in Cambodia as well as in neighboring Southeast Asian countries such as Thailand and Vietnam. Student’s research capability growth starts with one’s belief in their capacity to use the necessary information tools and their potential to succeed in research. In this research paper, we look at how such beliefs, specifically research self-efficacy and information literacy, can be developed through a short-term intervention that uses MOOCs and assess their long-term effects. Our previous research has shown that short-term training intervention has immediate positive effects on the undergraduate students’ self-efficacies in Cambodian public provincial universities. In this paper, we present the follow-up study results conducted sixteen months after the said short-term training intervention. Results reveal that from follow-up evaluations that while student’s self-efficacies were significantly higher than before the short-term intervention was completed, they were lower than immediately after the intervention. Thus, while perfunctory interventions such as merely introducing the students to MOOCs and other relevant research tools over as little as three weeks can have significant positive effects, efforts must be made to sustain the benefits gained. This implication is essential to developing countries such as Cambodia that need low-cost solutions with immediate positive results in developing human resources to conduct research, particularly in areas far from more developed capital cities.
Crochet is a popular handcraft all over the world. While other techniques such as knitting or weaving have received technical support over the years through machines, crochet is still a purely manual craft. Not just the act of crochet itself is manual but also the process of creating instructions for new crochet patterns, which is barely supported by domain specific digital solutions. This leads to unstructured and often also ambiguous and erroneous pattern instructions. In this report, we propose a concept to digitally represent crochet patterns. This format incorporates crochet techniques which allows domain specific support for crochet pattern designers during the pattern creation and instruction writing process. As contributions, we present a thorough domain analysis, the concept of a graph structure used as domain specific language to specify crochet patterns and a prototype of a projectional editor using the graph as representation format of patterns and a diagramming system to visualize them in 2D and 3D. By analyzing the domain, we learned about crochet techniques and pain points of designers in their pattern creation workflow. These insights are the basis on which we defined the pattern representation. In order to evaluate our concept, we built a prototype by which the feasibility of the concept is shown and we tested the software with professional crochet designers who approved of the concept.
Cyber-physical systems often encompass complex concurrent behavior with timing constraints and probabilistic failures on demand. The analysis whether such systems with probabilistic timed behavior adhere to a given specification is essential. When the states of the system can be represented by graphs, the rule-based formalism of Probabilistic Timed Graph Transformation Systems (PTGTSs) can be used to suitably capture structure dynamics as well as probabilistic and timed behavior of the system. The model checking support for PTGTSs w.r.t. properties specified using Probabilistic Timed Computation Tree Logic (PTCTL) has been already presented. Moreover, for timed graph-based runtime monitoring, Metric Temporal Graph Logic (MTGL) has been developed for stating metric temporal properties on identified subgraphs and their structural changes over time. In this paper, we (a) extend MTGL to the Probabilistic Metric Temporal Graph Logic (PMTGL) by allowing for the specification of probabilistic properties, (b) adapt our MTGL satisfaction checking approach to PTGTSs, and (c) combine the approaches for PTCTL model checking and MTGL satisfaction checking to obtain a Bounded Model Checking (BMC) approach for PMTGL. In our evaluation, we apply an implementation of our BMC approach in AutoGraph to a running example.
Learning analytics at scale
(2021)
Digital technologies are paving the way for innovative educational approaches. The learning format of Massive Open Online Courses (MOOCs) provides a highly accessible path to lifelong learning while being more affordable and flexible than face-to-face courses. Thereby, thousands of learners can enroll in courses mostly without admission restrictions, but this also raises challenges. Individual supervision by teachers is barely feasible, and learning persistence and success depend on students' self-regulatory skills. Here, technology provides the means for support. The use of data for decision-making is already transforming many fields, whereas in education, it is still a young research discipline. Learning Analytics (LA) is defined as the measurement, collection, analysis, and reporting of data about learners and their learning contexts with the purpose of understanding and improving learning and learning environments. The vast amount of data that MOOCs produce on the learning behavior and success of thousands of students provides the opportunity to study human learning and develop approaches addressing the demands of learners and teachers.
The overall purpose of this dissertation is to investigate the implementation of LA at the scale of MOOCs and to explore how data-driven technology can support learning and teaching in this context. To this end, several research prototypes have been iteratively developed for the HPI MOOC Platform. Hence, they were tested and evaluated in an authentic real-world learning environment. Most of the results can be applied on a conceptual level to other MOOC platforms as well. The research contribution of this thesis thus provides practical insights beyond what is theoretically possible. In total, four system components were developed and extended:
(1) The Learning Analytics Architecture: A technical infrastructure to collect, process, and analyze event-driven learning data based on schema-agnostic pipelining in a service-oriented MOOC platform. (2) The Learning Analytics Dashboard for Learners: A tool for data-driven support of self-regulated learning, in particular to enable learners to evaluate and plan their learning activities, progress, and success by themselves. (3) Personalized Learning Objectives: A set of features to better connect learners' success to their personal intentions based on selected learning objectives to offer guidance and align the provided data-driven insights about their learning progress. (4) The Learning Analytics Dashboard for Teachers: A tool supporting teachers with data-driven insights to enable the monitoring of their courses with thousands of learners, identify potential issues, and take informed action.
For all aspects examined in this dissertation, related research is presented, development processes and implementation concepts are explained, and evaluations are conducted in case studies. Among other findings, the usage of the learner dashboard in combination with personalized learning objectives demonstrated improved certification rates of 11.62% to 12.63%. Furthermore, it was observed that the teacher dashboard is a key tool and an integral part for teaching in MOOCs. In addition to the results and contributions, general limitations of the work are discussed—which altogether provide a solid foundation for practical implications and future research.