Refine
Year of publication
Document Type
- Article (51)
- Other (36)
- Monograph/Edited Volume (29)
- Conference Proceeding (4)
- Postprint (2)
- Part of a Book (1)
- Report (1)
Keywords
- MOOC (10)
- digital education (8)
- e-learning (8)
- Cloud Computing (7)
- E-Learning (7)
- openHPI (7)
- Onlinekurs (6)
- MOOCs (5)
- Identitätsmanagement (4)
- Security (4)
Digitale Technologien bieten erhebliche politische, wirtschaftliche und gesellschaftliche Chancen. Zugleich ist der Begriff digitale Souveränität zu einem Leitmotiv im deutschen Diskurs über digitale Technologien geworden: das heißt, die Fähigkeit des Staates, seine Verantwortung wahrzunehmen und die Befähigung der Gesellschaft – und des Einzelnen – sicherzustellen, die digitale Transformation selbstbestimmt zu gestalten. Exemplarisch für die Herausforderung in Deutschland und Europa, die Vorteile digitaler Technologien zu nutzen und gleichzeitig Souveränitätsbedenken zu berücksichtigen, steht der Bildungssektor. Er umfasst Bildung als zentrales öffentliches Gut, ein schnell aufkommendes Geschäftsfeld und wachsende Bestände an hochsensiblen personenbezogenen Daten. Davon ausgehend beschreibt der Bericht Wege zur Entschärfung des Spannungsverhältnisses zwischen Digitalisierung und Souveränität auf drei verschiedenen Ebenen – Staat, Wirtschaft und Individuum – anhand konkreter technischer Projekte im Bildungsbereich: die HPI Schul-Cloud (staatliche Souveränität), die MERLOT-Datenräume (wirtschaftliche Souveränität) und die openHPI-Plattform (individuelle Souveränität).
Digital technology offers significant political, economic, and societal opportunities. At the same time, the notion of digital sovereignty has become a leitmotif in German discourse: the state’s capacity to assume its responsibilities and safeguard society’s – and individuals’ – ability to shape the digital transformation in a self-determined way. The education sector is exemplary for the challenge faced by Germany, and indeed Europe, of harnessing the benefits of digital technology while navigating concerns around sovereignty. It encompasses education as a core public good, a rapidly growing field of business, and growing pools of highly sensitive personal data. The report describes pathways to mitigating the tension between digitalization and sovereignty at three different levels – state, economy, and individual – through the lens of concrete technical projects in the education sector: the HPI Schul-Cloud (state sovereignty), the MERLOT data spaces (economic sovereignty), and the openHPI platform (individual sovereignty).
Design thinking is a well-established practical and educational approach to fostering high-level creativity and innovation, which has been refined since the 1950s with the participation of experts like Joy Paul Guilford and Abraham Maslow. Through real-world projects, trainees learn to optimize their creative outcomes by developing and practicing creative cognition and metacognition. This paper provides a holistic perspective on creativity, enabling the formulation of a comprehensive theoretical framework of creative metacognition. It focuses on the design thinking approach to creativity and explores the role of metacognition in four areas of creativity expertise: Products, Processes, People, and Places. The analysis includes task-outcome relationships (product metacognition), the monitoring of strategy effectiveness (process metacognition), an understanding of individual or group strengths and weaknesses (people metacognition), and an examination of the mutual impact between environments and creativity (place metacognition). It also reviews measures taken in design thinking education, including a distribution of cognition and metacognition, to support students in their development of creative mastery. On these grounds, we propose extended methods for measuring creative metacognition with the goal of enhancing comprehensive assessments of the phenomenon. Proposed methodological advancements include accuracy sub-scales, experimental tasks where examinees explore problem and solution spaces, combinations of naturalistic observations with capability testing, as well as physiological assessments as indirect measures of creative metacognition.
In an effort to describe and produce different formats for video instruction, the research community in technology-enhanced learning, and MOOC scholars in particular, have focused on the general style of video production: whether it is a digitally scripted “talk-and-chalk” or a “talking head” version of a learning unit. Since these production styles include various sub-elements, this paper deconstructs the inherited elements of video production in the context of educational live-streams. Using over 700 videos – both from synchronous and asynchronous modalities of large video-based platforms (YouTube and Twitch), 92 features were found in eight categories of video production. These include commonly analyzed features such as the use of green screen and a visible instructor, but also less studied features such as social media connections and changing camera perspective depending on the topic being covered. Overall, the research results enable an analysis of common video production styles and a toolbox for categorizing new formats – independent of their final (a)synchronous use in MOOCs. Keywords: video production, MOOC video styles, live-streaming.
With the growing number of online learning resources, it becomes increasingly difficult and overwhelming to keep track of the latest developments and to find orientation in the plethora of offers. AI-driven services to recommend standalone learning resources or even complete learning paths are discussed as a possible solution for this challenge. To function properly, such services require a well-defined set of metadata provided by the learning resource. During the last few years, the so-called MOOChub metadata format has been established as a de-facto standard by a group of MOOC providers in German-speaking countries. This format, which is based on schema.org, already delivers a quite comprehensive set of metadata. So far, this set has been sufficient to list, display, sort, filter, and search for courses on several MOOC and open educational resources (OER) aggregators. AI recommendation services and further automated integration, beyond a plain listing, have special requirements, however. To optimize the format for proper support of such systems, several extensions and modifications have to be applied. We herein report on a set of suggested changes to prepare the format for this task.
EMOOCs 2023
(2023)
From June 14 to June 16, 2023, Hasso Plattner Institute, Potsdam, hosted the eighth European MOOC Stakeholder Summit (EMOOCs 2023).
The pandemic is fortunately over. It has once again shown how important digital education is. How well-prepared a country was could be seen in our schools, universities, and companies. In different countries, the problems manifested themselves differently. The measures and approaches to solving the problems varied accordingly. Digital education, whether micro-credentials, MOOCs, blended learning formats, or other e-learning tools, received a major boost.
EMOOCs 2023 focusses on the effects of this emergency situation. How has it affected the development and delivery of MOOCs and other e-learning offerings all over Europe? Which projects can serve as models for successful digital learning and teaching? Which roles can MOOCs and micro-credentials bear in the current business transformation? Is there a backlash to the routine we knew from pre-Corona times? Or have many things become firmly established in the meantime, e.g. remote work, hybrid conferences, etc.?
Furthermore, EMOOCs 2023 has a closer look at the development and formalization of digital learning. Micro-credentials are just the starting point. Further steps in this direction would be complete online study programs or full online universities.
Another main topic is the networking of learning offers and the standardization of formats and metadata. Examples of fruitful cooperations are the MOOChub, the European MOOC Consortium, and the Common Micro-Credential Framework.
The learnings, derived from practical experience and research, are explored in EMOOCs 2023 in four tracks and additional workshops, covering various aspects of this field. In this publication, we present papers from the conference’s Research & Experience Track, the Business Track and the International Track.
About 15 years ago, the first Massive Open Online Courses (MOOCs) appeared and revolutionized online education with more interactive and engaging course designs. Yet, keeping learners motivated and ensuring high satisfaction is one of the challenges today's course designers face. Therefore, many MOOC providers employed gamification elements that only boost extrinsic motivation briefly and are limited to platform support. In this article, we introduce and evaluate a gameful learning design we used in several iterations on computer science education courses. For each of the courses on the fundamentals of the Java programming language, we developed a self-contained, continuous story that accompanies learners through their learning journey and helps visualize key concepts. Furthermore, we share our approach to creating the surrounding story in our MOOCs and provide a guideline for educators to develop their own stories. Our data and the long-term evaluation spanning over four Java courses between 2017 and 2021 indicates the openness of learners toward storified programming courses in general and highlights those elements that had the highest impact. While only a few learners did not like the story at all, most learners consumed the additional story elements we provided. However, learners' interest in influencing the story through majority voting was negligible and did not show a considerable positive impact, so we continued with a fixed story instead. We did not find evidence that learners just participated in the narrative because they worked on all materials. Instead, for 10-16% of learners, the story was their main course motivation. We also investigated differences in the presentation format and concluded that several longer audio-book style videos were most preferred by learners in comparison to animated videos or different textual formats. Surprisingly, the availability of a coherent story embedding examples and providing a context for the practical programming exercises also led to a slightly higher ranking in the perceived quality of the learning material (by 4%). With our research in the context of storified MOOCs, we advance gameful learning designs, foster learner engagement and satisfaction in online courses, and help educators ease knowledge transfer for their learners.
Digitale Medien sind aus unserem Alltag kaum noch wegzudenken. Einer der zentralsten Bereiche für unsere Gesellschaft, die schulische Bildung, darf hier nicht hintanstehen. Wann immer der Einsatz digital unterstützter Tools pädagogisch sinnvoll ist, muss dieser in einem sicheren Rahmen ermöglicht werden können. Die HPI Schul-Cloud ist dieser Vision gefolgt, die vom Nationalen IT-Gipfel 2016 angestoßen wurde und dem Bericht vorangestellt ist – gefolgt. Sie hat sich in den vergangenen fünf Jahren vom Pilotprojekt zur unverzichtbaren IT-Infrastruktur für zahlreiche Schulen entwickelt. Während der Corona-Pandemie hat sie für viele Tausend Schulen wichtige Unterstützung bei der Umsetzung ihres Bildungsauftrags geboten. Das Ziel, eine zukunftssichere und datenschutzkonforme Infrastruktur zur digitalen Unterstützung des Unterrichts zur Verfügung zu stellen, hat sie damit mehr als erreicht. Aktuell greifen rund 1,4 Millionen Lehrkräfte und Schülerinnen und Schüler bundesweit und an den deutschen Auslandsschulen auf die HPI Schul-Cloud zu.
openHPI
(2022)
On the occasion of the 10th openHPI anniversary, this technical report provides information about the HPI MOOC platform, including its core features, technology, and architecture.
In an introduction, the platform family with all partner platforms is presented; these now amount to nine platforms, including openHPI. This section introduces openHPI as an advisor and research partner in various projects.
In the second chapter, the functionalities and common course formats of the platform are presented. The functionalities are divided into learner and admin features. The learner features section provides detailed information about performance records, courses, and the learning materials of which a course is composed: videos, texts, and quizzes. In addition, the learning materials can be enriched by adding external exercise tools that communicate with the HPI MOOC platform via the Learning Tools Interoperability (LTI) standard. Furthermore, the concept of peer assessments completed the possible learning materials.
The section then proceeds with further information on the discussion forum, a fundamental concept of MOOCs compared to traditional e-learning offers. The section is concluded with a description of the quiz recap, learning objectives, mobile applications, gameful learning, and the help desk.
The next part of this chapter deals with the admin features. The described functionality is restricted to describing the news and announcements, dashboards and statistics, reporting capabilities, research options with A/B testing, the course feed, and the TransPipe tool to support the process of creating automated or manual subtitles. The platform supports a large variety of additional features, but a detailed description of these features goes beyond the scope of this report.
The chapter then elaborates on common course formats and openHPI teaching activities at the HPI. The chapter concludes with some best practices for course design and delivery.
The third chapter provides insights into the technology and architecture behind openHPI. A special characteristic of the openHPI project is the conscious decision to operate the complete application from bare metal to platform development. Hence, the chapter starts with a section about the openHPI Cloud, including detailed information about the data center and devices, the used cloud software OpenStack and Ceph, as well as the openHPI Cloud Service provided for the HPI.
Afterward, a section on the application technology stack and development tooling describes the application infrastructure components, the used automation, the deployment pipeline, and the tools used for monitoring and alerting. The chapter is concluded with detailed information about the technology stack and concrete platform implementation details. The section describes the service-oriented Ruby on Rails application, inter-service communication, and public APIs. It also provides more information on the design system and components used in the application. The section concludes with a discussion of the original microservice architecture, where we share our insights and reasoning for migrating back to a monolithic application.
The last chapter provides a summary and an outlook on the future of digital education.
Many participants in Massive Open Online Courses are full-time employees seeking greater flexibility in their time commitment and the available learning paths. We recently addressed these requirements by splitting up our 6-week courses into three 2-week modules followed by a separate exam. Modularizing courses offers many advantages: Shorter modules are more sustainable and can be combined, reused, and incorporated into learning paths more easily. Time flexibility for learners is also improved as exams can now be offered multiple times per year, while the learning content is available independently. In this article, we answer the question of which impact this modularization has on key learning metrics, such as course completion rates, learning success, and no-show rates. Furthermore, we investigate the influence of longer breaks between modules on these metrics. According to our analysis, course modules facilitate more selective learning behaviors that encourage learners to focus on topics they are the most interested in. At the same time, participation in overarching exams across all modules seems to be less appealing compared to an integrated exam of a 6-week course. While breaks between the modules increase the distinctive appearance of individual modules, a break before the final exam further reduces initial interest in the exams. We further reveal that participation in self-paced courses as a preparation for the final exam is unlikely to attract new learners to the course offerings, even though learners' performance is comparable to instructor-paced courses. The results of our long-term study on course modularization provide a solid foundation for future research and enable educators to make informed decisions about the design of their courses.
Many participants in Massive Open Online Courses are full-time employees seeking greater flexibility in their time commitment and the available learning paths. We recently addressed these requirements by splitting up our 6-week courses into three 2-week modules followed by a separate exam. Modularizing courses offers many advantages: Shorter modules are more sustainable and can be combined, reused, and incorporated into learning paths more easily. Time flexibility for learners is also improved as exams can now be offered multiple times per year, while the learning content is available independently. In this article, we answer the question of which impact this modularization has on key learning metrics, such as course completion rates, learning success, and no-show rates. Furthermore, we investigate the influence of longer breaks between modules on these metrics. According to our analysis, course modules facilitate more selective learning behaviors that encourage learners to focus on topics they are the most interested in. At the same time, participation in overarching exams across all modules seems to be less appealing compared to an integrated exam of a 6-week course. While breaks between the modules increase the distinctive appearance of individual modules, a break before the final exam further reduces initial interest in the exams. We further reveal that participation in self-paced courses as a preparation for the final exam is unlikely to attract new learners to the course offerings, even though learners' performance is comparable to instructor-paced courses. The results of our long-term study on course modularization provide a solid foundation for future research and enable educators to make informed decisions about the design of their courses.
Evaluating creativity of verbal responses or texts is a challenging task due to psychometric issues associated with subjective ratings and the peculiarities of textual data. We explore an approach to objectively assess the creativity of responses in a sentence generation task to 1) better understand what language-related aspects are valued by human raters and 2) further advance the developments toward automating creativity evaluations. Over the course of two prior studies, participants generated 989 four-word sentences based on a four-letter prompt with the instruction to be creative. We developed an algorithm that scores each sentence on eight different metrics including 1) general word infrequency, 2) word combination infrequency, 3) context-specific word uniqueness, 4) syntax uniqueness, 5) rhyme, 6) phonetic similarity, and similarity of 7) sequence spelling and 8) semantic meaning to the cue. The text metrics were then used to explain the averaged creativity ratings of eight human raters. We found six metrics to be significantly correlated with the human ratings, explaining a total of 16% of their variance. We conclude that the creative impression of sentences is partly driven by different aspects of novelty in word choice and syntax, as well as rhythm and sound, which are amenable to objective assessment.
openHPI
(2022)
Anlässlich des 10-jährigen Jubiläums von openHPI informiert dieser technische Bericht über die HPI-MOOC-Plattform einschließlich ihrer Kernfunktionen, Technologie und Architektur.
In einer Einleitung wird die Plattformfamilie mit allen Partnerplattformen vorgestellt; diese belaufen sich inklusive openHPI aktuell auf neun Plattformen. In diesem Abschnitt wird außerdem gezeigt, wie openHPI als Berater und Forschungspartner in verschiedenen Projekten fungiert.
Im zweiten Kapitel werden die Funktionalitäten und gängigen Kursformate der Plattform präsentiert. Die Funktionalitäten sind in Lerner- und Admin-Funktionen unterteilt. Der Bereich Lernerfunktionen bietet detaillierte Informationen zu Leistungsnachweisen, Kursen und den Lernmaterialien, aus denen sich ein Kurs zusammensetzt: Videos, Texte und Quiz. Darüber hinaus können die Lernmaterialien durch externe Übungstools angereichert werden, die über den Standard Learning Tools Interoperability (LTI) mit der HPI MOOC-Plattform kommunizieren. Das Konzept der Peer-Assessments rundet die möglichen Lernmaterialien ab.
Der Abschnitt geht dann weiter auf das Diskussionsforum ein, das einen grundlegenden Unterschied von MOOCs im Vergleich zu traditionellen E-Learning-Angeboten darstellt. Zum Abschluss des Abschnitts folgen eine Beschreibung von Quiz-Recap, Lernzielen, mobilen Anwendungen, spielerischen Lernens und dem Helpdesk.
Der nächste Teil dieses Kapitels beschäftigt sich mit den Admin-Funktionen. Die Funktionalitätsbeschreibung beschränkt sich Neuigkeiten und Ankündigungen, Dashboards und Statistiken, Berichtsfunktionen, Forschungsoptionen mit A/B-Tests, den Kurs-Feed und das TransPipe-Tool zur Unterstützung beim Erstellen von automatischen oder manuellen Untertiteln. Die Plattform unterstützt außerdem eine Vielzahl zusätzlicher Funktionen, doch eine detaillierte Beschreibung dieser Funktionen würde den Rahmen des Berichts sprengen.
Das Kapitel geht dann auf gängige Kursformate und openHPI-Lehrveranstaltungen am HPI ein, bevor es mit einigen Best Practices für die Gestaltung und Durchführung von Kursen schließt.
Zum Abschluss des technischen Berichts gibt das letzte Kapitel eine Zusammenfassung und einen Ausblick auf die Zukunft der digitalen Bildung.
Ein besonderes Merkmal des openHPI-Projekts ist die bewusste Entscheidung, die komplette Anwendung von den physischen Netzwerkkomponenten bis zur Plattformentwicklung eigenständig zu betreiben. Bei der vorliegenden deutschen Variante handelt es sich um eine gekürzte Übersetzung des technischen Berichts 148, bei der kein Einblick in die Technologien und Architektur von openHPI gegeben wird. Interessierte Leser:innen können im technischen Bericht 148 (vollständige englische Version) detaillierte Informationen zum Rechenzentrum und den Geräten, der Cloud-Software und dem openHPI Cloud Service aber auch zu Infrastruktur-Anwendungskomponenten wie Entwicklungstools, Automatisierung, Deployment-Pipeline und Monitoring erhalten. Außerdem finden sich dort weitere Informationen über den Technologiestack und konkrete Implementierungsdetails der Plattform inklusive der serviceorientierten Ruby on Rails-Anwendung, die Kommunikation zwischen den Diensten, öffentliche APIs, sowie Designsystem und -komponenten. Der Abschnitt schließt mit einer Diskussion über die ursprüngliche Microservice-Architektur und die Migration zu einer monolithischen Anwendung.
The “HPI Future SOC Lab” is a cooperation of the Hasso Plattner Institute (HPI) and industry partners. Its mission is to enable and promote exchange and interaction between the research community and the industry partners.
The HPI Future SOC Lab provides researchers with free of charge access to a complete infrastructure of state of the art hard and software. This infrastructure includes components, which might be too expensive for an ordinary research environment, such as servers with up to 64 cores and 2 TB main memory. The offerings address researchers particularly from but not limited to the areas of computer science and business information systems. Main areas of research include cloud computing, parallelization, and In-Memory technologies.
This technical report presents results of research projects executed in 2018. Selected projects have presented their results on April 17th and November 14th 2017 at the Future SOC Lab Day events.
Proceedings of the HPI Research School on Service-oriented Systems Engineering 2020 Fall Retreat
(2021)
Design and Implementation of service-oriented architectures imposes a huge number of research questions from the fields of software engineering, system analysis and modeling, adaptability, and application integration. Component orientation and web services are two approaches for design and realization of complex web-based system. Both approaches allow for dynamic application adaptation as well as integration of enterprise application.
Service-Oriented Systems Engineering represents a symbiosis of best practices in object-orientation, component-based development, distributed computing, and business process management. It provides integration of business and IT concerns.
The annual Ph.D. Retreat of the Research School provides each member the opportunity to present his/her current state of their research and to give an outline of a prospective Ph.D. thesis. Due to the interdisciplinary structure of the research school, this technical report covers a wide range of topics. These include but are not limited to: Human Computer Interaction and Computer Vision as Service; Service-oriented Geovisualization Systems; Algorithm Engineering for Service-oriented Systems; Modeling and Verification of Self-adaptive Service-oriented Systems; Tools and Methods for Software Engineering in Service-oriented Systems; Security Engineering of Service-based IT Systems; Service-oriented Information Systems; Evolutionary Transition of Enterprise Applications to Service Orientation; Operating System Abstractions for Service-oriented Computing; and Services Specification, Composition, and Enactment.
EMOOCs 2021
(2021)
From June 22 to June 24, 2021, Hasso Plattner Institute, Potsdam, hosted the seventh European MOOC Stakeholder Summit (EMOOCs 2021) together with the eighth ACM Learning@Scale Conference.
Due to the COVID-19 situation, the conference was held fully online.
The boost in digital education worldwide as a result of the pandemic was also one of the main topics of this year’s EMOOCs. All institutions of learning have been forced to transform and redesign their educational methods, moving from traditional models to hybrid or completely online models at scale. The learnings, derived from practical experience and research, have been explored in EMOOCs 2021 in six tracks and additional workshops, covering various aspects of this field. In this publication, we present papers from the conference’s Experience Track, the Policy Track, the Business Track, the International Track, and the Workshops.
TransPipe
(2021)
Online learning environments, such as Massive Open Online Courses (MOOCs), often rely on videos as a major component to convey knowledge. However, these videos exclude potential participants who do not understand the lecturer’s language, regardless of whether that is due to language unfamiliarity or aural handicaps. Subtitles and/or interactive transcripts solve this issue, ease navigation based on the content, and enable indexing and retrieval by search engines. Although there are several automated speech-to-text converters and translation tools, their quality varies and the process of integrating them can be quite tedious. Thus, in practice, many videos on MOOC platforms only receive subtitles after the course is already finished (if at all) due to a lack of resources. This work describes an approach to tackle this issue by providing a dedicated tool, which is closing this gap between MOOC platforms and transcription and translation tools and offering a simple workflow that can easily be handled by users with a less technical background. The proposed method is designed and evaluated by qualitative interviews with three major MOOC providers.
ATIB
(2021)
Identity management is a principle component of securing online services. In the advancement of traditional identity management patterns, the identity provider remained a Trusted Third Party (TTP). The service provider and the user need to trust a particular identity provider for correct attributes amongst other demands. This paradigm changed with the invention of blockchain-based Self-Sovereign Identity (SSI) solutions that primarily focus on the users. SSI reduces the functional scope of the identity provider to an attribute provider while enabling attribute aggregation. Besides that, the development of new protocols, disregarding established protocols and a significantly fragmented landscape of SSI solutions pose considerable challenges for an adoption by service providers. We propose an Attribute Trust-enhancing Identity Broker (ATIB) to leverage the potential of SSI for trust-enhancing attribute aggregation. Furthermore, ATIB abstracts from a dedicated SSI solution and offers standard protocols. Therefore, it facilitates the adoption by service providers. Despite the brokered integration approach, we show that ATIB provides a high security posture. Additionally, ATIB does not compromise the ten foundational SSI principles for the users.
CloudStrike
(2020)
Most cyber-attacks and data breaches in cloud infrastructure are due to human errors and misconfiguration vulnerabilities. Cloud customer-centric tools are imperative for mitigating these issues, however existing cloud security models are largely unable to tackle these security challenges. Therefore, novel security mechanisms are imperative, we propose Risk-driven Fault Injection (RDFI) techniques to address these challenges. RDFI applies the principles of chaos engineering to cloud security and leverages feedback loops to execute, monitor, analyze and plan security fault injection campaigns, based on a knowledge-base. The knowledge-base consists of fault models designed from secure baselines, cloud security best practices and observations derived during iterative fault injection campaigns. These observations are helpful for identifying vulnerabilities while verifying the correctness of security attributes (integrity, confidentiality and availability). Furthermore, RDFI proactively supports risk analysis and security hardening efforts by sharing security information with security mechanisms. We have designed and implemented the RDFI strategies including various chaos engineering algorithms as a software tool: CloudStrike. Several evaluations have been conducted with CloudStrike against infrastructure deployed on two major public cloud infrastructure: Amazon Web Services and Google Cloud Platform. The time performance linearly increases, proportional to increasing attack rates. Also, the analysis of vulnerabilities detected via security fault injection has been used to harden the security of cloud resources to demonstrate the effectiveness of the security information provided by CloudStrike. Therefore, we opine that our approaches are suitable for overcoming contemporary cloud security issues.
Recurrent generative adversarial network for learning imbalanced medical image semantic segmentation
(2020)
We propose a new recurrent generative adversarial architecture named RNN-GAN to mitigate imbalance data problem in medical image semantic segmentation where the number of pixels belongs to the desired object are significantly lower than those belonging to the background. A model trained with imbalanced data tends to bias towards healthy data which is not desired in clinical applications and predicted outputs by these networks have high precision and low recall. To mitigate imbalanced training data impact, we train RNN-GAN with proposed complementary segmentation mask, in addition, ordinary segmentation masks. The RNN-GAN consists of two components: a generator and a discriminator. The generator is trained on the sequence of medical images to learn corresponding segmentation label map plus proposed complementary label both at a pixel level, while the discriminator is trained to distinguish a segmentation image coming from the ground truth or from the generator network. Both generator and discriminator substituted with bidirectional LSTM units to enhance temporal consistency and get inter and intra-slice representation of the features. We show evidence that the proposed framework is applicable to different types of medical images of varied sizes. In our experiments on ACDC-2017, HVSMR-2016, and LiTS-2017 benchmarks we find consistently improved results, demonstrating the efficacy of our approach.
Generative multi-adversarial network for striking the right balance in abdominal image segmentation
(2020)
Purpose: The identification of abnormalities that are relatively rare within otherwise normal anatomy is a major challenge for deep learning in the semantic segmentation of medical images. The small number of samples of the minority classes in the training data makes the learning of optimal classification challenging, while the more frequently occurring samples of the majority class hamper the generalization of the classification boundary between infrequently occurring target objects and classes. In this paper, we developed a novel generative multi-adversarial network, called Ensemble-GAN, for mitigating this class imbalance problem in the semantic segmentation of abdominal images. Method: The Ensemble-GAN framework is composed of a single-generator and a multi-discriminator variant for handling the class imbalance problem to provide a better generalization than existing approaches. The ensemble model aggregates the estimates of multiple models by training from different initializations and losses from various subsets of the training data. The single generator network analyzes the input image as a condition to predict a corresponding semantic segmentation image by use of feedback from the ensemble of discriminator networks. To evaluate the framework, we trained our framework on two public datasets, with different imbalance ratios and imaging modalities: the Chaos 2019 and the LiTS 2017. Result: In terms of the F1 score, the accuracies of the semantic segmentation of healthy spleen, liver, and left and right kidneys were 0.93, 0.96, 0.90 and 0.94, respectively. The overall F1 scores for simultaneous segmentation of the lesions and liver were 0.83 and 0.94, respectively. Conclusion: The proposed Ensemble-GAN framework demonstrated outstanding performance in the semantic segmentation of medical images in comparison with other approaches on popular abdominal imaging benchmarks. The Ensemble-GAN has the potential to segment abdominal images more accurately than human experts.
Social networking sites (SNS) are a rich source of latent information about individual characteristics. Crawling and analyzing this content provides a new approach for enterprises to personalize services and put forward product recommendations. In the past few years, commercial brands made a gradual appearance on social media platforms for advertisement, customers support and public relation purposes and by now it became a necessity throughout all branches. This online identity can be represented as a brand personality that reflects how a brand is perceived by its customers. We exploited recent research in text analysis and personality detection to build an automatic brand personality prediction model on top of the (Five-Factor Model) and (Linguistic Inquiry and Word Count) features extracted from publicly available benchmarks. Predictive evaluation on brands' accounts reveals that Facebook platform provides a slight advantage over Twitter platform in offering more self-disclosure for users' to express their emotions especially their demographic and psychological traits. Results also confirm the wider perspective that the same social media account carry a quite similar and comparable personality scores over different social media platforms. For evaluating our prediction results on actual brands' accounts, we crawled the Facebook API and Twitter API respectively for 100k posts from the most valuable brands' pages in the USA and we visualize exemplars of comparison results and present suggestions for future directions.
The “HPI Future SOC Lab” is a cooperation of the Hasso Plattner Institute (HPI) and industry partners. Its mission is to enable and promote exchange and interaction between the research community and the industry partners.
The HPI Future SOC Lab provides researchers with free of charge access to a complete infrastructure of state of the art hard and software. This infrastructure includes components, which might be too expensive for an ordinary research environment, such as servers with up to 64 cores and 2 TB main memory. The offerings address researchers particularly from but not limited to the areas of computer science and business information systems. Main areas of research include cloud computing, parallelization, and In-Memory technologies.
This technical report presents results of research projects executed in 2017. Selected projects have presented their results on April 25th and November 15th 2017 at the Future SOC Lab Day events.
User-generated content on social media platforms is a rich source of latent information about individual variables. Crawling and analyzing this content provides a new approach for enterprises to personalize services and put forward product recommendations. In the past few years, brands made a gradual appearance on social media platforms for advertisement, customers support and public relation purposes and by now it became a necessity throughout all branches. This online identity can be represented as a brand personality that reflects how a brand is perceived by its customers. We exploited recent research in text analysis and personality detection to build an automatic brand personality prediction model on top of the (Five-Factor Model) and (Linguistic Inquiry and Word Count) features extracted from publicly available benchmarks. The proposed model reported significant accuracy in predicting specific personality traits form brands. For evaluating our prediction results on actual brands, we crawled the Facebook API for 100k posts from the most valuable brands' pages in the USA and we visualize exemplars of comparison results and present suggestions for future directions.
In cloud computing, users are able to use their own operating system (OS) image to run a virtual machine (VM) on a remote host. The virtual machine OS is started by the user using some interfaces provided by a cloud provider in public or private cloud. In peer to peer cloud, the VM is started by the host admin. After the VM is running, the user could get a remote access to the VM to install, configure, and run services. For the security reasons, the user needs to verify the integrity of the running VM, because a malicious host admin could modify the image or even replace the image with a similar image, to be able to get sensitive data from the VM. We propose an approach to verify the integrity of a running VM on a remote host, without using any specific hardware such as Trusted Platform Module (TPM). Our approach is implemented on a Linux platform where the kernel files (vmlinuz and initrd) could be replaced with new files, while the VM is running. kexec is used to reboot the VM with the new kernel files. The new kernel has secret codes that will be used to verify whether the VM was started using the new kernel files. The new kernel is used to further measuring the integrity of the running VM.
High-dimensional data is particularly useful for data analytics research. In the healthcare domain, for instance, high-dimensional data analytics has been used successfully for drug discovery. Yet, in order to adhere to privacy legislation, data analytics service providers must guarantee anonymity for data owners. In the context of high-dimensional data, ensuring privacy is challenging because increased data dimensionality must be matched by an exponential growth in the size of the data to avoid sparse datasets. Syntactically, anonymising sparse datasets with methods that rely of statistical significance, makes obtaining sound and reliable results, a challenge. As such, strong privacy is only achievable at the cost of high information loss, rendering the data unusable for data analytics. In this paper, we make two contributions to addressing this problem from both the privacy and information loss perspectives. First, we show that by identifying dependencies between attribute subsets we can eliminate privacy violating attributes from the anonymised dataset. Second, to minimise information loss, we employ a greedy search algorithm to determine and eliminate maximal partial unique attribute combinations. Thus, one only needs to find the minimal set of identifying attributes to prevent re-identification. Experiments on a health cloud based on the SAP HANA platform using a semi-synthetic medical history dataset comprised of 109 attributes, demonstrate the effectiveness of our approach.
Devices on the Internet of Things (IoT) are usually battery-powered and have limited resources. Hence, energy-efficient and lightweight protocols were designed for IoT devices, such as the popular Constrained Application Protocol (CoAP). Yet, CoAP itself does not include any defenses against denial-of-sleep attacks, which are attacks that aim at depriving victim devices of entering low-power sleep modes. For example, a denial-of-sleep attack against an IoT device that runs a CoAP server is to send plenty of CoAP messages to it, thereby forcing the IoT device to expend energy for receiving and processing these CoAP messages. All current security solutions for CoAP, namely Datagram Transport Layer Security (DTLS), IPsec, and OSCORE, fail to prevent such attacks. To fill this gap, Seitz et al. proposed a method for filtering out inauthentic and replayed CoAP messages "en-route" on 6LoWPAN border routers. In this paper, we expand on Seitz et al.'s proposal in two ways. First, we revise Seitz et al.'s software architecture so that 6LoWPAN border routers can not only check the authenticity and freshness of CoAP messages, but can also perform a wide range of further checks. Second, we propose a couple of such further checks, which, as compared to Seitz et al.'s original checks, more reliably protect IoT devices that run CoAP servers from remote denial-of-sleep attacks, as well as from remote exploits. We prototyped our solution and successfully tested its compatibility with Contiki-NG's CoAP implementation.
LoANs
(2019)
Recently, deep neural networks have achieved remarkable performance on the task of object detection and recognition. The reason for this success is mainly grounded in the availability of large scale, fully annotated datasets, but the creation of such a dataset is a complicated and costly task. In this paper, we propose a novel method for weakly supervised object detection that simplifies the process of gathering data for training an object detector. We train an ensemble of two models that work together in a student-teacher fashion. Our student (localizer) is a model that learns to localize an object, the teacher (assessor) assesses the quality of the localization and provides feedback to the student. The student uses this feedback to learn how to localize objects and is thus entirely supervised by the teacher, as we are using no labels for training the localizer. In our experiments, we show that our model is very robust to noise and reaches competitive performance compared to a state-of-the-art fully supervised approach. We also show the simplicity of creating a new dataset, based on a few videos (e.g. downloaded from YouTube) and artificially generated data.
Cloud Storage Broker (CSB) provides value-added cloud storage service for enterprise usage by leveraging multi-cloud storage architecture. However, it raises several challenges for managing resources and its access control in multiple Cloud Service Providers (CSPs) for authorized CSB stakeholders. In this paper we propose unified cloud access control model that provides the abstraction of CSP's services for centralized and automated cloud resource and access control management in multiple CSPs. Our proposal offers role-based access control for CSB stakeholders to access cloud resources by assigning necessary privileges and access control list for cloud resources and CSB stakeholders, respectively, following privilege separation concept and least privilege principle. We implement our unified model in a CSB system called CloudRAID for Business (CfB) with the evaluation result shows it provides system-and-cloud level security service for cfB and centralized resource and access control management in multiple CSPs.
Many universities record the lectures being held in their facilities to preserve knowledge and to make it available to their students and, at least for some universities and classes, to the broad public. The way with the least effort is to record the whole lecture, which in our case usually is 90 min long. This saves the labor and time of cutting and rearranging lectures scenes to provide short learning videos as known from Massive Open Online Courses (MOOCs), etc. Many lecturers fear that recording their lectures and providing them via an online platform might lead to less participation in the actual lecture. Also, many teachers fear that the lecture recordings are not used with the same focus and dedication as lectures in a lecture hall. In this work, we show that in our experience, full lectures have an average watching duration of just a few minutes and explain the reasons for that and why, in most cases, teachers do not have to worry about that.
A Fuzzy Rule-Based Model for Remote Monitoring of Preterm in the Intensive Care Unit of Hospitals
(2019)
The use of Remote patient monitoring (RPM) systems to monitor critically ill patients in the Intensive Care Unit (ICU) has enabled quality and real-time healthcare management. Fuzzy logic as an approach to designing RPM systems provides a means for encapsulating the subjective decision-making process of medical experts in an algorithm suitable for computer implementation. In this paper, a remote monitoring system for preterm in neonatal ICU incubators is modeled and simulated. The model was designed with 4 input variables (body temperature, heart rate, respiratory rate, and oxygen level saturation), and 1 output variable (action performed represented as ACT). ACT decides whether-an alert is generated or not and also determines the message displayed when a notification is required. ACT classifies the clinical priority of the monitored preterm into 5 different fields: code blue, code red, code yellow, code green, and-code black. The model was simulated using a fuzzy logic toolbox of MATLAB R2015A. About 216 IF_THEN rules were formulated to monitor the inputs data fed into the model. The performance of the model was evaluated using-the confusion matrix to determine the model’s accuracy, precision, sensitivity, specificity, and false alarm rate. The-experimental results obtained shows that the fuzzy-based system is capable of producing satisfactory results when used for monitoring and classifying the clinical statuses of neonates in ICU incubators.
While the IEEE 802.15.4 radio standard has many features that meet the requirements of Internet of things applications, IEEE 802.15.4 leaves the whole issue of key management unstandardized. To address this gap, Krentz et al. proposed the Adaptive Key Establishment Scheme (AKES), which establishes session keys for use in IEEE 802.15.4 security. Yet, AKES does not cover all aspects of key management. In particular, AKES comprises no means for key revocation and rekeying. Moreover, existing protocols for key revocation and rekeying seem limited in various ways. In this paper, we hence propose a key revocation and rekeying protocol, which is designed to overcome various limitations of current protocols for key revocation and rekeying. For example, our protocol seems unique in that it routes around IEEE 802.15.4 nodes whose keys are being revoked. We successfully implemented and evaluated our protocol using the Contiki-NG operating system and aiocoap.
Detect me if you can
(2019)
Spam Bots have become a threat to online social networks with their malicious behavior, posting misinformation messages and influencing online platforms to fulfill their motives. As spam bots have become more advanced over time, creating algorithms to identify bots remains an open challenge. Learning low-dimensional embeddings for nodes in graph structured data has proven to be useful in various domains. In this paper, we propose a model based on graph convolutional neural networks (GCNN) for spam bot detection. Our hypothesis is that to better detect spam bots, in addition to defining a features set, the social graph must also be taken into consideration. GCNNs are able to leverage both the features of a node and aggregate the features of a node’s neighborhood. We compare our approach, with two methods that work solely on a features set and on the structure of the graph. To our knowledge, this work is the first attempt of using graph convolutional neural networks in spam bot detection.
The ability to work in teams is an important skill in today's work environments. In MOOCs, however, team work, team tasks, and graded team-based assignments play only a marginal role. To close this gap, we have been exploring ways to integrate graded team-based assignments in MOOCs. Some goals of our work are to determine simple criteria to match teams in a volatile environment and to enable a frictionless online collaboration for the participants within our MOOC platform. The high dropout rates in MOOCs pose particular challenges for team work in this context. By now, we have conducted 15 MOOCs containing graded team-based assignments in a variety of topics. The paper at hand presents a study that aims to establish a solid understanding of the participants in the team tasks. Furthermore, we attempt to determine which team compositions are particularly successful. Finally, we examine how several modifications to our platform's collaborative toolset have affected the dropout rates and performance of the teams.
The "Bachelor Project"
(2019)
One of the challenges of educating the next generation of computer scientists is to teach them to become team players, that are able to communicate and interact not only with different IT systems, but also with coworkers and customers with a non-it background. The “bachelor project” is a project based on team work and a close collaboration with selected industry partners. The authors hosted some of the teams since spring term 2014/15. In the paper at hand we explain and discuss this concept and evaluate its success based on students' evaluation and reports. Furthermore, the technology-stack that has been used by the teams is evaluated to understand how self-organized students in IT-related projects work. We will show that and why the bachelor is the most successful educational format in the perception of the students and how this positive results can be improved by the mentors.
MOOCs in Secondary Education
(2019)
Computer science education in German schools is often less than optimal. It is only mandatory in a few of the federal states and there is a lack of qualified teachers. As a MOOC (Massive Open Online Course) provider with a German background, we developed the idea to implement a MOOC addressing pupils in secondary schools to fill this gap. The course targeted high school pupils and enabled them to learn the Python programming language. In 2014, we successfully conducted the first iteration of this MOOC with more than 7000 participants. However, the share of pupils in the course was not quite satisfactory. So we conducted several workshops with teachers to find out why they had not used the course to the extent that we had imagined. The paper at hand explores and discusses the steps we have taken in the following years as a result of these workshops.
Electronic health is one of the most popular applications of information and communication technologies and it has contributed immensely to health delivery through the provision of quality health service and ubiquitous access at a lower cost. Even though this mode of health service is increasingly becoming known or used in developing nations, these countries are faced with a myriad of challenges when implementing and deploying e-health services on both small and large scale. It is estimated that the Africa population alone carries the highest percentage of the world’s global diseases despite its certain level of e-health adoption. This paper aims at analyzing the progress so far and the current state of e-health in developing countries particularly Africa and propose a framework for further improvement.
The emergence of cloud computing allows users to easily host their Virtual Machines with no up-front investment and the guarantee of always available anytime anywhere. But with the Virtual Machine (VM) is hosted outside of user's premise, the user loses the physical control of the VM as it could be running on untrusted host machines in the cloud. Malicious host administrator could launch live memory dumping, Spectre, or Meltdown attacks in order to extract sensitive information from the VM's memory, e.g. passwords or cryptographic keys of applications running in the VM. In this paper, inspired by the moving target defense (MTD) scheme, we propose a novel approach to increase the security of application's sensitive data in the VM by continuously moving the sensitive data among several memory allocations (blocks) in Random Access Memory (RAM). A movement function is added into the application source code in order for the function to be running concurrently with the application's main function. Our approach could reduce the possibility of VM's sensitive data in the memory to be leaked into memory dump file by 2 5% and secure the sensitive data from Spectre and Meltdown attacks. Our approach's overhead depends on the number and the size of the sensitive data.
Die HPI Schul-Cloud
(2019)
Die digitale Transformation durchdringt alle gesellschaftlichen Ebenen und Felder, nicht zuletzt auch das Bildungssystem. Dieses ist auf die Veränderungen kaum vorbereitet und begegnet ihnen vor allem auf Basis des Eigenengagements seiner Lehrer*innen. Strukturelle Reaktionen auf den Mangel an qualitativ hochwertigen Fortbildungen, auf schlecht ausgestattete Unterrichtsräume und nicht professionell gewartete Computersysteme gibt es erst seit kurzem. Doch auch wenn Beharrungskräfte unter Pädagog*innen verbreitet sind, erfordert die Transformation des Systems Schule auch eine neue Mentalität und neue Arbeits- und Kooperationsformen.
Zeitgemäßer Unterricht benötigt moderne Technologie und zeitgemäße IT-Architekturen. Nur Systeme, die für Lehrer*innen und Schüler*innen problemlos verfügbar, benutzerfreundlich zu bedienen und didaktisch flexibel einsetzbar sind, finden in Schulen Akzeptanz. Hierfür haben wir die HPI Schul-Cloud entwickelt. Sie ermöglicht den einfachen Zugang zu neuesten, professionell gewarteten Anwendungen, verschiedensten digitalen Medien, die Vernetzung verschiedener Lernorte und den rechtssicheren Einsatz von Kommunikations- und Kollaborationstools.
Die Entwicklung der HPI Schul-Cloud ist umso notwendiger, als dass rechtliche Anforderungen - insbesondere aus der Datenschutzgrundverordnung der EU herrührend - den Einsatz von Cloud-Anwendungen, die in der Arbeitswelt verbreitet sind, in Schulen unmöglich machen. Im Bildungsbereich verbreitete Anwendungen sind größtenteils technisch veraltet und nicht benutzerfreundlich.
Dies nötigt die Bundesländer zu kostspieligen Eigenentwicklungen mit Aufwänden im zweistelligen Millionenbereich - Projekte die teilweise gescheitert sind. Dank der modularen Micro-Service-Architektur können die Bundesländer zukünftig auf die HPI Schul-Cloud als technische Grundlage für ihre Eigen- oder Gemeinschaftsprojekte zurückgreifen. Hierfür gilt es, eine nachhaltige Struktur für die Weiterentwicklung der Open-Source-Software HPI Schul-Cloud zu schaffen.
Dieser Bericht beschreibt den Entwicklungsstand und die weiteren Perspektiven des Projekts HPI Schul-Cloud im Januar 2019. 96 Schulen deutschlandweit nutzen die HPI Schul-Cloud, bereitgestellt durch das Hasso-Plattner-Institut. Weitere 45 Schulen und Studienseminare nutzen die Niedersächsische Bildungscloud, die technisch auf der HPI Schul-Cloud basiert. Das vom Bundesministerium für Bildung und Forschung geförderte Projekt läuft in der gegenwärtigen Roll-Out-Phase bis zum 31. Juli 2021. Gemeinsam mit unserem Kooperationspartner MINT-EC streben wir an, die HPI Schul-Cloud möglichst an allen Schulen des Netzwerks einzusetzen.
Live migration is an important feature in modern software-defined datacenters and cloud computing environments. Dynamic resource management, load balance, power saving and fault tolerance are all dependent on the live migration feature. Despite the importance of live migration, the cost of live migration cannot be ignored and may result in service availability degradation. Live migration cost includes the migration time, downtime, CPU overhead, network and power consumption. There are many research articles that discuss the problem of live migration cost with different scopes like analyzing the cost and relate it to the parameters that control it, proposing new migration algorithms that minimize the cost and also predicting the migration cost. For the best of our knowledge, most of the papers that discuss the migration cost problem focus on open source hypervisors. For the research articles focus on VMware environments, none of the published articles proposed migration time, network overhead and power consumption modeling for single and multiple VMs live migration. In this paper, we propose empirical models for the live migration time, network overhead and power consumption for single and multiple VMs migration. The proposed models are obtained using a VMware based testbed.
Blockchain
(2018)
The term blockchain has recently become a buzzword, but only few know what exactly lies behind this approach. According to a survey, issued in the first quarter of 2017, the term is only known by 35 percent of German medium-sized enterprise representatives. However, the blockchain technology is very interesting for the mass media because of its rapid development and global capturing of different markets.
For example, many see blockchain technology either as an all-purpose weapon— which only a few have access to—or as a hacker technology for secret deals in the darknet. The innovation of blockchain technology is found in its successful combination of already existing approaches: such as decentralized networks, cryptography, and consensus models. This innovative concept makes it possible to exchange values in a decentralized system. At the same time, there is no requirement for trust between its nodes (e.g. users).
With this study the Hasso Plattner Institute would like to help readers form their own opinion about blockchain technology, and to distinguish between truly innovative properties and hype.
The authors of the present study analyze the positive and negative properties of the blockchain architecture and suggest possible solutions, which can contribute to the efficient use of the technology. We recommend that every company define a clear target for the intended application, which is achievable with a reasonable cost-benefit ration, before deciding on this technology. Both the possibilities and the limitations of blockchain technology need to be considered. The relevant steps that must be taken in this respect are summarized /summed up for the reader in this study.
Furthermore, this study elaborates on urgent problems such as the scalability of the blockchain, appropriate consensus algorithm and security, including various types of possible attacks and their countermeasures. New blockchains, for example, run the risk of reducing security, as changes to existing technology can lead to lacks in the security and failures.
After discussing the innovative properties and problems of the blockchain technology, its implementation is discussed. There are a lot of implementation opportunities for companies available who are interested in the blockchain realization. The numerous applications have either their own blockchain as a basis or use existing and widespread blockchain systems. Various consortia and projects offer "blockchain-as-a-serviceänd help other companies to develop, test and deploy their own applications.
This study gives a detailed overview of diverse relevant applications and projects in the field of blockchain technology. As this technology is still a relatively young and fast developing approach, it still lacks uniform standards to allow the cooperation of different systems and to which all developers can adhere. Currently, developers are orienting themselves to Bitcoin, Ethereum and Hyperledger systems, which serve as the basis for many other blockchain applications.
The goal is to give readers a clear and comprehensive overview of blockchain technology and its capabilities.
Beware of SMOMBIES
(2018)
Several research evaluated the user's style of walking for the verification of a claimed identity and showed high authentication accuracies in many settings. In this paper we present a system that successfully verifies a user's identity based on many real world smartphone placements and yet not regarded interactions while walking. Our contribution is the distinction of all considered activities into three distinct subsets and a specific one-class Support Vector Machine per subset. Using sensor data of 30 participants collected in a semi-supervised study approach, we prove that unsupervised verification is possible with very low false-acceptance and false-rejection rates. We furthermore show that these subsets can be distinguished with a high accuracy and demonstrate that this system can be deployed on off-the-shelf smartphones.
ASEDS
(2018)
The Massive adoption of social media has provided new ways for individuals to express their opinion and emotion online. In 2016, Facebook introduced a new reactions feature that allows users to express their psychological emotions regarding published contents using so-called Facebook reactions. In this paper, a framework for predicting the distribution of Facebook post reactions is presented. For this purpose, we collected an enormous amount of Facebook posts associated with their reactions labels using the proposed scalable Facebook crawler. The training process utilizes 3 million labeled posts for more than 64,000 unique Facebook pages from diverse categories. The evaluation on standard benchmarks using the proposed features shows promising results compared to previous research. The final model is able to predict the reaction distribution on Facebook posts with a recall score of 0.90 for "Joy" emotion.
Generating a novel and descriptive caption of an image is drawing increasing interests in computer vision, natural language processing, and multimedia communities. In this work, we propose an end-to-end trainable deep bidirectional LSTM (Bi-LSTM (Long Short-Term Memory)) model to address the problem. By combining a deep convolutional neural network (CNN) and two separate LSTM networks, our model is capable of learning long-term visual-language interactions by making use of history and future context information at high-level semantic space. We also explore deep multimodal bidirectional models, in which we increase the depth of nonlinearity transition in different ways to learn hierarchical visual-language embeddings. Data augmentation techniques such as multi-crop, multi-scale, and vertical mirror are proposed to prevent over-fitting in training deep models. To understand how our models "translate" image to sentence, we visualize and qualitatively analyze the evolution of Bi-LSTM internal states over time. The effectiveness and generality of proposed models are evaluated on four benchmark datasets: Flickr8K, Flickr30K, MSCOCO, and Pascal1K datasets. We demonstrate that Bi-LSTM models achieve highly competitive performance on both caption generation and image-sentence retrieval even without integrating an additional mechanism (e.g., object detection, attention model). Our experiments also prove that multi-task learning is beneficial to increase model generality and gain performance. We also demonstrate the performance of transfer learning of the Bi-LSTM model significantly outperforms previous methods on the Pascal1K dataset.
Coordinated sampled listening (CSL) is a standardized medium access control protocol for IEEE 80215.4 networks. Unfortunately, CSL comes without any protection against so-called denial-of-sleep attacks. Such attacks deprive energy-constrained devices of entering low-power sleep modes, thereby draining their charge. Repercussions of denial-of-sleep attacks include long outages, violated quality-of-service guarantees, and reduced customer satisfaction. However, while CSL has no built-in denial-of-sleep defenses, there already exist denial-of-sleep defenses for a predecessor of CSL, namely ContikiMAC. In this paper, we make two main contributions. First, motivated by the fact that CSL has many advantages over ContikiMAC, we tailor the existing denial-of-sleep defenses for ContikiMAC to CSL. Second, we propose several security enhancements to these existing denial-of-sleep defenses. In effect, our denial-of-sleep defenses for CSL mitigate denial-of-sleep attacks significantly better, as well as protect against a larger range of denial-of-sleep attacks than the existing denial-of-sleep defenses for ContikiMAC. We show the soundness of our denial-of-sleep defenses for CSL both analytically, as well as empirically using a whole new implementation of CSL. (C) 2018 Elsevier B.V. All rights reserved.
The relevance of identity data leaks on the Internet is more present than ever. Almost every week we read about leakage of databases with more than a million users in the news. Smaller but not less dangerous leaks happen even multiple times a day. The public availability of such leaked data is a major threat to the victims, but also creates the opportunity to learn not only about security of service providers but also the behavior of users when choosing passwords. Our goal is to analyze this data and generate knowledge that can be used to increase security awareness and security, respectively. This paper presents a novel approach to the processing and analysis of a vast majority of bigger and smaller leaks. We evolved from a semi-manual to a fully automated process that requires a minimum of human interaction. Our contribution is the concept and a prototype implementation of a leak processing workflow that includes the extraction of digital identities from structured and unstructured leak-files, the identification of hash routines and a quality control to ensure leak authenticity. By making use of parallel and distributed programming, we are able to make leaks almost immediately available for analysis and notification after they have been published. Based on the data collected, this paper reveals how easy it is for criminals to collect lots of passwords, which are plain text or only weakly hashed. We publish those results and hope to increase not only security awareness of Internet users but also security on a technical level on the service provider side.
Detecting and recognizing text in natural scene images is a challenging, yet not completely solved task. In recent years several new systems that try to solve at least one of the two sub-tasks (text detection and text recognition) have been proposed. In this paper we present SEE, a step towards semi-supervised neural networks for scene text detection and recognition, that can be optimized end-to-end. Most existing works consist of multiple deep neural networks and several pre-processing steps. In contrast to this, we propose to use a single deep neural network, that learns to detect and recognize text from natural images, in a semi-supervised way. SEE is a network that integrates and jointly learns a spatial transformer network, which can learn to detect text regions in an image, and a text recognition network that takes the identified text regions and recognizes their textual content. We introduce the idea behind our novel approach and show its feasibility, by performing a range of experiments on standard benchmark datasets, where we achieve competitive results.