Refine
Has Fulltext
- yes (2) (remove)
Document Type
- Doctoral Thesis (2) (remove)
Language
- English (2)
Is part of the Bibliography
- yes (2)
Keywords
- Deep Learning (2) (remove)
Institute
- Hasso-Plattner-Institut für Digital Engineering GmbH (2) (remove)
In this era of high-speed informatization and globalization, online education is no longer an exquisite concept in the ivory tower, but a rapidly developing industry closely relevant to people's daily lives. Numerous lectures are recorded in form of multimedia data, uploaded to the Internet and made publicly accessible from anywhere in this world. These lectures are generally addressed as e-lectures. In recent year, a new popular form of e-lectures, the Massive Open Online Courses (MOOCs), boosts the growth of online education industry and somehow turns "learning online" into a fashion.
As an e-learning provider, besides to keep improving the quality of e-lecture content, to provide better learning environment for online learners is also a highly important task. This task can be preceded in various ways, and one of them is to enhance and upgrade the learning materials provided: e-lectures could be more than videos. Moreover, this process of enhancement or upgrading should be done automatically, without giving extra burdens to the lecturers or teaching teams, and this is the aim of this thesis.
The first part of this thesis is an integrated framework of multi-lingual subtitles production, which can help online learners penetrate the language barrier. The framework consists of Automatic Speech Recognition (ASR), Sentence Boundary Detection (SBD) and Machine Translation (MT), among which the proposed SBD solution is major technical contribution, building on Deep Neural Network (DNN) and Word Vector (WV) and achieving state-of-the-art performance. Besides, a quantitative evaluation with dozens of volunteers is also introduced to measure how these auto-generated subtitles could actually help in context of e-lectures.
Secondly, a technical solution "TOG" (Tree-Structure Outline Generation) is proposed to extract textual content from the displaying slides recorded in video and re-organize them into a hierarchical lecture outline, which may serve in multiple functions, such like preview, navigation and retrieval. TOG runs adaptively and can be roughly divided into intra-slide and inter-slides phases. Table detection and lecture video segmentation can be implemented as sub- or post-application in these two phases respectively. Evaluation on diverse e-lectures shows that all the outlines, tables and segments achieved are trustworthily accurate.
Based on the subtitles and outlines previously created, lecture videos can be further split into sentence units and slide-based segment units. A lecture highlighting process is further applied on these units, in order to capture and mark the most important parts within the corresponding lecture, just as what people do with a pen when reading paper books. Sentence-level highlighting depends on the acoustic analysis on the audio track, while segment-level highlighting focuses on exploring clues from the statistical information of related transcripts and slide content. Both objective and subjective evaluations prove that the proposed lecture highlighting solution is with decent precision and welcomed by users.
All above enhanced e-lecture materials have been already implemented in actual use or made available for implementation by convenient interfaces.
One of the key challenges in modern Facility Management (FM) is to digitally reflect the current state of the built environment, referred to as-is or as-built versus as-designed representation. While the use of Building Information Modeling (BIM) can address the issue of digital representation, the generation and maintenance of BIM data requires a considerable amount of manual work and domain expertise. Another key challenge is being able to monitor the current state of the built environment, which is used to provide feedback and enhance decision making. The need for an integrated solution for all data associated with the operational life cycle of a building is becoming more pronounced as practices from Industry 4.0 are currently being evaluated and adopted for FM use. This research presents an approach for digital representation of indoor environments in their current state within the life cycle of a given building. Such an approach requires the fusion of various sources of digital data. The key to solving such a complex issue of digital data integration, processing and representation is with the use of a Digital Twin (DT). A DT is a digital duplicate of the physical environment, states, and processes. A DT fuses as-designed and as-built digital representations of built environment with as-is data, typically in the form of floorplans, point clouds and BIMs, with additional information layers pertaining to the current and predicted states of an indoor environment or a complete building (e.g., sensor data). The design, implementation and initial testing of prototypical DT software services for indoor environments is presented and described. These DT software services are implemented within a service-oriented paradigm, and their feasibility is presented through functioning and tested key software components within prototypical Service-Oriented System (SOS) implementations. The main outcome of this research shows that key data related to the built environment can be semantically enriched and combined to enable digital representations of indoor environments, based on the concept of a DT. Furthermore, the outcomes of this research show that digital data, related to FM and Architecture, Construction, Engineering, Owner and Occupant (AECOO) activity, can be combined, analyzed and visualized in real-time using a service-oriented approach. This has great potential to benefit decision making related to Operation and Maintenance (O&M) procedures within the scope of the post-construction life cycle stages of typical office buildings.