004 Datenverarbeitung; Informatik
Refine
Year of publication
Document Type
- Article (343)
- Monograph/Edited Volume (168)
- Doctoral Thesis (161)
- Conference Proceeding (61)
- Postprint (50)
- Master's Thesis (10)
- Other (10)
- Preprint (3)
- Part of a Book (2)
- Bachelor Thesis (1)
Language
- English (616)
- German (193)
- Multiple languages (2)
Keywords
- Informatik (21)
- machine learning (20)
- Didaktik (15)
- Hochschuldidaktik (14)
- Ausbildung (13)
- Cloud Computing (13)
- answer set programming (13)
- cloud computing (13)
- maschinelles Lernen (11)
- Forschungsprojekte (10)
Institute
- Institut für Informatik und Computational Science (271)
- Hasso-Plattner-Institut für Digital Engineering gGmbH (214)
- Hasso-Plattner-Institut für Digital Engineering GmbH (138)
- Extern (65)
- Fachgruppe Betriebswirtschaftslehre (40)
- Mathematisch-Naturwissenschaftliche Fakultät (24)
- Wirtschaftswissenschaften (19)
- Institut für Mathematik (16)
- Bürgerliches Recht (12)
- Digital Engineering Fakultät (8)
Intuitive Modelle der Informatik sind gedankliche Vorstellungen über informatische Konzepte, die mit subjektiver Gewissheit verbunden sind. Menschen verwenden sie, wenn sie die Arbeitsweise von Computerprogrammen nachvollziehen oder anderen erklären, die logische Korrektheit eines Programms prüfen oder in einem kreativen Prozess selbst Programme entwickeln. Intuitive Modelle können auf verschiedene Weise repräsentiert und kommuniziert werden, etwa verbal-abstrakt, durch ablauf- oder strukturorientierte Abbildungen und Filme oder konkrete Beispiele. Diskutiert werden in dieser Arbeit grundlegende intuitive Modelle für folgende inhaltliche Aspekte einer Programmausführung: Allokation von Aktivität bei einer Programmausführung, Benennung von Entitäten, Daten, Funktionen, Verarbeitung, Kontrollstrukturen zur Steuerung von Programmläufen, Rekursion, Klassen und Objekte. Mit Hilfe eines Systems von Online-Spielen, der Python Visual Sandbox, werden die psychische Realität verschiedener intuitiver Modelle bei Programmieranfängern nachgewiesen und fehlerhafte Anwendungen (Fehlvorstellungen) identifiziert.
This thesis addresses real-time rendering techniques for 3D information lenses based on the focus & context metaphor. It analyzes, conceives, implements, and reviews its applicability to objects and structures of virtual 3D city models. In contrast to digital terrain models, the application of focus & context visualization to virtual 3D city models is barely researched. However, the purposeful visualization of contextual data of is extreme importance for the interactive exploration and analysis of this field. Programmable hardware enables the implementation of new lens techniques, that allow the augmentation of the perceptive and cognitive quality of the visualization compared to classical perspective projections. A set of 3D information lenses is integrated into a 3D scene-graph system: • Occlusion lenses modify the appearance of virtual 3D city model objects to resolve their occlusion and consequently facilitate the navigation. • Best-view lenses display city model objects in a priority-based manner and mediate their meta information. Thus, they support exploration and navigation of virtual 3D city models. • Color and deformation lenses modify the appearance and geometry of 3D city models to facilitate their perception. The presented techniques for 3D information lenses and their application to virtual 3D city models clarify their potential for interactive visualization and form a base for further development.
Diese Arbeit befasst sich mit der Entwicklung einer Applikation, welche Infrastrukturdaten über Eisenbahnnetze generiert. Dabei bildet die Erzeugung der topologischen Informationen den Schwerpunkt dieser Arbeit. Der Anwender charakterisiert hierfür vorab das gewünschte Eisenbahnnetz, wobei die geforderten Eigenschaften die Randbedingungen darstellen, die bei der Synthese zu beachten sind. Zur Einhaltung dieser Bedingungen wird die Constraint-Programmierung eingesetzt, welche durch ihr spezielles Programmierparadigma konsistente Lösungen effizient erzeugt. Dies wird u.a. durch die Nachnutzung so genannter globaler Constraints erreicht. Aus diesem Grund wird insbesondere auf den Einsatz der Constraint-Programmierung bei der Modellierung und Implementierung der Applikation eingegangen.
Aktuelle Softwaresysteme erlauben die verteilte Authentifizierung von Benutzern über Ver-zeichnisdienste, die sowohl im Intranet als auch im Extranet liegen und die über Domänen-grenzen hinweg die Kooperation mit Partnern ermöglichen. Der nächste Schritt ist es nun, die Autorisierung ebenfalls aus der lokalen Anwendung auszulagern und diese extern durchzu-führen – vorzugsweise unter dem Einfluss der Authentifizierungspartner. Basierend auf der Analyse des State-of-the-Art wird in dieser Arbeit ein Framework vorges-tellt, das die verteilte Autorisierung von ADFS (Active Directory Federation Services) authenti-fizierten Benutzern auf Basis ihrer Gruppen oder ihrer persönlichen Identität ermöglicht. Es wird eine prototypische Implementation mit Diensten entwickelt, die für authentifizierte Be-nutzer Autorisierungsanfragen extern delegieren, sowie ein Dienst, der diese Autorisierungs-anfragen verarbeitet. Zusätzlich zeigt die Arbeit eine Integration dieses Autorisierungs-Frameworks in das .NET Framework, um die praxistaugliche Verwendbarkeit in einer aktuel-len Entwicklungsumgebung zu demonstrieren. Abschließend wird ein Ausblick auf weitere Fragestellungen und Folgearbeiten gegeben.
This work introduces novel internal and external memory algorithms for computing voxel skeletons of massive voxel objects with complex network-like architecture and for converting these voxel skeletons to piecewise linear geometry, that is triangle meshes and piecewise straight lines. The presented techniques help to tackle the challenge of visualizing and analyzing 3d images of increasing size and complexity, which are becoming more and more important in, for example, biological and medical research. Section 2.3.1 contributes to the theoretical foundations of thinning algorithms with a discussion of homotopic thinning in the grid cell model. The grid cell model explicitly represents a cell complex built of faces, edges, and vertices shared between voxels. A characterization of pairs of cells to be deleted is much simpler than characterizations of simple voxels were before. The grid cell model resolves topologically unclear voxel configurations at junctions and locked voxel configurations causing, for example, interior voxels in sets of non-simple voxels. A general conclusion is that the grid cell model is superior to indecomposable voxels for algorithms that need detailed control of topology. Section 2.3.2 introduces a noise-insensitive measure based on the geodesic distance along the boundary to compute two-dimensional skeletons. The measure is able to retain thin object structures if they are geometrically important while ignoring noise on the object's boundary. This combination of properties is not known of other measures. The measure is also used to guide erosion in a thinning process from the boundary towards lines centered within plate-like structures. Geodesic distance based quantities seem to be well suited to robustly identify one- and two-dimensional skeletons. Chapter 6 applies the method to visualization of bone micro-architecture. Chapter 3 describes a novel geometry generation scheme for representing voxel skeletons, which retracts voxel skeletons to piecewise linear geometry per dual cube. The generated triangle meshes and graphs provide a link to geometry processing and efficient rendering of voxel skeletons. The scheme creates non-closed surfaces with boundaries, which contain fewer triangles than a representation of voxel skeletons using closed surfaces like small cubes or iso-surfaces. A conclusion is that thinking specifically about voxel skeleton configurations instead of generic voxel configurations helps to deal with the topological implications. The geometry generation is one foundation of the applications presented in Chapter 6. Chapter 5 presents a novel external memory algorithm for distance ordered homotopic thinning. The presented method extends known algorithms for computing chamfer distance transformations and thinning to execute I/O-efficiently when input is larger than the available main memory. The applied block-wise decomposition schemes are quite simple. Yet it was necessary to carefully analyze effects of block boundaries to devise globally correct external memory variants of known algorithms. In general, doing so is superior to naive block-wise processing ignoring boundary effects. Chapter 6 applies the algorithms in a novel method based on confocal microscopy for quantitative study of micro-vascular networks in the field of microcirculation.
Die vorliegende Arbeit befasst sich mit der wissensbasierten Modellierung von Audio-Signal-Klassifikatoren (ASK) für die Bioakustik. Sie behandelt ein interdisziplinäres Problem, das viele Facetten umfasst. Zu diesen gehören artspezifische bioakustische Fragen, mathematisch-algorithmische Details und Probleme der Repräsentation von Expertenwissen. Es wird eine universelle praktisch anwendbare Methode zur wissensbasierten Modellierung bioakustischer ASK dargestellt und evaluiert. Das Problem der Modellierung von ASK wird dabei durchgängig aus KDD-Perspektive (Knowledge Discovery in Databases) betrachtet. Der grundlegende Ansatz besteht darin, mit Hilfe von modifizierten KDD-Methoden und Data-Mining-Verfahren die Modellierung von ASK wesentlich zu erleichtern. Das etablierte KDD-Paradigma wird mit Hilfe eines detaillierten formalen Modells auf den Bereich der Modellierung von ASK übertragen. Neunzehn elementare KDD-Verfahren bilden die Grundlage eines umfassenden Systems zur wissensbasierten Modellierung von ASK. Methode und Algorithmen werden evaluiert, indem eine sehr umfangreiche Sammlung akustischer Signale des Großen Tümmlers mit ihrer Hilfe untersucht wird. Die Sammlung wurde speziell für diese Arbeit in Eilat (Israel) angefertigt. Insgesamt werden auf Grundlage dieses Audiomaterials vier empirische Einzelstudien durchgeführt: - Auf der Basis von oszillographischen und spektrographischen Darstellungen wird ein phänomenologisches Klassifikationssystem für die vielfältigen Laute des Großen Tümmlers dargestellt. - Mit Hilfe eines Korpus halbsynthetischer Audiodaten werden verschiedene grundlegende Verfahren zur Modellierung und Anwendung von ASK in Hinblick auf ihre Genauigkeit und Robustheit untersucht. - Mit einem speziell entwickelten Clustering-Verfahren werden mehrere Tausend natürliche Pfifflaute des Großen Tümmlers untersucht. Die Ergebnisse werden visualisiert und diskutiert. - Durch maschinelles mustererkennungsbasiertes akustisches Monitoring wird die Emissionsdynamik verschiedener Lauttypen im Verlaufe von vier Wochen untersucht. Etwa 2.5 Millionen Klicklaute werden im Anschluss auf ihre spektralen Charakteristika hin untersucht. Die beschriebene Methode und die dargestellten Algorithmen sind in vielfältiger Hinsicht erweiterbar, ohne dass an ihrer grundlegenden Architektur etwas geändert werden muss. Sie lassen sich leicht in dem gesamten Gebiet der Bioakustik einsetzen. Hiermit besitzen sie auch für angrenzende Disziplinen ein hohes Potential, denn exaktes Wissen über die akustischen Kommunikations- und Sonarsysteme der Tiere wird in der theoretischen Biologie, in den Kognitionswissenschaften, aber auch im praktischen Naturschutz, in Zukunft eine wichtige Rolle spielen.
The requirements of modern e-learning techniques change. Aspects such as community interaction, flexibility, pervasive learning and increasing mobility in communication habits become more important. To meet these challenges e-learning platforms must provide support on mobile learning. Most approaches try to adopt centralised and static e-learning mechanisms to mobile devices. However, often technically it is not possible for all kinds of devices to be connected to a central server. Therefore we introduce an application of a mobile e-learning network which operates totally decentralised with the help of an underlying ad hoc network architecture. Furthermore the concept of ad hoc messaging network (AMNET) is used as basis system architecture for our approach to implement a platform for pervasive mobile e-learning.
This contribution presents a quantitative evaluation procedure for Information Retrieval models and the results of this procedure applied on the enhanced Topic-based Vector Space Model (eTVSM). Since the eTVSM is an ontology-based model, its effectiveness heavily depends on the quality of the underlaying ontology. Therefore the model has been tested with different ontologies to evaluate the impact of those ontologies on the effectiveness of the eTVSM. On the highest level of abstraction, the following results have been observed during our evaluation: First, the theoretically deduced statement that the eTVSM has a similar effecitivity like the classic Vector Space Model if a trivial ontology (every term is a concept and it is independet of any other concepts) is used has been approved. Second, we were able to show that the effectiveness of the eTVSM raises if an ontology is used which is only able to resolve synonyms. We were able to derive such kind of ontology automatically from the WordNet ontology. Third, we observed that more powerful ontologies automatically derived from the WordNet, dramatically dropped the effectiveness of the eTVSM model even clearly below the effectiveness level of the Vector Space Model. Fourth, we were able to show that a manually created and optimized ontology is able to raise the effectiveness of the eTVSM to a level which is clearly above the best effectiveness levels we have found in the literature for the Latent Semantic Index model with compareable document sets.
1. Design and Composition of 3D Geoinformation Services Benjamin Hagedorn 2. Operating System Abstractions for Service-Based Systems Michael Schöbel 3. A Task-oriented Approach to User-centered Design of Service-Based Enterprise Applications Matthias Uflacker 4. A Framework for Adaptive Transport in Service- Oriented Systems based on Performance Prediction Flavius Copaciu 5. Asynchronicity and Loose Coupling in Service-Oriented Architectures Nikola Milanovic
The innovation of information techniques has changed many aspects of our life. In health care field, we can obtain, manage and communicate high-quality large volumetric image data by computer integrated devices, to support medical care. In this dissertation I propose several promising methods that could assist physicians in processing, observing and communicating the image data. They are included in my three research aspects: telemedicine integration, medical image visualization and image segmentation. And these methods are also demonstrated by the demo software that I developed. One of my research point focuses on medical information storage standard in telemedicine, for example DICOM, which is the predominant standard for the storage and communication of medical images. I propose a novel 3D image data storage method, which was lacking in current DICOM standard. I also created a mechanism to make use of the non-standard or private DICOM files. In this thesis I present several rendering techniques on medical image visualization to offer different display manners, both 2D and 3D, for example, cut through data volume in arbitrary degree, rendering the surface shell of the data, and rendering the semi-transparent volume of the data. A hybrid segmentation approach, designed for semi-automated segmentation of radiological image, such as CT, MRI, etc, is proposed in this thesis to get the organ or interested area from the image. This approach takes advantage of the region-based method and boundary-based methods. Three steps compose the hybrid approach: the first step gets coarse segmentation by fuzzy affinity and generates homogeneity operator; the second step divides the image by Voronoi Diagram and reclassifies the regions by the operator to refine segmentation from the previous step; the third step handles vague boundary by level set model. Topics for future research are mentioned in the end, including new supplement for DICOM standard for segmentation information storage, visualization of multimodal image information, and improvement of the segmentation approach to higher dimension.
Answer Set Programming (ASP) emerged in the late 1990s as a new logic programming paradigm, having its roots in nonmonotonic reasoning, deductive databases, and logic programming with negation as failure. The basic idea of ASP is to represent a computational problem as a logic program whose answer sets correspond to solutions, and then to use an answer set solver for finding answer sets of the program. ASP is particularly suited for solving NP-complete search problems. Among these, we find applications to product configuration, diagnosis, and graph-theoretical problems, e.g. finding Hamiltonian cycles. On different lines of ASP research, many extensions of the basic formalism have been proposed. The most intensively studied one is the modelling of preferences in ASP. They constitute a natural and effective way of selecting preferred solutions among a plethora of solutions for a problem. For example, preferences have been successfully used for timetabling, auctioning, and product configuration. In this thesis, we concentrate on preferences within answer set programming. Among several formalisms and semantics for preference handling in ASP, we concentrate on ordered logic programs with the underlying D-, W-, and B-semantics. In this setting, preferences are defined among rules of a logic program. They select preferred answer sets among (standard) answer sets of the underlying logic program. Up to now, those preferred answer sets have been computed either via a compilation method or by meta-interpretation. Hence, the question comes up, whether and how preferences can be integrated into an existing ASP solver. To solve this question, we develop an operational graph-based framework for the computation of answer sets of logic programs. Then, we integrate preferences into this operational approach. We empirically observe that our integrative approach performs in most cases better than the compilation method or meta-interpretation. Another research issue in ASP are optimization methods that remove redundancies, as also found in database query optimizers. For these purposes, the rather recently suggested notion of strong equivalence for ASP can be used. If a program is strongly equivalent to a subprogram of itself, then one can always use the subprogram instead of the original program, a technique which serves as an effective optimization method. Up to now, strong equivalence has not been considered for logic programs with preferences. In this thesis, we tackle this issue and generalize the notion of strong equivalence to ordered logic programs. We give necessary and sufficient conditions for the strong equivalence of two ordered logic programs. Furthermore, we provide program transformations for ordered logic programs and show in how far preferences can be simplified. Finally, we present two new applications for preferences within answer set programming. First, we define new procedures for group decision making, which we apply to the problem of scheduling a group meeting. As a second new application, we reconstruct a linguistic problem appearing in German dialects within ASP. Regarding linguistic studies, there is an ongoing debate about how unique the rule systems of language are in human cognition. The reconstruction of grammatical regularities with tools from computer science has consequences for this debate: if grammars can be modelled this way, then they share core properties with other non-linguistic rule systems.
QuantPrime
(2008)
Background
Medium- to large-scale expression profiling using quantitative polymerase chain reaction (qPCR) assays are becoming increasingly important in genomics research. A major bottleneck in experiment preparation is the design of specific primer pairs, where researchers have to make several informed choices, often outside their area of expertise. Using currently available primer design tools, several interactive decisions have to be made, resulting in lengthy design processes with varying qualities of the assays.
Results
Here we present QuantPrime, an intuitive and user-friendly, fully automated tool for primer pair design in small- to large-scale qPCR analyses. QuantPrime can be used online through the internet http://www.quantprime.de/ or on a local computer after download; it offers design and specificity checking with highly customizable parameters and is ready to use with many publicly available transcriptomes of important higher eukaryotic model organisms and plant crops (currently 295 species in total), while benefiting from exon-intron border and alternative splice variant information in available genome annotations. Experimental results with the model plant Arabidopsis thaliana, the crop Hordeum vulgare and the model green alga Chlamydomonas reinhardtii show success rates of designed primer pairs exceeding 96%.
Conclusion
QuantPrime constitutes a flexible, fully automated web application for reliable primer design for use in larger qPCR experiments, as proven by experimental data. The flexible framework is also open for simple use in other quantification applications, such as hydrolyzation probe design for qPCR and oligonucleotide probe design for quantitative in situ hybridization. Future suggestions made by users can be easily implemented, thus allowing QuantPrime to be developed into a broad-range platform for the design of RNA expression assays.
Bio-jETI
(2008)
Background: With Bio-jETI, we introduce a service platform for interdisciplinary work on biological application domains and illustrate its use in a concrete application concerning statistical data processing in R and xcms for an LC/MS analysis of FAAH gene knockout.
Methods: Bio-jETI uses the jABC environment for service-oriented modeling and design as a graphical process modeling tool and the jETI service integration technology for remote tool execution.
Conclusions: As a service definition and provisioning platform, Bio-jETI has the potential to become a core technology in interdisciplinary service orchestration and technology transfer. Domain experts, like biologists not trained in computer science, directly define complex service orchestrations as process models and use efficient and complex bioinformatics tools in a simple and intuitive way.
One of the main problems in machine learning is to train a predictive model from training data and to make predictions on test data. Most predictive models are constructed under the assumption that the training data is governed by the exact same distribution which the model will later be exposed to. In practice, control over the data collection process is often imperfect. A typical scenario is when labels are collected by questionnaires and one does not have access to the test population. For example, parts of the test population are underrepresented in the survey, out of reach, or do not return the questionnaire. In many applications training data from the test distribution are scarce because they are difficult to obtain or very expensive. Data from auxiliary sources drawn from similar distributions are often cheaply available. This thesis centers around learning under differing training and test distributions and covers several problem settings with different assumptions on the relationship between training and test distributions-including multi-task learning and learning under covariate shift and sample selection bias. Several new models are derived that directly characterize the divergence between training and test distributions, without the intermediate step of estimating training and test distributions separately. The integral part of these models are rescaling weights that match the rescaled or resampled training distribution to the test distribution. Integrated models are studied where only one optimization problem needs to be solved for learning under differing distributions. With a two-step approximation to the integrated models almost any supervised learning algorithm can be adopted to biased training data. In case studies on spam filtering, HIV therapy screening, targeted advertising, and other applications the performance of the new models is compared to state-of-the-art reference methods.
We propose a network structure-based model for heterosis, and investigate it relying on metabolite profiles from Arabidopsis. A simple feed-forward two-layer network model (the Steinbuch matrix) is used in our conceptual approach. It allows for directly relating structural network properties with biological function. Interpreting heterosis as increased adaptability, our model predicts that the biological networks involved show increasing connectivity of regulatory interactions. A detailed analysis of metabolite profile data reveals that the increasing-connectivity prediction is true for graphical Gaussian models in our data from early development. This mirrors properties of observed heterotic Arabidopsis phenotypes. Furthermore, the model predicts a limit for increasing hybrid vigor with increasing heterozygosity—a known phenomenon in the literature.
Proceedings of the 2nd International Workshop on e-learning and Virtual and Remote Laboratories
(2008)
Content Session 1: Architecture of Virtual & Remote Laboratory Infrastructures (I) An Internet-Based Laboratory Course in Chemical Reaction Engineering and Unit Operations Internet Based Laboratory for Experimentation with Multilevel Medium-Power Converters Session 2: Architecture of Virtual & Remote Laboratory Infrastructures (II) Content management and architectural issues of a remote learning laboratory Distributed Software Architecture and Applications for Remote Laboratories Tele-Lab IT-Security: an architecture for an online virtual IT security lab Session 3: New e-learning Techniques for Virtual & Remote Laboratories NeOS: Neuchˆatel Online System A Flexible Instructional Electronics Laboratory with Local and Remote LabWorkbenches in a Grid Simulation of an Intelligent Network - Basic Call State Model Remote Laboratory Session 4: Service-Orientation in Virtual & Remote Laboratories SOA Meets Robots - A Service-Based Software Infrastructure For Remote Laboratories Service Orientation in Education - Intelligent Networks for eLearning / mLearning
Duplicate detection consists in determining different representations of real-world objects in a database. Recent research has considered the use of relationships among object representations to improve duplicate detection. In the general case where relationships form a graph, research has mainly focused on duplicate detection quality/effectiveness. Scalability has been neglected so far, even though it is crucial for large real-world duplicate detection tasks. In this paper we scale up duplicate detection in graph data (DDG) to large amounts of data and pairwise comparisons, using the support of a relational database system. To this end, we first generalize the process of DDG. We then present how to scale algorithms for DDG in space (amount of data processed with limited main memory) and in time. Finally, we explore how complex similarity computation can be performed efficiently. Experiments on data an order of magnitude larger than data considered so far in DDG clearly show that our methods scale to large amounts of data not residing in main memory.
Contents: Artem Polyvanny, Sergey Smirnow, and Mathias Weske The Triconnected Abstraction of Process Models 1 Introduction 2 Business Process Model Abstraction 3 Preliminaries 4 Triconnected Decomposition 4.1 Basic Approach for Process Component Discovery 4.2 SPQR-Tree Decomposition 4.3 SPQR-Tree Fragments in the Context of Process Models 5 Triconnected Abstraction 5.1 Abstraction Rules 5.2 Abstraction Algorithm 6 Related Work and Conclusions