Refine
Has Fulltext
- yes (14)
Year of publication
- 2013 (14) (remove)
Document Type
- Doctoral Thesis (14) (remove)
Language
- English (14) (remove)
Keywords
- HCI (2)
- 3D Computer Grafik (1)
- 3D Computer Graphics (1)
- Anisotroper Kuwahara Filter (1)
- Anomalien (1)
- Ausführungsgeschichte (1)
- Berührungseingaben (1)
- CSCW (1)
- Cloud Computing (1)
- Cloud computing (1)
- Databases (1)
- Datenabhängigkeiten-Entdeckung (1)
- Datenbanken (1)
- Datenintegration (1)
- Design Thinking (1)
- Differenz von Gauss Filtern (1)
- Digitale Whiteboards (1)
- Disambiguierung (1)
- Eingabegenauigkeit (1)
- Evolution (1)
- Fehlerbeseitigung (1)
- Flussgesteuerter Bilateraler Filter (1)
- Focus+Context Visualization (1)
- Fokus-&-Kontext Visualisierung (1)
- Index (1)
- Index Structures (1)
- Indexstrukturen (1)
- Inklusionsabhängigkeit (1)
- Interactive Rendering (1)
- Interaktives Rendering (1)
- Internet applications (1)
- Internetanwendungen (1)
- Kontext (1)
- Leistungsfähigkeit (1)
- Link-Entdeckung (1)
- Mobilgeräte (1)
- Modell (1)
- Modellierung (1)
- Nicht-photorealistisches Rendering (1)
- Performance (1)
- Prozessmodellsuche (1)
- Präsentation (1)
- Query (1)
- Scalability (1)
- Schema-Entdeckung (1)
- Search Algorithms (1)
- Semantische Analyse (1)
- Service-orientierte Systeme (1)
- Similarity Measures (1)
- Similarity Search (1)
- Skalierbarkeit (1)
- Softwaretest (1)
- Suchverfahren (1)
- Systems of Systems (1)
- Test-getriebene Fehlernavigation (1)
- Verifikation (1)
- Verteiltes Arbeiten (1)
- Videoanalyse (1)
- Videometadaten (1)
- Web of Data (1)
- anisotropic Kuwahara filter (1)
- anomalies (1)
- back-in-time (1)
- behavioral specification (1)
- coherence-enhancing filtering (1)
- context awareness (1)
- cscw (1)
- data integration (1)
- debugging (1)
- dependency discovery (1)
- design thinking (1)
- difference of Gaussians (1)
- digital whiteboard (1)
- entity alignment (1)
- evolution (1)
- flow-based bilateral filter (1)
- gesture (1)
- graph clustering (1)
- inclusion dependency (1)
- index (1)
- input accuracy (1)
- interaction (1)
- interactive simulation (1)
- interface (1)
- link discovery (1)
- map/reduce (1)
- mobile (1)
- mobile devices (1)
- model (1)
- model-based prototyping (1)
- modelling (1)
- non-photorealistic rendering (1)
- presentation (1)
- process model search (1)
- querying (1)
- rapid prototyping (1)
- remote collaboration (1)
- requirements engineering (1)
- schema discovery (1)
- semantic analysis (1)
- service-oriented systems (1)
- similarity (1)
- systems of systems (1)
- test-driven fault navigation (1)
- testing (1)
- topics (1)
- touch input (1)
- verification (1)
- video analysis (1)
- video metadata (1)
- word sense disambiguation (1)
- Ähnlichkeit (1)
- Ähnlichkeitsmaße (1)
- Ähnlichkeitssuche (1)
Institute
- Hasso-Plattner-Institut für Digital Engineering gGmbH (14) (remove)
Data integration aims to combine data of different sources and to provide users with a unified view on these data. This task is as challenging as valuable. In this thesis we propose algorithms for dependency discovery to provide necessary information for data integration. We focus on inclusion dependencies (INDs) in general and a special form named conditional inclusion dependencies (CINDs): (i) INDs enable the discovery of structure in a given schema. (ii) INDs and CINDs support the discovery of cross-references or links between schemas. An IND “A in B” simply states that all values of attribute A are included in the set of values of attribute B. We propose an algorithm that discovers all inclusion dependencies in a relational data source. The challenge of this task is the complexity of testing all attribute pairs and further of comparing all of each attribute pair's values. The complexity of existing approaches depends on the number of attribute pairs, while ours depends only on the number of attributes. Thus, our algorithm enables to profile entirely unknown data sources with large schemas by discovering all INDs. Further, we provide an approach to extract foreign keys from the identified INDs. We extend our IND discovery algorithm to also find three special types of INDs: (i) Composite INDs, such as “AB in CD”, (ii) approximate INDs that allow a certain amount of values of A to be not included in B, and (iii) prefix and suffix INDs that represent special cross-references between schemas. Conditional inclusion dependencies are inclusion dependencies with a limited scope defined by conditions over several attributes. Only the matching part of the instance must adhere the dependency. We generalize the definition of CINDs distinguishing covering and completeness conditions and define quality measures for conditions. We propose efficient algorithms that identify covering and completeness conditions conforming to given quality thresholds. The challenge for this task is twofold: (i) Which (and how many) attributes should be used for the conditions? (ii) Which attribute values should be chosen for the conditions? Previous approaches rely on pre-selected condition attributes or can only discover conditions applying to quality thresholds of 100%. Our approaches were motivated by two application domains: data integration in the life sciences and link discovery for linked open data. We show the efficiency and the benefits of our approaches for use cases in these domains.