Refine
Year of publication
Document Type
- Article (41)
- Monograph/Edited Volume (11)
- Other (3)
- Conference Proceeding (1)
- Postprint (1)
- Preprint (1)
Is part of the Bibliography
- yes (58)
Keywords
- radiation mechanisms: non-thermal (8)
- gamma rays: galaxies (6)
- galaxies: active (5)
- gamma rays: general (5)
- ISM: supernova remnants (4)
- data profiling (4)
- Datenintegration (3)
- duplicate detection (3)
- similarity measures (3)
- Data Integration (2)
- Forschungskolleg (2)
- Functional dependencies (2)
- Hasso Plattner Institute (2)
- Hasso-Plattner-Institut (2)
- ISM: individual objects: G338.3-0.0 (2)
- Klausurtagung (2)
- Query optimization (2)
- Service-oriented Systems Engineering (2)
- acceleration of particles (2)
- data matching (2)
- data quality (2)
- data wrangling (2)
- entity resolution (2)
- galaxies: jets (2)
- record linkage (2)
- Address matching (1)
- Air showers (1)
- Approximation algorithms (1)
- Apriori (1)
- Association Rule Mining (1)
- Assoziationsregeln (1)
- BL Lacertae objects: general (1)
- BL Lacertae objects: individual: 1ES 1312-423 (1)
- BL Lacertae objects: individual: AP Librae (1)
- BL Lacertae objects: individual: PKS 0301-243 (1)
- BL Lacertae objects: individual: PKS 2155-304 (1)
- BL Lacertae objects: individual: SHBL J001355.9-185406 (1)
- BL Lacertae objects: individual: lES 0229+200 (1)
- BL Lacertae objects: individual: lES 1101-232 (1)
- Bedingte Inklusionsabhängigkeiten (1)
- Big Data (1)
- Cherenkov Telescopes (1)
- Complexity theory (1)
- Conditional Inclusion Dependency (1)
- Cross-platform (1)
- Data Dependency (1)
- Data Profiling (1)
- Data Quality (1)
- Data Warehouse (1)
- Data dependencies (1)
- Data processing (1)
- Data profiling (1)
- Data profiling application (1)
- Database (1)
- Datenabhängigkeiten (1)
- Datenanalyse (1)
- Datenqualität (1)
- Design concepts (1)
- Distributed (1)
- Duplicate Detection (1)
- Duplikaterkennung (1)
- Entity resolution (1)
- Erkennen von Meta-Daten (1)
- Extract-Transform-Load (ETL) (1)
- Foreign key (1)
- Ground based gamma ray astronomy (1)
- ISM: clouds (1)
- ISM: individual objects: Crab nebula (1)
- ISM: individual objects: HESS J1832-093 (1)
- ISM: individual objects: SNR G1.9+0.3 (1)
- ISM: individual objects: SNR G22.7-0.2 (1)
- ISM: individual objects: SNR G330.2+1.0 (1)
- ISM: magnetic fields (1)
- Inclusion dependencies (1)
- Information Extraction (1)
- Information Systems (1)
- Informationsextraktion (1)
- Informationssysteme (1)
- Lakes (1)
- Link Discovery (1)
- Link-Entdeckung (1)
- Linked Data (1)
- Linked Open Data (1)
- Metadata Discovery (1)
- Metadatenentdeckung (1)
- Metadatenqualität (1)
- Next generation Cherenkov telescopes (1)
- Order dependencies (1)
- Ph.D. Retreat (1)
- Ph.D. retreat (1)
- Polystore (1)
- Primary key (1)
- Query execution (1)
- Record linkage (1)
- Relational data (1)
- Research School (1)
- SQL (1)
- Schemaentdeckung (1)
- Schlüsselentdeckung (1)
- Semantics (1)
- TeV gamma-ray astronomy (1)
- Unique column combinations (1)
- Wikipedia (1)
- X-rays: binaries (1)
- X-rays: general (1)
- X-rays: individuals: G15.4+0.1 (1)
- X-rays: stars (1)
- address normalization (1)
- address parsing (1)
- apriori (1)
- astroparticle physics (1)
- binaries: general (1)
- clustering (1)
- conditional functional dependencies (1)
- contract (1)
- corporate takeovers (1)
- cosmic rays (1)
- cross-platform (1)
- data cleaning (1)
- data cleansing (1)
- data integration (1)
- data preparation (1)
- data processing (1)
- databases (1)
- deduplication (1)
- dependency discovery (1)
- eindeutig (1)
- errata, addenda (1)
- explainability (1)
- explainability-accuracy trade-off (1)
- explainable AI (1)
- functional dependencies (1)
- functional dependency (1)
- funktionale Abhängigkeit (1)
- galaxies: individual (M 87) (1)
- galaxies: magnetic fields (1)
- galaxies: nuclei (1)
- gamma rays: ISM (1)
- gamma rays: general(HESS J0632+057, VER J0633+057) (1)
- gamma rays: stars (1)
- gamma-ray burst: individual: GRB 100621A (1)
- gamma-rays: ISM (1)
- gamma-rays: galaxies (1)
- gamma-rays: general (1)
- geocoding (1)
- geographic information systems (1)
- globular clusters: general (1)
- infrared: diffuse background (1)
- intergalactic medium (1)
- interpretable machine learning (1)
- key discovery (1)
- law (1)
- management (1)
- matching dependencies (1)
- medical malpractice (1)
- metadata discovery (1)
- metadata quality (1)
- methods: observational (1)
- metric learning (1)
- networks (1)
- neural (1)
- polystore (1)
- pulsars: general (1)
- pulsars: individual: PSR B1259-63 (1)
- quasars: individual: PKS 1510-089 (1)
- query optimization (1)
- random forest (1)
- relativistic processes (1)
- research school (1)
- schema discovery (1)
- service-oriented systems engineering (1)
- similarity learning (1)
- stars: individual: LS 2883 (1)
- supernovae: individual: HESS J1818-154 (1)
- tort law (1)
- transfer learning (1)
- unique (1)
Functional dependencies (FDs) play an important role in maintaining data quality. They can be used to enforce data consistency and to guide repairs over a database. In this work, we investigate the problem of missing values and its impact on FD discovery. When using existing FD discovery algorithms, some genuine FDs could not be detected precisely due to missing values or some non-genuine FDs can be discovered even though they are caused by missing values with a certain NULL semantics. We define a notion of genuineness and propose algorithms to compute the genuineness score of a discovered FD. This can be used to identify the genuine FDs among the set of all valid dependencies that hold on the data. We evaluate the quality of our method over various real-world and semi-synthetic datasets with extensive experiments. The results show that our method performs well for relatively large FD sets and is able to accurately capture genuine FDs.