Refine
Year of publication
Document Type
- Article (40)
- Monograph/Edited Volume (11)
- Other (3)
- Conference Proceeding (1)
- Postprint (1)
- Preprint (1)
Is part of the Bibliography
- yes (57)
Keywords
- radiation mechanisms: non-thermal (8)
- gamma rays: galaxies (6)
- galaxies: active (5)
- gamma rays: general (5)
- ISM: supernova remnants (4)
- data profiling (4)
- Datenintegration (3)
- duplicate detection (3)
- similarity measures (3)
- Data Integration (2)
VLDB 2021
(2021)
The 47th International Conference on Very Large Databases (VLDB'21) was held on August 16-20, 2021 as a hybrid conference. It attracted 180 in-person attendees in Copenhagen and 840 remote attendees. In this paper, we describe our key decisions as general chairs and program committee chairs and share the lessons we learned.
Extract-Transform-Load (ETL) tools are used for the creation, maintenance, and evolution of data warehouses, data marts, and operational data stores. ETL workflows populate those systems with data from various data sources by specifying and executing a DAG of transformations. Over time, hundreds of individual workflows evolve as new sources and new requirements are integrated into the system. The maintenance and evolution of large-scale ETL systems requires much time and manual effort. A key problem is to understand the meaning of unfamiliar attribute labels in source and target databases and ETL transformations. Hard-to-understand attribute labels lead to frustration and time spent to develop and understand ETL workflows. We present a schema decryption technique to support ETL developers in understanding cryptic schemata of sources, targets, and ETL transformations. For a given ETL system, our recommender-like approach leverages the large number of mapped attribute labels in existing ETL workflows to produce good and meaningful decryptions. In this way we are able to decrypt attribute labels consisting of a number of unfamiliar few-letter abbreviations, such as UNP_PEN_INT, which we can decrypt to UNPAID_PENALTY_INTEREST. We evaluate our schema decryption approach on three real-world repositories of ETL workflows and show that our approach is able to suggest high-quality decryptions for cryptic attribute labels in a given schema.
Duplicate detection algorithms produce clusters of database records, each cluster representing a single real-world entity. As most of these algorithms use pairwise comparisons, the resulting (transitive) clusters can be inconsistent: Not all records within a cluster are sufficiently similar to be classified as duplicate. Thus, one of many subsequent clustering algorithms can further improve the result. <br /> We explain in detail, compare, and evaluate many of these algorithms and introduce three new clustering algorithms in the specific context of duplicate detection. Two of our three new algorithms use the structure of the input graph to create consistent clusters. Our third algorithm, and many other clustering algorithms, focus on the edge weights, instead. For evaluation, in contrast to related work, we experiment on true real-world datasets, and in addition examine in great detail various pair-selection strategies used in practice. While no overall winner emerges, we are able to identify best approaches for different situations. In scenarios with larger clusters, our proposed algorithm, Extended Maximum Clique Clustering (EMCC), and Markov Clustering show the best results. EMCC especially outperforms Markov Clustering regarding the precision of the results and additionally has the advantage that it can also be used in scenarios where edge weights are not available.
The task of expert finding is to rank the experts in the search space given a field of expertise as an input query. In this paper, we propose a topic modeling approach for this task. The proposed model uses latent Dirichlet allocation (LDA) to induce probabilistic topics. In the first step of our algorithm, the main topics of a document collection are extracted using LDA. The extracted topics present the connection between expert candidates and user queries. In the second step, the topics are used as a bridge to find the probability of selecting each candidate for a given query. The candidates are then ranked based on these probabilities. The experimental results on the Text REtrieval Conference (TREC) Enterprise track for 2005 and 2006 show that the proposed topic-based approach outperforms the state-of-the-art profile- and document-based models, which use information retrieval methods to rank experts. Moreover, we present the superiority of the proposed topic-based approach to the improved document-based expert finding systems, which consider additional information such as local context, candidate prior, and query expansion.
The gamma-ray spectrum of the low-frequency-peaked BL Lac (LBL) object AP Librae is studied, following the discovery of very-high-energy (VHE; E > 100 GeV) gamma-ray emission up to the TeV range by the H.E.S.S. experiment. Thismakes AP Librae one of the few VHE emitters of the LBL type. The measured spectrum yields a flux of (8.8 +/- 1.5(stat) +/- 1.8(sys)) x 10(-12) cm(-2) s(-1) above 130 GeV and a spectral index of Gamma = 2.65 +/- 0.19(stat) +/- 0.20(sys). This study also makes use of Fermi-LAT observations in the high energy (HE, E > 100 MeV) range, providing the longest continuous light curve (5 years) ever published on this source. The source underwent a flaring event between MJD 56 306-56 376 in the HE range, with a flux increase of a factor of 3.5 in the 14 day bin light curve and no significant variation in spectral shape with respect to the low-flux state. While the H.E.S.S. and (low state) Fermi-LAT fluxes are in good agreement where they overlap, a spectral curvature between the steep VHE spectrum and the Fermi-LAT spectrum is observed. The maximum of the gamma-ray emission in the spectral energy distribution is located below the GeV energy range.
The 2010 very high energy gamma-ray flare and 10 years ofmulti-wavelength oservations of M 87
(2012)
The giant radio galaxy M 87 with its proximity (16 Mpc), famous jet, and very massive black hole ((3-6) x 10(9) M-circle dot) provides a unique opportunity to investigate the origin of very high energy (VHE; E > 100 GeV) gamma-ray emission generated in relativistic outflows and the surroundings of supermassive black holes. M 87 has been established as a VHE gamma-ray emitter since 2006. The VHE gamma-ray emission displays strong variability on timescales as short as a day. In this paper, results from a joint VHE monitoring campaign on M 87 by the MAGIC and VERITAS instruments in 2010 are reported. During the campaign, a flare at VHE was detected triggering further observations at VHE (H.E.S.S.), X-rays (Chandra), and radio (43 GHz Very Long Baseline Array, VLBA). The excellent sampling of the VHE gamma-ray light curve enables one to derive a precise temporal characterization of the flare: the single, isolated flare is well described by a two-sided exponential function with significantly different flux rise and decay times of tau(rise)(d) = (1.69 +/- 0.30) days and tau(decay)(d) = (0.611 +/- 0.080) days, respectively. While the overall variability pattern of the 2010 flare appears somewhat different from that of previous VHE flares in 2005 and 2008, they share very similar timescales (similar to day), peak fluxes (Phi(>0.35 TeV) similar or equal to (1-3) x 10(-11) photons cm(-2) s(-1)), and VHE spectra. VLBA radio observations of 43 GHz of the inner jet regions indicate no enhanced flux in 2010 in contrast to observations in 2008, where an increase of the radio flux of the innermost core regions coincided with a VHE flare. On the other hand, Chandra X-ray observations taken similar to 3 days after the peak of the VHE gamma-ray emission reveal an enhanced flux from the core (flux increased by factor similar to 2; variability timescale <2 days). The long-term (2001-2010) multi-wavelength (MWL) light curve of M 87, spanning from radio to VHE and including data from Hubble Space Telescope, Liverpool Telescope, Very Large Array, and European VLBI Network, is used to further investigate the origin of the VHE gamma-ray emission. No unique, common MWL signature of the three VHE flares has been identified. In the outer kiloparsec jet region, in particular in HST-1, no enhanced MWL activity was detected in 2008 and 2010, disfavoring it as the origin of the VHE flares during these years. Shortly after two of the three flares (2008 and 2010), the X-ray core was observed to be at a higher flux level than its characteristic range (determined from more than 60 monitoring observations: 2002-2009). In 2005, the strong flux dominance of HST-1 could have suppressed the detection of such a feature. Published models for VHE gamma-ray emission from M 87 are reviewed in the light of the new data.
TeV gamma-ray observations of the young synchrotron-dominated SNRs G1.9+0.3 and G330.2+1.0 with HESS
(2014)
The non-thermal nature of the X-ray emission from the shell-type supernova remnants (SNRs) G1.9+0.3 and G330.2+1.0 is an indication of intense particle acceleration in the shock fronts of both objects. This suggests that the SNRs are prime candidates for very-high-energy (VHE; E > 0.1 TeV) gamma-ray observations. G1.9+0.3, recently established as the youngest known SNR in the Galaxy, also offers a unique opportunity to study the earliest stages of SNR evolution in the VHE domain. The purpose of this work is to probe the level of VHE gamma-ray emission from both SNRs and use this to constrain their physical properties. Observations were conducted with the H. E. S. S. (High Energy Stereoscopic System) Cherenkov Telescope Array over a more than six-year period spanning 2004-2010. The obtained data have effective livetimes of 67 h for G1.9+0.3 and 16 h for G330.2+1.0. The data are analysed in the context of the multiwavelength observations currently available and in the framework of both leptonic and hadronic particle acceleration scenarios. No significant gamma-ray signal from G1.9+0.3 or G330.2+1.0 was detected. Upper limits (99 per cent confidence level) to the TeV flux from G1.9+0.3 and G330.2+1.0 for the assumed spectral index Gamma = 2.5 were set at 5.6 x 10(-1)3 cm(-2) s(-1) above 0.26 TeV and 3.2 x 10(-12) cm(-2) s(-1) above 0.38 TeV, respectively. In a one-zone leptonic scenario, these upper limits imply lower limits on the interior magnetic field to B-G1.9 greater than or similar to 12 mu G for G1.9+0.3 and to B-G330 greater than or similar to 8 mu G for G330.2+1.0. In a hadronic scenario, the low ambient densities and the large distances to the SNRs result in very low predicted fluxes, for which the H.E.S.S. upper limits are not constraining.
Duplicate detection consists in determining different representations of real-world objects in a database. Recent research has considered the use of relationships among object representations to improve duplicate detection. In the general case where relationships form a graph, research has mainly focused on duplicate detection quality/effectiveness. Scalability has been neglected so far, even though it is crucial for large real-world duplicate detection tasks. In this paper we scale up duplicate detection in graph data (DDG) to large amounts of data and pairwise comparisons, using the support of a relational database system. To this end, we first generalize the process of DDG. We then present how to scale algorithms for DDG in space (amount of data processed with limited main memory) and in time. Finally, we explore how complex similarity computation can be performed efficiently. Experiments on data an order of magnitude larger than data considered so far in DDG clearly show that our methods scale to large amounts of data not residing in main memory.
Context. Globular clusters (GCs) are established emitters of high-energy (HE, 100 MeV < E < 100 GeV) gamma-ray radiation which could originate from the cumulative emission of the numerous millisecond pulsars (msPSRs) in the clusters’ cores or from inverse Compton (IC) scattering of relativistic leptons accelerated in the GC environment. These stellar clusters could also constitute a new class of sources in the very-high-energy (VHE, E > 100 GeV) gamma-ray regime, judging from the recent detection of a signal from the direction of Terzan 5 with the H.E.S.S. telescope array. Aims. To search for VHE gamma-ray sources associated with other GCs, and to put constraints on leptonic emission models, we systematically analyzed the observations towards 15 GCs taken with the H. E. S. S. array of imaging atmospheric Cherenkov telescopes. Methods. We searched for point-like and extended VHE gamma-ray emission from each GC in our sample and also performed a stacking analysis combining the data from all GCs to investigate the hypothesis of a population of faint emitters. Assuming IC emission as the origin of the VHE gamma-ray signal from the direction of Terzan 5, we calculated the expected gamma-ray flux from each of the 15 GCs, based on their number of millisecond pulsars, their optical brightness and the energy density of background photon fields. Results. We did not detect significant VHE gamma-ray emission from any of the 15 GCs in either of the two analyses. Given the uncertainties related to the parameter determinations, the obtained flux upper limits allow to rule out the simple IC/msPSR scaling model for NGC6388 and NGC7078. The upper limits derived from the stacking analyses are factors between 2 and 50 below the flux predicted by the simple leptonic scaling model, depending on the assumed source extent and the dominant target photon fields. Therefore, Terzan 5 still remains exceptional among all GCs, as the VHE gamma-ray emission either arises from extra-ordinarily efficient leptonic processes, or from a recent catastrophic event, or is even unrelated to the GC itself.
Search for TeV Gamma-ray emission from GRB 100621A, an extremely bright GRB in X-rays, with HESS
(2014)
The long gamma-ray burst (GRB) 100621A, at the time the brightest X-ray transient ever detected by Swift-XRT in the 0.3-10 keV range, has been observed with the H.E.S.S. imaging air Cherenkov telescope array, sensitive to gamma radiation in the very-high-energy (VHE, >100 GeV) regime. Due to its relatively small redshift of z similar to 0.5, the favourable position in the southern sky and the relatively short follow-up time (<700 s after the satellite trigger) of the H.E.S.S. observations, this GRB could be within the sensitivity reach of the HESS. instrument. The analysis of the HESS. data shows no indication of emission and yields an integral flux upper limit above similar to 380 GeV of 4.2 x 10(-12) cm(-2) s(-1) s (95% confidence level), assuming a simple Band function extension model. A comparison to a spectral-temporal model, normalised to the prompt flux at sub-MeV energies, constraints the existence of a temporally extended and strong additional hard power law, as has been observed in the other bright X-ray GRB 130427A. A comparison between the HESS. upper limit and the contemporaneous energy output in X-rays constrains the ratio between the X-ray and VHE gamma-ray fluxes to be greater than 0.4. This value is an important quantity for modelling the afterglow and can constrain leptonic emission scenarios, where leptons are responsible for the X-ray emission and might produce VHE gamma rays.