Refine
Year of publication
Document Type
- Article (44)
- Monograph/Edited Volume (11)
- Other (3)
- Conference Proceeding (1)
- Postprint (1)
- Preprint (1)
Is part of the Bibliography
- yes (61)
Keywords
- radiation mechanisms: non-thermal (8)
- gamma rays: galaxies (6)
- galaxies: active (5)
- gamma rays: general (5)
- ISM: supernova remnants (4)
- data profiling (4)
- Datenintegration (3)
- duplicate detection (3)
- similarity measures (3)
- Data Integration (2)
The task of expert finding is to rank the experts in the search space given a field of expertise as an input query. In this paper, we propose a topic modeling approach for this task. The proposed model uses latent Dirichlet allocation (LDA) to induce probabilistic topics. In the first step of our algorithm, the main topics of a document collection are extracted using LDA. The extracted topics present the connection between expert candidates and user queries. In the second step, the topics are used as a bridge to find the probability of selecting each candidate for a given query. The candidates are then ranked based on these probabilities. The experimental results on the Text REtrieval Conference (TREC) Enterprise track for 2005 and 2006 show that the proposed topic-based approach outperforms the state-of-the-art profile- and document-based models, which use information retrieval methods to rank experts. Moreover, we present the superiority of the proposed topic-based approach to the improved document-based expert finding systems, which consider additional information such as local context, candidate prior, and query expansion.
Context. Globular clusters (GCs) are established emitters of high-energy (HE, 100 MeV < E < 100 GeV) gamma-ray radiation which could originate from the cumulative emission of the numerous millisecond pulsars (msPSRs) in the clusters’ cores or from inverse Compton (IC) scattering of relativistic leptons accelerated in the GC environment. These stellar clusters could also constitute a new class of sources in the very-high-energy (VHE, E > 100 GeV) gamma-ray regime, judging from the recent detection of a signal from the direction of Terzan 5 with the H.E.S.S. telescope array. Aims. To search for VHE gamma-ray sources associated with other GCs, and to put constraints on leptonic emission models, we systematically analyzed the observations towards 15 GCs taken with the H. E. S. S. array of imaging atmospheric Cherenkov telescopes. Methods. We searched for point-like and extended VHE gamma-ray emission from each GC in our sample and also performed a stacking analysis combining the data from all GCs to investigate the hypothesis of a population of faint emitters. Assuming IC emission as the origin of the VHE gamma-ray signal from the direction of Terzan 5, we calculated the expected gamma-ray flux from each of the 15 GCs, based on their number of millisecond pulsars, their optical brightness and the energy density of background photon fields. Results. We did not detect significant VHE gamma-ray emission from any of the 15 GCs in either of the two analyses. Given the uncertainties related to the parameter determinations, the obtained flux upper limits allow to rule out the simple IC/msPSR scaling model for NGC6388 and NGC7078. The upper limits derived from the stacking analyses are factors between 2 and 50 below the flux predicted by the simple leptonic scaling model, depending on the assumed source extent and the dominant target photon fields. Therefore, Terzan 5 still remains exceptional among all GCs, as the VHE gamma-ray emission either arises from extra-ordinarily efficient leptonic processes, or from a recent catastrophic event, or is even unrelated to the GC itself.
The quasar PKS 1510-089 (z = 0.361) was observed with the H.E.S.S. array of imaging atmospheric Cherenkov telescopes during high states in the optical and GeV bands, to search for very high energy (VHE, defined as E >= 0.1 TeV) emission. VHE gamma-rays were detected with a statistical significance of 9.2 standard deviations in 15.8 h of H. E. S. S. data taken during March and April 2009. A VHE integral flux of I(0.15 TeV < E < 1.0TeV) = (1.0 +/- 0.2(stat) +/- 0.2(sys)) x 10(-11) cm(-2) s(-1) is measured. The best-fit power law to the VHE data has a photon index of G = 5.4 +/- 0.7(stat) +/- 0.3(sys). The GeV and optical light curves show pronounced variability during the period of H.E.S.S. observations. However, there is insufficient evidence to claim statistically significant variability in the VHE data. Because of its relatively high redshift, the VHE flux from PKS 1510-089 should suffer considerable attenuation in the intergalactic space due to the extragalactic background light (EBL). Hence, the measured gamma-ray spectrum is used to derive upper limits on the opacity due to EBL, which are found to be comparable with the previously derived limits from relatively-nearby BL Lac objects. Unlike typical VHE-detected blazars where the broadband spectrum is dominated by nonthermal radiation at all wavelengths, the quasar PKS 1510-089 has a bright thermal component in the optical to UV frequency band. Among all VHE detected blazars, PKS 1510-089 has the most luminous broad line region. The detection of VHE emission from this quasar indicates a low level of gamma - gamma absorption on the internal optical to UV photon field.
Axionlike particles (ALPs) are hypothetical light (sub-eV) bosons predicted in some extensions of the Standard Model of particle physics. In astrophysical environments comprising high-energy gamma rays and turbulent magnetic fields, the existence of ALPs can modify the energy spectrum of the gamma rays for a sufficiently large coupling between ALPs and photons. This modification would take the form of an irregular behavior of the energy spectrum in a limited energy range. Data from the H. E. S. S. observations of the distant BL Lac object PKS 2155 - 304 (z = 0.116) are used to derive upper limits at the 95% C. L. on the strength of the ALP coupling to photons, g(gamma a) < 2.1 x 10(-11) GeV-1 for an ALP mass between 15 and 60 neV. The results depend on assumptions on the magnetic field around the source, which are chosen conservatively. The derived constraints apply to both light pseudoscalar and scalar bosons that couple to the electromagnetic field.
Gamma-ray line signatures can be expected in the very-high-energy (E-gamma > 100 GeV) domain due to self-annihilation or decay of dark matter (DM) particles in space. Such a signal would be readily distinguishable from astrophysical gamma-ray sources that in most cases produce continuous spectra that span over several orders of magnitude in energy. Using data collected with the H. E. S. S. gamma-ray instrument, upper limits on linelike emission are obtained in the energy range between similar to 500 GeV and similar to 25 TeV for the central part of the Milky Way halo and for extragalactic observations, complementing recent limits obtained with the Fermi-LAT instrument at lower energies. No statistically significant signal could be found. For monochromatic gamma-ray line emission, flux limits of (2 x 10(-7)-2 x 10(-5)) m(-2)s(-1)sr(-1) and (1 x 10(-8)- 2 x 10(-6)) m(-2)s(-1)sr(-1) are obtained for the central part of the Milky Way halo and extragalactic observations, respectively. For a DM particle mass of 1 TeV, limits on the velocity- averaged DM annihilation cross section <sigma upsilon >(chi chi ->gamma gamma) reach similar to 10(-27)cm(3)s(-1), based on the Einasto parametrization of the Galactic DM halo density profile. DOI: 10.1103/PhysRevLett.110.041301
Introducing the CTA concept
(2013)
The Cherenkov Telescope Array (CTA) is a new observatory for very high-energy (VHE) gamma rays. CTA has ambitions science goals, for which it is necessary to achieve full-sky coverage, to improve the sensitivity by about an order of magnitude, to span about four decades of energy, from a few tens of GeV to above 100 TeV with enhanced angular and energy resolutions over existing VHE gamma-ray observatories. An international collaboration has formed with more than 1000 members from 27 countries in Europe, Asia, Africa and North and South America. In 2010 the CTA Consortium completed a Design Study and started a three-year Preparatory Phase which leads to production readiness of CTA in 2014. In this paper we introduce the science goals and the concept of CTA, and provide an overview of the project.
Discovery of high and very high-energy emission from the BL Lacertae object SHBL J001355.9-185406
(2013)
The detection of the high-frequency peaked BL Lac object (HBL) SHBL J001355.9-185406 (z = 0.095) at high (HE; 100 MeV < E < 300 GeV) and very high-energy (VHE; E > 100 GeV) with the Fermi Large Area Telescope (LAT) and the High Energy Stereoscopic System (H.E.S.S.) is reported. Dedicated observations were performed with the H. E. S. S. telescopes, leading to a detection at the 5.5 sigma significance level. The measured flux above 310 GeV is (8.3 +/- 1.7(stat) +/- 1.7(sys)) x 10(-13) photons cm(-2) s(-1) (about 0.6% of that of the Crab Nebula), and the power-law spectrum has a photon index of Gamma = 3.4 +/- 0.5(stat) +/- 0.2(sys). Using 3.5 years of publicly available Fermi-LAT data, a faint counterpart has been detected in the LAT data at the 5.5 sigma significance level, with an integrated flux above 300 MeV of (9.3 +/- 3.4(stat) +/- 0.8(sys)) x 10(-10) photons cm(-2) s(-1) and a photon index of Gamma = 1.96 +/- 0.20(stat) +/- 0.08(sys). X-ray observations with Swift-XRT allow the synchrotron peak energy in vF(v) representation to be located at similar to 1.0 keV. The broadband spectral energy distribution is modelled with a one-zone synchrotron self-Compton (SSC) model and the optical data by a black-body emission describing the thermal emission of the host galaxy. The derived parameters are typical of HBLs detected at VHE, with a particle-dominated jet.
Discovery of very high energy gamma-ray emission from the BL Lacertae
object PKS0301-243 with HESS
(2013)
The active galactic nucleus PKS 0301-243 (z = 0.266) is a high-synchrotron-peaked BL Lac object that is detected at high energies (HE, 100 MeV < E < 100 GeV) by Fermi/LAT. This paper reports on the discovery of PKS 0301-243 at very high energies (E > 100 GeV) by the High Energy Stereoscopic System (H.E.S.S.) from observations between September 2009 and December 2011 for a total live time of 34.9 h. Gamma rays above 200 GeV are detected at a significance of 9.4 sigma. A hint of variability at the 2.5 sigma level is found. An integral flux I(E > 200GeV) = (3.3 +/- 1.1(stat) +/- 0.7(syst)) x 10(-12) ph cm(-2) s(-1) and a photon index Gamma = 4.6 +/- 0.7(stat) +/- 0.2(syst) are measured. Multi-wavelength light curves in HE, X-ray and optical bands show strong variability, and a minimal variability timescale of eight days is estimated from the optical light curve. A single-zone leptonic synchrotron self-Compton scenario satisfactorily reproduces the multi-wavelength data. In this model, the emitting region is out of equipartition and the jet is particle dominated. Because of its high redshift compared to other sources observed at TeV energies, the very high energy emission from PKS 0301-243 is attenuated by the extragalactic background light (EBL) and the measured spectrum is used to derive an upper limit on the opacity of the EBL.
HESS observations of the binary system PSR B1259-63/LS 2883 around the 2010/2011 periastron passage
(2013)
Aims. We present very high energy (VHE; E > 100 GeV) data from the gamma-ray binary system PSR B1259-63/LS 2883 taken around its periastron passage on 15th of December 2010 with the High Energy Stereoscopic System (H. E. S. S.) of Cherenkov Telescopes. We aim to search for a possible TeV counterpart of the GeV flare detected by the Fermi LAT. In addition, we aim to study the current periastron passage in the context of previous observations taken at similar orbital phases, testing the repetitive behaviour of the source.
Methods. Observations at VHEs were conducted with H.E.S.S. from 9th to 16th of January 2011. The total dataset amounts to similar to 6 h of observing time. The data taken around the 2004 periastron passage were also re-analysed with the current analysis techniques in order to extend the energy spectrum above 3 TeV to fully compare observation results from 2004 and 2011.
Results. The source is detected in the 2011 data at a significance level of 11.5 sigma revealing an averaged integral flux above 1 TeV of (1.01 +/- 0.18(stat) +/- 0.20(sys)) x 10(-12) cm(-2) s(-1). The differential energy spectrum follows a power-law shape with a spectral index Gamma = 2.92 +/- 0.30(stat) +/- 0.20(sys) and a flux normalisation at 1 TeV of N-0 = (1.95 +/- 0.32(stat) +/- 0.39(sys)) x 10(-12) TeV-1 cm(-2) s(-1). The measured light curve does not show any evidence for variability of the source on the daily scale. The re-analysis of the 2004 data yields results compatible with the published ones. The differential energy spectrum measured up to similar to 10 TeV is consistent with a power law with a spectral index Gamma = 2.81 +/- 0.10(stat) +/- 0.20(sys) and a flux normalisation at 1 TeV of N-0 = (1.29 +/- 0.08(stat) +/- 0.26(sys)) x 10(-12) TeV-1 cm(-2) s(-1).
Conclusions. The measured integral flux and the spectral shape of the 2011 data are compatible with the results obtained around previous periastron passages. The absence of variability in the H.E.S.S. data indicates that the GeV flare observed by Fermi LAT in the time period covered also by H.E.S.S. observations originates in a different physical scenario than the TeV emission. Moreover, the comparison of the new results to the results from the 2004 observations made at a similar orbital phase provides a stronger evidence of the repetitive behaviour of the source.
A deep observation campaign carried out by the High Energy Stereoscopic System (HESS) on Centaurus A enabled the discovery of gamma-rays from the blazar 1ES 1312-423, 2 degrees away from the radio galaxy. With a differential flux at 1 TeV of phi(1 TeV) = (1.9 +/- 0.6(stat) +/- 0.4(sys)) x 10(-13) cm(-2) s(-1) TeV-1 corresponding to 0.5 per cent of the Crab nebula differential flux and a spectral index Gamma = 2.9 +/- 0.5(stat) +/- 0.2(sys), 1ES 1312-423 is one of the faintest sources ever detected in the very high energy (E > 100 GeV) extragalactic sky. A careful analysis using three and a half years of Fermi Large Area Telescope (Fermi-LAT) data allows the discovery at high energies (E > 100 MeV) of a hard spectrum (Gamma = 1.4 +/- 0.4(stat) +/- 0.2(sys)) source coincident with 1ES 1312-423. Radio, optical, UV and X-ray observations complete the spectral energy distribution of this blazar, now covering 16 decades in energy. The emission is successfully fitted with a synchrotron self-Compton model for the non-thermal component, combined with a blackbody spectrum for the optical emission from the host galaxy.
Duplicate detection is the task of identifying all groups of records within a data set that represent the same real-world entity, respectively. This task is difficult, because (i) representations might differ slightly, so some similarity measure must be defined to compare pairs of records and (ii) data sets might have a high volume making a pair-wise comparison of all records infeasible. To tackle the second problem, many algorithms have been suggested that partition the data set and compare all record pairs only within each partition. One well-known such approach is the Sorted Neighborhood Method (SNM), which sorts the data according to some key and then advances a window over the data comparing only records that appear within the same window. We propose several variations of SNM that have in common a varying window size and advancement. The general intuition of such adaptive windows is that there might be regions of high similarity suggesting a larger window size and regions of lower similarity suggesting a smaller window size. We propose and thoroughly evaluate several adaption strategies, some of which are provably better than the original SNM in terms of efficiency (same results with fewer comparisons).
Extract-Transform-Load (ETL) tools are used for the creation, maintenance, and evolution of data warehouses, data marts, and operational data stores. ETL workflows populate those systems with data from various data sources by specifying and executing a DAG of transformations. Over time, hundreds of individual workflows evolve as new sources and new requirements are integrated into the system. The maintenance and evolution of large-scale ETL systems requires much time and manual effort. A key problem is to understand the meaning of unfamiliar attribute labels in source and target databases and ETL transformations. Hard-to-understand attribute labels lead to frustration and time spent to develop and understand ETL workflows. We present a schema decryption technique to support ETL developers in understanding cryptic schemata of sources, targets, and ETL transformations. For a given ETL system, our recommender-like approach leverages the large number of mapped attribute labels in existing ETL workflows to produce good and meaningful decryptions. In this way we are able to decrypt attribute labels consisting of a number of unfamiliar few-letter abbreviations, such as UNP_PEN_INT, which we can decrypt to UNPAID_PENALTY_INTEREST. We evaluate our schema decryption approach on three real-world repositories of ETL workflows and show that our approach is able to suggest high-quality decryptions for cryptic attribute labels in a given schema.
Data dependencies, or integrity constraints, are used to improve the quality of a database schema, to optimize queries, and to ensure consistency in a database. In the last years conditional dependencies have been introduced to analyze and improve data quality. In short, a conditional dependency is a dependency with a limited scope defined by conditions over one or more attributes. Only the matching part of the instance must adhere to the dependency. In this paper we focus on conditional inclusion dependencies (CINDs). We generalize the definition of CINDs, distinguishing covering and completeness conditions. We present a new use case for such CINDs showing their value for solving complex data quality tasks. Further, we define quality measures for conditions inspired by precision and recall. We propose efficient algorithms that identify covering and completeness conditions conforming to given quality thresholds. Our algorithms choose not only the condition values but also the condition attributes automatically. Finally, we show that our approach efficiently provides meaningful and helpful results for our use case.
The 2010 very high energy gamma-ray flare and 10 years ofmulti-wavelength oservations of M 87
(2012)
The giant radio galaxy M 87 with its proximity (16 Mpc), famous jet, and very massive black hole ((3-6) x 10(9) M-circle dot) provides a unique opportunity to investigate the origin of very high energy (VHE; E > 100 GeV) gamma-ray emission generated in relativistic outflows and the surroundings of supermassive black holes. M 87 has been established as a VHE gamma-ray emitter since 2006. The VHE gamma-ray emission displays strong variability on timescales as short as a day. In this paper, results from a joint VHE monitoring campaign on M 87 by the MAGIC and VERITAS instruments in 2010 are reported. During the campaign, a flare at VHE was detected triggering further observations at VHE (H.E.S.S.), X-rays (Chandra), and radio (43 GHz Very Long Baseline Array, VLBA). The excellent sampling of the VHE gamma-ray light curve enables one to derive a precise temporal characterization of the flare: the single, isolated flare is well described by a two-sided exponential function with significantly different flux rise and decay times of tau(rise)(d) = (1.69 +/- 0.30) days and tau(decay)(d) = (0.611 +/- 0.080) days, respectively. While the overall variability pattern of the 2010 flare appears somewhat different from that of previous VHE flares in 2005 and 2008, they share very similar timescales (similar to day), peak fluxes (Phi(>0.35 TeV) similar or equal to (1-3) x 10(-11) photons cm(-2) s(-1)), and VHE spectra. VLBA radio observations of 43 GHz of the inner jet regions indicate no enhanced flux in 2010 in contrast to observations in 2008, where an increase of the radio flux of the innermost core regions coincided with a VHE flare. On the other hand, Chandra X-ray observations taken similar to 3 days after the peak of the VHE gamma-ray emission reveal an enhanced flux from the core (flux increased by factor similar to 2; variability timescale <2 days). The long-term (2001-2010) multi-wavelength (MWL) light curve of M 87, spanning from radio to VHE and including data from Hubble Space Telescope, Liverpool Telescope, Very Large Array, and European VLBI Network, is used to further investigate the origin of the VHE gamma-ray emission. No unique, common MWL signature of the three VHE flares has been identified. In the outer kiloparsec jet region, in particular in HST-1, no enhanced MWL activity was detected in 2008 and 2010, disfavoring it as the origin of the VHE flares during these years. Shortly after two of the three flares (2008 and 2010), the X-ray core was observed to be at a higher flux level than its characteristic range (determined from more than 60 monitoring observations: 2002-2009). In 2005, the strong flux dominance of HST-1 could have suppressed the detection of such a feature. Published models for VHE gamma-ray emission from M 87 are reviewed in the light of the new data.
Unique column combinations of a relational database table are sets of columns that contain only unique values. Discovering such combinations is a fundamental research problem and has many different data management and knowledge discovery applications. Existing discovery algorithms are either brute force or have a high memory load and can thus be applied only to small datasets or samples. In this paper, the wellknown GORDIAN algorithm and "Apriori-based" algorithms are compared and analyzed for further optimization. We greatly improve the Apriori algorithms through efficient candidate generation and statistics-based pruning methods. A hybrid solution HCAGORDIAN combines the advantages of GORDIAN and our new algorithm HCA, and it significantly outperforms all previous work in many situations.
Ground-based gamma-ray astronomy has had a major breakthrough with the impressive results obtained using systems of imaging atmospheric Cherenkov telescopes. Ground-based gamma-ray astronomy has a huge potential in astrophysics, particle physics and cosmology. CTA is an international initiative to build the next generation instrument, with a factor of 5-10 improvement in sensitivity in the 100 GeV-10 TeV range and the extension to energies well below 100 GeV and above 100 TeV. CTA will consist of two arrays (one in the north, one in the south) for full sky coverage and will be operated as open observatory. The design of CTA is based on currently available technology. This document reports on the status and presents the major design concepts of CTA.
Roughly every third Wikipedia article contains an infobox - a table that displays important facts about the subject in attribute-value form. The schema of an infobox, i.e., the attributes that can be expressed for a concept, is defined by an infobox template. Often, authors do not specify all template attributes, resulting in incomplete infoboxes. With iPopulator, we introduce a system that automatically populates infoboxes of Wikipedia articles by extracting attribute values from the article's text. In contrast to prior work, iPopulator detects and exploits the structure of attribute values for independently extracting value parts. We have tested iPopulator on the entire set of infobox templates and provide a detailed analysis of its effectiveness. For instance, we achieve an average extraction precision of 91% for 1,727 distinct infobox template attributes.
Data obtained from foreign data sources often come with only superficial structural information, such as relation names and attribute names. Other types of metadata that are important for effective integration and meaningful querying of such data sets are missing. In particular, relationships among attributes, such as foreign keys, are crucial metadata for understanding the structure of an unknown database. The discovery of such relationships is difficult, because in principle for each pair of attributes in the database each pair of data values must be compared. A precondition for a foreign key is an inclusion dependency (IND) between the key and the foreign key attributes. We present with Spider an algorithm that efficiently finds all INDs in a given relational database. It leverages the sorting facilities of DBMS but performs the actual comparisons outside of the database to save computation. Spider analyzes very large databases up to an order of magnitude faster than previous approaches. We also evaluate in detail the effectiveness of several heuristics to reduce the number of necessary comparisons. Furthermore, we generalize Spider to find composite INDs covering multiple attributes, and partial INDs, which are true INDs for all but a certain number of values. This last type is particularly relevant when integrating dirty data as is often the case in the life sciences domain - our driving motivation.
Sowohl in kommerziellen als auch in wissenschaftlichen Datenbanken sind Daten von niedriger Qualität allgegenwärtig. Das kann zu erheblichen wirtschaftlichen Problemen führen", erläutert der 35-jährige Informatik-Professor und verweist zum Beispiel auf Duplikate. Diese können entstehen, wenn in Unternehmen verschiedene Kundendatenbestände zusammengefügt werden, aber die Integration mehrere Datensätze des gleichen Kunden hinterlässt. "Solche doppelten Einträge zu finden, ist aus zwei Gründen schwierig: Zum einen ist die Menge der Daten oft sehr groß, zum anderen können sich Einträge über die gleiche Person leicht unterscheiden", beschreibt Prof. Naumann häufig auftretende Probleme. In seiner Antrittsvorlesung will er zwei Lösungswege vorstellen: Erstens die Definition geeigneter Ähnlichkeitsmaße und zweitens die Nutzung von Algorithmen, die es vermeiden, jeden Datensatz mit jedem anderen zu vergleichen. Außerdem soll es um grundlegende Aspekte der Verständlichkeit, Objektivität, Vollständigkeit und Fehlerhaftigkeit von Daten gehen.
Duplicate detection consists in determining different representations of real-world objects in a database. Recent research has considered the use of relationships among object representations to improve duplicate detection. In the general case where relationships form a graph, research has mainly focused on duplicate detection quality/effectiveness. Scalability has been neglected so far, even though it is crucial for large real-world duplicate detection tasks. In this paper we scale up duplicate detection in graph data (DDG) to large amounts of data and pairwise comparisons, using the support of a relational database system. To this end, we first generalize the process of DDG. We then present how to scale algorithms for DDG in space (amount of data processed with limited main memory) and in time. Finally, we explore how complex similarity computation can be performed efficiently. Experiments on data an order of magnitude larger than data considered so far in DDG clearly show that our methods scale to large amounts of data not residing in main memory.