Institut für Informatik und Computational Science
Refine
Year of publication
- 2013 (58) (remove)
Document Type
- Article (37)
- Doctoral Thesis (15)
- Monograph/Edited Volume (2)
- Conference Proceeding (2)
- Master's Thesis (1)
- Preprint (1)
Keywords
Evaluating the quality of ranking functions is a core task in web search and other information retrieval domains. Because query distributions and item relevance change over time, ranking models often cannot be evaluated accurately on held-out training data. Instead, considerable effort is spent on manually labeling the relevance of query results for test queries in order to track ranking performance. We address the problem of estimating ranking performance as accurately as possible on a fixed labeling budget. Estimates are based on a set of most informative test queries selected by an active sampling distribution. Query labeling costs depend on the number of result items as well as item-specific attributes such as document length. We derive cost-optimal sampling distributions for the commonly used performance measures Discounted Cumulative Gain and Expected Reciprocal Rank. Experiments on web search engine data illustrate significant reductions in labeling costs.
The course timetabling problem can be generally defined as the task of assigning a number of lectures to a limited set of timeslots and rooms, subject to a given set of hard and soft constraints. The modeling language for course timetabling is required to be expressive enough to specify a wide variety of soft constraints and objective functions. Furthermore, the resulting encoding is required to be extensible for capturing new constraints and for switching them between hard and soft, and to be flexible enough to deal with different formulations. In this paper, we propose to make effective use of ASP as a modeling language for course timetabling. We show that our ASP-based approach can naturally satisfy the above requirements, through an ASP encoding of the curriculum-based course timetabling problem proposed in the third track of the second international timetabling competition (ITC-2007). Our encoding is compact and human-readable, since each constraint is individually expressed by either one or two rules. Each hard constraint is expressed by using integrity constraints and aggregates of ASP. Each soft constraint S is expressed by rules in which the head is the form of penalty (S, V, C), and a violation V and its penalty cost C are detected and calculated respectively in the body. We carried out experiments on four different benchmark sets with five different formulations. We succeeded either in improving the bounds or producing the same bounds for many combinations of problem instances and formulations, compared with the previous best known bounds.
Proposing relevant perturbations to biological signaling networks is central to many problems in biology and medicine because it allows for enabling or disabling certain biological outcomes. In contrast to quantitative methods that permit fine-grained (kinetic) analysis, qualitative approaches allow for addressing large-scale networks. This is accomplished by more abstract representations such as logical networks. We elaborate upon such a qualitative approach aiming at the computation of minimal interventions in logical signaling networks relying on Kleene's three-valued logic and fixpoint semantics. We address this problem within answer set programming and show that it greatly outperforms previous work using dedicated algorithms.
Motivation: Logic modeling is a useful tool to study signal transduction across multiple pathways. Logic models can be generated by training a network containing the prior knowledge to phospho-proteomics data. The training can be performed using stochastic optimization procedures, but these are unable to guarantee a global optima or to report the complete family of feasible models. This, however, is essential to provide precise insight in the mechanisms underlaying signal transduction and generate reliable predictions.
Results: We propose the use of Answer Set Programming to explore exhaustively the space of feasible logic models. Toward this end, we have developed caspo, an open-source Python package that provides a powerful platform to learn and characterize logic models by leveraging the rich modeling language and solving technologies of Answer Set Programming. We illustrate the usefulness of caspo by revisiting a model of pro-growth and inflammatory pathways in liver cells. We show that, if experimental error is taken into account, there are thousands (11 700) of models compatible with the data. Despite the large number, we can extract structural features from the models, such as links that are always (or never) present or modules that appear in a mutual exclusive fashion. To further characterize this family of models, we investigate the input-output behavior of the models. We find 91 behaviors across the 11 700 models and we suggest new experiments to discriminate among them. Our results underscore the importance of characterizing in a global and exhaustive manner the family of feasible models, with important implications for experimental design.
If sites, cities, and landscapes are captured at different points in time using technology such as LiDAR, large collections of 3D point clouds result. Their efficient storage, processing, analysis, and presentation constitute a challenging task because of limited computation, memory, and time resources. In this work, we present an approach to detect changes in massive 3D point clouds based on an out-of-core spatial data structure that is designed to store data acquired at different points in time and to efficiently attribute 3D points with distance information. Based on this data structure, we present and evaluate different processing schemes optimized for performing the calculation on the CPU and GPU. In addition, we present a point-based rendering technique adapted for attributed 3D point clouds, to enable effective out-of-core real-time visualization of the computation results. Our approach enables conclusions to be drawn about temporal changes in large highly accurate 3D geodata sets of a captured area at reasonable preprocessing and rendering times. We evaluate our approach with two data sets from different points in time for the urban area of a city, describe its characteristics, and report on applications.
Basic to information and communication technology design, simplicity as a driving concept receives little formal attention from the ICT community. A recent literature review and survey of scholars, researchers, and practitioners conducted through the Information Technology Simply Works (ITSy) European Support Action reveals key findings about current perceptions of and future directions for simplicity in ICT.
Simplicity is a mindset, a way of looking at solutions, an extremely wide-ranging philosophical stance on the world, and thus a deeply rooted cultural paradigm. The culture of "less" can be profoundly disruptive, cutting out existing "standard" elements from products and business models, thereby revolutionizing entire markets.