Refine
Has Fulltext
- yes (3)
Document Type
Language
- English (3)
Is part of the Bibliography
- yes (3)
Keywords
- Time Series Analysis (3) (remove)
The separation of natural and anthropogenically caused climatic changes is an important task of contemporary climate research. For this purpose, a detailed knowledge of the natural variability of the climate during warm stages is a necessary prerequisite. Beside model simulations and historical documents, this knowledge is mostly derived from analyses of so-called climatic proxy data like tree rings or sediment as well as ice cores. In order to be able to appropriately interpret such sources of palaeoclimatic information, suitable approaches of statistical modelling as well as methods of time series analysis are necessary, which are applicable to short, noisy, and non-stationary uni- and multivariate data sets. Correlations between different climatic proxy data within one or more climatological archives contain significant information about the climatic change on longer time scales. Based on an appropriate statistical decomposition of such multivariate time series, one may estimate dimensions in terms of the number of significant, linear independent components of the considered data set. In the presented work, a corresponding approach is introduced, critically discussed, and extended with respect to the analysis of palaeoclimatic time series. Temporal variations of the resulting measures allow to derive information about climatic changes. For an example of trace element abundances and grain-size distributions obtained near the Cape Roberts (Eastern Antarctica), it is shown that the variability of the dimensions of the investigated data sets clearly correlates with the Oligocene/Miocene transition about 24 million years before present as well as regional deglaciation events. Grain-size distributions in sediments give information about the predominance of different transportation as well as deposition mechanisms. Finite mixture models may be used to approximate the corresponding distribution functions appropriately. In order to give a complete description of the statistical uncertainty of the parameter estimates in such models, the concept of asymptotic uncertainty distributions is introduced. The relationship with the mutual component overlap as well as with the information missing due to grouping and truncation of the measured data is discussed for a particular geological example. An analysis of a sequence of grain-size distributions obtained in Lake Baikal reveals that there are certain problems accompanying the application of finite mixture models, which cause an extended climatological interpretation of the results to fail. As an appropriate alternative, a linear principal component analysis is used to decompose the data set into suitable fractions whose temporal variability correlates well with the variations of the average solar insolation on millenial to multi-millenial time scales. The abundance of coarse-grained material is obviously related to the annual snow cover, whereas a significant fraction of fine-grained sediments is likely transported from the Taklamakan desert via dust storms in the spring season.
Time series analysis
(2004)
In this work we consider statistical learning problems. A learning machine aims to extract information from a set of training examples such that it is able to predict the associated label on unseen examples. We consider the case where the resulting classification or regression rule is a combination of simple rules - also called base hypotheses. The so-called boosting algorithms iteratively find a weighted linear combination of base hypotheses that predict well on unseen data. We address the following issues: o The statistical learning theory framework for analyzing boosting methods. We study learning theoretic guarantees on the prediction performance on unseen examples. Recently, large margin classification techniques emerged as a practical result of the theory of generalization, in particular Boosting and Support Vector Machines. A large margin implies a good generalization performance. Hence, we analyze how large the margins in boosting are and find an improved algorithm that is able to generate the maximum margin solution. o How can boosting methods be related to mathematical optimization techniques? To analyze the properties of the resulting classification or regression rule, it is of high importance to understand whether and under which conditions boosting converges. We show that boosting can be used to solve large scale constrained optimization problems, whose solutions are well characterizable. To show this, we relate boosting methods to methods known from mathematical optimization, and derive convergence guarantees for a quite general family of boosting algorithms. o How to make Boosting noise robust? One of the problems of current boosting techniques is that they are sensitive to noise in the training sample. In order to make boosting robust, we transfer the soft margin idea from support vector learning to boosting. We develop theoretically motivated regularized algorithms that exhibit a high noise robustness. o How to adapt boosting to regression problems? Boosting methods are originally designed for classification problems. To extend the boosting idea to regression problems, we use the previous convergence results and relations to semi-infinite programming to design boosting-like algorithms for regression problems. We show that these leveraging algorithms have desirable theoretical and practical properties. o Can boosting techniques be useful in practice? The presented theoretical results are guided by simulation results either to illustrate properties of the proposed algorithms or to show that they work well in practice. We report on successful applications in a non-intrusive power monitoring system, chaotic time series analysis and a drug discovery process. --- Anmerkung: Der Autor ist Träger des von der Mathematisch-Naturwissenschaftlichen Fakultät der Universität Potsdam vergebenen Michelson-Preises für die beste Promotion des Jahres 2001/2002.