Refine
Has Fulltext
- yes (19) (remove)
Year of publication
- 2016 (19) (remove)
Document Type
- Preprint (11)
- Doctoral Thesis (7)
- Master's Thesis (1)
Language
- English (19)
Is part of the Bibliography
- yes (19)
Keywords
- (sub-) tropical Africa (1)
- (sub-) tropisches Afrika (1)
- Aerosole (1)
- Brownian motion with discontinuous drift (1)
- Case-Cohort-Design (1)
- Cauchy problem (1)
- Clifford algebra (1)
- Cox model (1)
- Cox-Modell (1)
- Detektion multipler Übergänge (1)
Institute
- Institut für Mathematik (19) (remove)
We prove statistical rates of convergence for kernel-based least squares regression from i.i.d. data using a conjugate gradient algorithm, where regularization against overfitting is obtained by early stopping. This method is related to Kernel Partial Least Squares, a regression method that combines supervised dimensionality reduction with least squares projection. Following the setting introduced in earlier related literature, we study so-called "fast convergence rates" depending on the regularity of the target regression function (measured by a source condition in terms of the kernel integral operator) and on the effective dimensionality of the data mapped into the kernel space. We obtain upper bounds, essentially matching known minimax lower bounds, for the L^2 (prediction) norm as well as for the stronger Hilbert norm, if the true
regression function belongs to the reproducing kernel Hilbert space. If the latter assumption is not fulfilled, we obtain similar convergence rates for appropriate norms, provided additional unlabeled data are available.
Using an algorithm based on a retrospective rejection sampling scheme, we propose an exact simulation of a Brownian diffusion whose drift admits several jumps. We treat explicitly and extensively the case of two jumps, providing numerical simulations. Our main contribution is to manage the technical difficulty due to the presence of two jumps thanks to a new explicit expression of the transition density of the skew Brownian motion with two semipermeable barriers and a constant drift.
When trying to extend the Hodge theory for elliptic complexes on compact closed manifolds to the case of compact manifolds with boundary one is led to a boundary value problem for
the Laplacian of the complex which is usually referred to as Neumann problem. We study the Neumann problem for a larger class of sequences of differential operators on
a compact manifold with boundary. These are sequences of small curvature, i.e., bearing the property that the composition of any two neighbouring operators has order less than two.
In many statistical applications, the aim is to model the relationship between covariates and some outcomes. A choice of the appropriate model depends on the outcome and the research objectives, such as linear models for continuous outcomes, logistic models for binary outcomes and the Cox model for time-to-event data. In epidemiological, medical, biological, societal and economic studies, the logistic regression is widely used to describe the relationship between a response variable as binary outcome and explanatory variables as a set of covariates. However, epidemiologic cohort studies are quite expensive regarding data management since following up a large number of individuals takes long time. Therefore, the case-cohort design is applied to reduce cost and time for data collection. The case-cohort sampling collects a small random sample from the entire cohort, which is called subcohort. The advantage of this design is that the covariate and follow-up data are recorded only on the subcohort and all cases (all members of the cohort who develop the event of interest during the follow-up process).
In this thesis, we investigate the estimation in the logistic model for case-cohort design. First, a model with a binary response and a binary covariate is considered. The maximum likelihood estimator (MLE) is described and its asymptotic properties are established. An estimator for the asymptotic variance of the estimator based on the maximum likelihood approach is proposed; this estimator differs slightly from the estimator introduced by Prentice (1986). Simulation results for several proportions of the subcohort show that the proposed estimator gives lower empirical bias and empirical variance than Prentice's estimator.
Then the MLE in the logistic regression with discrete covariate under case-cohort design is studied. Here the approach of the binary covariate model is extended. Proving asymptotic normality of estimators, standard errors for the estimators can be derived. The simulation study demonstrates the estimation procedure of the logistic regression model with a one-dimensional discrete covariate. Simulation results for several proportions of the subcohort and different choices of the underlying parameters indicate that the estimator developed here performs reasonably well. Moreover, the comparison between theoretical values and simulation results of the asymptotic variance of estimator is presented.
Clearly, the logistic regression is sufficient for the binary outcome refers to be available for all subjects and for a fixed time interval. Nevertheless, in practice, the observations in clinical trials are frequently collected for different time periods and subjects may drop out or relapse from other causes during follow-up. Hence, the logistic regression is not appropriate for incomplete follow-up data; for example, an individual drops out of the study before the end of data collection or an individual has not occurred the event of interest for the duration of the study. These observations are called censored observations. The survival analysis is necessary to solve these problems. Moreover, the time to the occurence of the event of interest is taken into account. The Cox model has been widely used in survival analysis, which can effectively handle the censored data. Cox (1972) proposed the model which is focused on the hazard function. The Cox model is assumed to be
λ(t|x) = λ0(t) exp(β^Tx)
where λ0(t) is an unspecified baseline hazard at time t and X is the vector of covariates, β is a p-dimensional vector of coefficient.
In this thesis, the Cox model is considered under the view point of experimental design. The estimability of the parameter β0 in the Cox model, where β0 denotes the true value of β, and the choice of optimal covariates are investigated. We give new representations of the observed information matrix In(β) and extend results for the Cox model of Andersen and Gill (1982). In this way conditions for the estimability of β0 are formulated. Under some regularity conditions, ∑ is the inverse of the asymptotic variance matrix of the MPLE of β0 in the Cox model and then some properties of the asymptotic variance matrix of the MPLE are highlighted. Based on the results of asymptotic estimability, the calculation of local optimal covariates is considered and shown in examples. In a sensitivity analysis, the efficiency of given covariates is calculated. For neighborhoods of the exponential models, the efficiencies have then been found. It is appeared that for fixed parameters β0, the efficiencies do not change very much for different baseline hazard functions. Some proposals for applicable optimal covariates and a calculation procedure for finding optimal covariates are discussed.
Furthermore, the extension of the Cox model where time-dependent coefficient are allowed, is investigated. In this situation, the maximum local partial likelihood estimator for estimating the coefficient function β(·) is described. Based on this estimator, we formulate a new test procedure for testing, whether a one-dimensional coefficient function β(·) has a prespecified parametric form, say β(·; ϑ). The score function derived from the local constant partial likelihood function at d distinct grid points is considered. It is shown that the distribution of the properly standardized quadratic form of this d-dimensional vector under the null hypothesis tends to a Chi-squared distribution. Moreover, the limit statement remains true when replacing the unknown ϑ0 by the MPLE in the hypothetical model and an asymptotic α-test is given by the quantiles or p-values of the limiting Chi-squared distribution. Finally, we propose a bootstrap version of this test. The bootstrap test is only defined for the special case of testing whether the coefficient function is constant. A simulation study illustrates the behavior of the bootstrap test under the null hypothesis and a special alternative. It gives quite good results for the chosen underlying model.
References
P. K. Andersen and R. D. Gill. Cox's regression model for counting processes: a large samplestudy. Ann. Statist., 10(4):1100{1120, 1982.
D. R. Cox. Regression models and life-tables. J. Roy. Statist. Soc. Ser. B, 34:187{220, 1972.
R. L. Prentice. A case-cohort design for epidemiologic cohort studies and disease prevention trials. Biometrika, 73(1):1{11, 1986.
We consider a statistical inverse learning problem, where we observe the image of a function f through a linear operator A at i.i.d. random design points X_i, superposed with an additional noise. The distribution of the design points is unknown and can be very general. We analyze simultaneously the direct (estimation of Af) and the inverse (estimation of f) learning problems. In this general framework, we obtain strong and weak minimax optimal rates of convergence (as the number of observations n grows large) for a large class of spectral regularization methods over regularity classes defined through appropriate source conditions. This improves on or completes previous results obtained in related settings. The optimality of the obtained rates is shown not only in the exponent in n but also in the explicit dependence of the constant factor in the variance of the noise and the radius of the source condition set.
The aim of this paper is to bring together two areas which are of great importance for the study of overdetermined boundary value problems. The first area is homological algebra which is the main tool in constructing the formal theory of overdetermined problems. And the second area is the global calculus of pseudodifferential operators which allows one to develop explicit analysis.
This article assesses the distance between the laws of stochastic differential equations with multiplicative Lévy noise on path space in terms of their characteristics. The notion of transportation distance on the set of Lévy kernels introduced by Kosenkova and Kulik yields a natural and statistically tractable upper bound on the noise sensitivity. This extends recent results for the additive case in terms of coupling distances to the multiplicative case. The strength of this notion is shown in a statistical implementation for simulations and the example of a benchmark time series in paleoclimate.
We elaborate a boundary Fourier method for studying an analogue of the Hilbert problem for analytic functions within the framework of generalised Cauchy-Riemann equations. The boundary value problem need not satisfy the Shapiro-Lopatinskij condition and so it fails to be Fredholm in Sobolev spaces. We show a solvability condition of the Hilbert problem, which looks like those for ill-posed
problems, and construct an explicit formula for approximate solutions.