Robust Simultaneous Inference for Functional Data
Costa Lima, Italo Raony
Type of DegreePhD Dissertation
DepartmentMathematics and Statistics
MetadataShow full item record
Advancements in modern technology have enabled the collection of complex, high- dimensional data sets, such as curves, 2D or 3D images, and other objects living in a functional space, thus boosting the investigation of function data. This phenomenon affects all the fields involving applied statistics such as Geophysics (Ferraty et al., 2005), Environ- metrics (Febrero et al., 2008), Ecology (Embling et al., 2012), Chemometrics (Daszykowski et al., 2007), and others, see also Ferraty and Vieu (2006). Not only revealing the wide variety of possible fields of application of functional data methods, but the previously selected set of references also reflect the vast scope of statistical problems, such as regression and inference, in which one may have to deal with under the context of functional data. Hence, functional data analysis (FDA) has become one of the most active fields of research in statistics during the last ten years. In this dissertation, our objective is to develop outlying-resistant methods for the estimation of the mean curve of a functional dataset that remain valid even in the presence of a significant proportion of outlier curves. We also propose a method for the calculation of a simultaneous confidence band for the mean curve that is robust to outliers. Our work is based on B-Spline Smoothing, together with LAD-based and M-based estimation techniques. We also extend both methods for the estimation of the difference of the mean functions of two populations, also obtaining a robust test statistics for the difference of the mean functions of two populations. We analyze the asymptotic properties of the proposed estimators, proving weak consistency and, for the M-based estimator, asymptotic normality. We implement an extensive numerical simulation to evaluate the performance of the proposed methods. We also demonstrate the applicability of the proposed methods using real datasets.