This discovery appears somewhat detrimental to the commonly used linearity and constant variance conditions. Broadly viewed, dimension reduction has always been a central statistical concept. Even with the proof, to understand this phenomenon intuitively is not simple at all.
All our derivations are performed without using the linearity or constant variance condition that is often assumed in the dimension reduction literature. Key words and phrases: However, numerical experiments repeatedly support the counter-intuitive phenomenon of improved efficiency through giving up using the conditions 5 and 6 even when they hold, and the ability of performing inference finally allows a formal investigation of this issue.
However, it is also important to recognize the practical usefulness of these conditions. Simulation studies are conducted in Section 4 to demonstrate the finite sample performance and the method is implemented in a real data example in Section 5.
Despite the various estimation methods, it is unclear if any of these estimators are optimal in the sense that they can exhaustively estimate the entire central subspace and have the minimum possible asymptotic estimation variance.
The resulting efficient estimator can exhaustively estimate the central subspace without imposing any distributional assumptions.
One difference is that sufficient statistics are observable, while a sufficient reduction may contain unknown parameters and thus needs to be estimated. This allows us to derive the estimation procedures and perform inference using semiparametric tools. Although large-p regressions are perhaps mainly responsible for renewed interest, dimension-reduction methodology can be useful regardless of the size of p.
Efficiency issues are also considered in more complex semiparametric problems such as regressions with missing covariates [ 23 ], skewed distribution families [ 1819 ], measurement error models [ 1525 ], partially linear models [ 16 ], the Cox model [ 24 ], pageaccelerated failure model [ 27 ] or other general survival models [ 28 ] and latent variable models [ 17 ].
We also derive the efficient score function. Such bounds quantify the minimum efficiency loss that results from generalizing one restrictive model to a more flexible one, and hence they can be important in making the decision of which model to use.
Therefore, the determination of d becomes a problem of determining the number of non-zero eigenvalues of the corresponding matrix, termed kernel matrix in the literature. We conduct simulation studies and a real data analysis to demonstrate the finite sample performance in comparison with several existing methods.
One approach consists of regressing Y on X in two steps. However, deciding the amount of penalty through a data-driven procedure is usually difficult in practice. A potential advantage of sufficient dimension reduction is that predictions based on an estimated R may be substantially less variable than those based on X, without introducing worrisome bias.
In this paper we study the estimation and inference in sufficient dimension reduction.
We assume that the central subspace exists throughout this paper, and use to denote its dimension. To the best of our knowledge, the efficiency issue has never been discussed in the context of sufficient dimension reduction.
For this goal, p may be regarded as large when it exceeds 2 or 3 because these bounds represent the limits of our ability to view a dataset in full using computer graphics.
In summary, we provide an efficient estimator which can exhaustively estimate the central subspace without imposing any distributional assumptions on the covariate x. Formulated specifically for regression, the following definition Cook of a sufficient reduction will help in our pursuit of methods for reducing the dimension of X while en route to estimating E Y X.
In general estimation procedures, using a true quantity to replace an estimated quantity does not necessarily increase or decrease the variability of the estimation of the parameter of interest. Our proposed efficient estimation also provides a possibility for making inference of parameters that uniquely identify the central subspace.
In the literature, vast and significant effort has been devoted to studying the semiparametric efficiency bounds for consistent estimators in semiparametric models. We can think of R X as a function that concentrates the relevant information in X. Consequently, the prediction goal is often specialized immediately to the task of estimating the conditional mean function E Y X from the regression of Y on X.
The variance of an estimator could go either way, and in our case, using a known form increased the variability.
Consequently, the inferential target in sufficient dimension reduction is often taken to be the central subspacedefined as the intersection of all dimension-reduction subspaces Cook, One typical semiparametric tool is to obtain estimators through obtaining the corresponding influence functions.
We further construct an efficient estimator, which reaches the minimum asymptotic estimation variance bound among all possible consistent estimators. We finish the paper with a brief discussion in Section 6. When an efficient estimator is obtained, the procedure of estimation can be considered to have reached certain optimality.sufficient dimension reduction based on normal and wishart inverse models a thesis submitted to the faculty of the graduate school of the university of minnesota.
The goal of sufficient dimension reduction is to estimate the column space of For example, when Y is continuous, we can propose a simple conditional normal model for Supplement to “Efficient estimation in sufficient dimension reduction.
Sufficient dimension reduction and prediction in in §4, where we discuss four inverse regression models, describe the prediction methodology that stems from them and give simulation results to illustrate a non-singular multivariate normal distribution, then R(X)=E(Y |X) is a.
A linearity condition is required for all the existing sufficient dimension reduction methods that deal with missing data. To remove the linearity condition, two new estimating equation procedures are proposed to handle missing response in sufficient dimension reduction: the complete-case estimating equation approach and the inverse probability weighted estimating equation approach.
Summarizing the effect of many covariates through a few linear combinations is an effective way of reducing covariate dimension and is the backbone of (sufficient) dimension reduction. Because the replacement of high-dimensional covariates by low-dimensional linear combinations is.
Sufficient dimension reduction and prediction in regression. Article (PDF Available) by using normal models for the conditional distribution of X | Y. The models in.Download