Winston Wei Dou

Ensemble Subsampling for Imbalanced Multivariate Two-Sample Tests

Joint with Lisha Chen, Zhihua Qiao

Journal of the American Statistical Association (2013) Volume 108, Issue 504


Some existing nonparametric two-sample tests for equality of multivariate distributions perform unsatisfactorily when the two sample sizes are unbalanced. In particular, the power of these tests tends to diminish with increasingly unbalanced sample sizes. In this article, we propose a new testing procedure to solve this problem. The proposed test, based on the nearest neighbor method by Schilling, employs a novel ensemble subsampling scheme to remedy this issue. More specifically, the test statistic is a weighted average of a collection of statistics, each associated with a randomly selected subsample of the data. We derive the asymptotic distribution of the test statistic under the null hypothesis and show that the new test is consistent against all alternatives when the ratio of the sample sizes either goes to a finite limit or tends to infinity. Via simulated data examples we demonstrate that the new test has increasing power with increasing sample size ratio when the size of the smaller sample is fixed. The test is applied to a real-data example in the field of corporate finance. Supplementary materials for this article are available online.