ScholarBank@NUShttps://scholarbank.nus.edu.sgThe DSpace digital repository system captures, stores, indexes, preserves, and distributes digital research material.Thu, 09 Jul 2020 15:07:45 GMT2020-07-09T15:07:45Z50571- Statistical analysis for rounded datahttps://scholarbank.nus.edu.sg/handle/10635/105388Title: Statistical analysis for rounded data
Authors: Bai, Z.; Zheng, S.; Zhang, B.; Hu, G.
Abstract: When random variables do not take discrete values, observed data are often the rounded values of continuous random variables. Errors caused by rounding of data are often neglected by classical statistical theories. While some pioneers have identified and made suggestions to rectify the problem, few suitable approaches were proposed. In this paper, we propose an approximate MLE (AMLE) procedure to estimate the parameters and discuss the consistency and asymptotic normality of the estimates. For our illustration, we shall consider the estimates of the parameters in AR (p) and MA (q) models for rounded data. © 2009 Elsevier B.V. All rights reserved.
Sat, 01 Aug 2009 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1053882009-08-01T00:00:00Z
- On limit theorem for the eigenvalues of product of two random matriceshttps://scholarbank.nus.edu.sg/handle/10635/105257Title: On limit theorem for the eigenvalues of product of two random matrices
Authors: Bai, Z.D.; Miao, B.; Jin, B.
Abstract: The existence of limiting spectral distribution (LSD) of the product of two random matrices is proved. One of the random matrices is a sample covariance matrix and the other is an arbitrary Hermitian matrix. Specially, the density function of LSD of Sn Wn is established, where Sn is a sample covariance matrix and Wn is Wigner matrix. © 2006 Elsevier Inc. All rights reserved.
Mon, 01 Jan 2007 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1052572007-01-01T00:00:00Z
- On estimation of the population spectral distribution from a high-dimensional sample covariance matrixhttps://scholarbank.nus.edu.sg/handle/10635/105253Title: On estimation of the population spectral distribution from a high-dimensional sample covariance matrix
Authors: Bai, Z.; Chen, J.; Yao, J.
Abstract: Sample covariance matrices play a central role in numerous popular statistical methodologies, for example principal components analysis, Kalman filtering and independent component analysis. However, modern random matrix theory indicates that, when the dimension of a random vector is not negligible with respect to the sample size, the sample covariance matrix demonstrates significant deviations from the underlying population covariance matrix. There is an urgent need to develop new estimation tools in such cases with high-dimensional data to recover the characteristics of the population covariance matrix from the observed sample covariance matrix. We propose a novel solution to this problem based on the method of moments. When the parametric dimension of the population spectrum is finite and known, we prove that the proposed estimator is strongly consistent and asymptotically Gaussian. Otherwise, we combine the first estimation method with a cross-validation procedure to select the unknown model dimension. Simulation experiments demonstrate the consistency of the proposed procedure. We also indicate possible extensions of the proposed estimator to the case where the population spectrum has a density. © 2010 Australian Statistical Publishing Association Inc..
Wed, 01 Dec 2010 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1052532010-12-01T00:00:00Z
- An adaptive design for multi-arm clinical trialshttps://scholarbank.nus.edu.sg/handle/10635/104989Title: An adaptive design for multi-arm clinical trials
Authors: Bai, Z.D.; Hu, F.; Shen, L.
Abstract: The randomized play-the-winner (RPW) rule is very useful in clinical trials for patient allocation with two treatments. L. J. Wei (1979, Ann. Statist. 7, 291-296) introduces the generalized Friedman's urn (GFU) model to clinical trials of K treatments (as an extension of the RPW). In this paper, we propose a new adaptive design for multi-arm clinical trials. The proposed adaptive design proportionally depends on the success rates of each treatment, so that a treatment which is doing well is more likely to be assigned in future trials than a treatment which is doing poorly. The new design is more reasonable although it is no longer a GFU model. In the paper we show that the new design has some desirable asymptotic properties and that it has wider and easier applications in practice. Some simulations also support this new design. © 2001 Elsevier Science (USA).
Tue, 01 Jan 2002 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1049892002-01-01T00:00:00Z
- Asymptotic error bounds for kernel-based Nyström low-rank approximation matriceshttps://scholarbank.nus.edu.sg/handle/10635/105025Title: Asymptotic error bounds for kernel-based Nyström low-rank approximation matrices
Authors: Chang, L.-B.; Bai, Z.; Huang, S.-Y.; Hwang, C.-R.
Abstract: Many kernel-based learning algorithms have the computational load scaled with the sample size n due to the column size of a full kernel Gram matrix K. This article considers the Nyström low-rank approximation. It uses a reduced kernel K̂, which is n × m, consisting of m columns (say columns i1, i2, ⋯ , i m) randomly drawn from K. This approximation takes the form K≈K̂U-1K̂T, where U is the reduced m × m matrix formed by rows i1, i2, ⋯ , i m of K̂. Often m is much smaller than the sample size n resulting in a thin rectangular reduced kernel, and it leads to learning algorithms scaled with the column size m. The quality of matrix approximations can be assessed by the closeness of their eigenvalues and eigenvectors. In this article, asymptotic error bounds on eigenvalues and eigenvectors are derived for the Nyström low-rank approximation matrix. © 2013 Elsevier Inc.
Sun, 01 Sep 2013 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1050252013-09-01T00:00:00Z
- The exact and limiting distributions for the number of successes in success runs within a sequence of Markov-dependent two-state trialshttps://scholarbank.nus.edu.sg/handle/10635/105419Title: The exact and limiting distributions for the number of successes in success runs within a sequence of Markov-dependent two-state trials
Authors: Fu, J.C.; Lou, W.Y.W.; Bai, Z.-D.; Li, G.
Abstract: The total number of successes in success runs of length greater than or equal to k in a sequence of n two-state trials is a statistic that has been broadly used in statistics and probability. For Bernoulli trials with k equal to one, this statistic has been shown to have binomial and normal distributions as exact and limiting distributions, respectively. For the case of Markov-dependent two-state trials with k greater than one, its exact and limiting distributions have never been considered in the literature. In this article, the finite Markov chain imbedding technique and the invariance principle are used to obtain, in general, the exact and limiting distributions of this statistic under Markov dependence, respectively. Numerical examples are given to illustrate the theoretical results.
Tue, 01 Jan 2002 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1054192002-01-01T00:00:00Z
- Estimation of the population spectral distribution from a large dimensional sample covariance matrixhttps://scholarbank.nus.edu.sg/handle/10635/105135Title: Estimation of the population spectral distribution from a large dimensional sample covariance matrix
Authors: Li, W.; Chen, J.; Qin, Y.; Bai, Z.; Yao, J.
Abstract: This paper introduces a new method to estimate the spectral distribution of a population covariance matrix from high-dimensional data. The method is founded on a meaningful generalization of the seminal Marčenko-Pastur equation, originally defined in the complex plane, to the real line. Beyond its easy implementation and the established asymptotic consistency, the new estimator outperforms two existing estimators from the literature in almost all the situations tested in a simulation experiment. An application to the analysis of the correlation matrix of S&P 500 daily stock returns is also given. © 2013 Elsevier B.V.
Fri, 01 Nov 2013 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1051352013-11-01T00:00:00Z
- Convergence rates of spectral distributions of large sample covariance matriceshttps://scholarbank.nus.edu.sg/handle/10635/105072Title: Convergence rates of spectral distributions of large sample covariance matrices
Authors: Bai, Z.D.; Miao, B.; Yao, J.-F.
Abstract: In this paper, we improve known results on the convergence rates of spectral distributions of large-dimensional sample covariance matrices of size p × n. Using the Stieltjes transform, we first prove that the expected spectral distribution converges to the limiting Marčenko-Pastur distribution with the dimension sample size ratio y = y n = p/n at a rate of O(n -1/2) if y keeps away from 0 and 1, under the assumption that the entries have a finite eighth moment. Furthermore, the rates for both the convergence in probability and the almost sure convergence are shown to be O p(n -2/5) and o a.s.(n -2/5+η), respectively, when y is away from 1. It is interesting that the rate in all senses is O(n -1/8) when y is close to 1.
Thu, 01 Jan 2004 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1050722004-01-01T00:00:00Z
- Weighted W test for normality and asymptotics a revisit of Chen-Shapiro test for normalityhttps://scholarbank.nus.edu.sg/handle/10635/105465Title: Weighted W test for normality and asymptotics a revisit of Chen-Shapiro test for normality
Authors: Bai, Z.D.; Chen, L.
Abstract: Chen and Shapiro (J. Statist. Comput. Simulation 53 (1995) 269) proposed the QH test for normality, based upon normalized spacings, which is easy to compute and has been shown by simulations to be as powerful as or superior to the original W-test. In this paper, we propose a generalized version of the W-type tests, named the weighted W-test which includes as special cases most versions of W-type tests. The limiting behavior of the weighted W statistics and the normalized version RH of the QH statistic are investigated. The relationship between QH and W is further discussed which interprets the underlying reason why the power property of the QH test is more likely to be that of the W test than that of the W′ test. © 2002 Published by Elsevier Science B.V.
Thu, 01 May 2003 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1054652003-05-01T00:00:00Z
- Rooted edges of a minimal directed spanning tree on random pointshttps://scholarbank.nus.edu.sg/handle/10635/105341Title: Rooted edges of a minimal directed spanning tree on random points
Authors: Bai, Z.D.; Lee, S.; Penrose, M.D.
Abstract: For n independent, identically distributed uniform points in [0, 1]d, d ≥ 2, let Ln be the total distance from the origin to all the minimal points under the coordinatewise partial order (this is also the total length of the rooted edges of a minimal directed spanning tree on the given random points). For d ≥ 3, we establish the asymptotics of the mean and the variance of Ln, and show that Ln satisfies a central limit theorem, unlike in the case d = 2. © Applied Probability Trust 2006.
Wed, 01 Mar 2006 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1053412006-03-01T00:00:00Z
- Multivariate linear and nonlinear causality testshttps://scholarbank.nus.edu.sg/handle/10635/105233Title: Multivariate linear and nonlinear causality tests
Authors: Bai, Z.; Wong, W.-K.; Zhang, B.
Abstract: The traditional linear Granger test has been widely used to examine the linear causality among several time series in bivariate settings as well as multivariate settings. Hiemstra and Jones [19] develop a nonlinear Granger causality test in bivariate settings to investigate the nonlinear causality between stock prices and trading volume. This paper extends their work by developing a nonlinear causality test in multivariate settings. © 2010 IMACS.
Wed, 01 Sep 2010 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1052332010-09-01T00:00:00Z
- Revisit of Sheppard corrections in linear regressionhttps://scholarbank.nus.edu.sg/handle/10635/105333Title: Revisit of Sheppard corrections in linear regression
Authors: Liu, T.Q.; Zhang, B.X.; Hu, G.R.; Bai, Z.D.
Abstract: Dempster and Rubin (D&R) in their JRSSB paper considered the statistical error caused by data rounding in a linear regression model and compared the Sheppard correction, BRB correction and the ordinary LSE by simulations. Some asymptotic results when the rounding scale tends to 0 were also presented. In a previous research, we found that the ordinary sample variance of rounded data from normal populations is always inconsistent while the sample mean of rounded data is consistent if and only if the true mean is a multiple of the half rounding scale. In the light of these results, in this paper we further investigate the rounding errors in linear regressions. We notice that these results form the basic reasons that the Sheppard corrections perform better than other methods in D&R examples and their conclusion in general cases is incorrect. Examples in which the Sheppard correction works worse than the BRB correction are also given. Furthermore, we propose a new approach to estimate the parameters, called "two-stage estimator", and establish the consistency and asymptotic normality of the new estimators. © Science China Press and Springer-Verlag Berlin Heidelberg 2010.
Fri, 01 Jan 2010 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1053332010-01-01T00:00:00Z
- Corrections to LRT on large-dimensional covariance matrix by RMThttps://scholarbank.nus.edu.sg/handle/10635/105074Title: Corrections to LRT on large-dimensional covariance matrix by RMT
Authors: Bai, Z.; Jiang, D.; Yao, J.-F.; Zheng, S.
Abstract: In this paper, we give an explanation to the failure of two likelihood ratio procedures for testing about covariance matrices from Gaussian populations when the dimension p is large compared to the sample size n. Next, using recent central limit theorems for linear spectral statistics of sample covariance matrices and of random F-matrices, we propose necessary corrections for these LR tests to cope with high-dimensional effects. The asymptotic distributions of these corrected tests under the null are given. Simulations demonstrate that the corrected LR tests yield a realized size close to nominal level for both moderate p (around 20) and high dimension, while the traditional LR tests with χ 2 approximation fails. Another contribution from the paper is that for testing the equality between two covariance matrices, the proposed correction applies equally for non-Gaussian populations yielding a valid pseudo-likelihood ratio test. © Institute of Mathematical Statistics, 2009.
Tue, 01 Dec 2009 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1050742009-12-01T00:00:00Z
- Functional CLT of eigenvectors for large sample covariance matriceshttps://scholarbank.nus.edu.sg/handle/10635/123769Title: Functional CLT of eigenvectors for large sample covariance matrices
Authors: Xia, Ningning; Bai, Zhidong
Thu, 01 Jan 2015 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1237692015-01-01T00:00:00Z
- On the theory of ranked-set sampling and its ramificationshttps://scholarbank.nus.edu.sg/handle/10635/105282Title: On the theory of ranked-set sampling and its ramifications
Authors: Bai, Z.; Chen, Z.
Abstract: We consider in this article ranked-set sampling (RSS) and its ramifications including RSS with imperfect ranking, RSS by ranking a concomitant variable and RSS with multivariate samples, etc. We deal with a unified sampling scheme which is referred to as generalized ranked-set sampling (GRSS) and which includes RSS and its ramifications as special cases. We develop a general theory for GRSS in both parametric and nonparametric settings. In a parametric setting, it is shown that the Fisher information matrix about the unknown parameters of a GRSS sample minus that of an SRS sample of the same size is always positive definite. In a nonparametric setting, a particular model, the smooth-function-of-means model, is considered and it is proved that the method-of-moment estimates of parameters based on a GRSS sample will always have smaller asymptotic variances than those based on an SRS sample of the same size. An example for RSS with multivariate samples is treated in detail and a simulation study is reported. Some other issues and open problems such as those involving optimal designs for the GRSS are also discussed. © 2002 Elsevier Science B.V. All rights reserved.
Wed, 01 Jan 2003 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1052822003-01-01T00:00:00Z
- Super efficient frequency estimationhttps://scholarbank.nus.edu.sg/handle/10635/105397Title: Super efficient frequency estimation
Authors: Kundu, D.; Bai, Z.; Nandi, S.; Bai, L.
Abstract: In this paper we propose a modified Newton-Raphson method to obtain super efficient estimators of the frequencies of a sinusoidal signal in presence of stationary noise. It is observed that if we start from an initial estimator with convergence rate Op(n-1) and use Newton-Raphson algorithm with proper step factor modification, then it produces super efficient frequency estimator in the sense that its asymptotic variance is lower than the asymptotic variance of the corresponding least squares estimator. The proposed frequency estimator is consistent and it has the same rate of convergence, namely Op(n-3/2), as the least squares estimator. Monte Carlo simulations are performed to observe the performance of the proposed estimator for different sample sizes and for different models. The results are quite satisfactory. One real data set has been analyzed for illustrative purpose. © 2011 Elsevier B.V.
Mon, 01 Aug 2011 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1053972011-08-01T00:00:00Z
- The performance of commodity trading advisors: A mean-variance-ratio test approachhttps://scholarbank.nus.edu.sg/handle/10635/105427Title: The performance of commodity trading advisors: A mean-variance-ratio test approach
Authors: Bai, Z.; Phoon, K.F.; Wang, K.; Wong, W.-K.
Abstract: In this paper, we provide evidence that the mean-variance-ratio (MVR) test is superior to the Sharpe ratio (SR) test by applying both tests to analyze the performance of commodity trading advisors (CTAs). Our findings show that while the SR test concludes that most of the CTA funds being analyzed are indistinguishable in their performance, the MVR statistic shows that some funds outperformed others. Moreover, the SR statistic indicates that one fund significantly outperformed another even when the difference between the two funds was insignificant or even changed directions over sub-periods. Conversely, the MVR statistic can detect such changes when they occur in the sub-periods. In addition, we have conducted simulations to show that the MVR test possesses good power. © 2012 Elsevier Inc.
Thu, 01 Aug 2013 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1054272013-08-01T00:00:00Z
- Functional CLT for sample covariance matriceshttps://scholarbank.nus.edu.sg/handle/10635/105154Title: Functional CLT for sample covariance matrices
Authors: Bai, Z.; Wang, X.; Zhou, W.
Abstract: Using Bernstein polynomial approximations, we prove the central limit theorem for linear spectral statistics of sample covariance matrices, indexed by a set of functions with continuous fourth order derivatives on an open interval including [(1 - √ y)2, (1 + √ y)2], the support of the Marčenko-Pastur law. We also derive the explicit expressions for asymptotic mean and covariance functions. © 2010 ISI/BS.
Mon, 01 Nov 2010 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1051542010-11-01T00:00:00Z
- The broken sample problemhttps://scholarbank.nus.edu.sg/handle/10635/105410Title: The broken sample problem
Authors: Bai, Z.; Hsing, T.
Abstract: Suppose that (X i Y i )i=12 ... n are iid. random vectors with uniform marginals and a certain joint distribution F ρ where ρ is a parameter with ρ=ρ o corresponds to the independence case. However the X's and Y's are observed separately so that the pairing information is missing. Can ρ be consistently estimated? This is an extension of a problem considered in (1980) which focused on the bivariate normal distribution with ρ being the correlation. In this paper we show that consistent discrimination between two distinct parameter values ρ 1 and ρ 2 is impossible if the density f ρ of F ρ is square integrable and the second largest singular value of the linear operator [InlineMediaObject not available: see fulltext.] is strictly less than 1 for ρ=ρ 1 and ρ 2. We also consider this result from the perspective of a bivariate empirical process which contains information equivalent to that of the broken sample. © Springer-Verlag 2004.
Fri, 01 Apr 2005 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1054102005-04-01T00:00:00Z
- Limit theorems for functions of marginal quantileshttps://scholarbank.nus.edu.sg/handle/10635/105196Title: Limit theorems for functions of marginal quantiles
Authors: Babu, G.J.; Bai, Z.; Choi, K.P.; Mangalam, V.
Abstract: Multivariate distributions are explored using the joint distributions of marginal sample quantiles. Limit theory for the mean of a function of order statistics is presented. The results include a multivariate central limit theorem and a strong law of large numbers. A result similar to Bahadur's representation of quantiles is established for the mean of a function of the marginal quantiles. In particular, it is shown that √ n(1/nσ n i=1φ(X(1) n : i, ⋯ , X(d) n : i) - ȳ)=1/√nσn i=1 Zn,i + oP (1) as n→ ∞, where ȳ is a constant and Zn,i are i.i.d. random variables for each n. This leads to the central limit theorem. Weak convergence to a Gaussian process using equicontinuity of functions is indicated. The results are established under very general conditions. These conditions are shown to be satisfied in many commonly occurring situations. © 2011 ISI/BS.
Sun, 01 May 2011 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1051962011-05-01T00:00:00Z
- Important ECG diagnosis-aiding indices of ventricular septal defect children with or without congestive heart failurehttps://scholarbank.nus.edu.sg/handle/10635/105176Title: Important ECG diagnosis-aiding indices of ventricular septal defect children with or without congestive heart failure
Authors: Guo, M.; Huang, M.-N.L.; Bai, Z.; Hsieh, K.-S.
Abstract: In this paper we perform a statistical study of the conventional RR intervals and two newly defined PR' and RT intervals of ECG data. A quadratic classification rule is applied to extract several important ECG diagnosis-aiding indices among normal children and children with ventricular septal defect (VSD) with or without congestive heart failure (CHF). The results show that certain statistics computed from PR', RR and RT intervals are important diagnosis-aiding indices. Best classification vectors are searched for pairwise classification. Two methods, minimum distance criterion and a two-stage classification procedure, are considered for three-way classification. Furthermore, logistic regression models based on transformations of these important diagnosis-aiding indices are proposed. The receiver operating characteristic curves of the proposed models show better performance than those of linear and quadratic logistic models. In order to proceed with this study, a computer algorithm to automatically detect the three intervals is developed and the related ECG data are collected and analysed. The algorithm is also enhanced with an outlier detection procedure for the automatic measurements of the PR' and RT intervals. Copyright © 2001 John Wiley & Sons, Ltd.
Mon, 01 Jan 2001 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1051762001-01-01T00:00:00Z
- Asymptotics in randomized urn modelshttps://scholarbank.nus.edu.sg/handle/10635/105032Title: Asymptotics in randomized urn models
Authors: Bai, Z.-D.; Hu, F.
Abstract: This paper studies a very general urn model stimulated by designs in clinical trials, where the number of balls of different types added to the urn at trial n depends on a random outcome directed by the composition at trials 1, 2, . . ., n - 1. Patient treatments are allocated according to types of balls. We establish the strong consistency and asymptotic normality for both the urn composition and the patient allocation under general assumptions on random generating matrices which determine how balls are added to the urn. Also we obtain explicit forms of the asymptotic variance-covariance matrices of both the urn composition and the patient allocation. The conditions on the nonhomogeneity of generating matrices are mild and widely satisfied in applications. Several applications are also discussed. © Institute of Mathematical Statistics, 2005.
Tue, 01 Feb 2005 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1050322005-02-01T00:00:00Z
- The deepest regression methodhttps://scholarbank.nus.edu.sg/handle/10635/105412Title: The deepest regression method
Authors: Mukherjee, K.; Bai, Z.D.
Abstract: Deepest regression (DR) is a method for linear regression introduced by P. J. Rousseeuw and M. Hubert (1999, J. Amer. Statis. Assoc. 94, 388-402). The DR method is defined as the fit with largest regression depth relative to the data. In this paper we show that DR is a robust method, with breakdown value that converges almost surely to 1/3 in any dimension. We construct an approximate algorithm for fast computation of DR in more than two dimensions. From the distribution of the regression depth we derive tests for the true unknown parameters in the linear regression model. Moreover, we construct simultaneous confidence regions based on bootstrapped estimates. We also use the maximal regression depth to construct a test for linearity versus convexity/concavity. We extend regression depth and deepest regression to more general models. We apply DR to polynomial regression and show that the deepest polynomial regression has breakdown value 1/5. Finally, DR is applied to the Michaelis-Menten model of enzyme kinetics, where it resolves a long-standing ambiguity. © 2001 Elsevier Science (USA).
Tue, 01 Jan 2002 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1054122002-01-01T00:00:00Z
- The limiting spectral distribution of the product of the Wigner matrix and a nonnegative definite matrixhttps://scholarbank.nus.edu.sg/handle/10635/105422Title: The limiting spectral distribution of the product of the Wigner matrix and a nonnegative definite matrix
Authors: Bai, Z.D.; Zhang, L.X.
Abstract: Let Wn be n x n Hermitian whose entries on and above the diagonal are independent complex random variables satisfying the Lindeberg type condition. Let Tn be n x n nonnegative definitive and be independent of Wn. Assume that almost surely, as n→∞, the empirical distribution of the eigenvalues of Tn converges weakly to a non-random probability distribution. Let An=n-1/2Tn 1/2WnTn 1/2. Then with the aid of the Stieltjes transforms, we show that almost surely, as n→∞, the empirical distribution of the eigenvalues of An also converges weakly to a non-random probability distribution, a system of two equations determining the Stieltjes transform of the limiting distribution. Important analytic properties of this limiting spectral distribution are then derived by means of those equations. It is shown that the limiting spectral distribution is continuously differentiable everywhere on the real line except only at the origin and that a necessary and sufficient condition is available for determining its support. At the end, the density function of the limiting spectral distribution is calculated for two important cases of Tn, when Tn is a sample covariance matrix and when Tn is the inverse of a sample covariance matrix. © 2010 Elsevier Inc.
Fri, 01 Oct 2010 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1054222010-10-01T00:00:00Z
- On the convergence of the spectral empirical process of Wigner matriceshttps://scholarbank.nus.edu.sg/handle/10635/105269Title: On the convergence of the spectral empirical process of Wigner matrices
Authors: Bai, Z.D.; Yao, J.
Abstract: It is well known that the spectral distribution Fn of a Wigner matrix converges to Wigner's semicircle law. We consider the empirical process indexed by a set of functions analytic on an open domain of the complex plane including the support of the semicircle law. Under fourth-moment conditions, we prove that this empirical process converges to a Gaussian process. Explicit formulae for the mean function and the covariance function of the limit process are provided. © 2005 ISI/BS.
Thu, 01 Dec 2005 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1052692005-12-01T00:00:00Z
- Central limit theorems for eigenvalues in a spiked population modelhttps://scholarbank.nus.edu.sg/handle/10635/52816Title: Central limit theorems for eigenvalues in a spiked population model
Authors: Bai, Z.; Yao, J.-F.
Abstract: In a spiked population model, the population covariance matrix has all its eigenvalues equal to units except for a few fixed eigenvalues (spikes). This model is proposed by Johnstone to cope with empirical findings on various data sets. The question is to quantify the effect of the perturbation caused by the spike eigenvalues. A recent work by Baik and Silverstein establishes the almost sure limits of the extreme sample eigenvalues associated to the spike eigenvalues when the population and the sample sizes become large. This paper establishes the limiting distributions of these extreme sample eigenvalues. As another important result of the paper, we provide a central limit theorem on random sesquilinear forms. © Association des Publications de l'Institut Henri Poincaré, 2008.
Sun, 01 Jun 2008 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/528162008-06-01T00:00:00Z
- Maxima in hypercubeshttps://scholarbank.nus.edu.sg/handle/10635/105212Title: Maxima in hypercubes
Authors: Bai, Z.-D.; Devroye, L.; Hwang, H.-K.; Tsai, T.-H.
Abstract: We derive a Berry-Esseen bound, essentially of the order of the square of the standard deviation, for the number of maxima in random samples from (0, 1)d. The bound is, although not optimal, the first of its kind for the number of maxima in dimensions higher than two. The proof uses Poisson processes and Stein's method. We also propose a new method for computing the variance and derive an asymptotic expansion. The methods of proof we propose are of some generality and applicable to other regions such as d-dimensional simplex. © 2005 Wiley Periodicals, Inc.
Sat, 01 Oct 2005 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1052122005-10-01T00:00:00Z
- Multivariate causality tests with simulation and applicationhttps://scholarbank.nus.edu.sg/handle/10635/105232Title: Multivariate causality tests with simulation and application
Authors: Bai, Z.; Li, H.; Wong, W.-K.; Zhang, B.
Abstract: This paper extends the test established by Hiemstra and Jones (1994) to develop a nonlinear causality test in a multivariate setting. A Monte Carlo simulation is conducted to demonstrate the superiority of our proposed multivariate test over its bivariate counterpart. In addition, we illustrate the applicability of our proposed test for analyzing the relationships among different Chinese stock market indices. © 2011 Elsevier B.V.
Mon, 01 Aug 2011 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1052322011-08-01T00:00:00Z
- The mean-variance ratio test-A complement to the coefficient of variation test and the Sharpe ratio testhttps://scholarbank.nus.edu.sg/handle/10635/105424Title: The mean-variance ratio test-A complement to the coefficient of variation test and the Sharpe ratio test
Authors: Bai, Z.; Wang, K.; Wong, W.-K.
Abstract: To circumvent the limitations of the tests for coefficients of variation and Sharpe ratios, we develop the mean-variance ratio statistic for testing the equality of mean-variance ratios, and prove that our proposed statistic is the uniformly most powerful unbiased statistic. In addition, we illustrate the applicability of our proposed test for comparing the performances of stock indices. © 2011 Elsevier B.V.
Mon, 01 Aug 2011 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1054242011-08-01T00:00:00Z
- The simultaneous estimation of the number of signals and frequencies of multiple sinusoids when some observations are missing: I. Asymptoticshttps://scholarbank.nus.edu.sg/handle/10635/105430Title: The simultaneous estimation of the number of signals and frequencies of multiple sinusoids when some observations are missing: I. Asymptotics
Authors: Bai, Z.; Rao, C.R.; Wu, Y.; Zen, M.-M.; Zhao, L.
Abstract: The problem of simultaneous estimation of the number of signals and frequencies of multiple sinusoids is considered in the case when some observations are missing. The number of signals is estimated with an information theoretic criterion, and the frequencies are estimated with eigen-variation linear prediction. The strong consistency of the estimates of the number of signals and the frequencies is established and the rate of convergence of these estimates is provided. Besides, the limiting distributions of various estimates are given.
Tue, 28 Sep 1999 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1054301999-09-28T00:00:00Z
- Normal approximations of the number of records in geometrically distributed random variableshttps://scholarbank.nus.edu.sg/handle/10635/103639Title: Normal approximations of the number of records in geometrically distributed random variables
Authors: Bai, Z.-D.; Hwang, H.-K.; Liang, W.-Q.
Abstract: We establish the asymptotic normality of the number of upper records in a sequence of iid geometric random variables. Large deviations and local limit theorems as well as approximation theorems for the number of lower records are also derived. © 1998 John Wiley & Sons, Inc. Random Struct. Alg., 13, 319-334, 1998.
Thu, 01 Oct 1998 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1036391998-10-01T00:00:00Z
- On the variance of the number of maxima in random vectors and its applicationshttps://scholarbank.nus.edu.sg/handle/10635/103847Title: On the variance of the number of maxima in random vectors and its applications
Authors: Bai, Z.-D.; Chao, C.-C.; Hwang, H.-K.; Liang, W.-Q.
Abstract: We derive a general asymptotic formula for the variance of the number of maxima in a set of independent and identically distributed random vectors in ℝd, where the components of each vector are independently and continuously distributed. Applications of the results to algorithmic analysis are also indicated.
Sat, 01 Aug 1998 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1038471998-08-01T00:00:00Z
- Some results on two-stage clinical trialshttps://scholarbank.nus.edu.sg/handle/10635/105379Title: Some results on two-stage clinical trials
Authors: Chen, Y.-M.; Chen, G.-J.; Bai, Z.-D.; Hu, F.-F.
Abstract: Among a variety of adaptive designs, stage-wise design, especially, two-stage design is an important one because patient responses are not available immediately but are available in batches or intermittently in some situations. In this paper, by Bayesian method, the general formula of asymptotical optimal worth is given, meanwhile the length of some optimal designs at first stage concerning two-stage trials in several important cases has been obtained. © Springer-Verlag 2003.
Wed, 01 Jan 2003 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1053792003-01-01T00:00:00Z
- Rank tests for independence - With a weighted contamination alternativehttps://scholarbank.nus.edu.sg/handle/10635/105320Title: Rank tests for independence - With a weighted contamination alternative
Authors: Shieh, G.S.; Bai, Z.; Tsai, W.-Y.
Abstract: Two rank tests for independence of bivariate random variables against an alternative model with weighted contamination are proposed. The model may emphasize the association of X and Y on items with high ranks in one variable (say X) and generalizes an alternative in Hájek and Šidák (1967). The model may be applied to both complete paired data and paired data which is truncated in one variable. We derive the locally most powerful rank (LMPR) test under the alternative setting. The proposed tests turn out to be asymptotic LMPR tests under Logistic and Extreme Value families. Under the null hypothesis of independence, both rank statistics have limiting normal distributions. An application to a data set from a special education program in Taiwan and a simulation study are presented. We also apply the Shapiro-Francia test to find the minimum sample sizes for approximate normality of exact distributions of the proposed test statistics.
Sat, 01 Apr 2000 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1053202000-04-01T00:00:00Z
- Statistics with estimated parametershttps://scholarbank.nus.edu.sg/handle/10635/105395Title: Statistics with estimated parameters
Authors: Yang, Z.L.; Tse, Y.K.; Bai, Z.D.
Abstract: This paper studies a general problem of making inferences for functions of two sets of parameters where, when the first set is given, there exists a statistic with a known distribution. We study the distribution of this statistic when the first set of parameters is unknown and is replaced by an estimator. We show that under mild conditions the variance of the statistic is inflated when the unconstrained maximum likelihood estimator (MLE) is used, but deflated when the constrained MLE is used. The results are shown to be useful in hypothesis testing and confidenceinterval construction in providing simpler and improved inference methods than do the standard large sample likelihood inference theories. We provide three applications of our theories, namely Box-Cox regression, dynamic regression, and spatial regression, to illustrate the generality and versatility of our results.
Sun, 01 Apr 2007 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1053952007-04-01T00:00:00Z
- Exact separation of eigenvalues of large dimensional sample covariance matriceshttps://scholarbank.nus.edu.sg/handle/10635/105140Title: Exact separation of eigenvalues of large dimensional sample covariance matrices
Authors: Bai, Z.D.; Silverstein, J.W.
Abstract: Let Bn = (1/N)T1/2 nXnX* nT1/2 n where Xn is n × N with i.i.d. complex standardized entries having finite fourth moment, and T1/2 n is a Hermitian square root of the nonnegative definite Hermitian matrix Tn. It was shown in an earlier paper by the authors that, under certain conditions on the eigenvalues of Tn, with probability 1 no eigenvalues lie in any interval which is outside the support of the limiting empirical distribution (known to exist) for all large n. For these n the interval corresponds to one that separates the eigenvalues of Tn. The aim of the present paper is to prove exact separation of eigenvalues; that is, with probability 1, the number of eigenvalues of Bn and Tn lying on one side of their respective intervals are identical for all large n.
Thu, 01 Jul 1999 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1051401999-07-01T00:00:00Z
- Limit theorems for the number of maxima in random samples from planar regionshttps://scholarbank.nus.edu.sg/handle/10635/105197Title: Limit theorems for the number of maxima in random samples from planar regions
Authors: Bai, Z.-D.; Hwang, H.-K.; Liang, W.-Q.; Tsai, T.-H.
Abstract: We prove that the number of maximal points in a random sample taken uniformly and independently from a convex polygon is asymptotically normal in the sense of convergence in distribution. Many new results for other planar regions are also derived. In particular, precise Poisson approximation results are given for the number of maxima in regions bounded above by a nondecreasing curve.
Mon, 22 Jan 2001 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1051972001-01-22T00:00:00Z
- Methodologies in spectral analysis of large dimensional random matrices, a reviewhttps://scholarbank.nus.edu.sg/handle/10635/105220Title: Methodologies in spectral analysis of large dimensional random matrices, a review
Authors: Bai, Z.D.
Abstract: In this paper, we give a brief review of the theory of spectral analysis of large dimensional random matrices. Most of the existing work in the literature has been stated for real matrices but the corresponding results for the complex case are also of interest, especially for researchers in Electrical and Electronic Engineering. Thus, we convert almost all results to the complex case, whenever possible. Only the latest results, including some new ones, are stated as theorems here. The main purpose of the paper is to show how important methodologies, or mathematical tools, have helped to develop the theory. Some unsolved problems are also stated.
Thu, 01 Jul 1999 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1052201999-07-01T00:00:00Z
- On the Markowitz mean-variance analysis of self-financing portfolioshttps://scholarbank.nus.edu.sg/handle/10635/105278Title: On the Markowitz mean-variance analysis of self-financing portfolios
Authors: Bai, Z.; Liu, H.; Wong, W.-K.
Abstract: This paper extends the work of Markowitz (1952), Korkie and Turtle (2002) and others by first proving that the traditional estimate for the optimal return of self-financing portfolios always over-estimates from its theoretic value. To circumvent the problem, we develop a bootstrap estimate for the optimal return of self-financing portfolios and prove that this estimate is consistent with its counterpart parameter. We further demonstrate the superiority of our proposed estimate over the traditional estimate by simulation. © 2009 - IOS Press and the authors. All rights reserved.
Thu, 01 Jan 2009 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1052782009-01-01T00:00:00Z
- On constrained M-estimation and its recursive analog in multivariate linear regression modelshttps://scholarbank.nus.edu.sg/handle/10635/105252Title: On constrained M-estimation and its recursive analog in multivariate linear regression models
Authors: Bai, Z.; Chen, X.; Wu, Y.
Abstract: In this paper, the constrained M-estimation of the regression coefficients and scatter parameters in a multivariate linear regression model is considered. Robustness and asymptotic behavior are investigated. Since constrained M-estimation is not easy to compute, an up-dating recursion procedure is proposed to simplify the computation of the estimators when a new observation is obtained. We show that, under mild conditions, the recursion estimates are strongly consistent. A Monte Carlo simulation study of the recursion estimates is also provided.
Tue, 01 Apr 2008 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1052522008-04-01T00:00:00Z
- Multi-step prediction for nonlinear autoregressive models based on empirical distributionshttps://scholarbank.nus.edu.sg/handle/10635/105231Title: Multi-step prediction for nonlinear autoregressive models based on empirical distributions
Authors: Guo, M.; Bai, Z.; An, H.Z.
Abstract: A multi-step prediction procedure for nonlinear autoregressive (NLAR) models based on empirical distributions is proposed. Calculations involved in this prediction scheme are rather simple. It is shown that the proposed predictors are asymptotically equivalent to the exact least squares multi-step predictors, which are computable only when the innovation distribution has a simple known form. Simulation studies are conducted for two- and three-step predictors of two NLAR models.
Thu, 01 Apr 1999 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1052311999-04-01T00:00:00Z
- Gaussian approximation theorems for urn models and their applicationshttps://scholarbank.nus.edu.sg/handle/10635/105156Title: Gaussian approximation theorems for urn models and their applications
Authors: Bai, Z.D.; Hu, F.; Zhang, L.-X.
Abstract: We consider weak and strong Gaussian approximations for a two-color generalized Friedman's urn model with homogeneous and nonhomogeneous generating matrices. In particular, the functional central limit theorems and the laws of iterated logarithm are obtained. As an application, we obtain the asymptotic properties for the randomized-play-the-winner rule. Based on the Gaussian approximations, we also get some variance estimators for the urn model.
Fri, 01 Nov 2002 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1051562002-11-01T00:00:00Z
- Large sample covariance matrices without independence structures in columnshttps://scholarbank.nus.edu.sg/handle/10635/105190Title: Large sample covariance matrices without independence structures in columns
Authors: Bai, Z.; Zhou, W.
Abstract: The limiting spectral distribution of large sample covariance matrices is derived under dependence conditions. As applications, we obtain the limiting spectral distributions of Spearman's rank correlation matrices, sample correlation matrices, sample covariance matrices from finite populations, and sample covariance matrices from causal AR(1) models.
Tue, 01 Apr 2008 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1051902008-04-01T00:00:00Z
- Berry-Esseen bounds for the number of maxima in planar regionshttps://scholarbank.nus.edu.sg/handle/10635/105042Title: Berry-Esseen bounds for the number of maxima in planar regions
Authors: Bai, Z.-D.; Hwang, H.-K.; Tsai, T.-H.
Abstract: We derive the optimal convergence rate O(n-1/4) in the central limit theorem for the number of maxima in random samples chosen uniformly at random from the right triangle of the shape right triangle. A local limit theorem with rate is also derived. The result is then applied to the number of maxima in general planar regions (upper-bounded by some smooth decreasing curves) for which a near-optimal convergence rate to the normal distribution is established.
Tue, 10 Jun 2003 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1050422003-06-10T00:00:00Z
- Central limit theorems for eigenvalues in a spiked population modelhttps://scholarbank.nus.edu.sg/handle/10635/105050Title: Central limit theorems for eigenvalues in a spiked population model
Authors: Bai, Z.; Yao, J.-F.
Abstract: In a spiked population model, the population covariance matrix has all its eigenvalues equal to units except for a few fixed eigenvalues (spikes). This model is proposed by Johnstone to cope with empirical findings on various data sets. The question is to quantify the effect of the perturbation caused by the spike eigenvalues. A recent work by Baik and Silverstein establishes the almost sure limits of the extreme sample eigenvalues associated to the spike eigenvalues when the population and the sample sizes become large. This paper establishes the limiting distributions of these extreme sample eigenvalues. As another important result of the paper, we provide a central limit theorem on random sesquilinear forms. © Association des Publications de l'Institut Henri Poincaré, 2008.
Sun, 01 Jun 2008 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1050502008-06-01T00:00:00Z
- Asymptotic properties of adaptive designs for clinical trials with delayed responsehttps://scholarbank.nus.edu.sg/handle/10635/105026Title: Asymptotic properties of adaptive designs for clinical trials with delayed response
Authors: Bai, Z.D.; Hu, F.; Rosenberger, W.F.
Abstract: For adaptive clinical trials using a generalized Friedman's urn design, we derive the limiting distribution of the urn composition under staggered entry and delayed response. The stochastic delay mechanism is assumed to depend on both the treatment assigned and the patient's response. A very general setup is employed with K treatments and L responses. When L = K = 2, one example of a generalized Friedman's urn design is the randomized playthe-winner rule. An application of this rule occurred in a clinical trial of depression, which had staggered entry and delayed response. We show that maximum likelihood estimators from such a trial have the usual asymptotic properties.
Fri, 01 Feb 2002 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1050262002-02-01T00:00:00Z
- Asymptotic theorems for urn models with nonhomogeneous generating matriceshttps://scholarbank.nus.edu.sg/handle/10635/105029Title: Asymptotic theorems for urn models with nonhomogeneous generating matrices
Authors: Bai, Z.D.; Hu, F.
Abstract: The generalized Friedman's urn (GFU) model has been extensively applied to biostatistics. However, in the literature, all the asymptotic results concerning the GFU are established under the assumption of a homogeneous generating matrix, whereas, in practical applications, the generating matrices are often nonhomogeneous. On the other hand, even for the homogeneous case, the generating matrix is assumed in the literature to have a diagonal Jordan form and satisfies λ>2Re(λ1), where λ and λ1 are the largest eigenvalue and the eigenvalue of the second largest real part of the generating matrix (see Smythe, 1996, Stochastic Process. Appl. 65, 115-137). In this paper, we study the asymptotic properties of the GFU model associated with nonhomogeneous generating matrices. The results are applicable to a variety of settings, such as the adaptive allocation rules with time trends in clinical trials and those with covariates. These results also apply to the case of a homogeneous generating matrix with a general Jordan form as well as the case where λ=2Re(λ1).
Mon, 01 Mar 1999 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1050291999-03-01T00:00:00Z
- Asymptotic distributions of the maximal depth estimators for regression and multivariate locationhttps://scholarbank.nus.edu.sg/handle/10635/105023Title: Asymptotic distributions of the maximal depth estimators for regression and multivariate location
Authors: Bai, Z.-D.; He, X.
Abstract: We derive the asymptotic distribution of the maximal depth regression estimator recently proposed in Rousseeuw and Hubert. The estimator is obtained by maximizing a projection-based depth and the limiting distribution is characterized through a max - min operation of a continuous process. The same techniques can be used to obtain the limiting distribution of some other depth estimators including Tukey's deepest point based on half-space depth. Results for the special case of two-dimensional problems have been available, but the earlier arguments have relied on some special geometric properties in the low-dimensional space. This paper completes the extension to higher dimensions for both regression and multivariate location models.
Fri, 01 Oct 1999 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1050231999-10-01T00:00:00Z
- A kind of urn model for adaptive sequential designhttps://scholarbank.nus.edu.sg/handle/10635/104934Title: A kind of urn model for adaptive sequential design
Authors: Bai, Z.; Chen, G.; Hu, F.
Abstract: This paper proposes a new kind of generalized Friendman's urn model,which with adaptive nonhomogeneous generating matrix.This model may be applied in sequential medical experiment.In this model some limit theorems (strong consistency and asymptotical normality) have been obtained.
Sun, 01 Apr 2001 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1049342001-04-01T00:00:00Z
- An efficient algorithm for estimating the parameters of superimposed exponential signalshttps://scholarbank.nus.edu.sg/handle/10635/104993Title: An efficient algorithm for estimating the parameters of superimposed exponential signals
Authors: Bai, Z.D.; Rao, C.R.; Chow, M.; Kundu, D.
Abstract: An efficient computational algorithm is proposed for estimating the parameters of undamped exponential signals, when the parameters are complex valued. Such data arise in several areas of applications including telecommunications, radio location of objects, seismic signal processing and computer assisted medical diagnostics. It is observed that the proposed estimators are consistent and the dispersion matrix of these estimators is asymptotically the same as that of the least squares estimators. Moreover, the asymptotic variances of the proposed estimators attain the Cramer-Rao lower bounds, when the errors are Gaussian. © 2001 Elsevier Science B.V.
Wed, 15 Jan 2003 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1049932003-01-15T00:00:00Z