Please use this identifier to cite or link to this item:
https://scholarbank.nus.edu.sg/handle/10635/171708
DC Field | Value | |
---|---|---|
dc.title | Efficient Statistics for Sparse Graphical Models from Truncated Samples. | |
dc.contributor.author | ARNAB BHATTACHARYYA | |
dc.contributor.author | Desai, Rathin | |
dc.contributor.author | Nagarajan, Sai Ganesh | |
dc.contributor.author | Panageas, Ioannis | |
dc.date.accessioned | 2020-07-27T04:06:48Z | |
dc.date.available | 2020-07-27T04:06:48Z | |
dc.date.issued | 2020 | |
dc.identifier.citation | ARNAB BHATTACHARYYA, Desai, Rathin, Nagarajan, Sai Ganesh, Panageas, Ioannis (2020). Efficient Statistics for Sparse Graphical Models from Truncated Samples.. CoRR abs/2006.09735. ScholarBank@NUS Repository. | |
dc.identifier.uri | https://scholarbank.nus.edu.sg/handle/10635/171708 | |
dc.description.abstract | In this paper, we study high-dimensional estimation from truncated samples. We focus on two fundamental and classical problems: (i) inference of sparse Gaussian graphical models and (ii) support recovery of sparse linear models. (i) For Gaussian graphical models, suppose $d$-dimensional samples ${\bf x}$ are generated from a Gaussian $N(\mu,\Sigma)$ and observed only if they belong to a subset $S \subseteq \mathbb{R}^d$. We show that ${\mu}$ and ${\Sigma}$ can be estimated with error $\epsilon$ in the Frobenius norm, using $\tilde{O}\left(\frac{\textrm{nz}({\Sigma}^{-1})}{\epsilon^2}\right)$ samples from a truncated $\mathcal{N}({\mu},{\Sigma})$ and having access to a membership oracle for $S$. The set $S$ is assumed to have non-trivial measure under the unknown distribution but is otherwise arbitrary. (ii) For sparse linear regression, suppose samples $({\bf x},y)$ are generated where $y = {\bf x}^\top{{\Omega}^*} + \mathcal{N}(0,1)$ and $({\bf x}, y)$ is seen only if $y$ belongs to a truncation set $S \subseteq \mathbb{R}$. We consider the case that ${\Omega}^*$ is sparse with a support set of size $k$. Our main result is to establish precise conditions on the problem dimension $d$, the support size $k$, the number of observations $n$, and properties of the samples and the truncation that are sufficient to recover the support of ${\Omega}^*$. Specifically, we show that under some mild assumptions, only $O(k^2 \log d)$ samples are needed to estimate ${\Omega}^*$ in the $\ell_\infty$-norm up to a bounded error. For both problems, our estimator minimizes the sum of the finite population negative log-likelihood function and an $\ell_1$-regularization term. | |
dc.source | Elements | |
dc.subject | stat.ML | |
dc.subject | stat.ML | |
dc.subject | cs.DS | |
dc.subject | cs.LG | |
dc.subject | math.ST | |
dc.subject | stat.CO | |
dc.subject | stat.TH | |
dc.type | Article | |
dc.date.updated | 2020-07-23T09:20:43Z | |
dc.contributor.department | DEPARTMENT OF COMPUTER SCIENCE | |
dc.description.sourcetitle | CoRR | |
dc.description.volume | abs/2006.09735 | |
dc.published.state | Published | |
Appears in Collections: | Staff Publications Elements |
Show simple item record
Files in This Item:
File | Description | Size | Format | Access Settings | Version | |
---|---|---|---|---|---|---|
2006.09735v1.pdf | 918.03 kB | Adobe PDF | OPEN | Pre-print | View/Download |
Google ScholarTM
Check
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.