Recent and Selected Publications
The Sketched Wasserstein Distance for mixture distributions (2022), Xin Bing, Florentina Bunea and Jon Niles-Weed, New title: Estimation and Inference for the Wasserstein Distance between mixing measures in topic models (2023); Submitted. https://arxiv.org/abs/2206.12768.
Asymptotic confidence sets for random linear programs (2023), Shuyu Liu, Florentina Bunea, and Jon Niles-Weed; Submitted; https://arxiv.org/pdf/2302.12364.pdf
Interpolating predictors in high-dimensional factor regression (2022), Florentina Bunea, Seth Strimas-Mackey and Marten Wegkamp. Journal of Machine Learning Research, Vol 23. [ArXiv].
Likelihood estimation of sparse topic distributions in topic models and its applications to Wasserstein document distance calculations, (2022), Xin Bing, Florentina Bunea, Seth Strimas-Mackey and Marten Wegkamp. Forthcoming in the Annals of Statistics. [ArXiv]
Detecting approximate replicate components of a high-dimensional random vector with latent structure (2023), Xin Bing, Florentina Bunea and Marten Wegkamp. Bernoulli, Vol. 29, pages 1368-1392 [ArXiv]
Inference in latent factor regression with clusterable features (2022), Xin Bing, Florentina Bunea and Marten Wegkamp. Bernoulli, Vol 28. [ArXiv].
Essential Regression – a generalizable framework for inferring causal latent factors from multi-omic human datasets (2022), Xin Bing, Tyler Lovelace, Florentina Bunea, Marten Wegkamp, Harinder Singh, Panayiotis V Benos, Jishnu Das. Forthcoming in Patterns-Cell Press.
Prediction in latent factor regression: Adaptive PCR and beyond (2021), Xin Bing, Florentina Bunea, Seth Strimas-Mackey and Marten Wegkamp. Journal of Machine Learning Research [ArXiv].
Optimal estimation of sparse topic models (2020), Xin Bing, Florentina Bunea, Marten Wegkamp. Journal of Machine Learning Research, Vol. 21, 1-45. [ArXiv].
A fast algorithm with minimax optimal guarantees for topic models with an unknown number of topics (2020), Xin Bing, Florentina Bunea and Marten Wegkamp. Bernoulli, Vol. 26 (3), 1765-1796. [ArXiv] (Python code is coming soon. For the beta-version of the code, please contact xb43@cornell.edu)
Adaptive Estimation in Structured Factor Models with Applications to Overlapping Clustering (2020), Xin Bing, Florentina Bunea, Yang Ning and Marten Wegkamp. The Annals of Statistics, Vol. 48(4), 2055-2081. [ArXiv] (R-package is coming up soon. For the beta-version of the code, please contact xb43@cornell.edu)
High-Dimensional Inference for Cluster-Based Graphical Models (2020), C. Eisenach, F. Bunea, Y. Ning and C. Dinicu, Journal of Machine Learning Research, Vol. 21, 1- 55. [ArXiv].
Model-assisted variable clustering: minimax-optimal recovery and algorithms (2020), Florentina Bunea, Christophe Giraud, Xi Luo, Martin Royer and Nicolas Verzelen, The Annals of Statistics, Vol. 48 (1), 111-137. [ArXiv].
Essential Regression (2019), Xin Bing, Florentina Bunea, Marten Wegkamp and Seth Strimas-Mackey. [ArXiv].
Latent model-based clustering for biological discovery (2019), Xin Bing, Florentina Bunea, Martin Royer, Jishnu Das. iScience ISSN 2589-0042. [PDF].
PECOK: a convex optimization approach to variable clustering (2017), Florentina Bunea, Christophe Giraud, Martin Royer, and Nicolas Verzelen. [Arxiv].
Minimax Optimal Variable Clustering in G-models via Cord (2016), Florentina Bunea, Christophe Giraud and Xi Luo. [Arxiv].
Convex banding of the covariance matrix (2016), J. Bien, F. Bunea and L. Xiao, Journal of the American Statistical Association,Volume 111, 834-845. [ArXiv]
On the sample covariance matrix estimator of reduced effective rank population matrices, with applications to fPCA (2015), F. Bunea and L. Xiao, Bernoulli, Vol. 21, 1200-1230. [ArXiv]
The square root group lasso: theoretical properties and fast algorithms (2014), F. Bunea, J. Lederer and Y. She, IEEE-Information Theory, Vol. 60, 1313-1325, [ArXiv]; For Matlab Code, see http://stat.fsu.edu/~yshe/code/g-sqrtlasso.zip
Joint variable and rank selection for parsimonious estimation of high dimensional matrices, (2012), F. Bunea, Y. She and M. Wegkamp, The Annals of Statistics, Vol. 40, 2359-2388, [ArXiv]
Optimal selection of reduced rank estimators of high-dimensional matrices (2011), F. Bunea, Y. She and M. Wegkamp, The Annals of Statistics, Vol. 39, 1282 – 1309, [ArXiv]; For Matlab Code, see http://stat.fsu.edu/~yshe/code/rsc.zip
Spades and Mixture Models (2010), F. Bunea, M. Wegkamp, A. Tsybakov and A. Barbu, The Annals of Statistics, Vol. 38, No. 4, 2525 – 2558, [ArXiv]
Honest variable selection in linear and logistic regression models via l1 and l1 + l2 penalization (2008), F. Bunea, The Electronic Journal of Statistics , Vol. 2, Pages: 1153-1194 .[ArXiv]
Aggregation for Gaussian Regression (2007), F. Bunea, M. Wegkamp and A. Tsybakov, The Annals of Statistics, 35 (4), 1674 – 1697. [ArXiv]
Sparsity oracle inequalities for the lasso (2007), F. Bunea, A. Tsybakov and M. Wegkamp, The Electronic Journal of Statistics, 169 – 194. [ArXiv]
Consistent Covariate Selection and Post Model Selection Inference in Semiparametric Regression (2004), F. Bunea, The Annals of Statistics, Vol. 32, No. 3, 898-927. [ArXiv]