Skip to main content

Recent and Selected Publications

SLIDE: Significant Latent Factor Interaction Discovery and Exploration across biological domains (2023), J. Rahimikollu, H. Xiao, Anna E. Rosengart,  Tracy Tabib,  Paul Zdinak,  Kun He, Xin Bing, Florentina Bunea, Marten Wegkamp, Amanda C. Poholek, Alok V Joglekar, Robert A Lafyatis, Jishnu Das,  Nature MethodsAccepted.

 Estimation and Inference for the Wasserstein Distance between mixing measures in topic models (2023);  Xin Bing, Florentina Bunea and Jon Niles-Weed; Old title: The Sketched Wasserstein Distance for mixture distributions , Submitted.   https://arxiv.org/abs/2206.12768.

Asymptotic confidence sets for random linear programs (2023), Shuyu Liu, Florentina Bunea, and Jon Niles-Weed; Conference on Learning Theory (COLT) 2023; https://arxiv.org/pdf/2302.12364.pdf

Interpolating predictors in high-dimensional factor regression  (2022), Florentina Bunea, Seth Strimas-Mackey and Marten Wegkamp.  Journal of Machine Learning Research, Vol 23. [ArXiv].

Likelihood estimation of sparse topic distributions in topic models and its applications to Wasserstein document distance calculations, (2022), Xin Bing, Florentina Bunea, Seth Strimas-Mackey and Marten Wegkamp. Forthcoming  in the Annals of Statistics. [ArXiv]

Detecting approximate replicate components of a high-dimensional random vector with latent structure (2023), Xin Bing, Florentina Bunea and Marten WegkampBernoulli, Vol. 29, pages 1368-1392 [ArXiv] 

Inference  in  latent factor regression with clusterable features (2022), Xin Bing, Florentina Bunea and Marten Wegkamp.  Bernoulli, Vol 28. [ArXiv].

Essential Regression – a generalizable framework for inferring causal latent factors from multi-omic human datasets (2022), Xin Bing,  Tyler Lovelace, Florentina Bunea, Marten Wegkamp,  Harinder Singh, Panayiotis V Benos, Jishnu Das. Forthcoming in Patterns-Cell Press.

Prediction in latent factor regression: Adaptive PCR and beyond (2021), Xin Bing, Florentina Bunea, Seth Strimas-Mackey and Marten Wegkamp. Journal of Machine Learning Research  [ArXiv].

Optimal estimation of sparse topic models (2020),  Xin Bing, Florentina Bunea, Marten Wegkamp. Journal of Machine Learning Research, Vol. 21, 1-45. [ArXiv].

A fast algorithm with minimax optimal guarantees for topic models with an unknown number of topics (2020), Xin Bing, Florentina Bunea and Marten WegkampBernoulli, Vol. 26 (3), 1765-1796.  [ArXiv]  (Python code is coming soon.  For the beta-version of the code, please contact xb43@cornell.edu)

Adaptive Estimation in Structured Factor Models with Applications to Overlapping Clustering (2020), Xin Bing, Florentina Bunea, Yang Ning and Marten Wegkamp.  The Annals of  Statistics, Vol. 48(4), 2055-2081. [ArXiv]  (R-package is coming up soon.  For the beta-version of the code, please contact xb43@cornell.edu)

High-Dimensional Inference for Cluster-Based Graphical Models (2020), C. Eisenach, F. Bunea, Y. Ning and C. Dinicu,  Journal of Machine Learning Research, Vol. 21, 1- 55. [ArXiv].

Model-assisted variable clustering: minimax-optimal recovery and algorithms  (2020), Florentina Bunea,  Christophe Giraud, Xi Luo, Martin Royer and Nicolas Verzelen, The  Annals of Statistics, Vol. 48 (1), 111-137. [ArXiv].

Essential Regression (2019),  Xin Bing, Florentina Bunea, Marten Wegkamp and Seth Strimas-Mackey. [ArXiv].

Latent model-based clustering for biological discovery (2019), Xin Bing, Florentina Bunea, Martin Royer, Jishnu Das. iScience ISSN 2589-0042[PDF].

PECOK: a convex optimization approach to variable clustering (2017), Florentina Bunea, Christophe Giraud, Martin Royer, and Nicolas Verzelen. [Arxiv].

Minimax Optimal Variable Clustering in G-models via Cord (2016), Florentina Bunea, Christophe Giraud and Xi Luo. [Arxiv].

Convex banding of the covariance matrix  (2016), J. Bien, F. Bunea and L. Xiao, Journal of the  American Statistical Association,Volume 111, 834-845. [ArXiv]

On the sample covariance matrix estimator of reduced effective rank population matrices, with applications to fPCA (2015), F. Bunea and L. Xiao,  Bernoulli, Vol. 21, 1200-1230. [ArXiv]

The square root group lasso: theoretical properties and fast algorithms (2014)F. Bunea, J. Lederer and Y. She, IEEE-Information Theory, Vol. 60, 1313-1325, [ArXiv];  For Matlab Code, see http://stat.fsu.edu/~yshe/code/g-sqrtlasso.zip

Joint variable and rank selection for parsimonious estimation of high dimensional matrices, (2012),  F. Bunea, Y. She and M. Wegkamp, The Annals of Statistics, Vol. 40, 2359-2388, [ArXiv]

Optimal selection of reduced rank estimators of high-dimensional matrices (2011), F. Bunea, Y. She and M. Wegkamp, The Annals of Statistics, Vol. 39, 1282 – 1309, [ArXiv]; For Matlab Code, see http://stat.fsu.edu/~yshe/code/rsc.zip

Spades and Mixture Models (2010), F. Bunea, M. Wegkamp, A. Tsybakov and A. Barbu, The Annals of Statistics, Vol. 38, No. 4, 2525 – 2558, [ArXiv]

Honest variable selection in linear and logistic regression models via l1  and l1 + l2 penalization (2008), F. Bunea,  The Electronic Journal of Statistics , Vol. 2, Pages: 1153-1194 .[ArXiv]

Aggregation for Gaussian Regression (2007),  F. Bunea, M. Wegkamp and A. Tsybakov,  The Annals of Statistics,  35 (4), 1674 – 1697. [ArXiv]

Sparsity oracle inequalities for the lasso (2007),  F. Bunea, A. Tsybakov and M. Wegkamp, The Electronic Journal of Statistics, 169 – 194. [ArXiv]

Consistent Covariate Selection and Post Model Selection Inference in Semiparametric Regression (2004),  F. Bunea, The Annals of Statistics, Vol. 32, No. 3, 898-927. [ArXiv]