Discovering Structural Patterns in Statistical Models via Regularization : Asymptotics, Exact Recovery and Estimation
Author
Summary, in English
In the linear regression model, when the dimension p is fixed and n→∞, the asymptotic distribution of the estimation error for Lasso-type M-estimators is well established. In contrast, the convergence of the discrete structures (patterns) induced by these regularizers, such as sparsity or clustering, has remained largely unaddressed, even for the standard Lasso penalty. Because these lower-dimensional structures are sensitive to infinitesimal perturbations, the weak convergence of the continuous estimation error does not guarantee the convergence of the induced patterns.
This thesis develops a unified theoretical framework to resolve this discrepancy. We establish that for a range of non-differentiable regularizers with a polyhedral component (including the Lasso, Generalized Lasso, SLOPE, and Elastic Net), the estimated patterns converge toward a limiting pattern determined by an explicit asymptotic formula. This limit is characterized by the Fisher Information matrix of the loss, the score covariance matrix, and the directional derivative of the regularizer. In Paper I, this is achieved in the linear model by utilizing the Hausdorff distance as a suitable mode of set convergence involving subdifferentials. In Paper II, we extend the theory from Paper I to general statistical models via a Stochastic Lipschitz Differentiability (SLD) condition, which controls the fluctuations of the Taylor remainder and, beyond differentiable losses, encompasses robust, non-smooth functions such as the Huber and quantile losses.
The asymptotic formula characterizes the specific asymptotic irrepresentability conditions under which the true signal pattern can be recovered with high probability. To bypass instances where these stringent conditions fail (typically in highly correlated predictor structures), we develop adaptive two-step procedures based on proximal operators. Furthermore, the theory identifies a critical degeneracy in the Fused Lasso regarding its inability to recover its own clusters under orthogonal design, motivating the proposal of a Concavified Fused Lasso penalty that resolves this limitation.
Papers III and IV apply this asymptotic theory to graphical models and precision matrix estimation. Paper III establishes exact asymptotic limits for the Graphical SLOPE estimator under elliptically distributed data, revealing how it can outperform the Graphical Lasso when clustering structures are present. Paper IV focuses on the scale-invariant PCGLASSO method; we derive an irrepresentability condition under which the true sparsity structure can be recovered, theoretically justifying the method's empirical performance in discovering network hubs.
Paper V focuses on the high-dimensional regime (p >> n). By introducing a Surrogate regularizer, a decomposable penalty that locally matches the original polyhedral regularizer around the true signal, we extend established high-dimensional estimation bounds to non-decomposable penalties. Using this surrogate construction, we derive explicit deterministic error bounds and exact pattern recovery conditions. While our primary application focuses on SLOPE, the surrogate approach encompasses penalties such as the Generalized Lasso. Additionally, by linking the linear SLOPE sequence to the Integrated Brownian Bridge, we show that the volume ratio between the dual SLOPE and dual Lasso balls decays at an exact rate of p^{-1/4} for the linear sequence, and we conjecture a p^{-1/6} Gaussian volume ratio decay rate for the Benjamini-Hochberg sequence.
This thesis develops a unified theoretical framework to resolve this discrepancy. We establish that for a range of non-differentiable regularizers with a polyhedral component (including the Lasso, Generalized Lasso, SLOPE, and Elastic Net), the estimated patterns converge toward a limiting pattern determined by an explicit asymptotic formula. This limit is characterized by the Fisher Information matrix of the loss, the score covariance matrix, and the directional derivative of the regularizer. In Paper I, this is achieved in the linear model by utilizing the Hausdorff distance as a suitable mode of set convergence involving subdifferentials. In Paper II, we extend the theory from Paper I to general statistical models via a Stochastic Lipschitz Differentiability (SLD) condition, which controls the fluctuations of the Taylor remainder and, beyond differentiable losses, encompasses robust, non-smooth functions such as the Huber and quantile losses.
The asymptotic formula characterizes the specific asymptotic irrepresentability conditions under which the true signal pattern can be recovered with high probability. To bypass instances where these stringent conditions fail (typically in highly correlated predictor structures), we develop adaptive two-step procedures based on proximal operators. Furthermore, the theory identifies a critical degeneracy in the Fused Lasso regarding its inability to recover its own clusters under orthogonal design, motivating the proposal of a Concavified Fused Lasso penalty that resolves this limitation.
Papers III and IV apply this asymptotic theory to graphical models and precision matrix estimation. Paper III establishes exact asymptotic limits for the Graphical SLOPE estimator under elliptically distributed data, revealing how it can outperform the Graphical Lasso when clustering structures are present. Paper IV focuses on the scale-invariant PCGLASSO method; we derive an irrepresentability condition under which the true sparsity structure can be recovered, theoretically justifying the method's empirical performance in discovering network hubs.
Paper V focuses on the high-dimensional regime (p >> n). By introducing a Surrogate regularizer, a decomposable penalty that locally matches the original polyhedral regularizer around the true signal, we extend established high-dimensional estimation bounds to non-decomposable penalties. Using this surrogate construction, we derive explicit deterministic error bounds and exact pattern recovery conditions. While our primary application focuses on SLOPE, the surrogate approach encompasses penalties such as the Generalized Lasso. Additionally, by linking the linear SLOPE sequence to the Integrated Brownian Bridge, we show that the volume ratio between the dual SLOPE and dual Lasso balls decays at an exact rate of p^{-1/4} for the linear sequence, and we conjecture a p^{-1/6} Gaussian volume ratio decay rate for the Benjamini-Hochberg sequence.
Department/s
Publishing year
2026-05-19
Language
English
Full text
Document type
Dissertation
Publisher
Lund University (Media-Tryck)
Topic
- Probability Theory and Statistics
Keywords
- regularization
- pattern convergence
- exact recovery
- irrepresentability condition
- high-dimensional statistics
- graphical models
- non-decomposable penalization
- Lasso
- Fused Lasso
- SLOPE
Status
Published
Supervisor
- Jonas Wallin
- Malgorzata Bogdan
ISBN/ISSN/Other
- ISBN: 978-91-90202-00-5
- ISBN: 978-91-90202-01-2
Defence date
9 June 2026
Defence time
13:15
Defence place
EC3:207
Opponent
- Piotr Zwiernik (Professor)