A z-plane analysis of Tikhonov regularization in polynomial regression algorithms shows that an increased weight decay factor generally corresponds to eigenvalues of reduced radii. Such a strategy offers increased out-of-sample error in regression models, effectively an insurance policy against overfit. A number of experiments demonstrate the filterbank representation of this technique for a variety of values for .
Tikhonov regularization, also known to some statisticians as "ridge regression," is a simple refinement of the linear regression model. While the linear model performs the regression:
regularized linear regression performs:
where the weight decay factor is .
By using the linear regression model to fit a transformation to the observed system behavior, and finding the eigenvalues of that transformation matrix, linear prediction algorithms find the coefficients to an all-pole filterbank whose response characterizes the spectral envelope of the signal. In a real-world object, the spectral envelope is distributed in space along complex exponential curves known as eigenmodes, which correspond to the eigenvectors of the decomposition. In this view, an n-dimensional linear model is a Multi-Input-Multi-Output (MIMO) adaptive filterbank. In the filterbank interpretation, the hypothesis is that regularization "whitens" the constituent 2nd order filters in the model, which results in lower selectivity. Given the findings of a previous experiment, this increased damping is highly desirable for residual estimation.
To test this, a synthetic target was used which simulates the response of a surface with modes. For each trial, the same target model was driven with independent gaussian white noise sources. The trials consisted of 11 tests, where the weight decay parameter was adjusted on an exponential scale for each test :
Each test consisted of 4 sample analyses, which were plotted together with the ideal eigenvalues, then concatenated into animations to allow easy visual inspection of the technique's effect on bias and variance for different targets. The source code for each trial can be found here.
The results are the following z-plane diagrams. The target eigenvalues are plotted with circles, whereas each estimate in the trial is a differently colored x. The forcing function vector changes between frames as well. Click each heading to download an archive of each trial's data.
While the effect of this regularization technique is nonlinear with respect to the eigenvalue estimates, this regularization strategy generally has the effect of adaptively increasing the damping of the filterbank where spectral peaks are less pronounced, making the model less sensitive to stochastic and deterministic noise. In certain cases, such as Trial B, this description is almost entirely sufficient to explain the behavior of the coefficients on the z-plane. However, certain cases, such as Trial D are less predictable. It appears that, in such cases, when large radii are penalized, the algorithm must adjust the angles of the eigenvalues to accommodate a least-squares fit.
Future research will explore the effects of regularization on residual (ie "forcing function") estimation, using methods similar to the unregularized case, as explored in previous posts.