Normal Operator

Many solvers in RegularizedLeastSquares.jl are based on the normal operator $\mathbf{A}^*\mathbf{A}$ of the linear operator $\mathbf{A}$.

Solvers such as ADMM, FISTA and POGM generally solve optimization problems of the form:

\[\begin{equation} \underset{\mathbf{x}}{argmin} \mathbf{f(x)}+ \mathbf{R(x)} \end{equation}\]

and require the gradient of the function $\mathbf{f(x)}$. In this package we specialise the function $\mathbf{f(x)}$ to the least squares norm:

\[\begin{equation} \mathbf{f(x)} = \frac{1}{2}\vert\vert \mathbf{A}\mathbf{x}-\mathbf{b} \vert\vert^2_2 \end{equation}\]

The gradient of this function is:

\[\begin{equation} \nabla \mathbf{f(x)} = \mathbf{A}^*(\mathbf{A}\mathbf{x}-\mathbf{b}) = \mathbf{A}^*\mathbf{Ax} - \mathbf{A}^*\mathbf{b} \end{equation}\]

Similarily, the conjugate gradient normal residual (CGNR) algorithm applies conjugate gradient algorithm to:

\[\begin{equation} \mathbf{A}^*\mathbf{A}\mathbf{x} = \mathbf{A}^*\mathbf{b} \end{equation}\]

The normal operator can be passed directly to these solvers, otherwise it is computed internally.

using RegularizedLeastSquares
A = randn(32, 16)
x = randn(16)
b = A*x

solver = createLinearSolver(CGNR, A; AHA = adjoint(A) * A, reg = L2Regularization(0.0001), iterations=32);
x_approx = solve!(solver, b)

16-element Vector{Float64}:
 -0.0068608070013497
  1.510279950134171
  0.49677947685196944
 -0.29238871075776907
 -0.14926922665854075
 -0.8421615508547345
 -1.8313349719860372
 -0.8129575646583957
 -1.1621652734708936
 -0.9202900795547184
 -3.3334695517379638
  0.8755477495614541
  0.2826186041481316
  0.03444547199721013
  0.8938219826009284
  0.6428792462082689

The normal operator can also be computed using the normalOperator function from LinearOperatorCollection.jl. This is useful if the normal operator is not directly available or shouldn't be stored in memory. This function is opinionated and attempts to optimize the resulting operator for iterative applications. Specifying a custom method for a custom operator allows one to control this optimization.

An example of such an optimization is a matrix-free weighting of $\mathbf{A}$ as shown in the Weighting example:

using LinearOperatorCollection
weights = rand(32)
WA = ProdOp(WeightingOp(weights), A)
AHA = LinearOperatorCollection.normalOperator(WA)

Linear operator
  nrow: 16
  ncol: 16
  eltype: Float64
  symmetric: false
  hermitian: false
  nprod:   0
  ntprod:  0
  nctprod: 0

Without an optimization a matrix-free product would apply the following operator each iteration:

\[\begin{equation} (\mathbf{WA})^*\mathbf{WA} = \mathbf{A}^*\mathbf{W}^*\mathbf{W}\mathbf{A} \end{equation}\]

This is not efficient and instead the normal operator can be optimized by initially computing the weights:

\[\begin{equation} \tilde{\mathbf{W}} = \mathbf{W}^*\mathbf{W} \end{equation}\]

and then applying the following each iteration:

\[\begin{equation} \mathbf{A}^*\tilde{\mathbf{W}}\mathbf{A} \end{equation}\]

The optimized normal operator can then be passed to the solver:

solver = createLinearSolver(CGNR, WA; AHA = AHA, reg = L2Regularization(0.0001), iterations=32);
x_approx2 = solve!(solver, weights .* b)

16-element Vector{Float64}:
 -0.006837192826844945
  1.5102911602957558
  0.49676488524402884
 -0.2924115178495965
 -0.1493748074846615
 -0.8421717429978992
 -1.8313337287492306
 -0.812867349035906
 -1.1621124027528222
 -0.9202872349850199
 -3.3333631998587387
  0.8755119410460656
  0.28257824769546835
  0.03441534824774462
  0.8937820594281263
  0.6429089170204695

Of course it is also possible to optimize a normal operator with other means and pass it to the solver via the AHA keyword argument.

It is also possible to only supply the normal operator to these solvers, however on then needs to supply $\mathbf{A^*b}$ intead of $\mathbf{b}$.

This page was generated using Literate.jl.