Efficient Kaczmarz
Unlike many of the other solvers provided by RegularizedLeastSquares.jl, the Kaczmarz method does not utilize a matrix-vector product with the operator $\mathbf{A}$ nor the normal operator $\mathbf{A*A}$. Instead, it uses the rows of $\mathbf{A}$ to update the solution iteratively. Efficient Kaczmarz implementation therefore require very efficient dot products with the rows of $\mathbf{A}$. In RegularizedLeastSquares.jl, this is achieved with the dot_with_matrix_row
function.
using RegularizedLeastSquares
A = randn(256, 256)
x = randn(256)
b = A*x;
The dot_with_matrix_row
function calculates the dot product between a row of A and the current approximate solution of x:
row = 1
isapprox(RegularizedLeastSquares.dot_with_matrix_row(A, x, row), sum(A[row, :] .* x))
true
Since in Julia, dense arrays are stored in column-major order, such a row-based operation is quite inefficient. A workaround is to transpose the matrix then pass it to a Kaczmarz solver.
At = collect(transpose(A))
A_eff = transpose(At)
256×256 transpose(::Matrix{Float64}) with eltype Float64:
0.989365 -0.58187 1.97371 … -0.694757 -0.655809 -0.996402
-1.40978 0.192424 -0.745341 -0.547174 0.652495 -1.07808
1.75596 1.60811 2.54307 -1.52723 -0.335948 1.43184
-0.421766 -1.63185 0.934959 0.451071 0.362942 -1.1912
-0.431688 -0.150076 -0.984168 1.36739 1.3035 2.85549
-0.855404 1.25841 0.739715 … -0.433284 -0.401964 -0.838355
-0.548449 -0.0412651 0.484013 -1.28038 -2.13875 -0.166735
1.06386 1.08153 0.617701 -1.28163 -0.224296 -0.368868
1.04657 0.214068 -0.608255 0.525888 0.930029 0.440489
-1.94456 0.206711 -1.32429 -0.204572 0.763032 0.290354
⋮ ⋱ ⋮
0.145738 0.224292 -0.85371 -1.9351 -1.11289 -0.785633
0.827237 -0.267411 -0.140251 1.54657 0.905815 -1.24774
-0.596398 1.10259 -0.861819 0.68415 -1.03592 -0.474571
0.0153301 0.328171 -1.07013 … -2.08589 -0.26642 -1.32704
1.11026 0.625114 0.765961 -1.23445 -0.0597777 0.774984
1.42451 0.578492 -1.17861 -0.113697 0.404036 -0.218054
1.26331 -1.145 0.0806542 0.0282147 1.06409 -0.930758
-0.347995 2.00732 0.163453 0.311289 -0.885045 0.782946
-0.106906 1.26698 -0.397374 … 0.112938 0.414758 0.659097
Note that the transpose function can return a lazy transpose object, so we first collect the transpose into a dense matrix. Then we transpose it again to get the efficient representation of the matrix.
We can compare the performance using the BenchmarkTools.jl package. First for the original matrix:
using BenchmarkTools
solver = createLinearSolver(Kaczmarz, A; reg = L2Regularization(0.0001), iterations=100)
@benchmark solve!(solver, b) samples = 100
BenchmarkTools.Trial: 100 samples with 1 evaluation.
Range (min … max): 25.524 ms … 27.560 ms ┊ GC (min … max): 0.00% … 0.00%
Time (median): 27.325 ms ┊ GC (median): 0.00%
Time (mean ± σ): 27.310 ms ± 189.770 μs ┊ GC (mean ± σ): 0.00% ± 0.00%
█▄▃
▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▁▁▁▁▁▂▁▁▇███▅▃ ▂
25.5 ms Histogram: frequency by time 27.4 ms <
Memory estimate: 17.31 KiB, allocs estimate: 505.
And then for the efficient matrix:
solver_eff = createLinearSolver(Kaczmarz, A_eff; reg = L2Regularization(0.0001), iterations=100)
@benchmark solve!(solver_eff, b) samples = 100
BenchmarkTools.Trial: 100 samples with 1 evaluation.
Range (min … max): 1.826 ms … 2.073 ms ┊ GC (min … max): 0.00% … 0.00%
Time (median): 1.856 ms ┊ GC (median): 0.00%
Time (mean ± σ): 1.859 ms ± 29.682 μs ┊ GC (mean ± σ): 0.00% ± 0.00%
▄▆▁ ▃ ▁█ ▁ ▁ ▁
▇▄▄▁▄███▇█▆▇▇▇▆▁▆██▁█▆▆▆█▆▁█▇▆▄▇▄▁▄▄▆▄▁▄▁▁▄▆▁▁▁▁▁▁▁▁▁▁▄▁▁▄ ▄
1.83 ms Histogram: frequency by time 1.92 ms <
Memory estimate: 17.34 KiB, allocs estimate: 507.
We can also combine the efficient matrix with a weighting matrix, as is shown in the Weighting example.
Custom operators need to implement the dot_with_matrix_row
function to be used with the Kaczmarz solver. Ideally, such an implementation is allocation free.
This page was generated using Literate.jl.