Efficient Kaczmarz
Unlike many of the other solvers provided by RegularizedLeastSquares.jl, the Kaczmarz method does not utilize a matrix-vector product with the operator $\mathbf{A}$ nor the normal operator $\mathbf{A*A}$. Instead, it uses the rows of $\mathbf{A}$ to update the solution iteratively. Efficient Kaczmarz implementation therefore require very efficient dot products with the rows of $\mathbf{A}$. In RegularizedLeastSquares.jl, this is achieved with the dot_with_matrix_row function.
using RegularizedLeastSquares
A = randn(256, 256)
x = randn(256)
b = A*x;The dot_with_matrix_row function calculates the dot product between a row of A and the current approximate solution of x:
row = 1
isapprox(RegularizedLeastSquares.dot_with_matrix_row(A, x, row), sum(A[row, :] .* x))trueSince in Julia, dense arrays are stored in column-major order, such a row-based operation is quite inefficient. A workaround is to transpose the matrix then pass it to a Kaczmarz solver.
At = collect(transpose(A))
A_eff = transpose(At)256×256 transpose(::Matrix{Float64}) with eltype Float64:
-0.845885 -1.03564 0.151952 … 0.671462 -0.329974 -0.04661
0.155163 0.392258 0.897342 -1.8246 0.189523 1.10736
-0.495033 -0.49133 0.778819 0.5618 -0.560283 -0.668798
-0.233133 0.70318 -0.524632 -0.22617 0.978764 -0.400967
0.17351 0.00356125 0.121891 -0.832413 -0.127142 -2.99181
1.94344 -0.0894104 0.0727451 … 0.729585 0.44853 -1.62267
-0.906821 0.320533 -0.825438 0.814818 -0.381371 -0.209179
1.07931 0.823433 0.26865 0.757307 0.232194 1.54605
0.725729 0.207188 0.549091 -0.924242 0.373392 -0.749068
-0.159451 0.0493149 1.93157 0.98536 -1.68801 0.137009
⋮ ⋱ ⋮
0.340863 -0.398874 -0.454218 -0.904708 0.959747 -0.105418
1.05571 -0.0929981 1.10612 -0.491928 1.75692 -1.30198
2.69411 0.356481 0.711368 0.0130584 2.54326 1.15603
-0.698125 -0.237754 -1.17593 … 0.100194 1.94983 0.0699787
0.307402 -1.81184 -0.573497 -0.650915 0.329893 0.502739
0.587114 -1.20302 0.234708 0.272464 -0.684008 0.278572
-1.02732 0.891735 -2.78025 1.9955 -0.707748 1.71998
1.56868 -0.998924 -1.04445 -0.186529 -0.662536 -0.29924
-0.278334 0.558744 -0.606144 … -0.341575 0.204934 -1.49226Note that the transpose function can return a lazy transpose object, so we first collect the transpose into a dense matrix. Then we transpose it again to get the efficient representation of the matrix.
We can compare the performance using the BenchmarkTools.jl package. First for the original matrix:
using BenchmarkTools
solver = createLinearSolver(Kaczmarz, A; reg = L2Regularization(0.0001), iterations=100)
@benchmark solve!(solver, b) samples = 100BenchmarkTools.Trial: 100 samples with 1 evaluation per sample.
Range (min … max): 37.972 ms … 38.524 ms ┊ GC (min … max): 0.00% … 0.00%
Time (median): 38.139 ms ┊ GC (median): 0.00%
Time (mean ± σ): 38.157 ms ± 96.939 μs ┊ GC (mean ± σ): 0.00% ± 0.00%
▅ ▅▅▂▂▂▂█▂ ▅▂ █ █ ▂ ▅
▅▅▅▁█▁▅▁▅▅▅▅███▅██████████▅▅██▁▅▅███▅█▅▅▅▁▅█▅█▁██▁▅▁▁▅▅▁▅▁█ ▅
38 ms Histogram: frequency by time 38.4 ms <
Memory estimate: 17.31 KiB, allocs estimate: 505.And then for the efficient matrix:
solver_eff = createLinearSolver(Kaczmarz, A_eff; reg = L2Regularization(0.0001), iterations=100)
@benchmark solve!(solver_eff, b) samples = 100BenchmarkTools.Trial: 100 samples with 1 evaluation per sample.
Range (min … max): 1.801 ms … 1.902 ms ┊ GC (min … max): 0.00% … 0.00%
Time (median): 1.818 ms ┊ GC (median): 0.00%
Time (mean ± σ): 1.820 ms ± 14.699 μs ┊ GC (mean ± σ): 0.00% ± 0.00%
▂ ▆▅▆▂▂▂▆█
▄▄▇▇▅▄▄█▅▇█████████▇▄▅▅▄▁▄▁▁▄▁▁▁▁▁▁▄▁▄▁▁▁▄▁▁▁▁▁▁▄▄▁▄▁▁▁▁▁▄ ▄
1.8 ms Histogram: frequency by time 1.87 ms <
Memory estimate: 17.34 KiB, allocs estimate: 507.We can also combine the efficient matrix with a weighting matrix, as is shown in the Weighting example.
Custom operators need to implement the dot_with_matrix_row function to be used with the Kaczmarz solver. Ideally, such an implementation is allocation free.
This page was generated using Literate.jl.