Efficient Kaczmarz
Unlike many of the other solvers provided by RegularizedLeastSquares.jl, the Kaczmarz method does not utilize a matrix-vector product with the operator $\mathbf{A}$ nor the normal operator $\mathbf{A*A}$. Instead, it uses the rows of $\mathbf{A}$ to update the solution iteratively. Efficient Kaczmarz implementation therefore require very efficient dot products with the rows of $\mathbf{A}$. In RegularizedLeastSquares.jl, this is achieved with the dot_with_matrix_row function.
using RegularizedLeastSquares
A = randn(256, 256)
x = randn(256)
b = A*x;The dot_with_matrix_row function calculates the dot product between a row of A and the current approximate solution of x:
row = 1
isapprox(RegularizedLeastSquares.dot_with_matrix_row(A, x, row), sum(A[row, :] .* x))trueSince in Julia, dense arrays are stored in column-major order, such a row-based operation is quite inefficient. A workaround is to transpose the matrix then pass it to a Kaczmarz solver.
At = collect(transpose(A))
A_eff = transpose(At)256×256 transpose(::Matrix{Float64}) with eltype Float64:
0.138222 -0.815986 -0.458157 … 1.13464 -1.2559 0.0904353
-0.0901411 -0.790107 -1.64117 -0.104218 -1.25769 -0.752571
-1.3977 -0.0658572 -0.418464 1.70363 -1.29695 0.631492
-0.415756 -1.63742 -0.590953 -1.06943 -0.165019 -1.52824
0.548712 -0.268245 -0.786453 -1.24014 -0.693496 -0.0245972
-1.58254 -0.116031 0.290272 … -1.8788 -1.95892 1.34318
-0.814817 -1.00597 -1.14166 -0.925887 -1.43197 -0.564316
-0.298049 -0.654494 0.147558 -1.43565 0.542322 0.375231
-0.554052 -0.621265 0.986895 -0.518966 0.0533121 -1.00655
0.491985 1.7973 1.14473 0.0242168 1.05972 -0.619231
⋮ ⋱ ⋮
-0.903275 0.139654 -2.40032 -0.740434 -0.262511 0.674176
-0.541804 0.454915 0.545221 -1.50538 0.672708 0.029648
-0.0227916 -0.349905 0.213702 1.31863 0.534436 -0.662495
-0.286013 2.17479 -1.175 … 0.555187 -0.520166 0.746992
0.842314 1.13362 -0.741895 -0.494821 -0.554388 0.40132
0.588076 -2.53672 1.55062 -0.541471 -0.405766 0.813539
0.804086 1.1432 0.176165 -1.28705 0.105536 1.03964
-1.53222 1.40588 1.08677 -0.538044 -0.438009 -1.39189
-1.44139 -0.70507 2.55221 … -1.28328 -1.31252 0.300821Note that the transpose function can return a lazy transpose object, so we first collect the transpose into a dense matrix. Then we transpose it again to get the efficient representation of the matrix.
We can compare the performance using the BenchmarkTools.jl package. First for the original matrix:
using BenchmarkTools
solver = createLinearSolver(Kaczmarz, A; reg = L2Regularization(0.0001), iterations=100)
@benchmark solve!(solver, b) samples = 100BenchmarkTools.Trial: 100 samples with 1 evaluation per sample.
Range (min … max): 38.329 ms … 38.600 ms ┊ GC (min … max): 0.00% … 0.00%
Time (median): 38.450 ms ┊ GC (median): 0.00%
Time (mean ± σ): 38.448 ms ± 56.091 μs ┊ GC (mean ± σ): 0.00% ± 0.00%
▁ ▁▃ ▃ ▃▁▃▆▁▁▃█▁▁ ▁ ▃▃
▇▁▄▄▄▄▇▄▄▇▁▁█▄▇██▇█▁▁▇▄██████████▄▇█▇██▄▁▁▇▇▁▇▇▁▁▁▄▁▁▁▁▁▁▁▄ ▄
38.3 ms Histogram: frequency by time 38.6 ms <
Memory estimate: 17.31 KiB, allocs estimate: 505.And then for the efficient matrix:
solver_eff = createLinearSolver(Kaczmarz, A_eff; reg = L2Regularization(0.0001), iterations=100)
@benchmark solve!(solver_eff, b) samples = 100BenchmarkTools.Trial: 100 samples with 1 evaluation per sample.
Range (min … max): 1.795 ms … 1.851 ms ┊ GC (min … max): 0.00% … 0.00%
Time (median): 1.811 ms ┊ GC (median): 0.00%
Time (mean ± σ): 1.812 ms ± 8.681 μs ┊ GC (mean ± σ): 0.00% ± 0.00%
▂ ▄▂ █▆ ▆ ▂▂▂ ▂ ▂
▄▁▄▁▄▁▁▆▁▄▄▄▄████▆▆████████▆▆▄█▄█▄██▄▄▁▄▁▆▁▁▁▄▄▄▄▄▁▁▁▁▁▁▆ ▄
1.79 ms Histogram: frequency by time 1.83 ms <
Memory estimate: 17.34 KiB, allocs estimate: 507.We can also combine the efficient matrix with a weighting matrix, as is shown in the Weighting example.
Custom operators need to implement the dot_with_matrix_row function to be used with the Kaczmarz solver. Ideally, such an implementation is allocation free.
This page was generated using Literate.jl.