Multi-Threading
There are different ways multi-threading can be used with RegularizedLeastSquares.jl. To use multi-threading in Julia, one needs to start their session with multi-threads, see the Julia documentation for more information.
Solver Based Multi-Threading
This type of multi-threading is transparent to the solver and is applicable if the total solution is composed of individual solutions that can be solved in parallel. In particular, this approach also allows for using solvers with different parameters, such as their operator or regularization parameters.
using RegularizedLeastSquares
As = [rand(32, 16) for _ in 1:4]
xs = [rand(16) for _ in 1:4]
bs = [A*x for (A, x) in zip(As, xs)]
xs_approx = similar(xs)
Threads.@threads for i in 1:4
solver = createLinearSolver(CGNR, As[i]; iterations=32)
xs_approx[i] = solve!(solver, bs[i])
end
Operator Based Multi-Threading
This type of multi-threading involves linear operators or proximal maps that can be implemnted in parallel. Examples of this include the proximal map of the TV regularization term, which is based on the multi-threaded GradientOp from LinearOperatorCollection. GPU acceleration also falls under this approach, see GPU Acceleration for more information.
Measurement Based Multi-Threading
This level of multi-threading applies the same solver (and its parameters) to multiple measurement vectors or rather a measurement matrix B. This is useful in the case of multiple measurements that can be solved in parallel and can reuse the same solver. This approach is not applicable if the operator is stateful.
To use this approach we first build a measurement matrix B and a corresponding solver:
A = first(As)
B = mapreduce(x -> A*x, hcat, xs)
solver = createLinearSolver(CGNR, A; iterations=32)
CGNR{Matrix{Float64}, Matrix{Float64}, L2Regularization{Float64}, Vector{Any}}([0.873070177875379 0.4489929668503878 … 0.10130640845656891 0.4807989969814992; 0.9749353348257264 0.3735845974566312 … 0.02513483367683289 0.5554485107067461; … ; 0.823516103273532 0.6483667860839095 … 0.05889148345230033 0.36070270096907; 0.1933993497213098 0.08168369194818892 … 0.5087893401540348 0.7253768187076945], [12.126180730802902 8.703613141585812 … 8.697878141272353 9.979016753753086; 8.703613141585812 11.607601872559167 … 7.9465662621924436 8.327939100895293; … ; 8.697878141272353 7.9465662621924436 … 9.903872768226947 8.938247556168001; 9.979016753753086 8.327939100895293 … 8.938247556168001 12.240652197219724], L2Regularization{Float64}(0.0), Any[], NoNormalization(), 32, RegularizedLeastSquares.CGNRState{Float64, Float64, Vector{Float64}}([5.06e-321, 6.9380211621485e-310, 6.9378998317039e-310, 0.0, 5.06e-321, 6.9378946657496e-310, 5.0e-324, 5.263544247e-315, 6.9379577524915e-310, 0.0, 5.06e-321, 8.86e-321, 5.0e-324, 5.06e-321, 5.0e-324, 6.93789252806663e-310], [5.06e-321, 6.9380211621485e-310, 6.9378998317039e-310, 0.0, 5.06e-321, 6.9378946657496e-310, 5.0e-324, 5.263544247e-315, 6.9379577524915e-310, 0.0, 5.06e-321, 8.86e-321, 5.0e-324, 5.06e-321, 5.0e-324, 6.9378925280698e-310], [5.06e-321, 6.9380211621485e-310, 6.9378998317039e-310, 0.0, 5.06e-321, 6.9378946657496e-310, 5.0e-324, 5.263544247e-315, 6.9379577524915e-310, 0.0, 5.06e-321, 8.86e-321, 5.0e-324, 5.06e-321, 5.0e-324, 6.93789252768403e-310], [5.06e-321, 6.9380211621485e-310, 6.9378998317039e-310, 0.0, 5.06e-321, 6.9378946657496e-310, 5.0e-324, 5.263544247e-315, 6.9379577524915e-310, 0.0, 5.06e-321, 8.86e-321, 5.0e-324, 5.06e-321, 5.0e-324, 6.9378925276872e-310], 0.0, 0.0, 0.0, 0, 2.220446049250313e-16, 0.0))
We can then simply pass the measurement matrix to the solver. The result will be the same as if we passed each colument of B seperately:
x_approx = solve!(solver, B)
size(x_approx)
(16, 4)
The previous solve!
call was still executed sequentially. To execute it in parallel, we have to specify a multi-threaded scheduler
as a keyword-argument of the solve!
call. RegularizedLeastSquares.jl provides a MultiThreadingState
scheduler that can be used for this purpose. This scheduler is based on the Threads.@threads
macro:
x_multi = solve!(solver, B; scheduler = MultiThreadingState)
x_approx == x_multi
true
Custom Scheduling
It is possible to implement custom scheduling. The following code shows how to implement this for the Threads.@spawn
macro. Usually one this to implement multi-threading with a package such as FLoop.jl or ThreadPools.jl for thread pinning:
Since most solver have conv. criteria, they can finish at different iteration numbers, which we track this information with flags.
mutable struct SpawnState{S, ST <: AbstractSolverState{S}} <: RegularizedLeastSquares.AbstractMatrixSolverState{S}
states::Vector{ST}
active::Vector{Bool}
SpawnState(states::Vector{ST}) where {S, ST <: AbstractSolverState{S}} = new{S, ST}(states, fill(true, length(states)))
end
To hook into the existing init! code we only have to supply a method that gets a copyable "vector" state. This will invoke our SpawnState constructor with copies of the given state.
prepareMultiStates(solver::AbstractLinearSolver, state::SpawnState, b::AbstractMatrix) = prepareMultiStates(solver, first(state.states), b);
We specialise the iterate function which is called with the idx of still active states
function Base.iterate(solver::AbstractLinearSolver, state::SpawnState, activeIdx)
@sync Threads.@spawn for i in activeIdx
res = iterate(solver, state.states[i])
if isnothing(res)
state.active[i] = false
end
end
return state.active, state
end
Now we can simply use the SpawnState scheduler in the solve! call:
x_custom = solve!(solver, B; scheduler = SpawnState)
x_approx == x_multi
true
This page was generated using Literate.jl.