Periodic pentadiagonal Thomas solve via Sherman-Morrison-Woodbury rank-4.
A_cyc = A_np + UC; U=[e_1|e_2|e_{n-1}|e_n]; C encodes 6 corner entries. Thread 1 computes Z_sh(:,k)=A_np^{-1}e_{pk} and shares via syncthreads. Each lane then forms M=I+CZ, solves Mc=Cy, applies du -= Zc.
| Type | Intent | Optional | Attributes | Name | ||
|---|---|---|---|---|---|---|
| real(kind=dp), | intent(out), | device, dimension(:, :, :) | :: | du | ||
| real(kind=dp), | intent(in), | device, dimension(:, :, :) | :: | u | ||
| real(kind=dp), | intent(in), | device, dimension(:, :, :) | :: | u_s | ||
| real(kind=dp), | intent(in), | device, dimension(:, :, :) | :: | u_e | ||
| integer, | intent(in), | value | :: | n_tds | ||
| integer, | intent(in), | value | :: | n_rhs | ||
| real(kind=dp), | intent(in), | device, dimension(:, :) | :: | coeffs_s | ||
| real(kind=dp), | intent(in), | device, dimension(:, :) | :: | coeffs_e | ||
| real(kind=dp), | intent(in), | device, dimension(:) | :: | coeffs | ||
| real(kind=dp), | intent(in), | device, dimension(:) | :: | ffr | ||
| real(kind=dp), | intent(in), | device, dimension(:) | :: | faf | ||
| real(kind=dp), | intent(in), | device, dimension(:) | :: | fsa | ||
| real(kind=dp), | intent(in), | device, dimension(:) | :: | fbw | ||
| real(kind=dp), | intent(in), | value | :: | beta_lhs | ||
| real(kind=dp), | intent(in), | value | :: | beta_lhs_s | ||
| real(kind=dp), | intent(in), | value | :: | alp | ||
| real(kind=dp), | intent(in), | value | :: | bet |