Full (forward + backward) non-periodic pentadiagonal Thomas solve.
First builds the RHS from u using the same 9-element stencil as der_univ_dist (ffr/faf/fsa computed by preprocess_penta_dist). Then performs forward elimination (5-band LU) and backward substitution (upper-3 band) in-place.
This is a single-GPU kernel; multi-GPU periodic extension requires a distributed pentadiag reduction (future work).
| Type | Intent | Optional | Attributes | Name | ||
|---|---|---|---|---|---|---|
| real(kind=dp), | intent(out), | device, dimension(:, :, :) | :: | du | ||
| real(kind=dp), | intent(in), | device, dimension(:, :, :) | :: | u | ||
| real(kind=dp), | intent(in), | device, dimension(:, :, :) | :: | u_s | ||
| real(kind=dp), | intent(in), | device, dimension(:, :, :) | :: | u_e | ||
| integer, | intent(in), | value | :: | n_tds | ||
| integer, | intent(in), | value | :: | n_rhs | ||
| real(kind=dp), | intent(in), | device, dimension(:, :) | :: | coeffs_s | ||
| real(kind=dp), | intent(in), | device, dimension(:, :) | :: | coeffs_e | ||
| real(kind=dp), | intent(in), | device, dimension(:) | :: | coeffs | ||
| real(kind=dp), | intent(in), | device, dimension(:) | :: | ffr | ||
| real(kind=dp), | intent(in), | device, dimension(:) | :: | faf | ||
| real(kind=dp), | intent(in), | device, dimension(:) | :: | fsa | ||
| real(kind=dp), | intent(in), | device, dimension(:) | :: | fbw | ||
| real(kind=dp), | intent(in), | value | :: | beta_lhs | ||
| real(kind=dp), | intent(in), | value | :: | beta_lhs_s |
j=1 beta (0 sym=True, 2β sym=False, β default) |