der_penta_full Subroutine

public subroutine der_penta_full(du, u, u_s, u_e, n_tds, n_rhs, coeffs_s, coeffs_e, coeffs, ffr, faf, fsa, fbw, beta_lhs, beta_lhs_s)

Full (forward + backward) non-periodic pentadiagonal Thomas solve.

First builds the RHS from u using the same 9-element stencil as der_univ_dist (ffr/faf/fsa computed by preprocess_penta_dist). Then performs forward elimination (5-band LU) and backward substitution (upper-3 band) in-place.

This is a single-GPU kernel; multi-GPU periodic extension requires a distributed pentadiag reduction (future work).

Arguments

Type IntentOptional Attributes Name
real(kind=dp), intent(out), device, dimension(:, :, :) :: du
real(kind=dp), intent(in), device, dimension(:, :, :) :: u
real(kind=dp), intent(in), device, dimension(:, :, :) :: u_s
real(kind=dp), intent(in), device, dimension(:, :, :) :: u_e
integer, intent(in), value :: n_tds
integer, intent(in), value :: n_rhs
real(kind=dp), intent(in), device, dimension(:, :) :: coeffs_s
real(kind=dp), intent(in), device, dimension(:, :) :: coeffs_e
real(kind=dp), intent(in), device, dimension(:) :: coeffs
real(kind=dp), intent(in), device, dimension(:) :: ffr
real(kind=dp), intent(in), device, dimension(:) :: faf
real(kind=dp), intent(in), device, dimension(:) :: fsa
real(kind=dp), intent(in), device, dimension(:) :: fbw
real(kind=dp), intent(in), value :: beta_lhs
real(kind=dp), intent(in), value :: beta_lhs_s

j=1 beta (0 sym=True, 2β sym=False, β default)


Called by

proc~~der_penta_full~2~~CalledByGraph proc~der_penta_full~2 m_cuda_kernels_dist::der_penta_full proc~exec_dist_penta_compact m_cuda_exec_dist::exec_dist_penta_compact proc~exec_dist_penta_compact->proc~der_penta_full~2