trsm_batch

Computes groups of matrix-matrix product with general matrices.

Syntax

Group API

event trsm_batch(queue &exec_queue, side *left_right, uplo *upper_lower, transpose *trans, diag *unit_diag, std::int64_t *m, std::int64_t *n, T alpha, T **a, std::int64_t *lda, std::int64_t *stridea, T **b, std::int64_t *ldb, std::int64_t *strideb, std::int64_t group_count, std::int64_t *groupsize, const vector_class<event> &dependencies = {})

Strided API

void trsm_batch(queue &exec_queue, side left_right, uplo upper_lower, transpose trans, diag unit_diag, std::int64_t m, std::int64_t n, T alpha, buffer<T, 1> &a, std::int64_t lda, std::int64_t stridea, buffer<T, 1> &b, std::int64_t ldb, std::int64_t strideb, std::int64_t batch_size)
event trsm_batch(queue &exec_queue, side left_right, uplo upper_lower, transpose trans, diag unit_diag, std::int64_t m, std::int64_t n, T alpha, T *a, std::int64_t lda, std::int64_t stridea, T *b, std::int64_t ldb, std::int64_t strideb, std::int64_t batch_size, const vector_class<event> &dependencies = {})

trsm_batch supports the following precisions and devices.

T

Devices Supported

float

Host, CPU, and GPU

double

Host, CPU, and GPU

std::complex<float>

Host, CPU, and GPU

std::complex<double>

Host, CPU, and GPU

Description

The trsm_batch routines solve a series of equations of the form op(A) * X = alpha * B or X * op(A) = alpha * B. They are similar to the trsm routine counterparts, but the trsm_batch routines solve linear equations with groups of matrices. The groups contain matrices with the same parameters.

The operation for the strided API is defined as

for i = 0 … batch_size – 1
    A and B are matrices at offset i * stridea and i * strideb in a and b.
    if (left_right == mkl::side::L) then
        computes X such that op(A) * X = alpha * B
    else
        computes X such that X * op(A) = alpha * B
    end if
    B := X
end for

The operation for the group API is defined as

idx = 0
for i = 0 … group_count – 1
left_right, upper_lower, alpha, and group_size at position i in their respective arrays.
    for j = 0 … group_size – 1
            A, and B are matrices at position idx in A_array, and B_array
        if (left_right == mkl::side::L) then
            computes X such that op(A) * X = alpha * B
        else
            computes X such that X * op(A) = alpha * B
        end if
        B:= X
        idx := idx + 1
    end for
end for

where:

  • op(A) is one of op(A) = A, or op(A) = AT, or op(A) = AH

  • alpha is a scalar

  • A is a triangular matrix

  • B and X are m x n general matrices

  • The a and b buffers (arrays, for USM API) contains all the input matrices. The stride between matrices is either given by the exact size of the matrix or by the stride parameter. The total number of matrices in a and b is given by the image0 .

  • The a and b arrays contains pointers to all the input matrices. The total number of matrices in a and b is given by the image1 .

A is either m x m or n x n,depending on whether it multiplies X on the leftor right. On return, the matrix B is overwrittenby the solution matrix X.

Input Parameters

Strided API

left_right

Specifies whether the matrices A multiply X on the left (side::left) or on the right (side::right). See Data Types for more details.

uplo

Specifies whether the matrices A are upper or lower triangular. See Data Types for more details.

trans

Specifies op(A), the transposition operation applied to the matrices A. See Data Types for more details.

unit_diag

Specifies whether the matrices A are assumed to be unit triangular (all diagonal elements are 1.). See Data Types for more details.

m

Number of rows of the B matrices. Must be at least zero.

n

Number of columns of the B matrices. Must be at least zero.

alpha

Scaling factor for the solutions.

a

Buffer holding the input matrices A. Must have size at least stridea*batch_size.

lda

Leading dimension of the matrices A. Must be at least m if left_right = side::left, and at least n if left_right = side::right. Must be positive.

stridea

Stride between the different A matrices.

If left_right = side::left, the matrices A are m-by-m matrices, so stridea must be at least lda*m.

If left_right = side::right, the matrices A are n-by-n matrices, so stridea must be at least lda*n.

b

Buffer holding the input matrices B. Must have size at least strideb*batch_size.

ldb

Leading dimension of the matrices B. If matrices are stored column major, ldb must be at least mldb. If matrices are stored row major, ldb must be at least n. must be positive.

strideb

Stride between the different B matrices. If matrices are stored column-major, strideb must be at least ldb*n. If matrices are stored row-major, strideb must be at least ldb*m”.

beta

Scaling factor for the matrices C.

batch_size

Specifies the number of triangular linear systems to solve.

Group API

left_right

Array of size group_count which specifies whether the matrices A in each group multiply each X in the same group on the left (side::left) or on the right (side::right). See Data Types for more details.

uplo

Array of size group_count which specifies whether the matrices A in each group are upper or lower triangular. See Data Types for more details.

trans

Array of size group_count which specifies op(A), the transposition operation applied to the matrices A in each group. See Data Types for more details.

unit_diag

Array of size group_count which specifies whether the matrices A in each group are assumed to be unit triangular (all diagonal elements are 1.). See Data Types for more details.

m

Array of size group_count which the number of rows of each B matrices in each group. Each must be at least zero.

n

Array of size group_count which the number of columns of each B matrices in each group. Each must be at least zero.

alpha

Array of size group_count containing scaling factors for the solutions in each group.

a

Array of size total_batch_count holding pointers to each A matrix. If left_righti = side::left, the matrices A are m-by-m matrices, so strideai must be at least ldai*mi. If left_righti = side::right, the matrices A are n-by-n matrices, so strideai must be at least ldai*ni.

lda

Array of size group_count containing leading dimensions of the matrices A in each group. Must be at least mi if left_righti = side::left, and at least ni if left_righti = side::right. Each must be positive.

b

Array of size total_batch_count holding pointers to each B matrix. If matrices are stored column-major, stridebi must be at least ldbi*ni. If matrices are stored row-major, stridebi must be at least ldbi*mi.

ldb

Array of size group_count, containing leading dimensions of the matrices B in each group. If matrices are stored column major, all ldbi must be at least mi. If matrices are stored row major, all ldbi must be at least ni. Each ldbi must be positive.

beta

Scaling factor for the matrices C.

batch_size

Specifies the number of triangular linear systems to solve.

Output Parameters

Strided API

b(Buffer API)

Output buffer, overwritten by batch_size solution matrices X.

b(USM API)

Output array, overwritten by batch_size solution matrices X.

Group API

b

Output array, containing pointers to arrays overwritten by batch_size solution matrices X.

Notes

If alpha = 0, matrix B is set to zero, and the matrices A and B do not need to be initialized before calling trsm_batch.