gemm

Computes a matrix-matrix product with general matrices.

Syntax

void gemm(queue &exec_queue, transpose transa, transpose transb, std::int64_t m, std::int64_t n, std::int64_t k, T alpha, buffer<T, 1> &a, std::int64_t lda, buffer<T, 1> &b, std::int64_t ldb, T beta, buffer<T, 1> &c, std::int64_t ldc)

gemm supports the following precisions and devices.

Ts

Ta

Tb

Tc

Devices Supported

half

half

half

half

Host, CPU, and GPU

float

half

half

float

Host, CPU, and GPU

float

float

float

float

Host, CPU, and GPU

double

double

double

double

Host, CPU, and GPU

std::complex<float>

std::complex<float>

std::complex<float>

std::complex<float>

Host, CPU, and GPU

std::complex<double>

std::complex<double>

std::complex<double>

std::complex<double>

Host, CPU, and GPU

Description

The gemm routines compute a scalar-matrix-matrix product and add the result to a scalar-matrix product, with general matrices. The operation is defined as

C <- alpha*op(A)*op(B) + beta*C

where:

op(X) is one of op(X) = X, or op(X) = XT, or op(X) = XH,

alpha and beta are scalars,

A, B and C are matrices:

op(A) is an m-by-k matrix,

op(B) is a k-by-n matrix,

C is an m-by-n matrix.

Input Parameters

exec_queue

The queue where the routine should be executed.

transa

Specifies the form of op(A), the transposition operation applied to A. See Data Types for more details.

transb

Specifies the form of op(B), the transposition operation applied to B. See Data Types for more details.

m

Specifies the number of rows of the matrix op(A) and of the matrix C. The value of m must be at least zero.

n

Specifies the number of columns of the matrix op(B) and the number of columns of the matrix B. The value of n must be at least zero.

k

Specifies the number of columns of the matrix op(A) and the number of rows of the matrix op(B). The value of k must be at least zero.

alpha

Scaling factor for the matrix-matrix product.

a

The buffer holding the input matrix A. If A is not transposed, A is an m-by-k matrix so the array a must have size at least lda*k (respectively, lda*m) if column (respectively, row) major layout is used to store matrices. If A is transposed, A is an k-by-m matrix so the array a must have size at least lda*m (respectively, lda*k) if column (respectively, row) major layout is used to store matrices. See Matrix and Vector Storage for more details.

lda

The leading dimension of A. If matrices are stored using column major layout, lda must be at least m if A is not transposed, and at least k if A is transposed. If matrices are stored using row major layout, lda must be at least k if A is not transposed, and at least m if A is transposed.

b

The buffer holding the input matrix B. If B is not transposed, B is an k-by-n matrix so the array b must have size at least ldb*n (respectively, ldb*k) if column (respectively, row) major layout is used to store matrices. If B is transposed, B is an n-by-k matrix so the array a must have size at least ldb*k (respectively, ldb*n) if column (respectively, row) major layout is used to store matrices. See Matrix and Vector Storage for more details.

ldb

The leading dimension of B. If matrices are stored using column major layout, ldb must be at least k if B is not transposed, and at least n if B is transposed. If matrices are stored using row major layout, ldb must be at least n if B is not transposed, and at least k if B is transposed.

beta

Scaling factor for matrix C.

c

The buffer holding the input/output matrix C. It must have a size of at least ldc*n if column major layout is used to store matrices or at least ldc*m if row major layout is used to store matrices. See Matrix and Vector Storage for more details.

ldc

The leading dimension of C. It must be positive and at least m if column major layout is used to store matrices or at least n if row major layout is used to store matrices.

Output Parameters

c

The buffer, which is overwritten by alpha*op(A)*op(B) + beta*C.

Notes

If beta = 0, matrix C does not need to be initialized before calling gemm.

Example

For examples of gemm usage, see these code examples in the Intel® oneMKL installation directory:

examples/sycl/blas/gemm.cpp