gemmt (USM Version)¶
Computes a matrix-matrix product with general matrices, but updates only the upper or lower triangular part of the result matrix.
Syntax
-
event
gemmt
(queue &exec_queue, uplo upper_lower, transpose transa, transpose transb, std::int64_t n, std::int64_t k, T alpha, const T *a, std::int64_t lda, const T *b, std::int64_t ldb, T beta, T *c, std::int64_t ldc, const vector_class<event> &dependencies = {})¶
The USM version ofgemmt
supports the following precisions
and devices.
T |
Devices Supported |
---|---|
|
Host, CPU, and GPU |
|
Host, CPU, and GPU |
|
Host, CPU, and GPU |
|
Host, CPU, and GPU |
Description
The gemmt routines compute a scalar-matrix-matrix product and add the result to the upper or lower part of a scalar-matrix product, with general matrices. The operation is defined as:
C <- alpha*op(A)*op(B) + beta*C
where:
op(X) is one of op(X) = X, or op(X) = XT, or op(X) = XH
alpha
andbeta
are scalarsA
,B
, andC
are matrices
Here, op(A
) is n
x k
, op(B
) is k
x n
, and
C
is n
x n
.
Input Parameters
- exec_queue
The queue where the routine should be executed.
- upper_lower
Specifies whether
C
’s data is stored in its upper or lower triangle. See Data Types for more details.- transa
Specifies op(
A
), the transposition operation applied toA
. See Data Types for more details.- transb
Specifies op(
B
), the transposition operation applied toB
. See Data Types for more details.- n
Number of columns of op(
A
), columns of op(B
), and columns ofC
. Must be at least zero.- k
Number of columns of op(
A
) and rows of op(B
). Must be at least zero.- alpha
Scaling factor for the matrix-matrix product.
- a
Pointer to input matrix
A
.If
A
is not transposed,A
is ann
-by-k
matrix so the arraya
must have size at leastlda
*k
.If
A
is transposed,A
is ak
-by-n
matrix so the arraya
must have size at leastlda
*n
.See Matrix Storage for more details.
- lda
Leading dimension of
A
. Must be at leastn
ifA
is not transposed, and at leastk
ifA
is transposed. Must be positive.- b
Pointer to input matrix
B
.If
B
is not transposed,B
is ak
-by-n
matrix so the arrayb
must have size at leastldb
*n
.If
B
is transposed,B
is ann
-by-k
matrix so the arrayb
must have size at leastldb
*k
.See Matrix Storage for more details.
- ldb
Leading dimension of
B
. Must be at leastk
ifB
is not transposed, and at leastn
ifB
is transposed. Must be positive.- beta
Scaling factor for matrix
C
.- c
Pointer to input/output matrix
C
. Must have size at leastldc
*n
. See Matrix Storage for more details.- ldc
Leading dimension of
C
. Must be positive and at leastm
.- dependencies
List of events to wait for before starting computation, if any. If omitted, defaults to no dependencies.
Output Parameters
- c
Pointer to the output matrix, overwritten by the upper or lower triangular part ofalpha*op(
A
)*op(B
) + beta*C
.
Notes
If beta
= 0, matrix C
does not need to be initialized before
calling gemmt.
Return Values
Output event to wait on to ensure computation is complete.