axpy_batch¶
Computes a group of vector-scalar products added to a vector.
Syntax
Group API
-
event
axpy_batch
(queue &exec_queue, int64_t *n_array, T *alpha_array, const T **x_array, int64_t *incx_array, T **y_array, int64_t *incy_array, int64_t group_count, int64_t *group_size_array, const vector_class<event> &dependencies = {})¶
Strided API
-
event
axpy_batch
(queue &exec_queue, int64_t n, T alpha, const T *x, int64_t incx, int64_t stridex, T *y, int64_t incy, int64_t stridey, int64_t batch_size, const vector_class<event> &dependencies = {})¶
-
void
axpy_batch
(queue &exec_queue, int64_t n, T alpha, buffer<T, 1> &x, int64_t incx, int64_t stridex, buffer<T, 1> &y, int64_t incy, int64_t stridey, int64_t batch_size)¶
axpy_batch
supports the following precisions and devices.
T |
Devices Supported |
---|---|
|
Host, CPU, and GPU |
|
Host, CPU, and GPU |
|
Host, CPU, and GPU |
|
Host, CPU, and GPU |
Description
The axpy_batch routines perform a series of scalar-vector product added to a vector. They are similar to the axpy routine counterparts, but the axpy_batch routines perform vector operations with a groups of vectors.
For the group API, each group contains vectors with the same parameters (size and increment). The operation for the group API is defined as
idx = 0
for i = 0 … group_count – 1
n, alpha, incx, incy and group_size at position i in n_array, alpha_array, incx_array, incy_array and group_size_array
for j = 0 … group_size – 1
x and y are vectors of size n at position idx in x_array and y_array
y := alpha * x + y
idx := idx + 1
end for
end for
The number of entries in x_array
, and y_array
is
total_batch_count
= the sum of all of the group_size
entries.
For the strided API, all vector x
(respectively, y
) have the
same parameters (size, increments) and are stored at constant
stridex
(respectively, stridey
) from each other. The
operation for the strided API is defined as
For i = 0 … batch_size – 1
X and Y are vectors at offset i * stridex and i * stridey in x and y
Y = alpha * X + Y
end for
Input Parameters
Group API
- exec_queue
The queue where the routine should be executed.
- n_array
Array of size
group_count
. For the groupi
,n
i =n_array[i]
is the number of elements in vectorsx
andy
.- alpha_array
Array of size
group_count
. For the groupi
,alpha
i =alpha_array[i]
is the scalaralpha
.- x_array
Array of size
total_batch_count
of pointers used to storex
vectors. The array allocated for thex
vectors of the groupi
must be of size at least (1 + (n
i – 1)*abs(incx
i)). See Matrix and Vector Storage for more details.- incx_array
Array of size
group_count
. For the groupi
,incx
i =incx_array[i]
is the stride of vectorx
.- y_array
Array of size
total_batch_count
of pointers used to storey
vectors. The array allocated for they
vectors of the groupi
must be of size at least (1 + (n
i – 1)*abs(incy
i)). See Matrix and Vector Storage for more details.- incy_array
Array of size
group_count
. For the groupi
,incy
i =incy_array[i]
is the stride of vector y.- group_count
Number of groups. Must be at least 0.
- group_size_array
Array of size
group_count
. The elementgroup_size_array[i]
is the number of vector in the groupi
. Each element ingroup_size_array
must be at least 0.- dependencies
List of events to wait for before starting computation, if any. If omitted, defaults to no dependencies.
Strided API
- exec_queue
The queue where the routine should be executed.
- n
Number of elements in vectors
x
andy
.- alpha
Specifies the scalar
alpha
.- x
Buffer or USM pointer accessible by the queue’s device holding all the input vector
x
. The buffer or allocated memory must be of size at leastbatch_size
*stridex
.- incx
Stride between two consecutive elements of the
x
vectors.- stridex
Stride between two consecutive
x
vectors, must be at least (1 + (n
-1)*abs(incx
)). See Matrix and Vector Storage for more details.- y
Buffer or USM pointer accessible by the queue’s device holding all the input vectors
y
. The buffer or allocated memory must be of size at leastbatch_size
*stridey
.- incy
Stride between two consecutive elements of the
y
vectors.- stridey
Stride between two consecutive
y
vectors, must be at least (1 + (n
-1)*abs(incy
)). See Matrix and Vector Storage for more details.- batch_size
Number of axpy computations to perform and
x
andy
vectors. Must be at least 0.
Output Parameters
Group API
- y_array
Array of pointers holding the
total_batch_count
updated vectory
.
Strided API
y |
Array or buffer holding the |