|
|
@@ -821,16 +821,16 @@ indicates that it is only available on Cell SPUs.
|
|
|
Is a function pointer to the CPU implementation of the codelet. Its prototype
|
|
|
must be: @code{void cpu_func(void *buffers[], void *cl_arg)}. The first
|
|
|
argument being the array of data managed by the data management library, and
|
|
|
-the second argument is a pointer to the argument passed from the @code{.cl_arg}
|
|
|
+the second argument is a pointer to the argument passed from the @code{cl_arg}
|
|
|
field of the @code{starpu_task} structure.
|
|
|
The @code{cpu_func} field is ignored if @code{STARPU_CPU} does not appear in
|
|
|
-the @code{.where} field, it must be non-null otherwise.
|
|
|
+the @code{where} field, it must be non-null otherwise.
|
|
|
|
|
|
@item @code{cuda_func} (optional):
|
|
|
Is a function pointer to the CUDA implementation of the codelet. @emph{This
|
|
|
must be a host-function written in the CUDA runtime API}. Its prototype must
|
|
|
be: @code{void cuda_func(void *buffers[], void *cl_arg);}. The @code{cuda_func}
|
|
|
-field is ignored if @code{STARPU_CUDA} does not appear in the @code{.where}
|
|
|
+field is ignored if @code{STARPU_CUDA} does not appear in the @code{where}
|
|
|
field, it must be non-null otherwise.
|
|
|
|
|
|
@item @code{opencl_func} (optional):
|
|
|
@@ -838,7 +838,7 @@ Is a function pointer to the OpenCL implementation of the codelet. Its
|
|
|
prototype must be:
|
|
|
@code{void opencl_func(starpu_data_interface_t *descr, void *arg);}.
|
|
|
This pointer is ignored if @code{OPENCL} does not appear in the
|
|
|
-@code{.where} field, it must be non-null otherwise.
|
|
|
+@code{where} field, it must be non-null otherwise.
|
|
|
|
|
|
@item @code{gordon_func} (optional):
|
|
|
This is the index of the Cell SPU implementation within the Gordon library.
|
|
|
@@ -847,7 +847,7 @@ TODO
|
|
|
@item @code{nbuffers}:
|
|
|
Specifies the number of arguments taken by the codelet. These arguments are
|
|
|
managed by the DSM and are accessed from the @code{void *buffers[]}
|
|
|
-array. The constant argument passed with the @code{.cl_arg} field of the
|
|
|
+array. The constant argument passed with the @code{cl_arg} field of the
|
|
|
@code{starpu_task} structure is not counted in this number. This value should
|
|
|
not be above @code{STARPU_NMAXBUFS}.
|
|
|
|
|
|
@@ -884,15 +884,15 @@ TODO
|
|
|
@item @code{cl_arg} (optional) (default = NULL):
|
|
|
This pointer is passed to the codelet through the second argument
|
|
|
of the codelet implementation (e.g. @code{cpu_func} or @code{cuda_func}).
|
|
|
-In the specific case of the Cell processor, see the @code{.cl_arg_size}
|
|
|
+In the specific case of the Cell processor, see the @code{cl_arg_size}
|
|
|
argument.
|
|
|
|
|
|
@item @code{cl_arg_size} (optional, Cell specific):
|
|
|
-In the case of the Cell processor, the @code{.cl_arg} pointer is not directly
|
|
|
+In the case of the Cell processor, the @code{cl_arg} pointer is not directly
|
|
|
given to the SPU function. A buffer of size @code{cl_arg_size} is allocated on
|
|
|
the SPU. This buffer is then filled with the @code{cl_arg_size} bytes starting
|
|
|
at address @code{cl_arg}. In that case, the argument given to the SPU codelet
|
|
|
-is therefore not the @code{.cl_arg} pointer, but the address of the buffer in
|
|
|
+is therefore not the @code{cl_arg} pointer, but the address of the buffer in
|
|
|
local store (LS) instead. This field is ignored for CPU, CUDA and OpenCL
|
|
|
codelets.
|
|
|
|
|
|
@@ -1003,7 +1003,7 @@ Note that this function is automatically called by @code{starpu_task_destroy}.
|
|
|
@item @emph{Description}:
|
|
|
Free the resource allocated during @code{starpu_task_create}. This function can be
|
|
|
called automatically after the execution of a task by setting the
|
|
|
-@code{.destroy} flag of the @code{starpu_task} structure (default behaviour).
|
|
|
+@code{destroy} flag of the @code{starpu_task} structure (default behaviour).
|
|
|
Calling this function on a statically allocated task results in an undefined
|
|
|
behaviour.
|
|
|
|
|
|
@@ -1268,7 +1268,7 @@ When calling this method, the offloaded function specified by the first argument
|
|
|
executed by every StarPU worker that may execute the function.
|
|
|
The second argument is passed to the offloaded function.
|
|
|
The last argument specifies on which types of processing units the function
|
|
|
-should be executed. Similarly to the @code{.where} field of the
|
|
|
+should be executed. Similarly to the @code{where} field of the
|
|
|
@code{starpu_codelet} structure, it is possible to specify that the function
|
|
|
should be executed on every CUDA device and every CPU by passing
|
|
|
@code{STARPU_CPU|STARPU_CUDA}.
|
|
|
@@ -1348,19 +1348,19 @@ A codelet is a structure that represents a computational kernel. Such a codelet
|
|
|
may contain an implementation of the same kernel on different architectures
|
|
|
(e.g. CUDA, Cell's SPU, x86, ...).
|
|
|
|
|
|
-The ''@code{.nbuffers}'' field specifies the number of data buffers that are
|
|
|
+The @code{nbuffers} field specifies the number of data buffers that are
|
|
|
manipulated by the codelet: here the codelet does not access or modify any data
|
|
|
that is controlled by our data management library. Note that the argument
|
|
|
-passed to the codelet (the ''@code{.cl_arg}'' field of the @code{starpu_task}
|
|
|
+passed to the codelet (the @code{cl_arg} field of the @code{starpu_task}
|
|
|
structure) does not count as a buffer since it is not managed by our data
|
|
|
management library.
|
|
|
|
|
|
@c TODO need a crossref to the proper description of "where" see bla for more ...
|
|
|
-We create a codelet which may only be executed on the CPUs. The ''@code{.where}''
|
|
|
+We create a codelet which may only be executed on the CPUs. The @code{where}
|
|
|
field is a bitmask that defines where the codelet may be executed. Here, the
|
|
|
@code{STARPU_CPU} value means that only CPUs can execute this codelet
|
|
|
(@pxref{Codelets and Tasks} for more details on that field).
|
|
|
-When a CPU core executes a codelet, it calls the @code{.cpu_func} function,
|
|
|
+When a CPU core executes a codelet, it calls the @code{cpu_func} function,
|
|
|
which @emph{must} have the following prototype:
|
|
|
|
|
|
@code{void (*cpu_func)(void *buffers[], void *cl_arg)}
|
|
|
@@ -1368,7 +1368,7 @@ which @emph{must} have the following prototype:
|
|
|
In this example, we can ignore the first argument of this function which gives a
|
|
|
description of the input and output buffers (e.g. the size and the location of
|
|
|
the matrices). The second argument is a pointer to a buffer passed as an
|
|
|
-argument to the codelet by the means of the ''@code{.cl_arg}'' field of the
|
|
|
+argument to the codelet by the means of the @code{cl_arg} field of the
|
|
|
@code{starpu_task} structure.
|
|
|
|
|
|
@c TODO rewrite so that it is a little clearer ?
|
|
|
@@ -1428,28 +1428,28 @@ corresponding structure with the default settings (@pxref{starpu_task_create}),
|
|
|
but it does not submit the task to StarPU.
|
|
|
|
|
|
@c not really clear ;)
|
|
|
-The ''@code{.cl}'' field is a pointer to the codelet which the task will
|
|
|
+The @code{cl} field is a pointer to the codelet which the task will
|
|
|
execute: in other words, the codelet structure describes which computational
|
|
|
kernel should be offloaded on the different architectures, and the task
|
|
|
structure is a wrapper containing a codelet and the piece of data on which the
|
|
|
codelet should operate.
|
|
|
|
|
|
-The optional ''@code{.cl_arg}'' field is a pointer to a buffer (of size
|
|
|
-@code{.cl_arg_size}) with some parameters for the kernel
|
|
|
+The optional @code{cl_arg} field is a pointer to a buffer (of size
|
|
|
+@code{cl_arg_size}) with some parameters for the kernel
|
|
|
described by the codelet. For instance, if a codelet implements a computational
|
|
|
kernel that multiplies its input vector by a constant, the constant could be
|
|
|
specified by the means of this buffer.
|
|
|
|
|
|
Once a task has been executed, an optional callback function can be called.
|
|
|
While the computational kernel could be offloaded on various architectures, the
|
|
|
-callback function is always executed on a CPU. The ''@code{.callback_arg}''
|
|
|
+callback function is always executed on a CPU. The @code{callback_arg}
|
|
|
pointer is passed as an argument of the callback. The prototype of a callback
|
|
|
function must be:
|
|
|
@example
|
|
|
void (*callback_function)(void *);
|
|
|
@end example
|
|
|
|
|
|
-If the @code{.synchronous} field is non-null, task submission will be
|
|
|
+If the @code{synchronous} field is non-null, task submission will be
|
|
|
synchronous: the @code{starpu_task_submit} function will not return until the
|
|
|
task was executed. Note that the @code{starpu_shutdown} method does not
|
|
|
guarantee that asynchronous tasks have been executed before it returns.
|
|
|
@@ -1508,7 +1508,7 @@ Since the factor is constant, it does not need a preliminary declaration, and
|
|
|
can just be passed through the @code{cl_arg} pointer like in the previous
|
|
|
example. The vector parameter is described by its handle.
|
|
|
There are two fields in each element of the @code{buffers} array.
|
|
|
-@code{.handle} is the handle of the data, and @code{.mode} specifies how the
|
|
|
+@code{handle} is the handle of the data, and @code{mode} specifies how the
|
|
|
kernel will access the data (@code{STARPU_R} for read-only, @code{STARPU_W} for
|
|
|
write-only and @code{STARPU_RW} for read and write access).
|
|
|
|
|
|
@@ -1543,7 +1543,7 @@ The second argument of the @code{scal_func} function contains a pointer to the
|
|
|
parameters of the codelet (given in @code{task->cl_arg}), so that we read the
|
|
|
constant factor from this pointer. The first argument is an array that gives
|
|
|
a description of every buffers passed in the @code{task->buffers}@ array. The
|
|
|
-size of this array is given by the @code{.nbuffers} field of the codelet
|
|
|
+size of this array is given by the @code{nbuffers} field of the codelet
|
|
|
structure. For the sake of generality, this array contains pointers to the
|
|
|
different interfaces describing each buffer. In the case of the @b{vector
|
|
|
interface}, the location of the vector (resp. its length) is accessible in the
|