|
@@ -17,13 +17,13 @@ may need to install a specific <c>-dev</c> package of your distro, such
|
|
|
as <c>gcc-4.6-plugin-dev</c> on Debian and derivatives. In addition,
|
|
|
the plug-in's test suite is only run when <a href="http://www.gnu.org/software/guile/">GNU Guile</a> is found at
|
|
|
<c>configure</c>-time. Building the GCC plug-in
|
|
|
-can be disabled by configuring with <c>--disable-gcc-extensions</c>.
|
|
|
+can be disabled by configuring with \ref disable-gcc-extensions.
|
|
|
|
|
|
Those extensions include syntactic sugar for defining
|
|
|
tasks and their implementations, invoking a task, and manipulating data
|
|
|
buffers. Use of these extensions can be made conditional on the
|
|
|
availability of the plug-in, leading to valid C sequential code when the
|
|
|
-plug-in is not used (\ref Conditional_Extensions).
|
|
|
+plug-in is not used (\ref UsingCExtensionsConditionally).
|
|
|
|
|
|
When StarPU has been installed with its GCC plug-in, programs that use
|
|
|
these extensions can be compiled this way:
|
|
@@ -64,10 +64,10 @@ choose any of its implementations.
|
|
|
</ul>
|
|
|
|
|
|
Tasks and their implementations must be <em>declared</em>. These
|
|
|
-declarations are annotated with attributes (@pxref{Attribute
|
|
|
-Syntax, attributes in GNU C,, gcc, Using the GNU Compiler Collection
|
|
|
-(GCC)}): the declaration of a task is a regular C function declaration
|
|
|
-with an additional <c>task</c> attribute, and task implementations are
|
|
|
+declarations are annotated with attributes
|
|
|
+(http://gcc.gnu.org/onlinedocs/gcc/Attribute-Syntax.html#Attribute-Syntax):
|
|
|
+the declaration of a task is a regular C function declaration with an
|
|
|
+additional <c>task</c> attribute, and task implementations are
|
|
|
declared with a <c>task_implementation</c> attribute.
|
|
|
|
|
|
The following function attributes are provided:
|
|
@@ -84,17 +84,17 @@ actual definition of a task's body is automatically generated by the
|
|
|
compiler.
|
|
|
|
|
|
Under the hood, declaring a task leads to the declaration of the
|
|
|
-corresponding <c>codelet</c> (\ref Codelet_and_Tasks). If one or
|
|
|
+corresponding <c>codelet</c> (\ref CodeletAndTasks). If one or
|
|
|
more task implementations are declared in the same compilation unit,
|
|
|
then the codelet and the function itself are also defined; they inherit
|
|
|
the scope of the task.
|
|
|
|
|
|
Scalar arguments to the task are passed by value and copied to the
|
|
|
-target device if need be---technically, they are passed as the
|
|
|
-<c>cl_arg</c> buffer (\ref Codelets_and_Tasks).
|
|
|
+target device if need be---technically, they are passed as the buffer
|
|
|
+starpu_task::cl_arg (\ref CodeletAndTasks).
|
|
|
|
|
|
Pointer arguments are assumed to be registered data buffers---the
|
|
|
-buffer argument of a task); <c>const</c>-qualified
|
|
|
+handles argument of a task (starpu_task::handles) ; <c>const</c>-qualified
|
|
|
pointer arguments are viewed as read-only buffers (::STARPU_R), and
|
|
|
non-<c>const</c>-qualified buffers are assumed to be used read-write
|
|
|
(::STARPU_RW). In addition, the <c>output</c> type attribute can be
|
|
@@ -181,7 +181,7 @@ matmul (const float *A, const float *B, __output float *C,
|
|
|
|
|
|
Use of implicit CPU task implementations as above has the advantage that
|
|
|
the code is valid sequential code when StarPU's GCC plug-in is not used
|
|
|
-(\ref Conditional_Extensions).
|
|
|
+(\ref UsingCExtensionsConditionally).
|
|
|
|
|
|
CUDA and OpenCL implementations can be declared in a similar way:
|
|
|
|
|
@@ -196,8 +196,8 @@ static void matmul_opencl (const float *A, const float *B, float *C,
|
|
|
\endcode
|
|
|
|
|
|
The CUDA and OpenCL implementations typically either invoke a kernel
|
|
|
-written in CUDA or OpenCL (for similar code, \ref CUDA_Kernel, and
|
|
|
-\ref OpenCL_Kernel), or call a library function that uses CUDA or
|
|
|
+written in CUDA or OpenCL (for similar code, \ref CUDAKernel, and
|
|
|
+\ref OpenCLKernel), or call a library function that uses CUDA or
|
|
|
OpenCL under the hood, such as CUBLAS functions:
|
|
|
|
|
|
\code{.c}
|
|
@@ -227,7 +227,7 @@ implementation may run in parallel with the continuation of the caller.
|
|
|
The next section describes how memory buffers must be handled in
|
|
|
StarPU-GCC code. For a complete example, see the
|
|
|
<c>gcc-plugin/examples</c> directory of the source distribution, and
|
|
|
-\ref Vector_Scaling_Using_the_C_Extension.
|
|
|
+\ref VectorScalingUsingTheCExtension.
|
|
|
|
|
|
|
|
|
\section InitializationTerminationAndSynchronization Initialization, Termination, and Synchronization
|
|
@@ -247,13 +247,13 @@ it provides greater control to user code over its resource usage.
|
|
|
<dt><c>\#pragma starpu shutdown</c></dt>
|
|
|
<dd>
|
|
|
Shut down StarPU, giving it an opportunity to write profiling info to a
|
|
|
-file on disk, for instance (\ref Off-line_performance_feedback).
|
|
|
+file on disk, for instance (\ref Off-linePerformanceFeedback).
|
|
|
</dd>
|
|
|
|
|
|
<dt><c>\#pragma starpu wait</c></dt>
|
|
|
<dd>
|
|
|
Wait for all task invocations to complete, as with
|
|
|
-starpu_wait_for_all().
|
|
|
+starpu_task_wait_for_all().
|
|
|
</dd>
|
|
|
</dl>
|
|
|
|
|
@@ -262,7 +262,7 @@ starpu_wait_for_all().
|
|
|
Data buffers such as matrices and vectors that are to be passed to tasks
|
|
|
must be registered. Registration allows StarPU to handle data
|
|
|
transfers among devices---e.g., transferring an input buffer from the
|
|
|
-CPU's main memory to a task scheduled to run a GPU (\ref StarPU_Data_Management_Library).
|
|
|
+CPU's main memory to a task scheduled to run a GPU (\ref StarPUDataManagementLibrary).
|
|
|
|
|
|
The following pragmas are provided:
|
|
|
|
|
@@ -332,14 +332,10 @@ these extensions when they are available---leading to hybrid CPU/GPU
|
|
|
code---and discard them when they are not available---leading to valid
|
|
|
sequential code.
|
|
|
|
|
|
-To that end, the GCC plug-in defines a C preprocessor macro when it is
|
|
|
-being used:
|
|
|
-
|
|
|
-@defmac STARPU_GCC_PLUGIN
|
|
|
-Defined for code being compiled with the StarPU GCC plug-in. When
|
|
|
-defined, this macro expands to an integer denoting the version of the
|
|
|
-supported C extensions.
|
|
|
-@end defmac
|
|
|
+To that end, the GCC plug-in defines the C preprocessor macro ---
|
|
|
+<c>STARPU_GCC_PLUGIN</c> --- when it is being used. When defined, this
|
|
|
+macro expands to an integer denoting the version of the supported C
|
|
|
+extensions.
|
|
|
|
|
|
The code below illustrates how to define a task and its implementations
|
|
|
in a way that allows it to be compiled without the GCC plug-in:
|