|
|
@@ -31,11 +31,13 @@ This manual documents the usage of StarPU
|
|
|
@comment better formatting.
|
|
|
@comment
|
|
|
@menu
|
|
|
-* Introduction:: A basic introduction to using StarPU.
|
|
|
-* Installing StarPU:: How to configure, build and install StarPU
|
|
|
-* StarPU API:: The API to use StarPU
|
|
|
-* Basic Examples:: Basic examples of the use of StarPU
|
|
|
-* Advanced Topics:: Advanced use of StarPU
|
|
|
+* Introduction:: A basic introduction to using StarPU.
|
|
|
+* Installing StarPU:: How to configure, build and install StarPU.
|
|
|
+* Configuration options:: Configurations options
|
|
|
+* Environment variables:: Environment variables used by StarPU.
|
|
|
+* StarPU API:: The API to use StarPU.
|
|
|
+* Basic Examples:: Basic examples of the use of StarPU.
|
|
|
+* Advanced Topics:: Advanced use of StarPU.
|
|
|
@end menu
|
|
|
|
|
|
@c ---------------------------------------------------------------------
|
|
|
@@ -114,6 +116,14 @@ by expressing dependencies between tags.
|
|
|
@c DSM
|
|
|
@subsection StarPU Data Management Library
|
|
|
|
|
|
+Because StarPU schedules tasks at runtime, data transfers have to be
|
|
|
+done automatically and ``just-in-time'' between processing units,
|
|
|
+relieving the application programmer from explicit data transfers.
|
|
|
+Moreover, to avoid unnecessary transfers, StarPU keeps data
|
|
|
+where it was last needed, even if was modified there, and it
|
|
|
+allows multiple copies of the same data to reside at the same time on
|
|
|
+several processing units as long as it is not modified.
|
|
|
+
|
|
|
@c ---------------------------------------------------------------------
|
|
|
@c Installing StarPU
|
|
|
@c ---------------------------------------------------------------------
|
|
|
@@ -134,7 +144,7 @@ are using the source code from the svn repository, you first need to generate
|
|
|
the configure scripts and the Makefiles.
|
|
|
|
|
|
@example
|
|
|
-$ autoreconf -i
|
|
|
+$ autoreconf -vfi
|
|
|
@end example
|
|
|
|
|
|
@subsection Configuring StarPU
|
|
|
@@ -143,7 +153,7 @@ $ autoreconf -i
|
|
|
$ ./configure
|
|
|
@end example
|
|
|
|
|
|
-@c TODO enumerate the list of interesting options
|
|
|
+@c TODO enumerate the list of interesting options: refer to a specific section
|
|
|
|
|
|
@section Building and Installing StarPU
|
|
|
|
|
|
@@ -168,7 +178,7 @@ In order to install StarPU at the location that was specified during
|
|
|
configuration:
|
|
|
|
|
|
@example
|
|
|
-# make install
|
|
|
+$ make install
|
|
|
@end example
|
|
|
|
|
|
@subsection pkg-config configuration
|
|
|
@@ -196,6 +206,139 @@ $ pkg-config --libs libstarpu # options for the linker
|
|
|
@end example
|
|
|
|
|
|
@c ---------------------------------------------------------------------
|
|
|
+@c Configuration options
|
|
|
+@c ---------------------------------------------------------------------
|
|
|
+
|
|
|
+@node Configuration options
|
|
|
+@chapter Configuration options
|
|
|
+
|
|
|
+TODO
|
|
|
+
|
|
|
+@c ---------------------------------------------------------------------
|
|
|
+@c Environment variables
|
|
|
+@c ---------------------------------------------------------------------
|
|
|
+
|
|
|
+@node Environment variables
|
|
|
+@chapter Environment variables
|
|
|
+
|
|
|
+@menu
|
|
|
+* Workers:: Configuring workers
|
|
|
+* Scheduling:: Configuring the Scheduling engine
|
|
|
+* Misc:: Miscellaneous and debug
|
|
|
+@end menu
|
|
|
+
|
|
|
+TODO, explicit configuration (passed to starpu_init) overrides env variables.
|
|
|
+
|
|
|
+@node Workers
|
|
|
+@section Configuring workers
|
|
|
+
|
|
|
+@menu
|
|
|
+* NCPUS :: Number of CPU workers
|
|
|
+* NCUDA :: Number of CUDA workers
|
|
|
+* NGORDON :: Number of SPU workers (Cell)
|
|
|
+* WORKERS_CPUID :: Bind workers to specific CPUs
|
|
|
+* WORKERS_GPUID :: Select specific CUDA devices
|
|
|
+@end menu
|
|
|
+
|
|
|
+@node NCPUS
|
|
|
+@subsection @code{NCPUS} -- Number of CPU workers
|
|
|
+@table @asis
|
|
|
+
|
|
|
+@item @emph{Description}:
|
|
|
+TODO
|
|
|
+
|
|
|
+@end table
|
|
|
+
|
|
|
+@node NCUDA
|
|
|
+@subsection @code{NCUDA} -- Number of CUDA workers
|
|
|
+@table @asis
|
|
|
+
|
|
|
+@item @emph{Description}:
|
|
|
+TODO
|
|
|
+
|
|
|
+@end table
|
|
|
+
|
|
|
+@node NGORDON
|
|
|
+@subsection @code{NGORDON} -- Number of SPU workers (Cell)
|
|
|
+@table @asis
|
|
|
+
|
|
|
+@item @emph{Description}:
|
|
|
+TODO
|
|
|
+
|
|
|
+@end table
|
|
|
+
|
|
|
+
|
|
|
+@node WORKERS_CPUID
|
|
|
+@subsection @code{WORKERS_CPUID} -- Bind workers to specific CPUs
|
|
|
+@table @asis
|
|
|
+
|
|
|
+@item @emph{Description}:
|
|
|
+TODO
|
|
|
+
|
|
|
+@end table
|
|
|
+
|
|
|
+@node WORKERS_GPUID
|
|
|
+@subsection @code{WORKERS_GPUID} -- Select specific CUDA devices
|
|
|
+@table @asis
|
|
|
+
|
|
|
+@item @emph{Description}:
|
|
|
+TODO
|
|
|
+
|
|
|
+@end table
|
|
|
+
|
|
|
+@node Scheduling
|
|
|
+@section Configuring the Scheduling engine
|
|
|
+
|
|
|
+@menu
|
|
|
+* SCHED :: Scheduling policy
|
|
|
+* CALIBRATE :: Calibrate performance models
|
|
|
+* PREFETCH :: Use data prefetch
|
|
|
+@end menu
|
|
|
+
|
|
|
+@node SCHED
|
|
|
+@subsection @code{SCHED} -- Scheduling policy
|
|
|
+@table @asis
|
|
|
+
|
|
|
+@item @emph{Description}:
|
|
|
+TODO
|
|
|
+
|
|
|
+@end table
|
|
|
+
|
|
|
+@node CALIBRATE
|
|
|
+@subsection @code{CALIBRATE} -- Calibrate performance models
|
|
|
+@table @asis
|
|
|
+
|
|
|
+@item @emph{Description}:
|
|
|
+TODO
|
|
|
+
|
|
|
+@end table
|
|
|
+
|
|
|
+@node PREFETCH
|
|
|
+@subsection @code{PREFETCH} -- Use data prefetch
|
|
|
+@table @asis
|
|
|
+
|
|
|
+@item @emph{Description}:
|
|
|
+TODO
|
|
|
+
|
|
|
+@end table
|
|
|
+
|
|
|
+@node Misc
|
|
|
+@section Miscellaneous and debug
|
|
|
+
|
|
|
+@menu
|
|
|
+* LOGFILENAME :: Select debug file name
|
|
|
+@end menu
|
|
|
+
|
|
|
+@node LOGFILENAME
|
|
|
+@subsection @code{LOGFILENAME} -- Select debug file name
|
|
|
+@table @asis
|
|
|
+
|
|
|
+@item @emph{Description}:
|
|
|
+TODO
|
|
|
+
|
|
|
+@end table
|
|
|
+
|
|
|
+@c ---------------------------------------------------------------------
|
|
|
@c StarPU API
|
|
|
@c ---------------------------------------------------------------------
|
|
|
|
|
|
@@ -204,6 +347,7 @@ $ pkg-config --libs libstarpu # options for the linker
|
|
|
|
|
|
@menu
|
|
|
* Initialization and Termination:: Initialization and Termination methods
|
|
|
+* Workers' Properties:: Methods to enumerate workers' properties
|
|
|
* Data Library:: Methods to manipulate data
|
|
|
* Codelets and Tasks:: Methods to construct tasks
|
|
|
* Tags:: Task dependencies
|
|
|
@@ -227,9 +371,12 @@ This is StarPU initialization method, which must be called prior to any other
|
|
|
StarPU call. It is possible to specify StarPU's configuration (eg. scheduling
|
|
|
policy, number of cores, ...) by passing a non-null argument. Default
|
|
|
configuration is used if the passed argument is @code{NULL}.
|
|
|
+@item @emph{Return value}:
|
|
|
+Upon successful completion, this function returns 0. Otherwise, @code{-ENODEV}
|
|
|
+indicates that no worker was available (so that StarPU was not be initialized).
|
|
|
|
|
|
@item @emph{Prototype}:
|
|
|
-@code{void starpu_init(struct starpu_conf *conf);}
|
|
|
+@code{int starpu_init(struct starpu_conf *conf);}
|
|
|
|
|
|
@end table
|
|
|
|
|
|
@@ -238,9 +385,35 @@ configuration is used if the passed argument is @code{NULL}.
|
|
|
|
|
|
@table @asis
|
|
|
@item @emph{Description}:
|
|
|
-TODO
|
|
|
-@item @emph{Definition}:
|
|
|
-TODO
|
|
|
+This structure is passed to the @code{starpu_init} function in order configure
|
|
|
+StarPU. When the default value is used, StarPU automatically select the number
|
|
|
+of processing units and takes the default scheduling policy. This parameters
|
|
|
+overwrite the equivalent environnement variables.
|
|
|
+
|
|
|
+@item @emph{Fields}:
|
|
|
+@table @asis
|
|
|
+@item @code{sched_policy} (default = NULL):
|
|
|
+This is the name of the scheduling policy. This can also be specified with the
|
|
|
+@code{SCHED} environment variable.
|
|
|
+
|
|
|
+@item @code{ncpus} (default = -1):
|
|
|
+This is the maximum number of CPU cores that StarPU can use. This can also be
|
|
|
+specified with the @code{NCPUS} environment variable.
|
|
|
+
|
|
|
+@item @code{ncuda} (default = -1):
|
|
|
+This is the maximum number of CUDA devices that StarPU can use. This can also be
|
|
|
+specified with the @code{NCUDA} environment variable.
|
|
|
+
|
|
|
+@item @code{nspus} (default = -1):
|
|
|
+This is the maximum number of Cell SPUs that StarPU can use. This can also be
|
|
|
+specified with the @code{NGORDON} environment variable.
|
|
|
+
|
|
|
+@item @code{calibrate} (default = 0):
|
|
|
+If this flag is set, StarPU will calibrate the performance models when
|
|
|
+executing tasks. This can also be specified with the @code{CALIBRATE}
|
|
|
+environment variable.
|
|
|
+@end table
|
|
|
+
|
|
|
@end table
|
|
|
|
|
|
|
|
|
@@ -259,6 +432,78 @@ garanteed to be available until this method has been called.
|
|
|
|
|
|
@end table
|
|
|
|
|
|
+@node Workers' Properties
|
|
|
+@section Workers' Properties
|
|
|
+
|
|
|
+@menu
|
|
|
+* starpu_get_worker_count:: Get the number of processing units
|
|
|
+* starpu_get_worker_id:: Get the identifier of the current worker
|
|
|
+* starpu_get_worker_type:: Get the type of processing unit associated to a worker
|
|
|
+* starpu_get_worker_name:: Get the name of a worker
|
|
|
+@end menu
|
|
|
+
|
|
|
+@node starpu_get_worker_count
|
|
|
+@subsection @code{starpu_get_worker_count} -- Get the number of processing units
|
|
|
+@table @asis
|
|
|
+
|
|
|
+@item @emph{Description}:
|
|
|
+This function returns the number of workers (ie. processing units executing
|
|
|
+StarPU tasks). The returned value should be at most @code{STARPU_NMAXWORKERS}.
|
|
|
+
|
|
|
+@item @emph{Prototype}:
|
|
|
+@code{unsigned starpu_get_worker_count(void);}
|
|
|
+
|
|
|
+@end table
|
|
|
+
|
|
|
+
|
|
|
+@node starpu_get_worker_id
|
|
|
+@subsection @code{starpu_get_worker_id} -- Get the identifier of the current worker
|
|
|
+@table @asis
|
|
|
+
|
|
|
+@item @emph{Description}:
|
|
|
+This function returns the identifier of the worker associated to the calling
|
|
|
+thread. The returned value is either -1 if the current context is not a StarPU
|
|
|
+worker (ie. when called from the application outside a task or a callback), or
|
|
|
+an integer between 0 and @code{starpu_get_worker_count() - 1}.
|
|
|
+
|
|
|
+@item @emph{Prototype}:
|
|
|
+@code{int starpu_get_worker_count(void);}
|
|
|
+
|
|
|
+@end table
|
|
|
+
|
|
|
+@node starpu_get_worker_type
|
|
|
+@subsection @code{starpu_get_worker_type} -- Get the type of processing unit associated to a worker
|
|
|
+@table @asis
|
|
|
+
|
|
|
+@item @emph{Description}:
|
|
|
+This function returns the type of worker associated to an identifier (as
|
|
|
+returned by the @code{starpu_get_worker_id} function). The returned value
|
|
|
+indicates the architecture of the worker: @code{STARPU_CORE_WORKER} for a CPU
|
|
|
+core, @code{STARPU_CUDA_WORKER} for a CUDA device, and
|
|
|
+@code{STARPU_GORDON_WORKER} for a Cell SPU. The value returned for an invalid
|
|
|
+identifier is unspecified.
|
|
|
+
|
|
|
+@item @emph{Prototype}:
|
|
|
+@code{enum starpu_archtype starpu_get_worker_type(int id);}
|
|
|
+
|
|
|
+@end table
|
|
|
+
|
|
|
+@node starpu_get_worker_name
|
|
|
+@subsection @code{starpu_get_worker_name} -- Get the name of a worker
|
|
|
+@table @asis
|
|
|
+
|
|
|
+@item @emph{Description}:
|
|
|
+StarPU associates a unique human readable string to each processing unit. This
|
|
|
+function copies at most the @code{maxlen} first bytes of the unique string
|
|
|
+associated to a worker identified by its identifier @code{id} into the
|
|
|
+@code{dst} buffer. The caller is responsible for ensuring that the @code{dst}
|
|
|
+is a valid pointer to a buffer of @code{maxlen} bytes at least. Calling this
|
|
|
+function on an invalid identifier results in an unspecified behaviour.
|
|
|
+
|
|
|
+@item @emph{Prototype}:
|
|
|
+@code{void starpu_get_worker_name(int id, char *dst, size_t maxlen);}
|
|
|
+
|
|
|
+@end table
|
|
|
|
|
|
@node Data Library
|
|
|
@section Data Library
|
|
|
@@ -275,22 +520,238 @@ garanteed to be available until this method has been called.
|
|
|
@section Codelets and Tasks
|
|
|
|
|
|
@menu
|
|
|
+* struct starpu_codelet:: StarPU codelet structure
|
|
|
+* struct starpu_task:: StarPU task structure
|
|
|
+* starpu_task_init:: Initialize a Task
|
|
|
* starpu_task_create:: Allocate and Initialize a Task
|
|
|
+* starpu_task_destroy:: Destroy a dynamically allocated Task
|
|
|
+* starpu_submit_task:: Submit a Task
|
|
|
+* starpu_wait_task:: Wait for the termination of a Task
|
|
|
+* starpu_wait_all_tasks:: Wait for the termination of all Tasks
|
|
|
@end menu
|
|
|
|
|
|
|
|
|
@c struct starpu_task
|
|
|
@c struct starpu_codelet
|
|
|
|
|
|
+@node struct starpu_codelet
|
|
|
+@subsection @code{struct starpu_codelet} -- StarPU codelet structure
|
|
|
+@table @asis
|
|
|
+@item @emph{Description}:
|
|
|
+The codelet structure describes a kernel that is possibly implemented on
|
|
|
+various targets.
|
|
|
+@item @emph{Fields}:
|
|
|
+@table @asis
|
|
|
+@item @code{where}:
|
|
|
+Indicates which types of processing units are able to execute that codelet.
|
|
|
+@code{CORE|CUDA} for instance indicates that the codelet is implemented for
|
|
|
+both CPU cores and CUDA devices while @code{GORDON} indicates that it is only
|
|
|
+available on Cell SPUs.
|
|
|
+
|
|
|
+@item @code{core_func} (optionnal):
|
|
|
+Is a function pointer to the CPU implementation of the codelet. Its prototype
|
|
|
+must be: @code{void core_func(starpu_data_interface_t *descr, void *arg)}. The
|
|
|
+first argument being the array of data managed by the data management library,
|
|
|
+and the second argument is a pointer to the argument (possibly a copy of it)
|
|
|
+passed from the @code{.cl_arg} field of the @code{starpu_task} structure. This
|
|
|
+pointer is ignored if @code{CORE} does not appear in the @code{.where} field,
|
|
|
+it must be non-null otherwise.
|
|
|
+
|
|
|
+@item @code{cuda_func} (optionnal):
|
|
|
+Is a function pointer to the CUDA implementation of the codelet. @emph{This
|
|
|
+must be a host-function written in the CUDA runtime API}. Its prototype must
|
|
|
+be: @code{void cuda_func(starpu_data_interface_t *descr, void *arg);}. This
|
|
|
+pointer is ignored if @code{CUDA} does not appear in the @code{.where} field,
|
|
|
+it must be non-null otherwise.
|
|
|
+
|
|
|
+@item @code{gordon_func} (optionnal):
|
|
|
+This is the index of the Cell SPU implementation within the Gordon library.
|
|
|
+TODO
|
|
|
+
|
|
|
+@item @code{nbuffers}:
|
|
|
+Specifies the number of arguments taken by the codelet. These arguments are
|
|
|
+managed by the DSM and are accessed from the @code{starpu_data_interface_t *}
|
|
|
+array. The constant argument passed with the @code{.cl_arg} field of the
|
|
|
+@code{starpu_task} structure is not counted in this number. This value should
|
|
|
+not be above @code{STARPU_NMAXBUFS}.
|
|
|
+
|
|
|
+@item @code{model} (optionnal):
|
|
|
+This is a pointer to the performance model associated to this codelet. This
|
|
|
+optionnal field is ignored when null. TODO
|
|
|
+
|
|
|
+@end table
|
|
|
+@end table
|
|
|
+
|
|
|
+@node struct starpu_task
|
|
|
+@subsection @code{struct starpu_task} -- StarPU task structure
|
|
|
+@table @asis
|
|
|
+@item @emph{Description}:
|
|
|
+The starpu_task structure describes a task that can be offloaded on the various
|
|
|
+processing units managed by StarPU. It instanciates a codelet. It can either be
|
|
|
+allocated dynamically with the @code{starpu_task_create} method, or declared
|
|
|
+statically. In the latter case, the programmer has to zero the
|
|
|
+@code{starpu_task} structure and to fill the different fields properly. The
|
|
|
+indicated default values correspond to the configuration of a task allocated
|
|
|
+with @code{starpu_task_create}.
|
|
|
+
|
|
|
+@item @emph{Fields}:
|
|
|
+@table @asis
|
|
|
+@item @code{cl}:
|
|
|
+Is a pointer to the corresponding @code{starpu_codelet} data structure. This
|
|
|
+describes where the kernel should be executed, and supplies the appropriate
|
|
|
+implementations. When set to @code{NULL}, no code is executed during the tasks,
|
|
|
+such empty tasks can be useful for synchronization purposes.
|
|
|
+
|
|
|
+@item @code{buffers}:
|
|
|
+TODO
|
|
|
+
|
|
|
+@item @code{cl_arg} (optional) (default = NULL):
|
|
|
+TODO
|
|
|
+
|
|
|
+@item @code{cl_arg_size} (optional):
|
|
|
+TODO
|
|
|
+@c ignored if only executable on CPUs or CUDA ...
|
|
|
+
|
|
|
+@item @code{callback_func} (optional) (default = @code{NULL}):
|
|
|
+This is a function pointer of prototype @code{void (*f)(void *)} which
|
|
|
+specifies a possible callback. If that pointer is non-null, the callback
|
|
|
+function is executed @emph{on the host} after the execution of the task. The
|
|
|
+callback is passed the value contained in the @code{callback_arg} field. No
|
|
|
+callback is executed if that field is null.
|
|
|
+
|
|
|
+@item @code{callback_arg} (optional) (default = @code{NULL}):
|
|
|
+This is the pointer passed to the callback function. This field is ignored if
|
|
|
+the @code{callback_func} is null.
|
|
|
+
|
|
|
+@item @code{use_tag} (optional) (default = 0):
|
|
|
+If set, this flag indicates that the task should be associated with the tag
|
|
|
+conained in the @code{tag_id} field. Tag allow the application to synchronize
|
|
|
+with the task and to express task dependencies easily.
|
|
|
+
|
|
|
+@item @code{tag_id}:
|
|
|
+This fields contains the tag associated to the tag if the @code{use_tag} field
|
|
|
+was set, it is ignored otherwise.
|
|
|
+
|
|
|
+@item @code{synchronous}:
|
|
|
+If this flag is set, the @code{starpu_submit_task} function is blocking and
|
|
|
+returns only when the task has been executed (or if no worker is able to
|
|
|
+process the task). Otherwise, @code{starpu_submit_task} returns immediately.
|
|
|
+
|
|
|
+@item @code{priority} (optionnal) (default = @code{DEFAULT_PRIO}):
|
|
|
+This field indicates a level of priority for the task. This is an integer value
|
|
|
+that must be selected between @code{MIN_PRIO} (for the least important tasks)
|
|
|
+and @code{MAX_PRIO} (for the most important tasks) included. Default priority
|
|
|
+is @code{DEFAULT_PRIO}. Scheduling strategies that take priorities into
|
|
|
+account can use this parameter to take better scheduling decisions, but the
|
|
|
+scheduling policy may also ignore it.
|
|
|
+
|
|
|
+@item @code{execute_on_a_specific_worker} (default = 0):
|
|
|
+If this flag is set, StarPU will bypass the scheduler and directly affect this
|
|
|
+task to the worker specified by the @code{workerid} field.
|
|
|
+
|
|
|
+@item @code{workerid} (optional):
|
|
|
+If the @code{execute_on_a_specific_worker} field is set, this field indicates
|
|
|
+which is the identifier of the worker that should process this task (as
|
|
|
+returned by @code{starpu_get_worker_id}). This field is ignored if
|
|
|
+@code{execute_on_a_specific_worker} field is set to 0.
|
|
|
+
|
|
|
+@item @code{detach} (optional) (default = 1):
|
|
|
+If this flag is set, it is not possible to synchronize with the task
|
|
|
+by the means of @code{starpu_wait_task} later on. Internal data structures
|
|
|
+are only garanteed to be liberated once @code{starpu_wait_task} is called
|
|
|
+if that flag is not set.
|
|
|
+
|
|
|
+@item @code{destroy} (optional) (default = 1):
|
|
|
+If that flag is set, the task structure will automatically be liberated, either
|
|
|
+after the execution of the callback if the task is detached, or during
|
|
|
+@code{starpu_task_wait} otherwise. If this flag is not set, dynamically allocated data
|
|
|
+structures will not be liberated until @code{starpu_task_destroy} is called
|
|
|
+explicitely. Setting this flag for a statically allocated task structure will
|
|
|
+result in undefined behaviour.
|
|
|
+
|
|
|
+@end table
|
|
|
+@end table
|
|
|
+
|
|
|
+@node starpu_task_init
|
|
|
+@subsection @code{starpu_task_init} -- Initialize a Task
|
|
|
+@table @asis
|
|
|
+@item @emph{Description}:
|
|
|
+TODO
|
|
|
+@item @emph{Prototype}:
|
|
|
+@code{void starpu_task_init(struct starpu_task *task);}
|
|
|
+@end table
|
|
|
+
|
|
|
@node starpu_task_create
|
|
|
@subsection @code{starpu_task_create} -- Allocate and Initialize a Task
|
|
|
@table @asis
|
|
|
@item @emph{Description}:
|
|
|
TODO
|
|
|
+(Describe the different default fields ...)
|
|
|
@item @emph{Prototype}:
|
|
|
@code{struct starpu_task *starpu_task_create(void);}
|
|
|
@end table
|
|
|
|
|
|
+@node starpu_task_destroy
|
|
|
+@subsection @code{starpu_task_destroy} -- Destroy a dynamically allocated Task
|
|
|
+@table @asis
|
|
|
+@item @emph{Description}:
|
|
|
+Liberate the ressource allocated during starpu_task_create. This function can
|
|
|
+be called automatically after the execution of a task by setting the
|
|
|
+@code{.destroy} flag of the @code{starpu_task} structure (default behaviour).
|
|
|
+Calling this function on a statically allocated task results in an undefined
|
|
|
+behaviour.
|
|
|
+
|
|
|
+@item @emph{Prototype}:
|
|
|
+@code{void starpu_task_destroy(struct starpu_task *task);}
|
|
|
+@end table
|
|
|
+
|
|
|
+@node starpu_wait_task
|
|
|
+@subsection @code{starpu_wait_task} -- Wait for the termination of a Task
|
|
|
+@table @asis
|
|
|
+@item @emph{Description}:
|
|
|
+This function blocks until the task was executed. It is not possible to
|
|
|
+synchronize with a task more than once. It is not possible to wait
|
|
|
+synchronous or detached tasks.
|
|
|
+@item @emph{Return value}:
|
|
|
+Upon successful completion, this function returns 0. Otherwise, @code{-EINVAL}
|
|
|
+indicates that the waited task was either synchronous or detached.
|
|
|
+@item @emph{Prototype}:
|
|
|
+@code{int starpu_wait_task(struct starpu_task *task);}
|
|
|
+@end table
|
|
|
+
|
|
|
+
|
|
|
+
|
|
|
+@node starpu_submit_task
|
|
|
+@subsection @code{starpu_submit_task} -- Submit a Task
|
|
|
+@table @asis
|
|
|
+@item @emph{Description}:
|
|
|
+This function submits task @code{task} to StarPU. Calling this function does
|
|
|
+not mean that the task will be executed immediatly as there can be data or task
|
|
|
+(tag) dependencies that are not fulfilled yet: StarPU will take care to
|
|
|
+schedule this task with respect to such dependencies.
|
|
|
+This function returns immediately if the @code{synchronous} field of the
|
|
|
+@code{starpu_task} structure was set to 0, and block until the termination of
|
|
|
+the task otherwise. It is also possible to synchronize the application with
|
|
|
+asynchronous tasks by the means of tags, using the @code{starpu_tag_wait}
|
|
|
+function for instance.
|
|
|
+
|
|
|
+In case of success, this function returns 0, a return value of @code{-ENODEV}
|
|
|
+means that there is no worker able to process that task (eg. there is no GPU
|
|
|
+available and this task is only implemented on top of CUDA).
|
|
|
+@item @emph{Prototype}:
|
|
|
+@code{int starpu_submit_task(struct starpu_task *task);}
|
|
|
+@end table
|
|
|
+
|
|
|
+@node starpu_wait_all_tasks
|
|
|
+@subsection @code{starpu_wait_all_tasks} -- Wait for the termination of all Tasks
|
|
|
+@table @asis
|
|
|
+@item @emph{Description}:
|
|
|
+This function blocks until all the tasks that were submitted are terminated.
|
|
|
+
|
|
|
+@item @emph{Prototype}:
|
|
|
+@code{void starpu_wait_all_tasks(void);}
|
|
|
+@end table
|
|
|
+
|
|
|
|
|
|
|
|
|
|
|
|
@@ -306,33 +767,78 @@ TODO
|
|
|
* starpu_tag_wait:: Block until a Tag is terminated
|
|
|
* starpu_tag_wait_array:: Block until a set of Tags is terminated
|
|
|
* starpu_tag_remove:: Destroy a Tag
|
|
|
+* starpu_tag_notify_from_apps:: Feed a tag explicitely
|
|
|
@end menu
|
|
|
|
|
|
|
|
|
@node starpu_tag_t
|
|
|
@subsection @code{starpu_tag_t} -- Task identifier
|
|
|
-@c mention the tag_id field of the task structure
|
|
|
@table @asis
|
|
|
-@item @emph{Definition}:
|
|
|
-TODO
|
|
|
+@item @emph{Description}:
|
|
|
+It is possible to associate a task with a unique "tag" and to express
|
|
|
+dependencies between tasks by the means of those tags. To do so, fill the
|
|
|
+@code{tag_id} field of the @code{starpu_task} structure with a tag number (can
|
|
|
+be arbitrary) and set the @code{use_tag} field to 1.
|
|
|
+
|
|
|
+If @code{starpu_tag_declare_deps} is called with that tag number, the task will
|
|
|
+not be started until the task which wears the declared dependency tags are
|
|
|
+complete.
|
|
|
@end table
|
|
|
|
|
|
@node starpu_tag_declare_deps
|
|
|
@subsection @code{starpu_tag_declare_deps} -- Declare the Dependencies of a Tag
|
|
|
@table @asis
|
|
|
@item @emph{Description}:
|
|
|
-TODO
|
|
|
+Specify the dependencies of the task identified by tag @code{id}. The first
|
|
|
+argument specifies the tag which is configured, the second argument gives the
|
|
|
+number of tag(s) on which @code{id} depends. The following arguments are the
|
|
|
+tags which have to terminated to unlock the task.
|
|
|
+
|
|
|
+This function must be called before the associated task is submitted to StarPU
|
|
|
+with @code{starpu_submit_task}.
|
|
|
+
|
|
|
+@item @emph{Remark}
|
|
|
+Because of the variable arity of @code{starpu_tag_declare_deps}, note that the
|
|
|
+last arguments @emph{must} be of type @code{starpu_tag_t}: constant values
|
|
|
+typically need to be explicitely casted. Using the
|
|
|
+@code{starpu_tag_declare_deps_array} function avoids this hazard.
|
|
|
+
|
|
|
@item @emph{Prototype}:
|
|
|
@code{void starpu_tag_declare_deps(starpu_tag_t id, unsigned ndeps, ...);}
|
|
|
+
|
|
|
+@item @emph{Example}:
|
|
|
+@example
|
|
|
+@c @cartouche
|
|
|
+/* Tag 0x1 depends on tags 0x32 and 0x52 */
|
|
|
+starpu_tag_declare_deps((starpu_tag_t)0x1,
|
|
|
+ 2, (starpu_tag_t)0x32, (starpu_tag_t)0x52);
|
|
|
+
|
|
|
+@c @end cartouche
|
|
|
+@end example
|
|
|
+
|
|
|
+
|
|
|
@end table
|
|
|
|
|
|
@node starpu_tag_declare_deps_array
|
|
|
@subsection @code{starpu_tag_declare_deps_array} -- Declare the Dependencies of a Tag
|
|
|
@table @asis
|
|
|
@item @emph{Description}:
|
|
|
-TODO
|
|
|
+This function is similar to @code{starpu_tag_declare_deps}, except that its
|
|
|
+does not take a variable number of arguments but an array of tags of size
|
|
|
+@code{ndeps}.
|
|
|
@item @emph{Prototype}:
|
|
|
@code{void starpu_tag_declare_deps_array(starpu_tag_t id, unsigned ndeps, starpu_tag_t *array);}
|
|
|
+@item @emph{Example}:
|
|
|
+@example
|
|
|
+@c @cartouche
|
|
|
+/* Tag 0x1 depends on tags 0x32 and 0x52 */
|
|
|
+starpu_tag_t tag_array[2] = @{0x32, 0x52@};
|
|
|
+starpu_tag_declare_deps((starpu_tag_t)0x1, 2, tag_array);
|
|
|
+
|
|
|
+@c @end cartouche
|
|
|
+@end example
|
|
|
+
|
|
|
+
|
|
|
@end table
|
|
|
|
|
|
|
|
|
@@ -340,7 +846,15 @@ TODO
|
|
|
@subsection @code{starpu_tag_wait} -- Block until a Tag is terminated
|
|
|
@table @asis
|
|
|
@item @emph{Description}:
|
|
|
-TODO
|
|
|
+This function blocks until the task associated to tag @code{id} has been
|
|
|
+executed. This is a blocking call which must therefore not be called within
|
|
|
+tasks or callbacks, but only from the application directly. It is possible to
|
|
|
+synchronize with the same tag multiple times, as long as the
|
|
|
+@code{starpu_tag_remove} function is not called. Note that it is still
|
|
|
+possible to synchronize wih a tag associated to a task which @code{starpu_task}
|
|
|
+data structure was liberated (eg. if the @code{destroy} flag of the
|
|
|
+@code{starpu_task} was enabled).
|
|
|
+
|
|
|
@item @emph{Prototype}:
|
|
|
@code{void starpu_tag_wait(starpu_tag_t id);}
|
|
|
@end table
|
|
|
@@ -349,22 +863,40 @@ TODO
|
|
|
@subsection @code{starpu_tag_wait_array} -- Block until a set of Tags is terminated
|
|
|
@table @asis
|
|
|
@item @emph{Description}:
|
|
|
-TODO
|
|
|
+This function is similar to @code{starpu_tag_wait} except that it blocks until
|
|
|
+@emph{all} the @code{ntags} tags contained in the @code{id} array are
|
|
|
+terminated.
|
|
|
@item @emph{Prototype}:
|
|
|
@code{void starpu_tag_wait_array(unsigned ntags, starpu_tag_t *id);}
|
|
|
@end table
|
|
|
|
|
|
|
|
|
-
|
|
|
@node starpu_tag_remove
|
|
|
@subsection @code{starpu_tag_remove} -- Destroy a Tag
|
|
|
@table @asis
|
|
|
@item @emph{Description}:
|
|
|
-TODO
|
|
|
+This function release the resources associated to tag @code{id}. It can be
|
|
|
+called once the corresponding task has been executed and when there is no tag
|
|
|
+that depend on that one anymore.
|
|
|
@item @emph{Prototype}:
|
|
|
@code{void starpu_tag_remove(starpu_tag_t id);}
|
|
|
@end table
|
|
|
|
|
|
+@node starpu_tag_notify_from_apps
|
|
|
+@subsection @code{starpu_tag_notify_from_apps} -- Feed a Tag explicitely
|
|
|
+@table @asis
|
|
|
+@item @emph{Description}:
|
|
|
+This function explicitely unlocks tag @code{id}. It may be useful in the
|
|
|
+case of applications which execute part of their computation outside StarPU
|
|
|
+tasks (eg. third-party libraries). It is also provided as a
|
|
|
+convenient tool for the programmer, for instance to entirely construct the task
|
|
|
+DAG before actually giving StarPU the opportunity to execute the tasks.
|
|
|
+@item @emph{Prototype}:
|
|
|
+@code{void starpu_tag_notify_from_apps(starpu_tag_t id);}
|
|
|
+@end table
|
|
|
+
|
|
|
+
|
|
|
+
|
|
|
|
|
|
@section Extensions
|
|
|
|
|
|
@@ -372,8 +904,6 @@ TODO
|
|
|
|
|
|
@c void starpu_malloc_pinned_if_possible(float **A, size_t dim);
|
|
|
|
|
|
-@c subsubsection driver API specific calls
|
|
|
-
|
|
|
@subsection Cell extensions
|
|
|
|
|
|
@c ---------------------------------------------------------------------
|
|
|
@@ -556,8 +1086,8 @@ Programmers can describe the data layout of their application so that StarPU is
|
|
|
responsible for enforcing data coherency and availability accross the machine.
|
|
|
Instead of handling complex (and non-portable) mechanisms to perform data
|
|
|
movements, programmers only declare which piece of data is accessed and/or
|
|
|
-modified by a task, and StarPU makes sure that when a computational kernel starts
|
|
|
-somewhere (eg. on a GPU), its data are available locally.
|
|
|
+modified by a task, and StarPU makes sure that when a computational kernel
|
|
|
+starts somewhere (eg. on a GPU), its data are available locally.
|
|
|
|
|
|
Before submitting those tasks, the programmer first needs to declare the
|
|
|
different pieces of data to StarPU using the @code{starpu_register_*_data}
|