15 years ago · cf79a02451
--- a/README
+++ b/README
@@ -7,15 +7,15 @@
 
				 
			
 
				 StarPU is a runtime system that offers support for heterogeneous multicore
			
 
				 machines. While many efforts are devoted to design efficient computation kernels
			
 
				-for those architectures (eg. to implement BLAS kernels on GPUs or on Cell's
			
 
				+for those architectures (e.g. to implement BLAS kernels on GPUs or on Cell's
			
 
				 SPUs), StarPU not only takes care of offloading such kernels (and implementing
			
 
				-data coherency accross the machine), but it also makes sure the kernels are
			
 
				+data coherency across the machine), but it also makes sure the kernels are
			
 
				 executed as efficiently as possible.
			
 
				 
			
 
				 +------------------------
			
 
				 | I.b. What StarPU is not
			
 
				 
			
 
				-StarPU is not a new langage, and it does not extends existing langages either.
			
 
				+StarPU is not a new language, and it does not extends existing languages either.
			
 
				 StarPU does not help to write computation kernels.
			
 
				 
			
 
				 +---------------------------------
			
--- a/doc/starpu.texi
+++ b/doc/starpu.texi
@@ -59,14 +59,14 @@ This manual documents the usage of StarPU
 
				 The use of specialized hardware such as accelerators or coprocessors offers an
			
 
				 interesting approach to overcome the physical limits encountered by processor
			
 
				 architects. As a result, many machines are now equipped with one or several
			
 
				-accelerators (eg. a GPU), in addition to the usual processor(s). While a lot of
			
 
				+accelerators (e.g. a GPU), in addition to the usual processor(s). While a lot of
			
 
				 efforts have been devoted to offload computation onto such accelerators, very
			
 
				 little attention as been paid to portability concerns on the one hand, and to the
			
 
				 possibility of having heterogeneous accelerators and processors to interact on the other hand.
			
 
				 
			
 
				 StarPU is a runtime system that offers support for heterogeneous multicore
			
 
				 architectures, it not only offers a unified view of the computational resources
			
 
				-(ie. CPUs and accelerators at the same time), but it also takes care to
			
 
				+(i.e. CPUs and accelerators at the same time), but it also takes care to
			
 
				 efficiently map and execute tasks onto an heterogeneous machine while
			
 
				 transparently handling low-level issues in a portable fashion.
			
 
				 
			
@@ -88,7 +88,7 @@ takes particular care of scheduling those tasks efficiently and allows
 
				 scheduling experts to implement custom scheduling policies in a portable
			
 
				 fashion.
			
 
				 
			
 
				-@c explain the notion of codelet and task (ie. g(A, B)
			
 
				+@c explain the notion of codelet and task (i.e. g(A, B)
			
 
				 @subsection Codelet and Tasks
			
 
				 One of StarPU primary data structure is the @b{codelet}. A codelet describes a
			
 
				 computational kernel that can possibly be implemented on multiple architectures
			
@@ -440,7 +440,7 @@ TODO
 
				 
			
 
				 @item @emph{Description}:
			
 
				 This is StarPU initialization method, which must be called prior to any other
			
 
				-StarPU call.  It is possible to specify StarPU's configuration (eg. scheduling
			
 
				+StarPU call.  It is possible to specify StarPU's configuration (e.g. scheduling
			
 
				 policy, number of cores, ...) by passing a non-null argument. Default
			
 
				 configuration is used if the passed argument is @code{NULL}.
			
 
				 @item @emph{Return value}:
			
@@ -460,7 +460,7 @@ indicates that no worker was available (so that StarPU was not be initialized).
 
				 This structure is passed to the @code{starpu_init} function in order configure
			
 
				 StarPU. When the default value is used, StarPU automatically select the number
			
 
				 of processing units and takes the default scheduling policy. This parameters
			
 
				-overwrite the equivalent environnement variables. 
			
 
				+overwrite the equivalent environment variables. 
			
 
				 
			
 
				 @item @emph{Fields}:
			
 
				 @table @asis 
			
@@ -501,7 +501,7 @@ environment variable.
 
				 @item @emph{Description}:
			
 
				 This is StarPU termination method. It must be called at the end of the
			
 
				 application: statistics and other post-mortem debugging information are not
			
 
				-garanteed to be available until this method has been called.
			
 
				+guaranteed to be available until this method has been called.
			
 
				 
			
 
				 @item @emph{Prototype}:
			
 
				 @code{void starpu_shutdown(void);}
			
@@ -527,7 +527,7 @@ garanteed to be available until this method has been called.
 
				 @table @asis
			
 
				 
			
 
				 @item @emph{Description}:
			
 
				-This function returns the number of workers (ie. processing units executing
			
 
				+This function returns the number of workers (i.e. processing units executing
			
 
				 StarPU tasks). The returned value should be at most @code{STARPU_NMAXWORKERS}. 
			
 
				 
			
 
				 @item @emph{Prototype}:
			
@@ -589,7 +589,7 @@ This function returns the number of Cell SPUs controlled by StarPU.
 
				 @item @emph{Description}:
			
 
				 This function returns the identifier of the worker associated to the calling
			
 
				 thread. The returned value is either -1 if the current context is not a StarPU
			
 
				-worker (ie. when called from the application outside a task or a callback), or
			
 
				+worker (i.e. when called from the application outside a task or a callback), or
			
 
				 an integer between 0 and @code{starpu_get_worker_count() - 1}.
			
 
				 
			
 
				 @item @emph{Prototype}:
			
@@ -705,7 +705,7 @@ Indicates which types of processing units are able to execute that codelet.
 
				 implemented for both CPU cores and CUDA devices while @code{STARPU_GORDON}
			
 
				 indicates that it is only available on Cell SPUs.
			
 
				 
			
 
				-@item @code{cpu_func} (optionnal):
			
 
				+@item @code{cpu_func} (optional):
			
 
				 Is a function pointer to the CPU implementation of the codelet. Its prototype
			
 
				 must be: @code{void cpu_func(void *buffers[], void *cl_arg)}. The first
			
 
				 argument being the array of data managed by the data management library, and
			
@@ -714,21 +714,21 @@ field of the @code{starpu_task} structure.
 
				 The @code{cpu_func} field is ignored if @code{STARPU_CPU} does not appear in
			
 
				 the @code{.where} field, it must be non-null otherwise.
			
 
				 
			
 
				-@item @code{cuda_func} (optionnal):
			
 
				+@item @code{cuda_func} (optional):
			
 
				 Is a function pointer to the CUDA implementation of the codelet. @emph{This
			
 
				 must be a host-function written in the CUDA runtime API}. Its prototype must
			
 
				 be: @code{void cuda_func(void *buffers[], void *cl_arg);}. The @code{cuda_func}
			
 
				 field is ignored if @code{STARPU_CUDA} does not appear in the @code{.where}
			
 
				 field, it must be non-null otherwise.
			
 
				 
			
 
				-@item @code{opencl_func} (optionnal):
			
 
				+@item @code{opencl_func} (optional):
			
 
				 Is a function pointer to the OpenCL implementation of the codelet. Its
			
 
				 prototype must be:
			
 
				 @code{void opencl_func(starpu_data_interface_t *descr, void *arg);}.
			
 
				 This pointer is ignored if @code{OPENCL} does not appear in the
			
 
				 @code{.where} field, it must be non-null otherwise.
			
 
				 
			
 
				-@item @code{gordon_func} (optionnal):
			
 
				+@item @code{gordon_func} (optional):
			
 
				 This is the index of the Cell SPU implementation within the Gordon library.
			
 
				 TODO
			
 
				 
			
@@ -739,9 +739,9 @@ array. The constant argument passed with the @code{.cl_arg} field of the
 
				 @code{starpu_task} structure is not counted in this number.  This value should
			
 
				 not be above @code{STARPU_NMAXBUFS}.
			
 
				 
			
 
				-@item @code{model} (optionnal):
			
 
				+@item @code{model} (optional):
			
 
				 This is a pointer to the performance model associated to this codelet. This
			
 
				-optionnal field is ignored when null. TODO
			
 
				+optional field is ignored when null. TODO
			
 
				 
			
 
				 @end table
			
 
				 @end table
			
@@ -751,7 +751,7 @@ optionnal field is ignored when null. TODO
 
				 @table @asis
			
 
				 @item @emph{Description}:
			
 
				 The starpu_task structure describes a task that can be offloaded on the various
			
 
				-processing units managed by StarPU. It instanciates a codelet. It can either be
			
 
				+processing units managed by StarPU. It instantiates a codelet. It can either be
			
 
				 allocated dynamically with the @code{starpu_task_create} method, or declared
			
 
				 statically. In the latter case, the programmer has to zero the
			
 
				 @code{starpu_task} structure and to fill the different fields properly. The
			
@@ -771,7 +771,7 @@ TODO
 
				 
			
 
				 @item @code{cl_arg} (optional) (default = NULL):
			
 
				 This pointer is passed to the codelet through the second argument
			
 
				-of the codelet implementation (eg. @code{cpu_func} or @code{cuda_func}).
			
 
				+of the codelet implementation (e.g. @code{cpu_func} or @code{cuda_func}).
			
 
				 In the specific case of the Cell processor, see the @code{.cl_arg_size}
			
 
				 argument.
			
 
				 
			
@@ -797,7 +797,7 @@ the @code{callback_func} is null.
 
				 
			
 
				 @item @code{use_tag} (optional) (default = 0):
			
 
				 If set, this flag indicates that the task should be associated with the tag
			
 
				-conained in the @code{tag_id} field. Tag allow the application to synchronize
			
 
				+contained in the @code{tag_id} field. Tag allow the application to synchronize
			
 
				 with the task and to express task dependencies easily.
			
 
				 
			
 
				 @item @code{tag_id}:
			
@@ -809,7 +809,7 @@ If this flag is set, the @code{starpu_submit_task} function is blocking and
 
				 returns only when the task has been executed (or if no worker is able to
			
 
				 process the task). Otherwise, @code{starpu_submit_task} returns immediately.
			
 
				 
			
 
				-@item @code{priority} (optionnal) (default = @code{STARPU_DEFAULT_PRIO}):
			
 
				+@item @code{priority} (optional) (default = @code{STARPU_DEFAULT_PRIO}):
			
 
				 This field indicates a level of priority for the task. This is an integer value
			
 
				 that must be selected between @code{STARPU_MIN_PRIO} (for the least important
			
 
				 tasks) and @code{STARPU_MAX_PRIO} (for the most important tasks) included.
			
@@ -830,7 +830,7 @@ returned by @code{starpu_get_worker_id}). This field is ignored if
 
				 @item @code{detach} (optional) (default = 1):
			
 
				 If this flag is set, it is not possible to synchronize with the task
			
 
				 by the means of @code{starpu_wait_task} later on. Internal data structures
			
 
				-are only garanteed to be liberated once @code{starpu_wait_task} is called
			
 
				+are only guaranteed to be liberated once @code{starpu_wait_task} is called
			
 
				 if that flag is not set.
			
 
				 
			
 
				 @item @code{destroy} (optional) (default = 1):
			
@@ -838,7 +838,7 @@ If that flag is set, the task structure will automatically be liberated, either
 
				 after the execution of the callback if the task is detached, or during
			
 
				 @code{starpu_task_wait} otherwise. If this flag is not set, dynamically allocated data
			
 
				 structures will not be liberated until @code{starpu_task_destroy} is called
			
 
				-explicitely. Setting this flag for a statically allocated task structure will
			
 
				+explicitly. Setting this flag for a statically allocated task structure will
			
 
				 result in undefined behaviour.
			
 
				 
			
 
				 @end table
			
@@ -848,9 +848,9 @@ result in undefined behaviour.
 
				 @subsection @code{starpu_task_init} -- Initialize a Task
			
 
				 @table @asis
			
 
				 @item @emph{Description}:
			
 
				-Initialize a task structure with default values. This function is implicitely
			
 
				+Initialize a task structure with default values. This function is implicitly
			
 
				 called by @code{starpu_task_create}. By default, tasks initialized with
			
 
				-@code{starpu_task_init} must be deinitialized explicitely with
			
 
				+@code{starpu_task_init} must be deinitialized explicitly with
			
 
				 @code{starpu_task_deinit}. Tasks can also be initialized statically, using the
			
 
				 constant @code{STARPU_TASK_INITIALIZER}.
			
 
				 @item @emph{Prototype}:
			
@@ -863,8 +863,8 @@ constant @code{STARPU_TASK_INITIALIZER}.
 
				 @item @emph{Description}:
			
 
				 Allocate a task structure and initialize it with default values. Tasks
			
 
				 allocated dynamically with starpu_task_create are automatically liberated when
			
 
				-the task is terminated. If the destroy flag is explicitely unset, the
			
 
				-ressources used by the task are liberated by calling
			
 
				+the task is terminated. If the destroy flag is explicitly unset, the
			
 
				+resources used by the task are liberated by calling
			
 
				 @code{starpu_task_destroy}.
			
 
				 
			
 
				 @item @emph{Prototype}:
			
@@ -876,7 +876,7 @@ ressources used by the task are liberated by calling
 
				 @table @asis
			
 
				 @item @emph{Description}:
			
 
				 Release all the structures automatically allocated to execute the task. This is
			
 
				-called implicitely by starpu_task_destroy, but the task structure itself is not
			
 
				+called implicitly by starpu_task_destroy, but the task structure itself is not
			
 
				 liberated. This should be used for statically allocated tasks for instance.
			
 
				 Note that this function is automatically called by @code{starpu_task_destroy}.
			
 
				 @item @emph{Prototype}:
			
@@ -889,7 +889,7 @@ Note that this function is automatically called by @code{starpu_task_destroy}.
 
				 @subsection @code{starpu_task_destroy} -- Destroy a dynamically allocated Task
			
 
				 @table @asis
			
 
				 @item @emph{Description}:
			
 
				-Liberate the ressource allocated during starpu_task_create. This function can
			
 
				+Liberate the resource allocated during starpu_task_create. This function can
			
 
				 be called automatically after the execution of a task by setting the
			
 
				 @code{.destroy} flag of the @code{starpu_task} structure (default behaviour).
			
 
				 Calling this function on a statically allocated task results in an undefined
			
@@ -920,7 +920,7 @@ indicates that the waited task was either synchronous or detached.
 
				 @table @asis
			
 
				 @item @emph{Description}:
			
 
				 This function submits task @code{task} to StarPU. Calling this function does
			
 
				-not mean that the task will be executed immediatly as there can be data or task
			
 
				+not mean that the task will be executed immediately as there can be data or task
			
 
				 (tag) dependencies that are not fulfilled yet: StarPU will take care to
			
 
				 schedule this task with respect to such dependencies.
			
 
				 This function returns immediately if the @code{synchronous} field of the
			
@@ -930,7 +930,7 @@ asynchronous tasks by the means of tags, using the @code{starpu_tag_wait}
 
				 function for instance. 
			
 
				 
			
 
				 In case of success, this function returns 0, a return value of @code{-ENODEV}
			
 
				-means that there is no worker able to process that task (eg. there is no GPU
			
 
				+means that there is no worker able to process that task (e.g. there is no GPU
			
 
				 available and this task is only implemented on top of CUDA).
			
 
				 @item @emph{Prototype}:
			
 
				 @code{int starpu_submit_task(struct starpu_task *task);}
			
@@ -961,7 +961,7 @@ This function blocks until all the tasks that were submitted are terminated.
 
				 * starpu_tag_wait::                Block until a Tag is terminated
			
 
				 * starpu_tag_wait_array::          Block until a set of Tags is terminated
			
 
				 * starpu_tag_remove::              Destroy a Tag
			
 
				-* starpu_tag_notify_from_apps::    Feed a tag explicitely
			
 
				+* starpu_tag_notify_from_apps::    Feed a tag explicitly
			
 
				 @end menu
			
 
				 
			
 
				 
			
@@ -994,7 +994,7 @@ with @code{starpu_submit_task}.
 
				 @item @emph{Remark}
			
 
				 Because of the variable arity of @code{starpu_tag_declare_deps}, note that the
			
 
				 last arguments @emph{must} be of type @code{starpu_tag_t}: constant values
			
 
				-typically need to be explicitely casted. Using the
			
 
				+typically need to be explicitly casted. Using the
			
 
				 @code{starpu_tag_declare_deps_array} function avoids this hazard.
			
 
				 
			
 
				 @item @emph{Prototype}:
			
@@ -1042,8 +1042,8 @@ executed. This is a blocking call which must therefore not be called within
 
				 tasks or callbacks, but only from the application directly.  It is possible to
			
 
				 synchronize with the same tag multiple times, as long as the
			
 
				 @code{starpu_tag_remove} function is not called.  Note that it is still
			
 
				-possible to synchronize wih a tag associated to a task which @code{starpu_task}
			
 
				-data structure was liberated (eg. if the @code{destroy} flag of the
			
 
				+possible to synchronize with a tag associated to a task which @code{starpu_task}
			
 
				+data structure was liberated (e.g. if the @code{destroy} flag of the
			
 
				 @code{starpu_task} was enabled).
			
 
				 
			
 
				 @item @emph{Prototype}:
			
@@ -1073,12 +1073,12 @@ that depend on that one anymore.
 
				 @end table
			
 
				 
			
 
				 @node starpu_tag_notify_from_apps
			
 
				-@subsection @code{starpu_tag_notify_from_apps} -- Feed a Tag explicitely
			
 
				+@subsection @code{starpu_tag_notify_from_apps} -- Feed a Tag explicitly
			
 
				 @table @asis
			
 
				 @item @emph{Description}:
			
 
				-This function explicitely unlocks tag @code{id}. It may be useful in the
			
 
				+This function explicitly unlocks tag @code{id}. It may be useful in the
			
 
				 case of applications which execute part of their computation outside StarPU
			
 
				-tasks (eg. third-party libraries).  It is also provided as a
			
 
				+tasks (e.g. third-party libraries).  It is also provided as a
			
 
				 convenient tool for the programmer, for instance to entirely construct the task
			
 
				 DAG before actually giving StarPU the opportunity to execute the tasks.
			
 
				 @item @emph{Prototype}:
			
@@ -1235,7 +1235,7 @@ starpu_codelet cl =
 
				 
			
 
				 A codelet is a structure that represents a computational kernel. Such a codelet
			
 
				 may contain an implementation of the same kernel on different architectures
			
 
				-(eg. CUDA, Cell's SPU, x86, ...).
			
 
				+(e.g. CUDA, Cell's SPU, x86, ...).
			
 
				 
			
 
				 The ''@code{.nbuffers}'' field specifies the number of data buffers that are
			
 
				 manipulated by the codelet: here the codelet does not access or modify any data
			
@@ -1255,7 +1255,7 @@ which @emph{must} have the following prototype:
 
				 @code{void (*cpu_func)(void *buffers[], void *cl_arg)}
			
 
				 
			
 
				 In this example, we can ignore the first argument of this function which gives a
			
 
				-description of the input and output buffers (eg. the size and the location of
			
 
				+description of the input and output buffers (e.g. the size and the location of
			
 
				 the matrices). The second argument is a pointer to a buffer passed as an
			
 
				 argument to the codelet by the means of the ''@code{.cl_arg}'' field of the
			
 
				 @code{starpu_task} structure.
			
@@ -1263,7 +1263,7 @@ argument to the codelet by the means of the ''@code{.cl_arg}'' field of the
 
				 @c TODO rewrite so that it is a little clearer ?
			
 
				 Be aware that this may be a pointer to a
			
 
				 @emph{copy} of the actual buffer, and not the pointer given by the programmer:
			
 
				-if the codelet modifies this buffer, there is no garantee that the initial
			
 
				+if the codelet modifies this buffer, there is no guarantee that the initial
			
 
				 buffer will be modified as well: this for instance implies that the buffer
			
 
				 cannot be used as a synchronization medium.
			
 
				 
			
@@ -1350,11 +1350,11 @@ The previous example has shown how to submit tasks. In this section we show how
 
				 StarPU tasks can manipulate data.
			
 
				 
			
 
				 Programmers can describe the data layout of their application so that StarPU is
			
 
				-responsible for enforcing data coherency and availability accross the machine.
			
 
				+responsible for enforcing data coherency and availability across the machine.
			
 
				 Instead of handling complex (and non-portable) mechanisms to perform data
			
 
				 movements, programmers only declare which piece of data is accessed and/or
			
 
				 modified by a task, and StarPU makes sure that when a computational kernel
			
 
				-starts somewhere (eg. on a GPU), its data are available locally.
			
 
				+starts somewhere (e.g. on a GPU), its data are available locally.
			
 
				 
			
 
				 Before submitting those tasks, the programmer first needs to declare the
			
 
				 different pieces of data to StarPU using the @code{starpu_register_*_data}