|
@@ -94,6 +94,7 @@ fashion.
|
|
|
@menu
|
|
|
* Codelet and Tasks::
|
|
|
* StarPU Data Management Library::
|
|
|
+* Glossary::
|
|
|
* Research Papers::
|
|
|
@end menu
|
|
|
|
|
@@ -117,8 +118,8 @@ such as a CPU, a CUDA device or a Cell's SPU.
|
|
|
|
|
|
Another important data structure is the @b{task}. Executing a StarPU task
|
|
|
consists in applying a codelet on a data set, on one of the architectures on
|
|
|
-which the codelet is implemented. In addition to the codelet that a task
|
|
|
-useuses, it also describes which data are accessed, and how they are
|
|
|
+which the codelet is implemented. A task thus describes the codelet that it
|
|
|
+uses, but also which data are accessed, and how they are
|
|
|
accessed during the computation (read and/or write).
|
|
|
StarPU tasks are asynchronous: submitting a task to StarPU is a non-blocking
|
|
|
operation. The task structure can also specify a @b{callback} function that is
|
|
@@ -126,10 +127,13 @@ called once StarPU has properly executed the task. It also contains optional
|
|
|
fields that the application may use to give hints to the scheduler (such as
|
|
|
priority levels).
|
|
|
|
|
|
+By default, task dependencies are inferred from data dependency (sequential
|
|
|
+coherence) by StarPU. The application can however disable sequential coherency
|
|
|
+for some data, and dependencies be expressed by hand.
|
|
|
A task may be identified by a unique 64-bit number chosen by the application
|
|
|
which we refer as a @b{tag}.
|
|
|
-Task dependencies can be enforced either by the means of callback functions, by
|
|
|
-expressing dependencies between explicit tasks or by expressing dependencies
|
|
|
+Task dependencies can be enforced by hand either by the means of callback functions, by
|
|
|
+submitting other tasks, or by expressing dependencies
|
|
|
between tags (which can thus correspond to tasks that have not been submitted
|
|
|
yet).
|
|
|
|
|
@@ -147,6 +151,59 @@ where it was last needed, even if was modified there, and it
|
|
|
allows multiple copies of the same data to reside at the same time on
|
|
|
several processing units as long as it is not modified.
|
|
|
|
|
|
+@node Glossary
|
|
|
+@subsection Glossary
|
|
|
+
|
|
|
+A @b{codelet} records pointers to various implementations of the same
|
|
|
+theoretical function.
|
|
|
+
|
|
|
+A @b{memory node} can be either the main RAM or GPU-embedded memory.
|
|
|
+
|
|
|
+A @b{bus} is a link between memory nodes.
|
|
|
+
|
|
|
+A @b{data handle} keeps track of replicates of the same data (@b{registered} by the
|
|
|
+application) over various memory nodes. The data management library manages
|
|
|
+keeping them coherent.
|
|
|
+
|
|
|
+The @b{home} memory node of a data handle is the memory node from which the data
|
|
|
+was registered (usually the main memory node).
|
|
|
+
|
|
|
+A @b{task} represents a scheduled execution of a codelet on some data handles.
|
|
|
+
|
|
|
+A @b{tag} is a rendez-vous point. Tasks typically have their own tag, and can
|
|
|
+depend on other tags. The value is chosen by the application.
|
|
|
+
|
|
|
+A @b{worker} execute tasks. There is typically one per CPU computation core and
|
|
|
+one per accelerator (for which a whole CPU core is dedicated).
|
|
|
+
|
|
|
+A @b{driver} drives a given kind of workers. There are currently CPU, CUDA,
|
|
|
+OpenCL and Gordon drivers. They usually start several workers to actually drive
|
|
|
+them.
|
|
|
+
|
|
|
+A @b{performance model} is a (dynamic or static) model of the performance of a
|
|
|
+given codelet. Codelets can have execution time performance model as well as
|
|
|
+power consumption performance models.
|
|
|
+
|
|
|
+A data @b{interface} describes the layout of the data: for a vector, a pointer
|
|
|
+for the start, the number of elements and the size of elements ; for a matrix, a
|
|
|
+pointer for the start, the number of elements per row, the offset between rows,
|
|
|
+and the size of each element ; etc. To access their data, codelet functions are
|
|
|
+given interfaces for the local memory node replicates of the data handles of the
|
|
|
+scheduled task.
|
|
|
+
|
|
|
+@b{Partitioning} data means dividing the data of a given data handle (called
|
|
|
+@b{father}) into a series of @b{children} data handles which designate various
|
|
|
+portions of the former.
|
|
|
+
|
|
|
+A @b{filter} is the function which computes children data handles from a father
|
|
|
+data handle, and thus describes how the partitioning should be done (horizontal,
|
|
|
+vertical, etc.)
|
|
|
+
|
|
|
+@b{Acquiring} a data handle can be done from the main application, to safely
|
|
|
+access the data of a data handle from its home node, without having to
|
|
|
+unregister it.
|
|
|
+
|
|
|
+
|
|
|
@node Research Papers
|
|
|
@subsection Research Papers
|
|
|
|
|
@@ -3065,8 +3122,8 @@ design their own data interfaces if required.
|
|
|
@node starpu_malloc
|
|
|
@subsection @code{starpu_malloc} -- Allocate data and pin it
|
|
|
@deftypefun int starpu_malloc (void **@var{A}, size_t @var{dim})
|
|
|
-This function allocates data of the given size. It will also try to pin it in
|
|
|
-CUDA or OpenGL, so that data transfers from this buffer can be asynchronous, and
|
|
|
+This function allocates data of the given size in main memory. It will also try to pin it in
|
|
|
+CUDA or OpenCL, so that data transfers from this buffer can be asynchronous, and
|
|
|
thus permit data transfer and computation overlapping. The allocated buffer must
|
|
|
be freed thanks to the @code{starpu_free} function.
|
|
|
@end deftypefun
|
|
@@ -4409,6 +4466,7 @@ TODO describe all the different fields
|
|
|
@subsection An example of data interface
|
|
|
@table @asis
|
|
|
TODO
|
|
|
+See @code{src/datawizard/interfaces/vector_interface.c} for now.
|
|
|
@end table
|
|
|
|
|
|
@node Defining a new scheduling policy
|