Browse Source

cross-reference between performance model example and scheduling algorithm selection

Samuel Thibault 14 years ago
parent
commit
faf0486405
1 changed files with 16 additions and 9 deletions
  1. 16 9
      doc/starpu.texi

+ 16 - 9
doc/starpu.texi

@@ -1195,10 +1195,13 @@ type). This still assumes performance regularity, but can work with various data
 input sizes, by applying a*n^b+c regression over observed execution times.
 @end itemize
 
+How to use schedulers which can benefit from such performance model is explained
+in section @ref{Task scheduling policy}.
+
 The same can be done for task power consumption estimation, by setting the
 @code{power_model} field the same way as the @code{model} field. Note: for
 now, the application has to give to the power consumption performance model
-a different name.
+a name which is different from the execution time performance model.
 
 @node Theoretical lower bound on execution time
 @section Theoretical lower bound on execution time
@@ -1441,11 +1444,14 @@ priority information to StarPU.
 
 By default, StarPU uses the @code{eager} simple greedy scheduler. This is
 because it provides correct load balance even if the application codelets do not
-have performance models. If your application codelets have performance models,
+have performance models. If your application codelets have performance models
+(see section @ref{Performance model example} for examples showing how to do it),
 you should change the scheduler thanks to the @code{STARPU_SCHED} environment
 variable. For instance @code{export STARPU_SCHED=dmda} . Use @code{help} to get
 the list of available schedulers.
 
+@c TODO: give some details about each scheduler.
+
 Most schedulers are based on an estimation of codelet duration on each kind
 of processing unit. For this to be possible, the application programmer needs
 to configure a performance model for the codelets of the application (see
@@ -1502,15 +1508,16 @@ The power actually consumed by the total execution can be displayed by setting
 @node Profiling
 @section Profiling
 
-Profiling can be enabled by using @code{export STARPU_PROFILING=1} or by
+A quick view of how many tasks each worker has executed can be obtained by setting 
+@code{export STARPU_WORKER_STATS=1} This is a convenient way to check that
+execution did happen on accelerators without penalizing performance with
+the profiling overhead.
+
+More detailed profiling information can be enabled by using @code{export STARPU_PROFILING=1} or by
 calling @code{starpu_profiling_status_set} from the source code.
 Statistics on the execution can then be obtained by using @code{export
-STARPU_BUS_STATS=1} and @code{export STARPU_WORKER_STATS=1} . Workers
-stats will include an approximation of the number of executed tasks even if
-@code{STARPU_PROFILING} is not set. This is a convenient way to check that
-execution did happen on accelerators without penalizing performance with
-the profiling overhead. More details on performance feedback are provided by the
-next chapter.
+STARPU_BUS_STATS=1} and @code{export STARPU_WORKER_STATS=1} .
+ More details on performance feedback are provided by the next chapter.
 
 @node CUDA-specific optimizations
 @section CUDA-specific optimizations