14 gadi atpakaļ · faf0486405
--- a/doc/starpu.texi
+++ b/doc/starpu.texi
@@ -1195,10 +1195,13 @@ type). This still assumes performance regularity, but can work with various data
 
				 input sizes, by applying a*n^b+c regression over observed execution times.
			
 
				 @end itemize
			
 
				 
			
 
				+How to use schedulers which can benefit from such performance model is explained
			
 
				+in section @ref{Task scheduling policy}.
			
 
				+
			
 
				 The same can be done for task power consumption estimation, by setting the
			
 
				 @code{power_model} field the same way as the @code{model} field. Note: for
			
 
				 now, the application has to give to the power consumption performance model
			
 
				-a different name.
			
 
				+a name which is different from the execution time performance model.
			
 
				 
			
 
				 @node Theoretical lower bound on execution time
			
 
				 @section Theoretical lower bound on execution time
			
@@ -1441,11 +1444,14 @@ priority information to StarPU.
 
				 
			
 
				 By default, StarPU uses the @code{eager} simple greedy scheduler. This is
			
 
				 because it provides correct load balance even if the application codelets do not
			
 
				-have performance models. If your application codelets have performance models,
			
 
				+have performance models. If your application codelets have performance models
			
 
				+(see section @ref{Performance model example} for examples showing how to do it),
			
 
				 you should change the scheduler thanks to the @code{STARPU_SCHED} environment
			
 
				 variable. For instance @code{export STARPU_SCHED=dmda} . Use @code{help} to get
			
 
				 the list of available schedulers.
			
 
				 
			
 
				+@c TODO: give some details about each scheduler.
			
 
				+
			
 
				 Most schedulers are based on an estimation of codelet duration on each kind
			
 
				 of processing unit. For this to be possible, the application programmer needs
			
 
				 to configure a performance model for the codelets of the application (see
			
@@ -1502,15 +1508,16 @@ The power actually consumed by the total execution can be displayed by setting
 
				 @node Profiling
			
 
				 @section Profiling
			
 
				 
			
 
				-Profiling can be enabled by using @code{export STARPU_PROFILING=1} or by
			
 
				+A quick view of how many tasks each worker has executed can be obtained by setting 
			
 
				+@code{export STARPU_WORKER_STATS=1} This is a convenient way to check that
			
 
				+execution did happen on accelerators without penalizing performance with
			
 
				+the profiling overhead.
			
 
				+
			
 
				+More detailed profiling information can be enabled by using @code{export STARPU_PROFILING=1} or by
			
 
				 calling @code{starpu_profiling_status_set} from the source code.
			
 
				 Statistics on the execution can then be obtained by using @code{export
			
 
				-STARPU_BUS_STATS=1} and @code{export STARPU_WORKER_STATS=1} . Workers
			
 
				-stats will include an approximation of the number of executed tasks even if
			
 
				-@code{STARPU_PROFILING} is not set. This is a convenient way to check that
			
 
				-execution did happen on accelerators without penalizing performance with
			
 
				-the profiling overhead. More details on performance feedback are provided by the
			
 
				-next chapter.
			
 
				+STARPU_BUS_STATS=1} and @code{export STARPU_WORKER_STATS=1} .
			
 
				+ More details on performance feedback are provided by the next chapter.
			
 
				 
			
 
				 @node CUDA-specific optimizations
			
 
				 @section CUDA-specific optimizations