Browse Source

more comments about task granularity

Samuel Thibault 12 years ago
parent
commit
701f81abfc
1 changed files with 6 additions and 2 deletions
  1. 6 2
      doc/chapters/perf-optimization.texi

+ 6 - 2
doc/chapters/perf-optimization.texi

@@ -122,10 +122,14 @@ only when another task writes some value to the handle.
 Like any other runtime, StarPU has some overhead to manage tasks. Since
 it does smart scheduling and data management, that overhead is not always
 neglectable. The order of magnitude of the overhead is typically a couple of
-microseconds. The amount of work that a task should do should thus be somewhat
+microseconds, which is actually quite smaller than the CUDA overhead itself. The
+amount of work that a task should do should thus be somewhat
 bigger, to make sure that the overhead becomes neglectible. The offline
 performance feedback can provide a measure of task length, which should thus be
-checked if bad performance are observed.
+checked if bad performance are observed. To get a grasp at the scalability
+possibility according to task size, one can run
+@code{tests/microbenchs/tasks_size_overhead.sh} which draws curves of the
+speedup of independent tasks of very small sizes.
 
 @node Task submission
 @section Task submission