浏览代码

more comments about task granularity

Samuel Thibault 12 年之前
父节点
当前提交
701f81abfc
共有 1 个文件被更改,包括 6 次插入2 次删除
  1. 6 2
      doc/chapters/perf-optimization.texi

+ 6 - 2
doc/chapters/perf-optimization.texi

@@ -122,10 +122,14 @@ only when another task writes some value to the handle.
 Like any other runtime, StarPU has some overhead to manage tasks. Since
 Like any other runtime, StarPU has some overhead to manage tasks. Since
 it does smart scheduling and data management, that overhead is not always
 it does smart scheduling and data management, that overhead is not always
 neglectable. The order of magnitude of the overhead is typically a couple of
 neglectable. The order of magnitude of the overhead is typically a couple of
-microseconds. The amount of work that a task should do should thus be somewhat
+microseconds, which is actually quite smaller than the CUDA overhead itself. The
+amount of work that a task should do should thus be somewhat
 bigger, to make sure that the overhead becomes neglectible. The offline
 bigger, to make sure that the overhead becomes neglectible. The offline
 performance feedback can provide a measure of task length, which should thus be
 performance feedback can provide a measure of task length, which should thus be
-checked if bad performance are observed.
+checked if bad performance are observed. To get a grasp at the scalability
+possibility according to task size, one can run
+@code{tests/microbenchs/tasks_size_overhead.sh} which draws curves of the
+speedup of independent tasks of very small sizes.
 
 
 @node Task submission
 @node Task submission
 @section Task submission
 @section Task submission