Browse Source

Mention DriverCopyAsync in the trace.

Samuel Thibault 14 years ago
parent
commit
e9e66b6c78
2 changed files with 4 additions and 1 deletions
  1. 3 1
      doc/starpu.texi
  2. 1 0
      src/sched_policies/heft.c

+ 3 - 1
doc/starpu.texi

@@ -1570,7 +1570,9 @@ When the application allocates data, whenever possible it should use the
 @code{starpu_malloc} function, which will ask CUDA or
 @code{starpu_malloc} function, which will ask CUDA or
 OpenCL to make the allocation itself and pin the corresponding allocated
 OpenCL to make the allocation itself and pin the corresponding allocated
 memory. This is needed to permit asynchronous data transfer, i.e. permit data
 memory. This is needed to permit asynchronous data transfer, i.e. permit data
-transfer to overlap with computations.
+transfer to overlap with computations. Otherwise, the trace will show that the
+@code{DriverCopyAsync} state takes a lot of time, this is because CUDA or OpenCL
+then reverts to synchronous transfers.
 
 
 By default, StarPU leaves replicates of data wherever they were used, in case they
 By default, StarPU leaves replicates of data wherever they were used, in case they
 will be re-used by other tasks, thus saving the data transfer time. When some
 will be re-used by other tasks, thus saving the data transfer time. When some

+ 1 - 0
src/sched_policies/heft.c

@@ -315,6 +315,7 @@ static int _heft_push_task(struct starpu_task *task, unsigned prio)
 
 
 	for (worker = 0; worker < nworkers; worker++)
 	for (worker = 0; worker < nworkers; worker++)
 	{
 	{
+		/* FIXME: multiimpl! */
 		if (!starpu_worker_may_execute_task(worker, task, 0))
 		if (!starpu_worker_may_execute_task(worker, task, 0))
 		{
 		{
 			/* no one on that queue may execute this task */
 			/* no one on that queue may execute this task */