|
@@ -73,7 +73,7 @@ participate to computation.
|
|
|
|
|
|
While a worker is conducting a scheduling operations, e.g. the worker is in the
|
|
While a worker is conducting a scheduling operations, e.g. the worker is in the
|
|
process of selecting a new task to execute, flag state_sched_op_pending is set
|
|
process of selecting a new task to execute, flag state_sched_op_pending is set
|
|
-to !0, otherwise it is set to 0.
|
|
|
|
|
|
+to \c !0, otherwise it is set to \c 0.
|
|
|
|
|
|
While state_sched_op_pending is !0, the following exhaustive list of operations on that
|
|
While state_sched_op_pending is !0, the following exhaustive list of operations on that
|
|
workers are restricted in the stated way:
|
|
workers are restricted in the stated way:
|
|
@@ -83,29 +83,29 @@ workers are restricted in the stated way:
|
|
- adding the worker to a parallel task team is not allowed;
|
|
- adding the worker to a parallel task team is not allowed;
|
|
- removing the worker from a parallel task team is not allowed;
|
|
- removing the worker from a parallel task team is not allowed;
|
|
- querying state information about the worker is only allowed while
|
|
- querying state information about the worker is only allowed while
|
|
- state_relax_refcnt > 0;
|
|
|
|
|
|
+ <code>state_relax_refcnt > 0</code>;
|
|
- in particular, querying whether the worker is blocked on a parallel team entry is only
|
|
- in particular, querying whether the worker is blocked on a parallel team entry is only
|
|
- allowed while state_relax_refcnt > 0.
|
|
|
|
|
|
+ allowed while <code>state_relax_refcnt > 0</code>.
|
|
|
|
|
|
Entering and leaving the state_sched_op_pending state is done through calls to
|
|
Entering and leaving the state_sched_op_pending state is done through calls to
|
|
-_starpu_worker_enter_sched_op() and _starpu_worker_leave_sched_op()
|
|
|
|
-respectively (see these functions in use in functions _starpu_get_worker_task() and
|
|
|
|
-_starpu_get_multi_worker_task()). These calls ensure that any pending
|
|
|
|
|
|
+\ref _starpu_worker_enter_sched_op() and \ref _starpu_worker_leave_sched_op()
|
|
|
|
+respectively (see these functions in use in functions \ref _starpu_get_worker_task() and
|
|
|
|
+\ref _starpu_get_multi_worker_task()). These calls ensure that any pending
|
|
conflicting operation deferred while the worker was in the
|
|
conflicting operation deferred while the worker was in the
|
|
state_sched_op_pending state is performed in an orderly manner.
|
|
state_sched_op_pending state is performed in an orderly manner.
|
|
|
|
|
|
<br>
|
|
<br>
|
|
<b>Scheduling contexts related states</b>
|
|
<b>Scheduling contexts related states</b>
|
|
|
|
|
|
-Flag state_changing_ctx_notice is set to !0 when a thread is about to
|
|
|
|
-add to a scheduling context or remove it from a scheduling context, and is
|
|
|
|
-currently waiting for a safe window to do, until the targeted worker is not in a
|
|
|
|
-scheduling operation or parallel task operation anymore. This flag set to !0 will also
|
|
|
|
|
|
+Flag \c state_changing_ctx_notice is set to \c !0 when a thread is about to
|
|
|
|
+add the worker to a scheduling context or remove it from a scheduling context, and is
|
|
|
|
+currently waiting for a safe window to do so, until the targeted worker is not in a
|
|
|
|
+scheduling operation or parallel task operation anymore. This flag set to \c !0 will also
|
|
prevent the targeted worker to attempt a fresh scheduling operation or parallel
|
|
prevent the targeted worker to attempt a fresh scheduling operation or parallel
|
|
task operation to avoid starving conditions. However, a scheduling operation
|
|
task operation to avoid starving conditions. However, a scheduling operation
|
|
-that was already in process before the notice is allowed to complete.
|
|
|
|
|
|
+that was already in progress before the notice is allowed to complete.
|
|
|
|
|
|
-Flag state_changing_ctx_waiting is set to !0 when a scheduling context worker
|
|
|
|
|
|
+Flag \c state_changing_ctx_waiting is set to \c !0 when a scheduling context worker
|
|
addition or removal involving the targeted worker is about to occur and the
|
|
addition or removal involving the targeted worker is about to occur and the
|
|
worker is currently performing a scheduling operation to tell the targeted
|
|
worker is currently performing a scheduling operation to tell the targeted
|
|
worker that the initiator thread is waiting for the scheduling operation to
|
|
worker that the initiator thread is waiting for the scheduling operation to
|
|
@@ -118,18 +118,18 @@ Any StarPU worker may participate to scheduling operations, and in this process,
|
|
may be forced to observe state information from other workers.
|
|
may be forced to observe state information from other workers.
|
|
A StarPU worker thread may therefore be observed by any thread, even
|
|
A StarPU worker thread may therefore be observed by any thread, even
|
|
other StarPU workers. Since workers may observe each other in any order, it is
|
|
other StarPU workers. Since workers may observe each other in any order, it is
|
|
-not possible to rely exclusively on the sched_mutex of each worker to protect the
|
|
|
|
|
|
+not possible to rely exclusively on the \c sched_mutex of each worker to protect the
|
|
observation of worker state flags by other workers, because
|
|
observation of worker state flags by other workers, because
|
|
worker A observing worker B would involve locking workers in (A B) sequence,
|
|
worker A observing worker B would involve locking workers in (A B) sequence,
|
|
while worker B observing worker A would involve locking workers in (B A)
|
|
while worker B observing worker A would involve locking workers in (B A)
|
|
sequence, leading to lock inversion deadlocks.
|
|
sequence, leading to lock inversion deadlocks.
|
|
|
|
|
|
In consequence, no thread must hold more than one worker's sched_mutex at any time.
|
|
In consequence, no thread must hold more than one worker's sched_mutex at any time.
|
|
-Instead, workers implement a relaxed locking scheme based on the state_relax_refcnt
|
|
|
|
-counter, itself protected by the worker's sched_mutex. When state_relax_refcnt
|
|
|
|
-> 0, the targeted worker state flags may be observed, otherwise the thread attempting
|
|
|
|
-the observation must repeatedly wait on the targeted worker's sched_cond
|
|
|
|
-condition until state_relax_refcnt > 0.
|
|
|
|
|
|
+Instead, workers implement a relaxed locking scheme based on the \c state_relax_refcnt
|
|
|
|
+counter, itself protected by the worker's sched_mutex. When <code>state_relax_refcnt
|
|
|
|
+> 0</code>, the targeted worker state flags may be observed, otherwise the thread attempting
|
|
|
|
+the observation must repeatedly wait on the targeted worker's \c sched_cond
|
|
|
|
+condition until <code>state_relax_refcnt > 0</code>.
|
|
|
|
|
|
The relaxed mode, while on, can actually be seen as a transactional consistency
|
|
The relaxed mode, while on, can actually be seen as a transactional consistency
|
|
model, where concurrent accesses are authorized and potential conflicts are
|
|
model, where concurrent accesses are authorized and potential conflicts are
|
|
@@ -147,58 +147,59 @@ available to run other tasks or perform other operations. Such a waiting
|
|
operation can therefore not start while conflicting operations such as
|
|
operation can therefore not start while conflicting operations such as
|
|
scheduling operations and scheduling context resizing involving the worker are
|
|
scheduling operations and scheduling context resizing involving the worker are
|
|
on-going. Conversely these operations and other may query weather the worker is
|
|
on-going. Conversely these operations and other may query weather the worker is
|
|
-blocked on a parallel task entry with starpu_worker_is_blocked_in_parallel().
|
|
|
|
|
|
+blocked on a parallel task entry with \ref starpu_worker_is_blocked_in_parallel().
|
|
|
|
|
|
-The starpu_worker_is_blocked_in_parallel() function is allowed to proceed while
|
|
|
|
-and only while state_relax_refcnt > 0. Due to the relaxed worker locking scheme,
|
|
|
|
-the state_blocked_in_parallel flag of the targeted worker may change after it
|
|
|
|
|
|
+The \ref starpu_worker_is_blocked_in_parallel() function is allowed to proceed while
|
|
|
|
+and only while <code>state_relax_refcnt > 0</code>. Due to the relaxed worker locking scheme,
|
|
|
|
+the \c state_blocked_in_parallel flag of the targeted worker may change after it
|
|
has been observed by an observer thread. In consequence, flag
|
|
has been observed by an observer thread. In consequence, flag
|
|
-state_blocked_in_parallel_observed of the targeted worker is set to 1 by the
|
|
|
|
|
|
+\c state_blocked_in_parallel_observed of the targeted worker is set to \c 1 by the
|
|
observer immediately after the observation to "taint" the targeted worker. The
|
|
observer immediately after the observation to "taint" the targeted worker. The
|
|
-targeted worker will clear the state_blocked_in_parallel_observed flag tainting
|
|
|
|
|
|
+targeted worker will clear the \c state_blocked_in_parallel_observed flag tainting
|
|
and defer the processing of parallel task related requests until a full
|
|
and defer the processing of parallel task related requests until a full
|
|
scheduling operation shot completes without the
|
|
scheduling operation shot completes without the
|
|
-state_blocked_in_parallel_observed flag being tainted again. The purpose of this
|
|
|
|
|
|
+\c state_blocked_in_parallel_observed flag being tainted again. The purpose of this
|
|
tainting flag is to prevent parallel task operations to be started immediately
|
|
tainting flag is to prevent parallel task operations to be started immediately
|
|
after the observation of a transient scheduling state.
|
|
after the observation of a transient scheduling state.
|
|
|
|
|
|
Worker's management of parallel tasks is
|
|
Worker's management of parallel tasks is
|
|
governed by the following set of state flags and counters:
|
|
governed by the following set of state flags and counters:
|
|
|
|
|
|
-- state_blocked_in_parallel: set to !0 while the worker is currently blocked on a parallel
|
|
|
|
|
|
+- \c state_blocked_in_parallel: set to \c !0 while the worker is currently blocked on a parallel
|
|
task;
|
|
task;
|
|
-- state_blocked_in_parallel_observed: set to !0 to taint the worker when a
|
|
|
|
|
|
+
|
|
|
|
+- \c state_blocked_in_parallel_observed: set to \c !0 to taint the worker when a
|
|
thread has observed the state_blocked_in_parallel flag of this worker while
|
|
thread has observed the state_blocked_in_parallel flag of this worker while
|
|
- its state_relax_refcnt state counter was >0. Any pending request to add or
|
|
|
|
|
|
+ its \c state_relax_refcnt state counter was \c >0. Any pending request to add or
|
|
remove the worker from a parallel task team will be deferred until a whole
|
|
remove the worker from a parallel task team will be deferred until a whole
|
|
scheduling operation shot completes without being tainted again.
|
|
scheduling operation shot completes without being tainted again.
|
|
|
|
|
|
-- state_block_in_parallel_req: set to !0 when a thread is waiting on a request
|
|
|
|
|
|
+- \c state_block_in_parallel_req: set to \c !0 when a thread is waiting on a request
|
|
for the worker to be added to a parallel task team. Must be protected by the
|
|
for the worker to be added to a parallel task team. Must be protected by the
|
|
- worker's sched_mutex.
|
|
|
|
|
|
+ worker's \c sched_mutex.
|
|
|
|
|
|
-- state_block_in_parallel_ack: set to !0 by the worker when acknowledging a
|
|
|
|
|
|
+- \c state_block_in_parallel_ack: set to \c !0 by the worker when acknowledging a
|
|
request for being added to a parallel task team. Must be protected by the
|
|
request for being added to a parallel task team. Must be protected by the
|
|
- worker's sched_mutex.
|
|
|
|
|
|
+ worker's \c sched_mutex.
|
|
|
|
|
|
|
|
|
|
-- state_unblock_in_parallel_req: set to !0 when a thread is waiting on a request
|
|
|
|
|
|
+- \c state_unblock_in_parallel_req: set to \c !0 when a thread is waiting on a request
|
|
for the worker to be removed from a parallel task team. Must be protected by the
|
|
for the worker to be removed from a parallel task team. Must be protected by the
|
|
- worker's sched_mutex.
|
|
|
|
|
|
+ worker's \c sched_mutex.
|
|
|
|
|
|
|
|
|
|
-- state_unblock_in_parallel_ack: set to !0 by the worker when acknowledging a
|
|
|
|
|
|
+- \c state_unblock_in_parallel_ack: set to \c !0 by the worker when acknowledging a
|
|
request for being removed from a parallel task team. Must be protected by the
|
|
request for being removed from a parallel task team. Must be protected by the
|
|
- worker's sched_mutex.
|
|
|
|
|
|
+ worker's \c sched_mutex.
|
|
|
|
|
|
|
|
|
|
-- block_in_parallel_ref_count: counts the number of consecutive pending requests
|
|
|
|
|
|
+- \c block_in_parallel_ref_count: counts the number of consecutive pending requests
|
|
to enter parallel task teams. Only the first of a train of requests for
|
|
to enter parallel task teams. Only the first of a train of requests for
|
|
entering parallel task teams triggers the transition of the
|
|
entering parallel task teams triggers the transition of the
|
|
- state_block_in_parallel_req flag from 0 to 1. Only the last of a train of
|
|
|
|
|
|
+ \c state_block_in_parallel_req flag from \c 0 to \c 1. Only the last of a train of
|
|
requests to leave a parallel task team triggers the transition of flag
|
|
requests to leave a parallel task team triggers the transition of flag
|
|
- state_unblock_in_parallel_req from 0 to 1. Must be protected by the
|
|
|
|
- worker's sched_mutex.
|
|
|
|
|
|
+ \c state_unblock_in_parallel_req from \c 0 to \c 1. Must be protected by the
|
|
|
|
+ worker's \c sched_mutex.
|
|
|
|
|
|
|
|
|
|
\subsubsection CoreEntitiesWorkersOperations Operations
|
|
\subsubsection CoreEntitiesWorkersOperations Operations
|