6 年之前 · 15bcee33b9
--- a/doc/doxygen_dev/Makefile.am
+++ b/doc/doxygen_dev/Makefile.am
@@ -55,7 +55,8 @@ endif
 
				 endif
			
 
				 
			
 
				 chapters =	\
			
 
				-	chapters/000_introduction.doxy
			
 
				+	chapters/000_introduction.doxy \
			
 
				+	chapters/010_core.doxy
			
 
				 
			
 
				 images =
			
 
				 
			
--- a/doc/doxygen_dev/chapters/010_core.doxy
+++ b/doc/doxygen_dev/chapters/010_core.doxy
@@ -0,0 +1,191 @@
 
				+/* StarPU --- Runtime system for heterogeneous multicore architectures.
			
 
				+ *
			
 
				+ * Copyright (C) 2018                                     Inria
			
 
				+ *
			
 
				+ * StarPU is free software; you can redistribute it and/or modify
			
 
				+ * it under the terms of the GNU Lesser General Public License as published by
			
 
				+ * the Free Software Foundation; either version 2.1 of the License, or (at
			
 
				+ * your option) any later version.
			
 
				+ *
			
 
				+ * StarPU is distributed in the hope that it will be useful, but
			
 
				+ * WITHOUT ANY WARRANTY; without even the implied warranty of
			
 
				+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
			
 
				+ *
			
 
				+ * See the GNU Lesser General Public License in COPYING.LGPL for more details.
			
 
				+ */
			
 
				+
			
 
				+/*! \page StarPUCore StarPU Core
			
 
				+
			
 
				+\section CoreEntities StarPU Core Entities
			
 
				+
			
 
				+TODO
			
 
				+
			
 
				+\subsection CoreEntitiesOverview Overview
			
 
				+
			
 
				+Execution entities:
			
 
				+- worker: A worker (see \ref CoreEntitiesWorkers, \ref
			
 
				+  CoreEntitiesWorkersAndContexts) entity is a thread created by StarPU to manage
			
 
				+  one computing unit. The computing unit can be a local CPU core, an accelerator
			
 
				+  or GPU device, or --- on the master side when running in master-slave
			
 
				+  distributed mode --- a remote slave computing node. It is responsible for
			
 
				+  querying scheduling policies for tasks to execute.
			
 
				+
			
 
				+- sched_context: A scheduling context (see \ref CoreEntitiesContexts, \ref
			
 
				+  CoreEntitiesWorkersAndContexts) is a logical set of workers governed by an
			
 
				+  instance of a scheduling policy. It defines the computing units to which the
			
 
				+  scheduling policy instance may assign work entities.
			
 
				+
			
 
				+- driver: A driver is the set of hardware-dependent routines used by a
			
 
				+  worker to initialize its associated computing unit, execute work entities on
			
 
				+  it, and finalize the computing unit usage at the end of the session.
			
 
				+
			
 
				+Work entities:
			
 
				+- task: TODO
			
 
				+- job: TODO
			
 
				+
			
 
				+Data entities:
			
 
				+- data handle
			
 
				+- data replicate: TODO
			
 
				+
			
 
				+\subsection CoreEntitiesWorkers Workers
			
 
				+
			
 
				+TODO
			
 
				+
			
 
				+\subsubsection CoreEntitiesWorkersStates States
			
 
				+
			
 
				+Scheduling operations related state
			
 
				+
			
 
				+While a worker is conducting a scheduling operations, e.g. the worker is in the
			
 
				+process of selecting a new task to execute, flag state_sched_op_pending is set
			
 
				+to !0, otherwise it is set to 0.
			
 
				+
			
 
				+While state_sched_op_pending is !0, the following exhaustive list of operations on that
			
 
				+workers are restricted in the stated way:
			
 
				+
			
 
				+- adding the worker to a context is not allowed;
			
 
				+- removing the worker from a context is not allowed;
			
 
				+- adding the worker to a parallel task team is not allowed;
			
 
				+- removing the worker from a parallel task team is not allowed;
			
 
				+- querying state information about the worker is only allowed while
			
 
				+  state_relax_refcnt > 0;
			
 
				+  - in particular, querying whether the worker is blocked on a parallel team entry is only
			
 
				+  allowed while state_relax_refcnt > 0.
			
 
				+
			
 
				+Entering and leaving the state_sched_op_pending state is done through calls to
			
 
				+_starpu_worker_enter_sched_op() and _starpu_worker_leave_sched_op()
			
 
				+respectively (see these functions in use in functions _starpu_get_worker_task() and
			
 
				+_starpu_get_multi_worker_task()). These calls ensure that any pending
			
 
				+conflicting operation deferred while the worker was in the
			
 
				+state_sched_op_pending state is performed in an orderly manner.
			
 
				+
			
 
				+
			
 
				+Scheduling contexts related states
			
 
				+
			
 
				+Flag state_changing_ctx_notice is set to !0 when a thread is about to
			
 
				+add to a scheduling context or remove it from a scheduling context, and is
			
 
				+currently waiting for a safe window to do, until the targeted worker is not in a
			
 
				+scheduling operation or parallel task operation anymore. This flag set to !0 will also
			
 
				+prevent the targeted worker to attempt a fresh scheduling operation or parallel
			
 
				+task operation to avoid starving conditions. However, a scheduling operation
			
 
				+that was already in process before the notice is allowed to complete.
			
 
				+
			
 
				+Flag state_changing_ctx_waiting is set to !0 when a scheduling context worker
			
 
				+addition or removal involving the targeted worker is about to occur and the
			
 
				+worker is currently performing a scheduling operation to tell the targeted
			
 
				+worker that the initiator thread is waiting for the scheduling operation to
			
 
				+complete and should be woken up upon completion.
			
 
				+
			
 
				+Relaxed synchronization related states
			
 
				+
			
 
				+Any StarPU worker may participate to scheduling operations, and in this process,
			
 
				+may be forced to observe state information from other workers. 
			
 
				+A StarPU worker thread may therefore be observed by any thread, even
			
 
				+other StarPU workers. Since workers may observe each other in any order, it is
			
 
				+not possible to rely exclusively on the sched_mutex of each worker to protect the
			
 
				+observation of worker state flags by other workers, because
			
 
				+worker A observing worker B would involve locking workers in (A B) sequence,
			
 
				+while worker B observing worker A would involve locking workers in (B A)
			
 
				+sequence, leading to lock inversion deadlocks.
			
 
				+
			
 
				+In consequence, no thread must hold more than one worker's sched_mutex at any time.
			
 
				+Instead, workers implement a relaxed locking scheme based on the state_relax_refcnt
			
 
				+counter, itself protected by the worker's sched_mutex. When state_relax_refcnt
			
 
				+> 0, the targeted worker state flags may be observed, otherwise the thread attempting
			
 
				+>the observation must repeatedly wait on the targeted worker's sched_cond
			
 
				+>condition until state_relax_refcnt > 0.
			
 
				+
			
 
				+
			
 
				+Parallel tasks related states
			
 
				+
			
 
				+When a worker is scheduled to participate to the execution of a parallel task,
			
 
				+it must wait for the whole team of workers participating to the execution of
			
 
				+this task to be ready. While the worker waits for its teammates, it is not
			
 
				+available to run other tasks or perform other operations. Such a waiting
			
 
				+operation can therefore not start while conflicting operations such as
			
 
				+scheduling operations and scheduling context resizing involving the worker are
			
 
				+on-going. Conversely these operations and other may query weather the worker is
			
 
				+blocked on a parallel task entry with starpu_worker_is_blocked_in_parallel().
			
 
				+
			
 
				+The starpu_worker_is_blocked_in_parallel() function is allowed to proceed while
			
 
				+and only while state_relax_refcnt > 0. Due to the relaxed worker locking scheme,
			
 
				+the state_blocked_in_parallel flag of the targeted worker may change after it
			
 
				+has been observed by an observer thread. In consequence, flag
			
 
				+state_blocked_in_parallel_observed of the targeted worker is set to 1 by the
			
 
				+observer immediately after the observation to "taint" the targeted worker. The
			
 
				+targeted worker will clear the state_blocked_in_parallel_observed flag tainting
			
 
				+and defer the processing of parallel task related requests until a full
			
 
				+scheduling operation shot completes without the
			
 
				+state_blocked_in_parallel_observed flag being tainted again. The purpose of this
			
 
				+tainting flag is to prevent parallel task operations to be started immediately
			
 
				+after the observation of a transient scheduling state.
			
 
				+
			
 
				+Worker's management of parallel tasks is
			
 
				+governed by the following set of state flags and counters:
			
 
				+
			
 
				+- state_blocked_in_parallel: set to !0 while the worker is currently blocked on a parallel
			
 
				+  task;
			
 
				+- state_blocked_in_parallel_observed: set to !0 to taint the worker when a
			
 
				+  thread has observed the state_blocked_in_parallel flag of this worker while
			
 
				+  its state_relax_refcnt state counter was >0. Any pending request to add or
			
 
				+  remove the worker from a parallel task team will be deferred until a whole
			
 
				+  scheduling operation shot completes without being tainted again.
			
 
				+
			
 
				+- state_block_in_parallel_req: set to !0 when a thread is waiting on a request
			
 
				+  for the worker to be added to a parallel task team. Must be protected by the
			
 
				+  worker's sched_mutex.
			
 
				+
			
 
				+- state_block_in_parallel_ack: set to !0 by the worker when acknowledging a
			
 
				+  request for being added to a parallel task team. Must be protected by the
			
 
				+  worker's sched_mutex.
			
 
				+
			
 
				+
			
 
				+- state_unblock_in_parallel_req: set to !0 when a thread is waiting on a request
			
 
				+  for the worker to be removed from a parallel task team. Must be protected by the
			
 
				+  worker's sched_mutex.
			
 
				+
			
 
				+
			
 
				+- state_unblock_in_parallel_ack: set to !0 by the worker when acknowledging a
			
 
				+  request for being removed from a parallel task team. Must be protected by the
			
 
				+  worker's sched_mutex.
			
 
				+
			
 
				+
			
 
				+- block_in_parallel_ref_count: counts the number of consecutive pending requests
			
 
				+  to enter parallel task teams. Only the first of a train of requests for
			
 
				+  entering parallel task teams triggers the transition of the
			
 
				+  state_block_in_parallel_req flag from 0 to 1. Only the last of a train of
			
 
				+  requests to leave a parallel task team triggers the transition of flag
			
 
				+  state_unblock_in_parallel_req from 0 to 1. Must be protected by the
			
 
				+  worker's sched_mutex.
			
 
				+
			
 
				+
			
 
				+
			
 
				+\subsection CoreEntitiesContexts Scheduling Contexts
			
 
				+
			
 
				+TODO
			
 
				+
			
 
				+\subsection CoreEntitiesWorkersAndContexts Workers and Scheduling Contexts
			
 
				+
			
 
				+TODO
			
 
				+
			
 
				+*/
			
 
				+