350_scheduling_policy_definition.doxy 28 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601
  1. /* StarPU --- Runtime system for heterogeneous multicore architectures.
  2. *
  3. * Copyright (C) 2013-2020 Université de Bordeaux, CNRS (LaBRI UMR 5800), Inria
  4. * Copyright (C) 2013 Simon Archipoff
  5. *
  6. * StarPU is free software; you can redistribute it and/or modify
  7. * it under the terms of the GNU Lesser General Public License as published by
  8. * the Free Software Foundation; either version 2.1 of the License, or (at
  9. * your option) any later version.
  10. *
  11. * StarPU is distributed in the hope that it will be useful, but
  12. * WITHOUT ANY WARRANTY; without even the implied warranty of
  13. * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
  14. *
  15. * See the GNU Lesser General Public License in COPYING.LGPL for more details.
  16. */
  17. /*! \page HowToDefineANewSchedulingPolicy How To Define A New Scheduling Policy
  18. \section NewSchedulingPolicy_Introduction Introduction
  19. StarPU provides two ways of defining a scheduling policy, a basic monolithic
  20. way, and a modular way.
  21. The basic monolithic way is directly connected with the core of StarPU, which
  22. means that the policy then has to handle all performance details, such as data
  23. prefetching, task performance model calibration, worker locking, etc.
  24. <c>examples/scheduler/dummy_sched.c</c> is a trivial example which does not
  25. handle this, and thus e.g. does not achieve any data prefetching or smart
  26. scheduling.
  27. The modular way allows to implement just one component, and
  28. reuse existing components to cope with all these details.
  29. <c>examples/scheduler/dummy_modular_sched.c</c> is a trivial example very
  30. similar to <c>dummy_sched.c</c>, but implemented as a component, which allows to
  31. assemble it with other components, and notably get data prefetching support for
  32. free, and task performance model calibration is properly performed, which allows
  33. to easily extend it into taking task duration into account, etc.
  34. \section SchedulingHelpers Helper functions for defining a scheduling policy (Basic or modular)
  35. Make sure to have a look at the \ref API_Scheduling_Policy section, which
  36. provides a complete list of the functions available for writing advanced schedulers.
  37. This includes getting an estimation for a task computation completion with
  38. starpu_task_expected_length(), for the required data transfers with
  39. starpu_task_expected_data_transfer_time_for(), for the required energy with
  40. starpu_task_expected_energy(), etc. Per-worker variants are also available with
  41. starpu_task_worker_expected_length(), etc. Other
  42. useful functions include starpu_transfer_bandwidth(), starpu_transfer_latency(),
  43. starpu_transfer_predict(), ...
  44. One can also directly test the presence of a data handle with starpu_data_is_on_node().
  45. Prefetches can be triggered by calling either starpu_prefetch_task_input_for(),
  46. starpu_idle_prefetch_task_input(), starpu_prefetch_task_input_for_prio(), or
  47. starpu_idle_prefetch_task_input_for_prio(). The <c>_prio</c> versions allow to
  48. specify a priority for the transfer (instead of taking the task priority by
  49. default). These prefetches are only processed when there are no fetch data
  50. requests (i.e. a task is waiting for it) to process. The <c>_idle</c> versions
  51. queue the transfers on the idle prefetch queue, which is only processed when
  52. there are no non-idle prefetch to process.
  53. starpu_get_prefetch_flag() is a convenient helper for checking the value of the
  54. \ref STARPU_PREFETCH environment variable.
  55. When a scheduler does such prefetching, it should set the <c>prefetches</c>
  56. field of the <c>starpu_sched_policy</c> to 1, to prevent the core from
  57. triggering its own prefetching.
  58. Usual functions can be used on tasks, for instance one can use the following to
  59. get the data size for a task.
  60. \code{.c}
  61. size = 0;
  62. write = 0;
  63. if (task->cl)
  64. for (i = 0; i < STARPU_TASK_GET_NBUFFERS(task); i++)
  65. {
  66. starpu_data_handle_t data = STARPU_TASK_GET_HANDLE(task, i)
  67. size_t datasize = starpu_data_get_size(data);
  68. size += datasize;
  69. if (STARPU_TASK_GET_MODE(task, i) & STARPU_W)
  70. write += datasize;
  71. }
  72. \endcode
  73. Task queues can be implemented with the starpu_task_list functions.
  74. Access to the \c hwloc topology is available with starpu_worker_get_hwloc_obj().
  75. \section DefiningANewBasicSchedulingPolicy Defining A New Basic Scheduling Policy
  76. A full example showing how to define a new scheduling policy is available in
  77. the StarPU sources in <c>examples/scheduler/dummy_sched.c</c>.
  78. The scheduler has to provide methods:
  79. \code{.c}
  80. static struct starpu_sched_policy dummy_sched_policy =
  81. {
  82. .init_sched = init_dummy_sched,
  83. .deinit_sched = deinit_dummy_sched,
  84. .add_workers = dummy_sched_add_workers,
  85. .remove_workers = dummy_sched_remove_workers,
  86. .push_task = push_task_dummy,
  87. .pop_task = pop_task_dummy,
  88. .policy_name = "dummy",
  89. .policy_description = "dummy scheduling strategy"
  90. };
  91. \endcode
  92. The idea is that when a task becomes ready for execution, the
  93. starpu_sched_policy::push_task method is called to give the ready task to the
  94. scheduler. When a worker is idle, the starpu_sched_policy::pop_task method is
  95. called to get a task from the scheduler. It is up to the
  96. scheduler to implement what is between. A simple eager scheduler is for instance
  97. to make starpu_sched_policy::push_task push the task to a global list, and make
  98. starpu_sched_policy::pop_task pop from this list. A scheduler can also use
  99. starpu_push_local_task() to directly push tasks to a per-worker queue, and then
  100. starpu does not even need to implement starpu_sched_policy::pop_task.
  101. If there are no ready tasks within the scheduler, it can just return \c NULL, and
  102. the worker will sleep.
  103. The \ref starpu_sched_policy section provides the exact rules that govern the
  104. methods of the policy.
  105. One can enumerate the workers with this iterator:
  106. \code{.c}
  107. struct starpu_worker_collection *workers = starpu_sched_ctx_get_worker_collection(sched_ctx_id);
  108. struct starpu_sched_ctx_iterator it;
  109. workers->init_iterator(workers, &it);
  110. while(workers->has_next(workers, &it))
  111. {
  112. unsigned worker = workers->get_next(workers, &it);
  113. ...
  114. }
  115. \endcode
  116. To provide synchronization between workers, a per-worker lock exists to protect
  117. the data structures of a given worker. It is acquired around scheduler methods,
  118. so that the scheduler does not need any additional mutex to protect its per-worker data.
  119. In case the scheduler wants to access another scheduler's data, it should use
  120. starpu_worker_lock() and starpu_worker_unlock().
  121. Calling \code{.c}starpu_worker_lock(B)\endcode from a worker \c A will however thus make
  122. worker \c A wait for worker \c B to complete its scheduling method. That may be
  123. a problem if that method takes a long time, because it is e.g. computing a
  124. heuristic or waiting for another mutex, or even cause deadlocks if worker \c B is
  125. calling \code{.c}starpu_worker_lock(A)\endcode at the same time. In such a case, worker \c B must
  126. call starpu_worker_relax_on() and starpu_worker_relax_off() around the section
  127. which potentially blocks (and does not actually need protection). While a worker
  128. is in relaxed mode, e.g. between a pair of starpu_worker_relax_on() and
  129. starpu_worker_relax_off() calls, its state can be altered by other threads: for
  130. instance, worker \c A can push tasks for worker \c B. In consequence, worker \c B
  131. must re-assess its state after \code{.c}starpu_worker_relax_off(B)\endcode, such as taking
  132. possible new tasks pushed to its queue into account.
  133. When the starpu_sched_policy::push_task method has pushed a task for another
  134. worker, one has to call starpu_wake_worker_relax_light() so that the worker wakes up
  135. and picks it. If the task was pushed on a shared queue, one may want to only
  136. wake one idle worker. An example doing this is available in
  137. <c>src/sched_policies/eager_central_policy.c</c>.
  138. A pointer to one data structure specific to the scheduler can be set with
  139. starpu_sched_ctx_set_policy_data() and fetched with
  140. starpu_sched_ctx_get_policy_data(). Per-worker data structures can then be
  141. store in it by allocating a \ref STARPU_NMAXWORKERS -sized array of structures indexed
  142. by workers.
  143. A variety of examples of
  144. advanced schedulers can be read in <c>src/sched_policies</c>, for
  145. instance <c>random_policy.c</c>, <c>eager_central_policy.c</c>,
  146. <c>work_stealing_policy.c</c> Code protected by
  147. <c>if (_starpu_get_nsched_ctxs() > 1)</c> can be ignored, this is for scheduling
  148. contexts, which is an experimental feature.
  149. \section DefiningANewModularSchedulingPolicy Defining A New Modular Scheduling Policy
  150. StarPU's Modularized Schedulers are made of individual Scheduling Components
  151. Modularizedly assembled as a Scheduling Tree. Each Scheduling Component has an
  152. unique purpose, such as prioritizing tasks or mapping tasks over resources.
  153. A typical Scheduling Tree is shown below.
  154. <pre>
  155. |
  156. starpu_push_task |
  157. |
  158. v
  159. Fifo_Component
  160. | ^
  161. Push | | Can_Push
  162. v |
  163. Eager_Component
  164. | ^
  165. | |
  166. v |
  167. --------><-------------------><---------
  168. | ^ | ^
  169. Push | | Can_Push Push | | Can_Push
  170. v | v |
  171. Fifo_Component Fifo_Component
  172. | ^ | ^
  173. Pull | | Can_Pull Pull | | Can_Pull
  174. v | v |
  175. Worker_Component Worker_Component
  176. | |
  177. starpu_pop_task | |
  178. v v
  179. </pre>
  180. When a task is pushed by StarPU in a Modularized Scheduler, the task moves from
  181. a Scheduling Component to an other, following the hierarchy of the
  182. Scheduling Tree, and is stored in one of the Scheduling Components of the
  183. strategy.
  184. When a worker wants to pop a task from the Modularized Scheduler, the
  185. corresponding Worker Component of the Scheduling Tree tries to pull a task from
  186. its parents, following the hierarchy, and gives it to the worker if it succeded
  187. to get one.
  188. \subsection Interface
  189. Each Scheduling Component must follow the following pre-defined Interface
  190. to be able to interact with other Scheduling Components.
  191. - push_task (child_component, Task) \n
  192. The calling Scheduling Component transfers a task to its
  193. Child Component. When the Push function returns, the task no longer
  194. belongs to the calling Component. The Modularized Schedulers'
  195. model relies on this function to perform prefetching.
  196. See starpu_sched_component::push_task for more details
  197. - pull_task (parent_component, caller_component) -> Task \n
  198. The calling Scheduling Component requests a task from
  199. its Parent Component. When the Pull function ends, the returned
  200. task belongs to the calling Component.
  201. See starpu_sched_component::pull_task for more details
  202. - can_push (caller_component, parent_component) \n
  203. The calling Scheduling Component notifies its Parent Component that
  204. it is ready to accept new tasks.
  205. See starpu_sched_component::can_push for more details
  206. - can_pull (caller_component, child_component) \n
  207. The calling Scheduling Component notifies its Child Component
  208. that it is ready to give new tasks.
  209. See starpu_sched_component::can_pull for more details
  210. The components also provide the following useful methods:
  211. - starpu_sched_component::estimated_load provides an estimated load of
  212. the component
  213. - starpu_sched_component::estimated_end provides an estimated date of
  214. availability of workers behind the component, after processing tasks in
  215. the component and below.
  216. This is computed only if the estimated field of the tasks have been set
  217. before passing it to the component.
  218. \subsection BuildAModularizedScheduler Building a Modularized Scheduler
  219. \subsubsection PreImplementedComponents Pre-implemented Components
  220. StarPU is currently shipped with the following four Scheduling Components :
  221. - Storage Components : Fifo, Prio \n
  222. Components which store tasks. They can also prioritize them if
  223. they have a defined priority. It is possible to define a threshold
  224. for those Components following two criterias : the number of tasks
  225. stored in the Component, or the sum of the expected length of all
  226. tasks stored in the Component. When a push operation tries to queue a
  227. task beyond the threshold, the push fails. When some task leaves the
  228. queue (and thus possibly more tasks can fit), this component calls
  229. can_push from ancestors.
  230. - Resource-Mapping Components : Mct, Heft, Eager, Random, Work-Stealing \n
  231. "Core" of the Scheduling Strategy, those Components are the
  232. ones who make scheduling choices between their children components.
  233. - Worker Components : Worker \n
  234. Each Worker Component modelizes a concrete worker, and copes with the
  235. technical tricks of interacting with the StarPU core. Modular schedulers
  236. thus usually have them at the bottom of their component tree.
  237. - Special-Purpose Components : Perfmodel_Select, Best_Implementation \n
  238. Components dedicated to original purposes. The Perfmodel_Select
  239. Component decides which Resource-Mapping Component should be used to
  240. schedule a task: a component that assumes tasks with a calibrated
  241. performance model; a component for non-yet-calibrated tasks, that will
  242. distribute them to get measurements done as quickly as possible; and a
  243. component that takes the tasks without performance models.\n
  244. The Best_Implementation Component chooses which
  245. implementation of a task should be used on the chosen resource.
  246. \subsubsection ProgressionAndValidationRules Progression And Validation Rules
  247. Some rules must be followed to ensure the correctness of a Modularized
  248. Scheduler :
  249. - At least one Storage Component without threshold is needed in a
  250. Modularized Scheduler, to store incoming tasks from StarPU. It can for
  251. instance be a global component at the top of the tree, or one component
  252. per worker at the bottom of the tree, or intermediate assemblies. The
  253. important point is that the starpu_sched_component::push_task call at the top can not
  254. fail, so there has to be a storage component without threshold between
  255. the top of the tree and the first storage component with threshold, or
  256. the workers themselves.
  257. - At least one Resource-Mapping Component is needed in a Modularized
  258. Scheduler. Resource-Mapping Components are the only ones which can make
  259. scheduling choices, and so the only ones which can have several child.
  260. \subsubsection ModularizedSchedulerLocking Locking in modularized schedulers
  261. Most often, components do not need to take locks. This allows e.g. the push
  262. operation to be called in parallel when tasks get released in parallel from
  263. different workers which have completed different ancestor tasks.
  264. When a component has internal information which needs to be kept coherent, the
  265. component can define its own lock at take it as it sees fit, e.g. to protect a
  266. task queue. This may however limit scalability of the scheduler. Conversely,
  267. since push and pull operations will be called concurrently from different
  268. workers, the component might prefer to use a central mutex to serialize all
  269. scheduling decisions to avoid pathological cases (all push calls decide to put
  270. their task on the same target)
  271. \subsubsection ImplementAModularizedScheduler Implementing a Modularized Scheduler
  272. The following code shows how to implement a Tree-Eager-Prefetching Scheduler.
  273. \code{.c}
  274. static void initialize_eager_prefetching_center_policy(unsigned sched_ctx_id)
  275. {
  276. /* The eager component will decide for each task which worker will run it,
  277. * and we want fifos both above and below the component */
  278. starpu_sched_component_initialize_simple_scheduler(
  279. starpu_sched_component_eager_create, NULL,
  280. STARPU_SCHED_SIMPLE_DECIDE_WORKERS |
  281. STARPU_SCHED_SIMPLE_FIFO_ABOVE |
  282. STARPU_SCHED_SIMPLE_FIFOS_BELOW,
  283. sched_ctx_id);
  284. }
  285. /* Properly destroy the Scheduling Tree and all its Components */
  286. static void deinitialize_eager_prefetching_center_policy(unsigned sched_ctx_id)
  287. {
  288. struct starpu_sched_tree * tree = (struct starpu_sched_tree*)starpu_sched_ctx_get_policy_data(sched_ctx_id);
  289. starpu_sched_tree_destroy(tree);
  290. }
  291. /* Initializing the starpu_sched_policy struct associated to the Modularized
  292. * Scheduler : only the init_sched and deinit_sched needs to be defined to
  293. * implement a Modularized Scheduler */
  294. struct starpu_sched_policy _starpu_sched_tree_eager_prefetching_policy =
  295. {
  296. .init_sched = initialize_eager_prefetching_center_policy,
  297. .deinit_sched = deinitialize_eager_prefetching_center_policy,
  298. .add_workers = starpu_sched_tree_add_workers,
  299. .remove_workers = starpu_sched_tree_remove_workers,
  300. .push_task = starpu_sched_tree_push_task,
  301. .pop_task = starpu_sched_tree_pop_task,
  302. .pre_exec_hook = starpu_sched_component_worker_pre_exec_hook,
  303. .post_exec_hook = starpu_sched_component_worker_post_exec_hook,
  304. .pop_every_task = NULL,
  305. .policy_name = "tree-eager-prefetching",
  306. .policy_description = "eager with prefetching tree policy"
  307. };
  308. \endcode
  309. starpu_sched_component_initialize_simple_scheduler() is a helper function which
  310. makes it very trivial to assemble a modular scheduler around a scheduling
  311. decision component as seen above (here, a dumb eager decision component). Most
  312. often a modular scheduler can be implemented that way.
  313. A modular scheduler can also be constructed hierarchically with
  314. starpu_sched_component_composed_recipe_create().
  315. That modular scheduler can also be built by hand in the following way:
  316. \code{.c}
  317. #define _STARPU_SCHED_NTASKS_THRESHOLD_DEFAULT 2
  318. #define _STARPU_SCHED_EXP_LEN_THRESHOLD_DEFAULT 1000000000.0
  319. static void initialize_eager_prefetching_center_policy(unsigned sched_ctx_id)
  320. {
  321. unsigned ntasks_threshold = _STARPU_SCHED_NTASKS_THRESHOLD_DEFAULT;
  322. double exp_len_threshold = _STARPU_SCHED_EXP_LEN_THRESHOLD_DEFAULT;
  323. [...]
  324. starpu_sched_ctx_create_worker_collection
  325. (sched_ctx_id, STARPU_WORKER_LIST);
  326. /* Create the Scheduling Tree */
  327. struct starpu_sched_tree * t = starpu_sched_tree_create(sched_ctx_id);
  328. /* The Root Component is a Flow-control Fifo Component */
  329. t->root = starpu_sched_component_fifo_create(NULL);
  330. /* The Resource-mapping Component of the strategy is an Eager Component
  331. */
  332. struct starpu_sched_component *eager_component = starpu_sched_component_eager_create(NULL);
  333. /* Create links between Components : the Eager Component is the child
  334. * of the Root Component */
  335. starpu_sched_component_connect(t->root, eager_component);
  336. /* A task threshold is set for the Flow-control Components which will
  337. * be connected to Worker Components. By doing so, this Modularized
  338. * Scheduler will be able to perform some prefetching on the resources
  339. */
  340. struct starpu_sched_component_fifo_data fifo_data =
  341. {
  342. .ntasks_threshold = ntasks_threshold,
  343. .exp_len_threshold = exp_len_threshold,
  344. };
  345. unsigned i;
  346. for(i = 0; i < starpu_worker_get_count() + starpu_combined_worker_get_count(); i++)
  347. {
  348. /* Each Worker Component has a Flow-control Fifo Component as
  349. * father */
  350. struct starpu_sched_component * worker_component = starpu_sched_component_worker_new(i);
  351. struct starpu_sched_component * fifo_component = starpu_sched_component_fifo_create(&fifo_data);
  352. starpu_sched_component_connect(fifo_component, worker_component);
  353. /* Each Flow-control Fifo Component associated to a Worker
  354. * Component is linked to the Eager Component as one of its
  355. * children */
  356. starpu_sched_component_connect(eager_component, fifo_component);
  357. }
  358. starpu_sched_tree_update_workers(t);
  359. starpu_sched_ctx_set_policy_data(sched_ctx_id, (void*)t);
  360. }
  361. /* Properly destroy the Scheduling Tree and all its Components */
  362. static void deinitialize_eager_prefetching_center_policy(unsigned sched_ctx_id)
  363. {
  364. struct starpu_sched_tree * tree = (struct starpu_sched_tree*)starpu_sched_ctx_get_policy_data(sched_ctx_id);
  365. starpu_sched_tree_destroy(tree);
  366. starpu_sched_ctx_delete_worker_collection(sched_ctx_id);
  367. }
  368. /* Initializing the starpu_sched_policy struct associated to the Modularized
  369. * Scheduler : only the init_sched and deinit_sched needs to be defined to
  370. * implement a Modularized Scheduler */
  371. struct starpu_sched_policy _starpu_sched_tree_eager_prefetching_policy =
  372. {
  373. .init_sched = initialize_eager_prefetching_center_policy,
  374. .deinit_sched = deinitialize_eager_prefetching_center_policy,
  375. .add_workers = starpu_sched_tree_add_workers,
  376. .remove_workers = starpu_sched_tree_remove_workers,
  377. .push_task = starpu_sched_tree_push_task,
  378. .pop_task = starpu_sched_tree_pop_task,
  379. .pre_exec_hook = starpu_sched_component_worker_pre_exec_hook,
  380. .post_exec_hook = starpu_sched_component_worker_post_exec_hook,
  381. .pop_every_task = NULL,
  382. .policy_name = "tree-eager-prefetching",
  383. .policy_description = "eager with prefetching tree policy"
  384. };
  385. \endcode
  386. Other modular scheduler examples can be seen in <c>src/sched_policies/modular_*.c</c>
  387. For instance, \c modular-heft-prio needs performance models, decides
  388. memory nodes, uses prioritized fifos above and below, and decides the best
  389. implementation.
  390. If unsure on the result of the modular scheduler construction, you can run a
  391. simple application with FxT enabled (see \ref GeneratingTracesWithFxT), and open
  392. the generated file \c trace.html in a web-browser.
  393. \subsection ModularizedSchedulersAndParallelTasks Management of parallel task
  394. At the moment, parallel tasks can be managed in modularized schedulers through
  395. combined workers: instead of connecting a scheduling component to a worker
  396. component, one can connect it to a combined worker component (i.e. a worker
  397. component created with a combined worker id). That component will handle
  398. creating task aliases for parallel execution and push them to the different
  399. workers components.
  400. \subsection WriteASchedulingComponent Writing a Scheduling Component
  401. \subsubsection GenericSchedulingComponent Generic Scheduling Component
  402. Each Scheduling Component is instantiated from a Generic Scheduling Component,
  403. which implements a generic version of the Interface. The generic implementation
  404. of Pull, Can_Pull and Can_Push functions are recursive calls to their parents
  405. (respectively to their children). However, as a Generic Scheduling Component do
  406. not know how much children it will have when it will be instantiated, it does
  407. not implement the Push function.
  408. \subsubsection InstantiationRedefineInterface Instantiation : Redefining the Interface
  409. A Scheduling Component must implement all the functions of the Interface. It is
  410. so necessary to implement a Push function to instantiate a Scheduling Component.
  411. The implemented Push function is the "fingerprint" of a Scheduling Component.
  412. Depending on how functionalities or properties programmers want to give
  413. to the Scheduling Component they are implementing, it is possible to reimplement
  414. all the functions of the Interface. For example, a Flow-control Component
  415. reimplements the Pull and the Can_Push functions of the Interface, allowing
  416. to catch the generic recursive calls of these functions. The Pull function of
  417. a Flow-control Component can, for example, pop a task from the local storage
  418. queue of the Component, and give it to the calling Component which asks for it.
  419. \subsubsection DetailedProgressionAndValidationRules Detailed Progression and Validation Rules
  420. - A Reservoir is a Scheduling Component which redefines a Push and a Pull
  421. function, in order to store tasks into it. A Reservoir delimit Scheduling
  422. Areas in the Scheduling Tree.
  423. - A Pump is the engine source of the Scheduler : it pushes/pulls tasks
  424. to/from a Scheduling Component to an other. Native Pumps of a Scheduling
  425. Tree are located at the root of the Tree (incoming Push calls from StarPU),
  426. and at the leafs of the Tree (Pop calls coming from StarPU Workers).
  427. Pre-implemented Scheduling Components currently shipped with Pumps are
  428. Flow-Control Components and the Resource-Mapping Component Heft, within
  429. their defined Can_Push functions.
  430. - A correct Scheduling Tree requires a Pump per Scheduling Area and per
  431. Execution Flow.
  432. The Tree-Eager-Prefetching Scheduler shown in Section
  433. \ref ImplementAModularizedScheduler follows the previous assumptions :
  434. <pre>
  435. starpu_push_task
  436. <b>Pump</b>
  437. |
  438. Area 1 |
  439. |
  440. v
  441. -----------------------Fifo_Component-----------------------------
  442. <b>Pump</b>
  443. | ^
  444. Push | | Can_Push
  445. v |
  446. Area 2 Eager_Component
  447. | ^
  448. | |
  449. v |
  450. --------><-------------------><---------
  451. | ^ | ^
  452. Push | | Can_Push Push | | Can_Push
  453. v | v |
  454. -----Fifo_Component-----------------------Fifo_Component----------
  455. | ^ | ^
  456. Pull | | Can_Pull Pull | | Can_Pull
  457. Area 3 v | v |
  458. <b>Pump</b> <b>Pump</b>
  459. Worker_Component Worker_Component
  460. </pre>
  461. \section GraphScheduling Graph-based Scheduling
  462. For performance reasons, most of the schedulers shipped with StarPU use simple
  463. list-scheduling heuristics, assuming that the application has already set
  464. priorities. This is why they do their scheduling between when tasks become
  465. available for execution and when a worker becomes idle, without looking at the
  466. task graph.
  467. Other heuristics can however look at the task graph. Recording the task graph
  468. is expensive, so it is not available by default, the scheduling heuristic has
  469. to set \c _starpu_graph_record to \c 1 from the initialization function, to make it
  470. available. Then the <c>_starpu_graph*</c> functions can be used.
  471. <c>src/sched_policies/graph_test_policy.c</c> is an example of simple greedy
  472. policy which automatically computes priorities by bottom-up rank.
  473. The idea is that while the application submits tasks, they are only pushed
  474. to a bag of tasks. When the application is finished with submitting tasks,
  475. it calls starpu_do_schedule() (or starpu_task_wait_for_all(), which calls
  476. starpu_do_schedule()), and the starpu_sched_policy::do_schedule method of the
  477. scheduler is called. This method calls \c _starpu_graph_compute_depths() to compute
  478. the bottom-up ranks, and then uses these ranks to set priorities over tasks.
  479. It then has two priority queues, one for CPUs, and one for GPUs, and uses a dumb
  480. heuristic based on the duration of the task over CPUs and GPUs to decide between
  481. the two queues. CPU workers can then pop from the CPU priority queue, and GPU
  482. workers from the GPU priority queue.
  483. \section DebuggingScheduling Debugging Scheduling
  484. All the \ref OnlinePerformanceTools and \ref OfflinePerformanceTools can
  485. be used to get information about how well the execution proceeded, and thus the
  486. overall quality of the execution.
  487. Precise debugging can also be performed by using the
  488. \ref STARPU_TASK_BREAK_ON_PUSH, \ref STARPU_TASK_BREAK_ON_SCHED,
  489. \ref STARPU_TASK_BREAK_ON_POP, and \ref STARPU_TASK_BREAK_ON_EXEC environment variables.
  490. By setting the job_id of a task
  491. in these environment variables, StarPU will raise <c>SIGTRAP</c> when the task is being
  492. scheduled, pushed, or popped by the scheduler. This means that when one notices
  493. that a task is being scheduled in a seemingly odd way, one can just reexecute
  494. the application in a debugger, with some of those variables set, and the
  495. execution will stop exactly at the scheduling points of this task, thus allowing
  496. to inspect the scheduler state, etc.
  497. */