sched_ctx_hypervisor.texi 19 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395
  1. @c -*-texinfo-*-
  2. @c This file is part of the StarPU Handbook.
  3. @c Copyright (C) 2011, 2012 Institut National de Recherche en Informatique et Automatique
  4. @c See the file starpu.texi for copying conditions.
  5. @cindex Scheduling Context Hypervisor
  6. StarPU proposes a platform for constructing Scheduling Contexts, for deleting and modifying them dynamically.
  7. A parallel kernel, can thus be isolated into a scheduling context and interferences between several parallel kernels are avoided.
  8. If the user knows exactly how many workers each scheduling context needs, he can assign them to the contexts at their creation time or modify them during the execution of the program.
  9. The Scheduling Context Hypervisor Plugin is available for the users who do not dispose of a regular parallelism, who cannot know in advance the exact size of the context and need to resize the contexts according to the behavior of the parallel kernel.
  10. The Hypervisor receives information from StarPU concerning the execution of the tasks, the efficiency of the resources, etc. and it decides accordingly when and how the contexts can be resized.
  11. Basic strategies of resizing scheduling contexts already exist but a platform for implementing additional custom ones is available.
  12. @menu
  13. * Managing the hypervisor:: Initialize the hypervisor
  14. * Registering Scheduling Contexts to the hypervisor:: Contexts have to register to the hypervisor
  15. * The user's input in the resizing process:: The user can help the hypervisor decide how to resize
  16. * Resizing strategies:: Several resizing strategies are proposed
  17. * Performance Counters:: StarPU provides information to the Hypervisor through performance counters
  18. * Defining a new hypervisor policy:: New Policies can be implemented
  19. @end menu
  20. @node Managing the hypervisor
  21. @section Managing the hypervisor
  22. There is a single hypervisor that is in charge of resizing contexts and the resizing strategy is chosen at the initialization of the hypervisor. A single resize can be done at a time.
  23. @deftypefun {struct starpu_performance_counters *} sched_ctx_hypervisor_init ({struct starpu_sched_ctx_hypervisor_policy *} @var{policy})
  24. Initializes the hypervisor to use the strategy provided as parameter and creates the performance counters (see @pxref{Performance Counters}).
  25. These performance counters represent actually some callbacks that will be used by the contexts to notify the information needed by the hypervisor.
  26. @end deftypefun
  27. Note: The Hypervisor is actually a worker that takes this role once certain conditions trigger the resizing process (there is no additional thread assigned to the hypervisor).
  28. @deftypefun void sched_ctx_hypervisor_shutdown (void)
  29. The hypervisor and all information is freed. There is no synchronization between this function and starpu_shutdown. Thus, this should be done after starpu_shutdown(),
  30. because the performance counters will still need allocated callback functions.
  31. @end deftypefun
  32. @node Registering Scheduling Contexts to the hypervisor
  33. @section Registering Scheduling Contexts to the hypervisor
  34. Scheduling Contexts that have to be resized by the hypervisor must be first registered to the hypervisor. Whenever we want to exclude contexts from the resizing process we have to unregister them from the hypervisor.
  35. @deftypefun void sched_ctx_hypervisor_register_ctx (unsigned @var{sched_ctx}, double @var{total_flops})
  36. Register the context to the hypervisor, and indicate the number of flops the context will execute (needed for Gflops rate based strategy @pxref{Resizing strategies} or any other custom strategy needing it, for the others we can pass 0.0)
  37. @end deftypefun
  38. @deftypefun void sched_ctx_hypervisor_unregister_ctx (unsigned @var{sched_ctx})
  39. Unregister the context from the hypervisor
  40. @end deftypefun
  41. @node The user's input in the resizing process
  42. @section The user's input in the resizing process
  43. The user can totally forbid the resizing of a certain context or can then change his mind and allow it (in this case the resizing is managed by the hypervisor, that can forbid it or allow it)
  44. @deftypefun void sched_ctx_hypervisor_stop_resize (unsigned @var{sched_ctx})
  45. Forbid resizing of a context
  46. @end deftypefun
  47. @deftypefun void sched_ctx_hypervisor_start_resize (unsigned @var{sched_ctx})
  48. Allow resizing of a context
  49. @end deftypefun
  50. The user can then provide information to the hypervisor concerning the conditions of resizing.
  51. @deftypefun void sched_ctx_hypervisor_ioctl (unsigned @var{sched_ctx}, ...)
  52. Inputs conditions to the context @code{sched_ctx} with the following arguments. The argument list must be zero-terminated.
  53. @defmac HYPERVISOR_MAX_IDLE
  54. This macro is used when calling sched_ctx_hypervisor_ioctl and must be followed by 3 arguments:
  55. an array of int for the workerids to apply the condition, an int to indicate the size of the array, and a double value indicating
  56. the maximum idle time allowed for a worker before the resizing process should be triggered
  57. @end defmac
  58. @defmac HYPERVISOR_PRIORITY
  59. This macro is used when calling sched_ctx_hypervisor_ioctl and must be followed by 3 arguments:
  60. an array of int for the workerids to apply the condition, an int to indicate the size of the array, and an int value indicating
  61. the priority of the workers previously mentioned.
  62. The workers with the smallest priority are moved the first.
  63. @end defmac
  64. @defmac HYPERVISOR_MIN_WORKERS
  65. This macro is used when calling sched_ctx_hypervisor_ioctl and must be followed by 1 argument(int) indicating
  66. the minimum number of workers a context should have, underneath this limit the context cannot execute.
  67. @end defmac
  68. @defmac HYPERVISOR_MAX_WORKERS
  69. This macro is used when calling sched_ctx_hypervisor_ioctl and must be followed by 1 argument(int) indicating
  70. the maximum number of workers a context should have, above this limit the context would not be able to scale
  71. @end defmac
  72. @defmac HYPERVISOR_GRANULARITY
  73. This macro is used when calling sched_ctx_hypervisor_ioctl and must be followed by 1 argument(int) indicating
  74. the granularity of the resizing process (the number of workers should be moved from the context once it is resized)
  75. This parameter is ignore for the Gflops rate based strategy @pxref{Resizing strategies}, the number of workers that have to be moved is calculated by the strategy.
  76. @end defmac
  77. @defmac HYPERVISOR_FIXED_WORKERS
  78. This macro is used when calling sched_ctx_hypervisor_ioctl and must be followed by 2 arguments:
  79. an array of int for the workerids to apply the condition and an int to indicate the size of the array.
  80. These workers are not allowed to be moved from the context.
  81. @end defmac
  82. @defmac HYPERVISOR_MIN_TASKS
  83. This macro is used when calling sched_ctx_hypervisor_ioctl and must be followed by 1 argument (int)
  84. that indicated the minimum number of tasks that have to be executed before the context could be resized.
  85. This parameter is ignored for the Application Driven strategy @pxref{Resizing strategies} where the user indicates exactly when the resize should be done.
  86. @end defmac
  87. @defmac HYPERVISOR_NEW_WORKERS_MAX_IDLE
  88. This macro is used when calling sched_ctx_hypervisor_ioctl and must be followed by 1 argument, a double value indicating
  89. the maximum idle time allowed for workers that have just been moved from other contexts in the current context.
  90. @end defmac
  91. @defmac HYPERVISOR_TIME_TO_APPLY
  92. This macro is used when calling sched_ctx_hypervisor_ioctl and must be followed by 1 argument (int) indicating the tag
  93. an executed task should have such that this configuration should be taken into account.
  94. @end defmac
  95. @end deftypefun
  96. @node Resizing strategies
  97. @section Resizing strategies
  98. The plugin proposes several strategies for resizing the scheduling context.
  99. The @b{Application driven} strategy uses the user's input concerning the moment when he wants to resize the contexts.
  100. Thus, the users tags the task that should trigger the resizing process. We can set directly the corresponding field in the @code{starpu_task} data structure is @code{hypervisor_tag} or
  101. use the macro @code{STARPU_HYPERVISOR_TAG} in @code{starpu_insert_task} function.
  102. @cartouche
  103. @smallexample
  104. task.hypervisor_tag = 2;
  105. @end smallexample
  106. @end cartouche
  107. or
  108. @cartouche
  109. @smallexample
  110. starpu_insert_task(&codelet,
  111. ...,
  112. STARPU_HYPERVISOR_TAG, 2,
  113. 0);
  114. @end smallexample
  115. @end cartouche
  116. Then the user has to indicate that when a task with the specified tag is executed the contexts should resize.
  117. @cartouche
  118. @smallexample
  119. sched_ctx_hypervisor_resize(sched_ctx, 2);
  120. @end smallexample
  121. @end cartouche
  122. The user can use the same tag to change the resizing configuration of the contexts if he considers it necessary.
  123. @cartouche
  124. @smallexample
  125. sched_ctx_hypervisor_ioctl(sched_ctx,
  126. HYPERVISOR_MIN_WORKERS, 6,
  127. HYPERVISOR_MAX_WORKERS, 12,
  128. HYPERVISOR_TIME_TO_APPLY, 2,
  129. NULL);
  130. @end smallexample
  131. @end cartouche
  132. The @b{Idleness} based strategy resizes the scheduling contexts every time one of their workers stays idle
  133. for a period longer than the one imposed by the user (see @pxref{The user's input in the resizing process})
  134. @cartouche
  135. @smallexample
  136. int workerids[3] = @{1, 3, 10@};
  137. int workerids2[9] = @{0, 2, 4, 5, 6, 7, 8, 9, 11@};
  138. sched_ctx_hypervisor_ioctl(sched_ctx_id,
  139. HYPERVISOR_MAX_IDLE, workerids, 3, 10000.0,
  140. HYPERVISOR_MAX_IDLE, workerids2, 9, 50000.0,
  141. NULL);
  142. @end smallexample
  143. @end cartouche
  144. The @b{Gflops rate} based strategy resizes the scheduling contexts such that they all finish at the same time.
  145. The velocity of each of them is considered and once one of them is significantly slower the resizing process is triggered.
  146. In order to do these computations the user has to input the total number of instructions needed to be executed by the
  147. parallel kernels and the number of instruction to be executed by each task.
  148. The number of flops to be executed by a context are passed as parameter when they are registered to the hypervisor,
  149. (@code{sched_ctx_hypervisor_register_ctx(sched_ctx_id, flops)}) and the one to be executed by each task are passed when the task is submitted.
  150. The corresponding field in the @code{starpu_task} data structure is @code{flops} and
  151. the corresponding macro in @code{starpu_insert_task} function is @code{STARPU_FLOPS}. When the task is executed
  152. the resizing process is triggered.
  153. @cartouche
  154. @smallexample
  155. task.flops = 100;
  156. @end smallexample
  157. @end cartouche
  158. or
  159. @cartouche
  160. @smallexample
  161. starpu_insert_task(&codelet,
  162. ...,
  163. STARPU_FLOPS, 100,
  164. 0);
  165. @end smallexample
  166. @end cartouche
  167. @node Performance Counters
  168. @section Performance Counters
  169. The Scheduling Context Hypervisor Plugin provides a series of performance counters to StarPU. By incrementing them, StarPU can help the hypervisor in the resizing decision making process.
  170. @deftp {Data Type} {struct starpu_performance_counters}
  171. @anchor{struct starpu_performance_counters}
  172. @table @asis
  173. @item @code{void (*notify_idle_cycle)(unsigned sched_ctx_id, int worker, double idle_time)}
  174. Informs the hypervisor for how long a worker has been idle in the specified context
  175. @item @code{void (*notify_idle_end)(unsigned sched_ctx_id, int worker)}
  176. Informs the hypervisor that after a period of idle, the worker has just executed a task in the specified context.
  177. The idle counter it though reset.
  178. @item @code{void (*notify_pushed_task)(unsigned sched_ctx_id, int worker)}
  179. Notifies the hypervisor a task has been scheduled on the queue of the worker corresponding to the specified context
  180. @item @code{void (*notify_poped_task)(unsigned sched_ctx_id, int worker, double flops)}
  181. Informs the hypervisor a task executing a specified number of instructions has been poped from the worker
  182. @item @code{void (*notify_post_exec_hook)(unsigned sched_ctx_id, int taskid)}
  183. Notifies the hypervisor a task has just been executed
  184. @end table
  185. @end deftp
  186. TODO maybe they should be hidden to the user
  187. @node Defining a new hypervisor policy
  188. @section Defining a new hypervisor policy
  189. @menu
  190. * Hypervisor Policy API:: Hypervisor Policy API
  191. * Hypervisor example::
  192. @end menu
  193. @node Hypervisor Policy API
  194. @subsection Hypervisor Policy API
  195. While Scheduling Context Hypervisor Plugin comes with a variety of resizing policies (@pxref{Resizing strategies}),
  196. it may sometimes be desirable to implement custom
  197. policies to address specific problems. The API described below allows
  198. users to write their own resizing policy.
  199. @deftp {Data Type} {struct starpu_sched_ctx_hypervisor_policy}
  200. This structure contains all the methods that implement a hypervisor resizing policy.
  201. @table @asis
  202. @item @code{const char* name}
  203. Indicates the name of the policy, if there is not a custom policy, the policy corresponding to this name will be used by the hypervisor
  204. @item @code{unsigned custom}
  205. Indicates whether the policy is custom or not
  206. @item @code{void (*handle_idle_cycle)(unsigned sched_ctx_id, int worker)}
  207. It is called whenever the indicated worker executes another idle cycle in @code{sched_ctx}
  208. @item @code{void (*handle_pushed_task)(unsigned sched_ctx_id, int worker)}
  209. It is called whenever a task is pushed on the worker's queue corresponding to the context @code{sched_ctx}
  210. @item @code{void (*handle_poped_task)(unsigned sched_ctx_id, int worker)}
  211. It is called whenever a task is poped from the worker's queue corresponding to the context @code{sched_ctx}
  212. @item @code{void (*handle_idle_end)(unsigned sched_ctx_id, int worker)}
  213. It is called whenever a task is executed on the indicated worker and context after a long period of idle time
  214. @item @code{void (*handle_post_exec_hook)(unsigned sched_ctx_id, struct starpu_htbl32_node* resize_requests, int task_tag)}
  215. It is called whenever a tag task has just been executed. The table of resize requests is provided as well as the tag
  216. @end table
  217. @end deftp
  218. The Hypervisor provides also a structure with configuration information of each context, which can be used to construct new resize strategies.
  219. @deftp {Data Type} {struct starpu_sched_ctx_hypervisor_policy_config }
  220. This structure contains all configuration information of a context
  221. @table @asis
  222. @item @code{int min_nworkers}
  223. Indicates the minimum number of workers needed by the context
  224. @item @code{int max_nworkers}
  225. Indicates the maximum number of workers needed by the context
  226. @item @code{int granularity}
  227. Indicates the workers granularity of the context
  228. @item @code{int priority[STARPU_NMAXWORKERS]}
  229. Indicates the priority of each worker in the context
  230. @item @code{double max_idle[STARPU_NMAXWORKERS]}
  231. Indicates the maximum idle time accepted before a resize is triggered
  232. @item @code{int fixed_workers[STARPU_NMAXWORKERS]}
  233. Indicates which workers can be moved and which ones are fixed
  234. @item @code{double new_workers_max_idle}
  235. Indicates the maximum idle time accepted before a resize is triggered for the workers that just arrived in the new context
  236. @end table
  237. @end deftp
  238. Additionally, the hypervisor provides a structure with information obtained from StarPU by means of the performance counters
  239. @deftp {Data Type} {struct starpu_sched_ctx_hypervisor_wrapper}
  240. This structure is a wrapper of the contexts available in StarPU
  241. and contains all information about a context obtained by incrementing the performance counters
  242. @table @asis
  243. @item @code{unsigned sched_ctx}
  244. The context wrapped
  245. @item @code{struct starpu_sched_ctx_hypervisor_policy_config *config}
  246. The corresponding resize configuration
  247. @item @code{double current_idle_time[STARPU_NMAXWORKERS]}
  248. The idle time counter of each worker of the context
  249. @item @code{int pushed_tasks[STARPU_NMAXWORKERS]}
  250. The number of pushed tasks of each worker of the context
  251. @item @code{int poped_tasks[STARPU_NMAXWORKERS]}
  252. The number of poped tasks of each worker of the context
  253. @item @code{double total_flops}
  254. The total number of flops to execute by the context
  255. @item @code{double total_elapsed_flops[STARPU_NMAXWORKERS]}
  256. The number of flops executed by each workers of the context
  257. @item @code{double elapsed_flops[STARPU_NMAXWORKERS]}
  258. The number of flops executed by each worker of the context from last resize
  259. @item @code{double remaining_flops}
  260. The number of flops that still have to be executed by the workers in the context
  261. @item @code{double start_time}
  262. The time when he started executed
  263. @item @code{struct starpu_sched_ctx_hypervisor_resize_ack resize_ack}
  264. The structure confirming the last resize finished and a new one can be done
  265. @end table
  266. @end deftp
  267. @deftp {Data Type} {struct starpu_sched_ctx_hypervisor_resize_ack}
  268. This structures checks if the workers moved to another context are actually taken into account in that context
  269. @table @asis
  270. @item @code{int receiver_sched_ctx}
  271. The context receiving the new workers
  272. @item @code{int *moved_workers}
  273. The workers moved to the receiver context
  274. @item @code{int nmoved_workers}
  275. The number of workers moved
  276. @item @code{int *acked_workers}
  277. If the value corresponding to a worker is 1, this one is taken into account in the new context if 0 not yet
  278. @end table
  279. @end deftp
  280. The following functions can be used in the resizing strategies.
  281. @deftypefun void sched_ctx_hypervisor_move_workers (unsigned @var{sender_sched_ctx}, unsigned @var{receiver_sched_ctx}, {int *}@var{workers_to_move}, unsigned @var{nworkers_to_move}, unsigned @var{now});
  282. Moves workers from one context to another
  283. @end deftypefun
  284. @deftypefun {struct starpu_sched_ctx_hypervisor_policy_config *} sched_ctx_hypervisor_get_config (unsigned @var{sched_ctx});
  285. Returns the configuration structure of a context
  286. @end deftypefun
  287. @deftypefun {int *} sched_ctx_hypervisor_get_sched_ctxs ();
  288. Gets the contexts managed by the hypervisor
  289. @end deftypefun
  290. @deftypefun int sched_ctx_hypervisor_get_nsched_ctxs ();
  291. Gets the number of contexts managed by the hypervisor
  292. @end deftypefun
  293. @deftypefun {struct starpu_sched_ctx_hypervisor_wrapper *} sched_ctx_hypervisor_get_wrapper (unsigned @var{sched_ctx});
  294. Returns the wrapper corresponding the context @code{sched_ctx}
  295. @end deftypefun
  296. @deftypefun double sched_ctx_hypervisor_get_elapsed_flops_per_sched_ctx ({struct starpu_sched_ctx_hypervisor_wrapper *} @var{sc_w});
  297. Returns the flops of a context elapsed from the last resize
  298. @end deftypefun
  299. @deftypefun {char *} sched_ctx_hypervisor_get_policy ();
  300. Returns the name of the resizing policy the hypervisor uses
  301. @end deftypefun
  302. @node Hypervisor example
  303. @subsection Hypervisor example
  304. @cartouche
  305. @smallexample
  306. struct starpu_sched_ctx_hypervisor_policy dummy_policy =
  307. @{
  308. .handle_poped_task = dummy_handle_poped_task,
  309. .handle_pushed_task = dummy_handle_pushed_task,
  310. .handle_idle_cycle = dummy_handle_idle_cycle,
  311. .handle_idle_end = dummy_handle_idle_end,
  312. .handle_post_exec_hook = dummy_handle_post_exec_hook,
  313. .custom = 1,
  314. .name = "dummy"
  315. @};
  316. @end smallexample
  317. @end cartouche
  318. @c Local Variables:
  319. @c TeX-master: "../starpu.texi"
  320. @c ispell-local-dictionary: "american"
  321. @c End: