340_scheduling_context_hypervisor.doxy 9.9 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233
  1. /* StarPU --- Runtime system for heterogeneous multicore architectures.
  2. *
  3. * Copyright (C) 2011-2013 Inria
  4. * Copyright (C) 2010-2017 CNRS
  5. * Copyright (C) 2009-2011,2014 Université de Bordeaux
  6. *
  7. * StarPU is free software; you can redistribute it and/or modify
  8. * it under the terms of the GNU Lesser General Public License as published by
  9. * the Free Software Foundation; either version 2.1 of the License, or (at
  10. * your option) any later version.
  11. *
  12. * StarPU is distributed in the hope that it will be useful, but
  13. * WITHOUT ANY WARRANTY; without even the implied warranty of
  14. * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
  15. *
  16. * See the GNU Lesser General Public License in COPYING.LGPL for more details.
  17. */
  18. /*! \page SchedulingContextHypervisor Scheduling Context Hypervisor
  19. \section WhatIsTheHypervisor What Is The Hypervisor
  20. StarPU proposes a platform to construct Scheduling Contexts, to
  21. delete and modify them dynamically. A parallel kernel, can thus
  22. be isolated into a scheduling context and interferences between
  23. several parallel kernels are avoided. If users know exactly how
  24. many workers each scheduling context needs, they can assign them to the
  25. contexts at their creation time or modify them during the execution of
  26. the program.
  27. The Scheduling Context Hypervisor Plugin is available for users
  28. who do not dispose of a regular parallelism, who cannot know in
  29. advance the exact size of the context and need to resize the contexts
  30. according to the behavior of the parallel kernels.
  31. The Hypervisor receives information from StarPU concerning the
  32. execution of the tasks, the efficiency of the resources, etc. and it
  33. decides accordingly when and how the contexts can be resized. Basic
  34. strategies of resizing scheduling contexts already exist but a
  35. platform for implementing additional custom ones is available.
  36. \section StartTheHypervisor Start the Hypervisor
  37. The Hypervisor must be initialized once at the beginning of the
  38. application. At this point a resizing policy should be indicated. This
  39. strategy depends on the information the application is able to provide
  40. to the hypervisor as well as on the accuracy needed for the resizing
  41. procedure. For example, the application may be able to provide an
  42. estimation of the workload of the contexts. In this situation the
  43. hypervisor may decide what resources the contexts need. However, if no
  44. information is provided the hypervisor evaluates the behavior of the
  45. resources and of the application and makes a guess about the future.
  46. The hypervisor resizes only the registered contexts.
  47. \section InterrogateTheRuntime Interrogate The Runtime
  48. The runtime provides the hypervisor with information concerning the
  49. behavior of the resources and the application. This is done by using
  50. the <c>performance_counters</c> which represent callbacks indicating
  51. when the resources are idle or not efficient, when the application
  52. submits tasks or when it becomes to slow.
  53. \section TriggerTheHypervisor Trigger the Hypervisor
  54. The resizing is triggered either when the application requires it
  55. (sc_hypervisor_resize_ctxs()) or
  56. when the initials distribution of resources alters the performance of
  57. the application (the application is to slow or the resource are idle
  58. for too long time). If the environment
  59. variable \ref SC_HYPERVISOR_TRIGGER_RESIZE is set to <c>speed</c>
  60. the monitored speed of the contexts is compared to a theoretical value
  61. computed with a linear program, and the resizing is triggered
  62. whenever the two values do not correspond. Otherwise, if the environment
  63. variable is set to <c>idle</c> the hypervisor triggers the resizing algorithm
  64. whenever the workers are idle for a period longer than the threshold
  65. indicated by the programmer. When this
  66. happens different resizing strategy are applied that target minimizing
  67. the total execution of the application, the instant speed or the idle
  68. time of the resources.
  69. \section ResizingStrategies Resizing Strategies
  70. The plugin proposes several strategies for resizing the scheduling context.
  71. The <b>Application driven</b> strategy uses users's input concerning the moment when they want to resize the contexts.
  72. Thus, users tag the task that should trigger the resizing
  73. process. One can set directly the field starpu_task::hypervisor_tag or
  74. use the macro ::STARPU_HYPERVISOR_TAG in the function
  75. starpu_task_insert().
  76. \code{.c}
  77. task.hypervisor_tag = 2;
  78. \endcode
  79. or
  80. \code{.c}
  81. starpu_task_insert(&codelet,
  82. ...,
  83. STARPU_HYPERVISOR_TAG, 2,
  84. 0);
  85. \endcode
  86. Then users have to indicate that when a task with the specified tag is executed the contexts should resize.
  87. \code{.c}
  88. sc_hypervisor_resize(sched_ctx, 2);
  89. \endcode
  90. Users can use the same tag to change the resizing configuration of the contexts if they consider it necessary.
  91. \code{.c}
  92. sc_hypervisor_ctl(sched_ctx,
  93. SC_HYPERVISOR_MIN_WORKERS, 6,
  94. SC_HYPERVISOR_MAX_WORKERS, 12,
  95. SC_HYPERVISOR_TIME_TO_APPLY, 2,
  96. NULL);
  97. \endcode
  98. The <b>Idleness</b> based strategy moves workers unused in a certain context to another one needing them.
  99. (see \ref API_SC_Hypervisor_usage)
  100. \code{.c}
  101. int workerids[3] = {1, 3, 10};
  102. int workerids2[9] = {0, 2, 4, 5, 6, 7, 8, 9, 11};
  103. sc_hypervisor_ctl(sched_ctx_id,
  104. SC_HYPERVISOR_MAX_IDLE, workerids, 3, 10000.0,
  105. SC_HYPERVISOR_MAX_IDLE, workerids2, 9, 50000.0,
  106. NULL);
  107. \endcode
  108. The <b>Gflops rate</b> based strategy resizes the scheduling contexts such that they all finish at the same time.
  109. The speed of each of them is computed and once one of them is significantly slower the resizing process is triggered.
  110. In order to do these computations users have to input the total number of instructions needed to be executed by the
  111. parallel kernels and the number of instruction to be executed by each
  112. task.
  113. The number of flops to be executed by a context are passed as
  114. parameter when they are registered to the hypervisor,
  115. \code{.c}
  116. sc_hypervisor_register_ctx(sched_ctx_id, flops)
  117. \endcode
  118. and the one
  119. to be executed by each task are passed when the task is submitted.
  120. The corresponding field is starpu_task::flops and the corresponding
  121. macro in the function starpu_task_insert() is ::STARPU_FLOPS
  122. (<b>Caution</b>: but take care of passing a double, not an integer,
  123. otherwise parameter passing will be bogus). When the task is executed
  124. the resizing process is triggered.
  125. \code{.c}
  126. task.flops = 100;
  127. \endcode
  128. or
  129. \code{.c}
  130. starpu_task_insert(&codelet,
  131. ...,
  132. STARPU_FLOPS, (double) 100,
  133. 0);
  134. \endcode
  135. The <b>Feft</b> strategy uses a linear program to predict the best distribution of resources
  136. such that the application finishes in a minimum amount of time. As for the <b>Gflops rate </b>
  137. strategy the programmers has to indicate the total number of flops to be executed
  138. when registering the context. This number of flops may be updated dynamically during the execution
  139. of the application whenever this information is not very accurate from the beginning.
  140. The function sc_hypervisor_update_diff_total_flops() is called in order to add or to remove
  141. a difference to the flops left to be executed.
  142. Tasks are provided also the number of flops corresponding to each one of them. During the
  143. execution of the application the hypervisor monitors the consumed flops and recomputes
  144. the time left and the number of resources to use. The speed of each type of resource
  145. is (re)evaluated and inserter in the linear program in order to better adapt to the
  146. needs of the application.
  147. The <b>Teft</b> strategy uses a linear program too, that considers all the types of tasks
  148. and the number of each of them and it tries to allocates resources such that the application
  149. finishes in a minimum amount of time. A previous calibration of StarPU would be useful
  150. in order to have good predictions of the execution time of each type of task.
  151. The types of tasks may be determines directly by the hypervisor when they are submitted.
  152. However there are applications that do not expose all the graph of tasks from the beginning.
  153. In this case in order to let the hypervisor know about all the tasks the function
  154. sc_hypervisor_set_type_of_task() will just inform the hypervisor about future tasks
  155. without submitting them right away.
  156. The <b>Ispeed </b> strategy divides the execution of the application in several frames.
  157. For each frame the hypervisor computes the speed of the contexts and tries making them
  158. run at the same speed. The strategy requires less contribution from users as
  159. the hypervisor requires only the size of the frame in terms of flops.
  160. \code{.c}
  161. int workerids[3] = {1, 3, 10};
  162. int workerids2[9] = {0, 2, 4, 5, 6, 7, 8, 9, 11};
  163. sc_hypervisor_ctl(sched_ctx_id,
  164. SC_HYPERVISOR_ISPEED_W_SAMPLE, workerids, 3, 2000000000.0,
  165. SC_HYPERVISOR_ISPEED_W_SAMPLE, workerids2, 9, 200000000000.0,
  166. SC_HYPERVISOR_ISPEED_CTX_SAMPLE, 60000000000.0,
  167. NULL);
  168. \endcode
  169. The <b>Throughput </b> strategy focuses on maximizing the throughput of the resources
  170. and resizes the contexts such that the machine is running at its maximum efficiency
  171. (maximum instant speed of the workers).
  172. \section DefiningANewHypervisorPolicy Defining A New Hypervisor Policy
  173. While Scheduling Context Hypervisor Plugin comes with a variety of
  174. resizing policies (see \ref ResizingStrategies), it may sometimes be
  175. desirable to implement custom policies to address specific problems.
  176. The API described below allows users to write their own resizing policy.
  177. Here an example of how to define a new policy
  178. \code{.c}
  179. struct sc_hypervisor_policy dummy_policy =
  180. {
  181. .handle_poped_task = dummy_handle_poped_task,
  182. .handle_pushed_task = dummy_handle_pushed_task,
  183. .handle_idle_cycle = dummy_handle_idle_cycle,
  184. .handle_idle_end = dummy_handle_idle_end,
  185. .handle_post_exec_hook = dummy_handle_post_exec_hook,
  186. .custom = 1,
  187. .name = "dummy"
  188. };
  189. \endcode
  190. */