starpu.texi 17 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569
  1. \input texinfo @c -*-texinfo-*-
  2. @c %**start of header
  3. @setfilename starpu.info
  4. @settitle StarPU
  5. @c %**end of header
  6. @setchapternewpage odd
  7. @titlepage
  8. @title StarPU
  9. @page
  10. @vskip 0pt plus 1filll
  11. @comment For the @value{version-GCC} Version*
  12. @end titlepage
  13. @summarycontents
  14. @contents
  15. @page
  16. @node Top
  17. @top Preface
  18. @cindex Preface
  19. This manual documents the usage of StarPU
  20. @comment
  21. @comment When you add a new menu item, please keep the right hand
  22. @comment aligned to the same column. Do not use tabs. This provides
  23. @comment better formatting.
  24. @comment
  25. @menu
  26. * Introduction:: A basic introduction to using StarPU.
  27. * Installing StarPU:: How to configure, build and install StarPU
  28. * StarPU API:: The API to use StarPU
  29. * Basic Examples:: Basic examples of the use of StarPU
  30. @end menu
  31. @c ---------------------------------------------------------------------
  32. @c Introduction to StarPU
  33. @c ---------------------------------------------------------------------
  34. @node Introduction
  35. @chapter Introduction to StarPU
  36. @section Motivation
  37. @c DSM
  38. @c explain the notion of codelet and task (ie. g(A, B)
  39. @c ---------------------------------------------------------------------
  40. @c Installing StarPU
  41. @c ---------------------------------------------------------------------
  42. @node Installing StarPU
  43. @chapter Installing StarPU
  44. StarPU can be built and installed by the standard means of the GNU
  45. autotools. The following chapter is intended to briefly remind how these tools
  46. can be used to install StarPU.
  47. @section Configuring StarPU
  48. @subsection Generating Makefiles and configuration scripts
  49. This step is not necessary when using the tarball releases of StarPU. If you
  50. are using the source code from the svn repository, you first need to generate
  51. the configure scripts and the Makefiles.
  52. @example
  53. $ autoreconf -i
  54. @end example
  55. @subsection Configuring StarPU
  56. @example
  57. $ ./configure
  58. @end example
  59. @c TODO enumerate the list of interesting options
  60. @section Building and Installing StarPU
  61. @subsection Building
  62. @example
  63. $ make
  64. @end example
  65. @subsection Sanity Checks
  66. In order to make sure that StarPU is working properly on the system, it is also
  67. possible to run a test suite.
  68. @example
  69. $ make check
  70. @end example
  71. @subsection Installing
  72. In order to install StarPU at the location that was specified during
  73. configuration:
  74. @example
  75. # make install
  76. @end example
  77. @subsection pkg-config configuration
  78. It is possible that compiling and linking an application against StarPU
  79. requires to use specific flags or libraries (for instance @code{CUDA} or
  80. @code{libspe2}). Therefore, it is possible to use the @code{pkg-config} tool.
  81. If StarPU was not installed at some standard location, the path of StarPU's
  82. library must be specified in the @code{PKG_CONFIG_PATH} environment variable so
  83. that @code{pkg-config} can find it. So if StarPU was installed in
  84. @code{$(prefix_dir)}:
  85. @example
  86. @c TODO: heu, c'est vraiment du shell ça ? :)
  87. $ PKG_CONFIG_PATH = @{PKG_CONFIG_PATH@}:$(prefix_dir)/lib/
  88. @end example
  89. The flags required to compiled or linked against StarPU are then
  90. accessible with the following commands:
  91. @example
  92. $ pkg-config --cflags libstarpu # options for the compiler
  93. $ pkg-config --libs libstarpu # options for the linker
  94. @end example
  95. @c ---------------------------------------------------------------------
  96. @c StarPU API
  97. @c ---------------------------------------------------------------------
  98. @node StarPU API
  99. @chapter StarPU API
  100. @menu
  101. * Initialization and Termination:: Initialization and Termination methods
  102. * Data Library:: Methods to manipulate data
  103. * Codelets and Tasks:: Methods to construct tasks
  104. * Tags:: Task dependencies
  105. @end menu
  106. @node Initialization and Termination
  107. @section Initialization and Termination
  108. @menu
  109. * starpu_init:: Initialize StarPU
  110. * struct starpu_conf:: StarPU runtime configuration
  111. * starpu_shutdown:: Terminate StarPU
  112. @end menu
  113. @node starpu_init
  114. @subsection @code{starpu_init} -- Initialize StarPU
  115. @table @asis
  116. @item @emph{Description}:
  117. This is StarPU initialization method, which must be called prior to any other
  118. StarPU call. It is possible to specify StarPU's configuration (eg. scheduling
  119. policy, number of cores, ...) by passing a non-null argument. Default
  120. configuration is used if the passed argument is @code{NULL}.
  121. @item @emph{Prototype}:
  122. @code{void starpu_init(struct starpu_conf *conf);}
  123. @end table
  124. @node struct starpu_conf
  125. @subsection @code{struct starpu_conf} -- StarPU runtime configuration
  126. @table @asis
  127. @item @emph{Description}:
  128. TODO
  129. @item @emph{Definition}:
  130. TODO
  131. @end table
  132. @node starpu_shutdown
  133. @subsection @code{starpu_shutdown} -- Terminate StarPU
  134. @table @asis
  135. @item @emph{Description}:
  136. This is StarPU termination method. It must be called at the end of the
  137. application: statistics and other post-mortem debugging information are not
  138. garanteed to be available until this method has been called.
  139. @item @emph{Prototype}:
  140. @code{void starpu_shutdown(void);}
  141. @end table
  142. @node Data Library
  143. @section Data Library
  144. @c data_handle_t
  145. @c void starpu_delete_data(struct starpu_data_state_t *state);
  146. @c user interaction with the DSM
  147. @c void starpu_sync_data_with_mem(struct starpu_data_state_t *state);
  148. @c void starpu_notify_data_modification(struct starpu_data_state_t *state, uint32_t modifying_node);
  149. @node Codelets and Tasks
  150. @section Codelets and Tasks
  151. @menu
  152. * starpu_task_create:: Allocate and Initialize a Task
  153. @end menu
  154. @c struct starpu_task
  155. @c struct starpu_codelet
  156. @node starpu_task_create
  157. @subsection @code{starpu_task_create} -- Allocate and Initialize a Task
  158. @table @asis
  159. @item @emph{Description}:
  160. TODO
  161. @item @emph{Prototype}:
  162. @code{struct starpu_task *starpu_task_create(void);}
  163. @end table
  164. @c Callbacks : what can we put in callbacks ?
  165. @node Tags
  166. @section Tags
  167. @menu
  168. * starpu_tag_t:: Task identifier
  169. * starpu_tag_declare_deps:: Declare the Dependencies of a Tag
  170. * starpu_tag_declare_deps_array:: Declare the Dependencies of a Tag
  171. * starpu_tag_wait:: Block until a Tag is terminated
  172. * starpu_tag_wait_array:: Block until a set of Tags is terminated
  173. * starpu_tag_remove:: Destroy a Tag
  174. @end menu
  175. @node starpu_tag_t
  176. @subsection @code{starpu_tag_t} -- Task identifier
  177. @c mention the tag_id field of the task structure
  178. @table @asis
  179. @item @emph{Definition}:
  180. TODO
  181. @end table
  182. @node starpu_tag_declare_deps
  183. @subsection @code{starpu_tag_declare_deps} -- Declare the Dependencies of a Tag
  184. @table @asis
  185. @item @emph{Description}:
  186. TODO
  187. @item @emph{Prototype}:
  188. @code{void starpu_tag_declare_deps(starpu_tag_t id, unsigned ndeps, ...);}
  189. @end table
  190. @node starpu_tag_declare_deps_array
  191. @subsection @code{starpu_tag_declare_deps_array} -- Declare the Dependencies of a Tag
  192. @table @asis
  193. @item @emph{Description}:
  194. TODO
  195. @item @emph{Prototype}:
  196. @code{void starpu_tag_declare_deps_array(starpu_tag_t id, unsigned ndeps, starpu_tag_t *array);}
  197. @end table
  198. @node starpu_tag_wait
  199. @subsection @code{starpu_tag_wait} -- Block until a Tag is terminated
  200. @table @asis
  201. @item @emph{Description}:
  202. TODO
  203. @item @emph{Prototype}:
  204. @code{void starpu_tag_wait(starpu_tag_t id);}
  205. @end table
  206. @node starpu_tag_wait_array
  207. @subsection @code{starpu_tag_wait_array} -- Block until a set of Tags is terminated
  208. @table @asis
  209. @item @emph{Description}:
  210. TODO
  211. @item @emph{Prototype}:
  212. @code{void starpu_tag_wait_array(unsigned ntags, starpu_tag_t *id);}
  213. @end table
  214. @node starpu_tag_remove
  215. @subsection @code{starpu_tag_remove} -- Destroy a Tag
  216. @table @asis
  217. @item @emph{Description}:
  218. TODO
  219. @item @emph{Prototype}:
  220. @code{void starpu_tag_remove(starpu_tag_t id);}
  221. @end table
  222. @section Extensions
  223. @subsection CUDA extensions
  224. @c void starpu_malloc_pinned_if_possible(float **A, size_t dim);
  225. @c subsubsection driver API specific calls
  226. @subsection Cell extensions
  227. @c ---------------------------------------------------------------------
  228. @c Basic Examples
  229. @c ---------------------------------------------------------------------
  230. @node Basic Examples
  231. @chapter Basic Examples
  232. @section Compiling and linking options
  233. The Makefile could for instance contain the following lines to define which
  234. options must be given to the compiler and to the linker:
  235. @example
  236. @c @cartouche
  237. CFLAGS+=$$(pkg-config --cflags libstarpu)
  238. LIBS+=$$(pkg-config --libs libstarpu)
  239. @c @end cartouche
  240. @end example
  241. @section Hello World
  242. In this section, we show how to implement a simple program that submits a task to StarPU.
  243. @subsection Required Headers
  244. The @code{starpu.h} header should be included in any code using StarPU.
  245. @example
  246. @c @cartouche
  247. #include <starpu.h>
  248. @c @end cartouche
  249. @end example
  250. @subsection Defining a Codelet
  251. @example
  252. @c @cartouche
  253. void cpu_func(starpu_data_interface_t *buffers, void *func_arg)
  254. @{
  255. float *array = func_arg;
  256. printf("Hello world (array = @{%f, %f@} )\n", array[0], array[1]);
  257. @}
  258. starpu_codelet cl =
  259. @{
  260. .where = CORE,
  261. .core_func = cpu_func,
  262. .nbuffers = 0
  263. @};
  264. @c @end cartouche
  265. @end example
  266. A codelet is a structure that represents a computational kernel. Such a codelet
  267. may contain an implementation of the same kernel on different architectures
  268. (eg. CUDA, Cell's SPU, x86, ...).
  269. The ''@code{.nbuffers}'' field specifies the number of data buffers that are
  270. manipulated by the codelet: here the codelet does not access or modify any data
  271. that is controlled by our data management library. Note that the argument
  272. passed to the codelet (the ''@code{.cl_arg}'' field of the @code{starpu_task}
  273. structure) does not count as a buffer since it is not managed by our data
  274. management library.
  275. @c TODO need a crossref to the proper description of "where" see bla for more ...
  276. We create a codelet which may only be executed on the CPUs. The ''@code{.where}''
  277. field is a bitmask that defines where the codelet may be executed. Here, the
  278. @code{CORE} value means that only CPUs can execute this codelet
  279. (@pxref{Codelets and Tasks} for more details on that field).
  280. When a CPU core executes a codelet, it calls the @code{.core_func} function,
  281. which @emph{must} have the following prototype:
  282. @code{void (*core_func)(starpu_data_interface_t *, void *)}
  283. In this example, we can ignore the first argument of this function which gives a
  284. description of the input and output buffers (eg. the size and the location of
  285. the matrices). The second argument is a pointer to a buffer passed as an
  286. argument to the codelet by the means of the ''@code{.cl_arg}'' field of the
  287. @code{starpu_task} structure. Be aware that this may be a pointer to a
  288. @emph{copy} of the actual buffer, and not the pointer given by the programmer:
  289. if the codelet modifies this buffer, there is no garantee that the initial
  290. buffer will be modified as well: this for instance implies that the buffer
  291. cannot be used as a synchronization medium.
  292. @subsection Submitting a Task
  293. @example
  294. @c @cartouche
  295. void callback_func(void *callback_arg)
  296. @{
  297. printf("Callback function (arg %x)\n", callback_arg);
  298. @}
  299. int main(int argc, char **argv)
  300. @{
  301. /* initialize StarPU */
  302. starpu_init(NULL);
  303. struct starpu_task *task = starpu_task_create();
  304. task->cl = &cl;
  305. float array[2] = @{1.0f, -1.0f@};
  306. task->cl_arg = &array;
  307. task->cl_arg_size = 2*sizeof(float);
  308. task->callback_func = callback_func;
  309. task->callback_arg = 0x42;
  310. /* starpu_submit_task will be a blocking call */
  311. task->synchronous = 1;
  312. /* submit the task to StarPU */
  313. starpu_submit_task(task);
  314. /* terminate StarPU */
  315. starpu_shutdown();
  316. return 0;
  317. @}
  318. @c @end cartouche
  319. @end example
  320. Before submitting any tasks to StarPU, @code{starpu_init} must be called. The
  321. @code{NULL} arguments specifies that we use default configuration. Tasks cannot
  322. be submitted after the termination of StarPU by a call to
  323. @code{starpu_shutdown}.
  324. In the example above, a task structure is allocated by a call to
  325. @code{starpu_task_create}. This function only allocate and fills the
  326. corresponding structure with the default settings (@pxref{starpu_task_create}),
  327. but it does not submit the task to StarPU.
  328. @c not really clear ;)
  329. The ''@code{.cl}'' field is a pointer to the codelet which the task will
  330. execute: in other words, the codelet structure describes which computational
  331. kernel should be offloaded on the different architectures, and the task
  332. structure is a wrapper containing a codelet and the piece of data on which the
  333. codelet should operate.
  334. The optional ''@code{.cl_arg}'' field is a pointer to a buffer (of size
  335. @code{.cl_arg_size}) with some parameters for some parameters for the kernel
  336. described by the codelet. For instance, if a codelet implements a computational
  337. kernel that multiplies its input vector by a constant, the constant could be
  338. specified by the means of this buffer.
  339. Once a task has been executed, an optional callback function can be called.
  340. While the computational kernel could be offloaded on various architectures, the
  341. callback function is always executed on a CPU. The ''@code{.callback_arg}''
  342. pointer is passed as an argument of the callback. The prototype of a callback
  343. function must be:
  344. @example
  345. void (*callback_function)(void *);
  346. @end example
  347. If the @code{.synchronous} field is non-null, task submission will be
  348. synchronous: the @code{starpu_submit_task} function will not return until the
  349. task was executed. Note that the @code{starpu_shutdown} method does not
  350. guaranty that asynchronous tasks have been executed before it returns.
  351. @section Manipulating Data: Scaling a Vector
  352. The previous example has shown how to submit tasks, in this section we show how
  353. StarPU tasks can manipulate data.
  354. Programmers can describe the data layout of their application so that StarPU is
  355. responsible for enforcing data coherency and availability accross the machine.
  356. Instead of handling complex (and non-portable) mechanisms to perform data
  357. movements, programmers only declare which piece of data is accessed and/or
  358. modify by a task, and StarPU makes sure that when a computational kernel starts
  359. somewhere (eg. on a GPU), its data are available locally.
  360. Before submitting those tasks, the programmer first need to declare the
  361. different piece of data to StarPU using the @code{starpu_monitor_*_data}
  362. functions. To ease the development of applications for StarPU, it is possible
  363. to describe multiple types of data layout. A type of data layout is called an
  364. @b{interface}. By default, there are different interfaces available in StarPU:
  365. here we will consider the @b{vector interface}.
  366. The following lines show how to declare an array of @code{n} elements of type
  367. @code{float} using the vector interface:
  368. @example
  369. float tab[n];
  370. starpu_data_handle tab_handle;
  371. starpu_monitor_vector_data(&tab_handle, 0, tab, n, sizeof(float));
  372. @end example
  373. The first argument, called the @b{data handle} is an opaque pointer which
  374. designates the array in StarPU. This is also the structure which is used to
  375. describe which data is used by a task. It is possible to construct a StarPU
  376. task that multiplies this vector by a constant factor:
  377. @example
  378. float factor;
  379. struct starpu_task *task = starpu_task_create();
  380. task->cl = &cl;
  381. task->cl_arg = &factor;
  382. task->cl_arg_size = sizeof(float);
  383. task->buffers[0].state = &tab_handle;
  384. task->buffers[0].mode = RW;
  385. @end example
  386. The constant factor can be passed as a simple parameter of the task, but the
  387. size and the content of the vector is not known in advance so that we describe
  388. the vector by its handle. There are two fields in each element @code{buffers} array.
  389. @code{.state} is the handle of the data, and @code{.mode} specifies how the
  390. kernel will access the data (@code{R} for read-only, @code{W} for write-only
  391. and @code{RW} for read and write access).
  392. @example
  393. void scal_func(starpu_data_interface_t *buffers, void *arg)
  394. @{
  395. unsigned i;
  396. float *factor = arg;
  397. /* length of the vector */
  398. unsigned n = buffers[0].vector.nx;
  399. /* local copy of the vector */
  400. float *val = (float *)buffers[0].vector.ptr;
  401. for (i = 0; i < n; i++)
  402. val[i] *= *factor;
  403. @}
  404. starpu_codelet cl = @{
  405. .where = CORE,
  406. .core_func = scal_func,
  407. .nbuffers = 1
  408. @};
  409. @end example
  410. The @code{.nbuffers} field of the codelet structure specifies that there is
  411. only one piece of data that is handled by the codelet. The second argument of
  412. the @code{scal_func} function contains a pointer to the parameters of the
  413. codelet (given in @code{task->cl_arg}), so the we read the constant factor from
  414. this pointer. The first argument is an array that gives a description of every
  415. buffers passed in the @code{task->buffers}@ array. In the @b{vector interface},
  416. the location of the vector (resp. its length) is accessible in the
  417. @code{.vector.ptr} (resp. @code{.vector.nx}) of this array. Since the vector is
  418. accessed in a read-write fashion, any modification will automatically affect
  419. future accesses to that vector.
  420. @section Vector Scaling on an Hybrid CPU/GPU Machine
  421. Contrary to the previous examples, the task submitted in the example may not
  422. only be executed by the CPUs, but also by a CUDA device.
  423. TODO
  424. @bye