data_management.doxy 23 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444
  1. /* StarPU --- Runtime system for heterogeneous multicore architectures.
  2. *
  3. * Copyright (C) 2011,2012,2017 Inria
  4. * Copyright (C) 2010-2019 CNRS
  5. * Copyright (C) 2009-2011,2014-2017,2019 Université de Bordeaux
  6. *
  7. * StarPU is free software; you can redistribute it and/or modify
  8. * it under the terms of the GNU Lesser General Public License as published by
  9. * the Free Software Foundation; either version 2.1 of the License, or (at
  10. * your option) any later version.
  11. *
  12. * StarPU is distributed in the hope that it will be useful, but
  13. * WITHOUT ANY WARRANTY; without even the implied warranty of
  14. * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
  15. *
  16. * See the GNU Lesser General Public License in COPYING.LGPL for more details.
  17. */
  18. /*! \defgroup API_Data_Management Data Management
  19. \brief This section describes the data management facilities provided
  20. by StarPU. We show how to use existing data interfaces in
  21. \ref API_Data_Interfaces, but developers can design their own data interfaces if
  22. required.
  23. \typedef starpu_data_handle_t
  24. \ingroup API_Data_Management
  25. StarPU uses ::starpu_data_handle_t as an opaque handle to
  26. manage a piece of data. Once a piece of data has been registered to
  27. StarPU, it is associated to a ::starpu_data_handle_t which keeps track
  28. of the state of the piece of data over the entire machine, so that we
  29. can maintain data consistency and locate data replicates for instance.
  30. \typedef starpu_arbiter_t
  31. \ingroup API_Data_Management
  32. This is an arbiter, which implements an advanced but centralized management of
  33. concurrent data accesses, see \ref ConcurrentDataAccess for the details.
  34. \enum starpu_data_access_mode
  35. \ingroup API_Data_Management
  36. This datatype describes a data access mode.
  37. \var starpu_data_access_mode::STARPU_NONE
  38. TODO
  39. \var starpu_data_access_mode::STARPU_R
  40. read-only mode.
  41. \var starpu_data_access_mode::STARPU_W
  42. write-only mode.
  43. \var starpu_data_access_mode::STARPU_RW
  44. read-write mode. This is equivalent to ::STARPU_R|::STARPU_W
  45. \var starpu_data_access_mode::STARPU_SCRATCH
  46. A temporary buffer is allocated for the task, but StarPU does not
  47. enforce data consistency---i.e. each device has its own buffer,
  48. independently from each other (even for CPUs), and no data
  49. transfer is ever performed. This is useful for temporary variables
  50. to avoid allocating/freeing buffers inside each task. Currently,
  51. no behavior is defined concerning the relation with the ::STARPU_R
  52. and ::STARPU_W modes and the value provided at registration ---
  53. i.e., the value of the scratch buffer is undefined at entry of the
  54. codelet function. It is being considered for future extensions at
  55. least to define the initial value. For now, data to be used in
  56. ::STARPU_SCRATCH mode should be registered with node -1 and
  57. a <c>NULL</c> pointer, since the value of the provided buffer is
  58. simply ignored for now.
  59. \var starpu_data_access_mode::STARPU_REDUX
  60. todo
  61. \var starpu_data_access_mode::STARPU_COMMUTE
  62. ::STARPU_COMMUTE can be passed along
  63. ::STARPU_W or ::STARPU_RW to express that StarPU can let tasks
  64. commute, which is useful e.g. when bringing a contribution into
  65. some data, which can be done in any order (but still require
  66. sequential consistency against reads or non-commutative writes).
  67. \var starpu_data_access_mode::STARPU_SSEND
  68. used in starpu_mpi_insert_task() to specify the data has to be
  69. sent using a synchronous and non-blocking mode (see
  70. starpu_mpi_issend())
  71. \var starpu_data_access_mode::STARPU_LOCALITY
  72. used to tell the scheduler which data is the most important for
  73. the task, and should thus be used to try to group tasks on the
  74. same core or cache, etc. For now only the ws and lws schedulers
  75. take this flag into account, and only when rebuild with
  76. USE_LOCALITY flag defined in the
  77. src/sched_policies/work_stealing_policy.c source code.
  78. \var starpu_data_access_mode::STARPU_ACCESS_MODE_MAX
  79. todo
  80. @name Basic Data Management API
  81. \ingroup API_Data_Management
  82. Data management is done at a high-level in StarPU: rather than
  83. accessing a mere list of contiguous buffers, the tasks may manipulate
  84. data that are described by a high-level construct which we call data
  85. interface.
  86. An example of data interface is the "vector" interface which describes
  87. a contiguous data array on a spefic memory node. This interface is a
  88. simple structure containing the number of elements in the array, the
  89. size of the elements, and the address of the array in the appropriate
  90. address space (this address may be invalid if there is no valid copy
  91. of the array in the memory node). More informations on the data
  92. interfaces provided by StarPU are given in \ref API_Data_Interfaces.
  93. When a piece of data managed by StarPU is used by a task, the task
  94. implementation is given a pointer to an interface describing a valid
  95. copy of the data that is accessible from the current processing unit.
  96. Every worker is associated to a memory node which is a logical
  97. abstraction of the address space from which the processing unit gets
  98. its data. For instance, the memory node associated to the different
  99. CPU workers represents main memory (RAM), the memory node associated
  100. to a GPU is DRAM embedded on the device. Every memory node is
  101. identified by a logical index which is accessible from the
  102. function starpu_worker_get_memory_node(). When registering a piece of
  103. data to StarPU, the specified memory node indicates where the piece of
  104. data initially resides (we also call this memory node the home node of
  105. a piece of data).
  106. In the case of NUMA systems, functions starpu_memory_nodes_numa_devid_to_id()
  107. and starpu_memory_nodes_numa_id_to_devid() can be used to convert from NUMA node
  108. numbers as seen by the Operating System and NUMA node numbers as seen by StarPU.
  109. \fn void starpu_data_register(starpu_data_handle_t *handleptr, int home_node, void *data_interface, struct starpu_data_interface_ops *ops)
  110. \ingroup API_Data_Management
  111. Register a piece of data into the handle located at the
  112. \p handleptr address. The \p data_interface buffer contains the initial
  113. description of the data in the \p home_node. The \p ops argument is a
  114. pointer to a structure describing the different methods used to
  115. manipulate this type of interface. See starpu_data_interface_ops for
  116. more details on this structure.
  117. If \p home_node is -1, StarPU will automatically allocate the memory when
  118. it is used for the first time in write-only mode. Once such data
  119. handle has been automatically allocated, it is possible to access it
  120. using any access mode.
  121. Note that StarPU supplies a set of predefined types of interface (e.g.
  122. vector or matrix) which can be registered by the means of helper
  123. functions (e.g. starpu_vector_data_register() or
  124. starpu_matrix_data_register()).
  125. \fn void starpu_data_ptr_register(starpu_data_handle_t handle, unsigned node)
  126. \ingroup API_Data_Management
  127. Register that a buffer for \p handle on \p node will be set. This is typically
  128. used by starpu_*_ptr_register helpers before setting the interface pointers for
  129. this node, to tell the core that that is now allocated.
  130. \fn void starpu_data_register_same(starpu_data_handle_t *handledst, starpu_data_handle_t handlesrc)
  131. \ingroup API_Data_Management
  132. Register a new piece of data into the handle \p handledst with the
  133. same interface as the handle \p handlesrc.
  134. \fn void starpu_data_unregister(starpu_data_handle_t handle)
  135. \ingroup API_Data_Management
  136. Unregister a data \p handle from StarPU. If the
  137. data was automatically allocated by StarPU because the home node was
  138. -1, all automatically allocated buffers are freed. Otherwise, a valid
  139. copy of the data is put back into the home node in the buffer that was
  140. initially registered. Using a data handle that has been unregistered
  141. from StarPU results in an undefined behaviour. In case we do not need
  142. to update the value of the data in the home node, we can use
  143. the function starpu_data_unregister_no_coherency() instead.
  144. \fn void starpu_data_unregister_no_coherency(starpu_data_handle_t handle)
  145. \ingroup API_Data_Management
  146. This is the same as starpu_data_unregister(), except that
  147. StarPU does not put back a valid copy into the home node, in the
  148. buffer that was initially registered.
  149. \fn void starpu_data_unregister_submit(starpu_data_handle_t handle)
  150. \ingroup API_Data_Management
  151. Destroy the data \p handle once it is no longer needed by any
  152. submitted task. No coherency is assumed.
  153. \fn void starpu_data_invalidate(starpu_data_handle_t handle)
  154. \ingroup API_Data_Management
  155. Destroy all replicates of the data \p handle immediately. After
  156. data invalidation, the first access to \p handle must be performed in
  157. ::STARPU_W mode. Accessing an invalidated data in ::STARPU_R mode
  158. results in undefined behaviour.
  159. \fn void starpu_data_invalidate_submit(starpu_data_handle_t handle)
  160. \ingroup API_Data_Management
  161. Submit invalidation of the data \p handle after completion of
  162. previously submitted tasks.
  163. \fn void starpu_data_set_wt_mask(starpu_data_handle_t handle, uint32_t wt_mask)
  164. \ingroup API_Data_Management
  165. Set the write-through mask of the data \p handle (and
  166. its children), i.e. a bitmask of nodes where the data should be always
  167. replicated after modification. It also prevents the data from being
  168. evicted from these nodes when memory gets scarse. When the data is
  169. modified, it is automatically transfered into those memory nodes. For
  170. instance a <c>1<<0</c> write-through mask means that the CUDA workers
  171. will commit their changes in main memory (node 0).
  172. \fn void starpu_data_set_name(starpu_data_handle_t handle, const char *name)
  173. \ingroup API_Data_Management
  174. Set the name of the data, to be shown in various profiling tools.
  175. \fn void starpu_data_set_coordinates_array(starpu_data_handle_t handle, int dimensions, int dims[])
  176. \ingroup API_Data_Management
  177. Set the coordinates of the data, to be shown in various profiling tools.
  178. \p dimensions is the size of the \p dims array
  179. This can be for instance the tile coordinates within a big matrix.
  180. \fn void starpu_data_set_coordinates(starpu_data_handle_t handle, unsigned dimensions, ...)
  181. \ingroup API_Data_Management
  182. Set the coordinates of the data, to be shown in various profiling tools.
  183. \p dimensions is the number of subsequent \c int parameters.
  184. This can be for instance the tile coordinates within a big matrix.
  185. \fn void starpu_data_set_ooc_flag(starpu_data_handle_t handle, unsigned flag)
  186. \ingroup API_Data_Management
  187. Set whether this data should be elligible to be evicted to disk storage (1) or
  188. not (0). The default is 1.
  189. \fn unsigned starpu_data_get_ooc_flag(starpu_data_handle_t handle)
  190. \ingroup API_Data_Management
  191. Get whether this data was set to be elligible to be evicted to disk storage (1) or
  192. not (0).
  193. \fn int starpu_data_fetch_on_node(starpu_data_handle_t handle, unsigned node, unsigned async)
  194. \ingroup API_Data_Management
  195. Issue a fetch request for the data \p handle to \p node, i.e.
  196. requests that the data be replicated to the given node as soon as possible, so that it is
  197. available there for tasks. If \p async is 0, the call will
  198. block until the transfer is achieved, else the call will return immediately,
  199. after having just queued the request. In the latter case, the request will
  200. asynchronously wait for the completion of any task writing on the data.
  201. \fn int starpu_data_prefetch_on_node(starpu_data_handle_t handle, unsigned node, unsigned async)
  202. \ingroup API_Data_Management
  203. Issue a prefetch request for the data \p handle to \p node, i.e.
  204. requests that the data be replicated to \p node when there is room for it, so that it is
  205. available there for tasks. If \p async is 0, the call will
  206. block until the transfer is achieved, else the call will return immediately,
  207. after having just queued the request. In the latter case, the request will
  208. asynchronously wait for the completion of any task writing on the data.
  209. \fn int starpu_data_idle_prefetch_on_node(starpu_data_handle_t handle, unsigned node, unsigned async)
  210. \ingroup API_Data_Management
  211. Issue an idle prefetch request for the data \p handle to \p node, i.e.
  212. requests that the data be replicated to \p node, so that it is
  213. available there for tasks, but only when the bus is really idle. If \p async is 0, the call will
  214. block until the transfer is achieved, else the call will return immediately,
  215. after having just queued the request. In the latter case, the request will
  216. asynchronously wait for the completion of any task writing on the data.
  217. \fn unsigned starpu_data_is_on_node(starpu_data_handle_t handle, unsigned node)
  218. \ingroup API_Data_Management
  219. Check whether a valid copy of \p handle is currently available on memory node \p
  220. node .
  221. \fn void starpu_data_wont_use(starpu_data_handle_t handle)
  222. \ingroup API_Data_Management
  223. Advise StarPU that \p handle will not be used in the close future, and is
  224. thus a good candidate for eviction from GPUs. StarPU will thus write its value
  225. back to its home node when the bus is idle, and select this data in priority
  226. for eviction when memory gets low.
  227. \fn starpu_data_handle_t starpu_data_lookup(const void *ptr)
  228. \ingroup API_Data_Management
  229. Return the handle corresponding to the data pointed to by the \p ptr host pointer.
  230. \fn int starpu_data_request_allocation(starpu_data_handle_t handle, unsigned node)
  231. \ingroup API_Data_Management
  232. Explicitly ask StarPU to allocate room for a piece of data on
  233. the specified memory \p node.
  234. \fn void starpu_data_query_status(starpu_data_handle_t handle, int memory_node, int *is_allocated, int *is_valid, int *is_requested)
  235. \ingroup API_Data_Management
  236. Query the status of \p handle on the specified \p memory_node.
  237. \fn void starpu_data_advise_as_important(starpu_data_handle_t handle, unsigned is_important)
  238. \ingroup API_Data_Management
  239. Specify that the data \p handle can be discarded without impacting the application.
  240. \fn void starpu_data_set_reduction_methods(starpu_data_handle_t handle, struct starpu_codelet *redux_cl, struct starpu_codelet *init_cl)
  241. \ingroup API_Data_Management
  242. Set the codelets to be used for \p handle when it is accessed in the
  243. mode ::STARPU_REDUX. Per-worker buffers will be initialized with
  244. the codelet \p init_cl, and reduction between per-worker buffers will be
  245. done with the codelet \p redux_cl.
  246. \fn struct starpu_data_interface_ops* starpu_data_get_interface_ops(starpu_data_handle_t handle)
  247. \ingroup API_Data_Management
  248. todo
  249. \fn void starpu_data_set_user_data(starpu_data_handle_t handle, void* user_data)
  250. \ingroup API_Data_Management
  251. Sset the field \c user_data for the \p handle to \p user_data . It can
  252. then be retrieved with starpu_data_get_user_data(). \p user_data can be any
  253. application-defined value, for instance a pointer to an object-oriented
  254. container for the data.
  255. \fn void *starpu_data_get_user_data(starpu_data_handle_t handle)
  256. \ingroup API_Data_Management
  257. This retrieves the field \c user_data previously set for the \p handle.
  258. @name Access registered data from the application
  259. \ingroup API_Data_Management
  260. \fn int starpu_data_acquire(starpu_data_handle_t handle, enum starpu_data_access_mode mode)
  261. \ingroup API_Data_Management
  262. The application must call this function prior to accessing
  263. registered data from main memory outside tasks. StarPU ensures that
  264. the application will get an up-to-date copy of \p handle in main memory
  265. located where the data was originally registered, and that all
  266. concurrent accesses (e.g. from tasks) will be consistent with the
  267. access mode specified with \p mode. starpu_data_release() must
  268. be called once the application no longer needs to access the piece of
  269. data. Note that implicit data dependencies are also enforced
  270. by starpu_data_acquire(), i.e. starpu_data_acquire() will wait for all
  271. tasks scheduled to work on the data, unless they have been disabled
  272. explictly by calling starpu_data_set_default_sequential_consistency_flag() or
  273. starpu_data_set_sequential_consistency_flag(). starpu_data_acquire() is a
  274. blocking call, so that it cannot be called from tasks or from their
  275. callbacks (in that case, starpu_data_acquire() returns <c>-EDEADLK</c>). Upon
  276. successful completion, this function returns 0.
  277. \fn int starpu_data_acquire_cb(starpu_data_handle_t handle, enum starpu_data_access_mode mode, void (*callback)(void *), void *arg)
  278. \ingroup API_Data_Management
  279. Asynchronous equivalent of starpu_data_acquire(). When the data
  280. specified in \p handle is available in the access \p mode, the \p
  281. callback function is executed. The application may access
  282. the requested data during the execution of \p callback. The \p callback
  283. function must call starpu_data_release() once the application no longer
  284. needs to access the piece of data. Note that implicit data
  285. dependencies are also enforced by starpu_data_acquire_cb() in case they
  286. are not disabled. Contrary to starpu_data_acquire(), this function is
  287. non-blocking and may be called from task callbacks. Upon successful
  288. completion, this function returns 0.
  289. \fn int starpu_data_acquire_cb_sequential_consistency(starpu_data_handle_t handle, enum starpu_data_access_mode mode, void (*callback)(void *), void *arg, int sequential_consistency)
  290. \ingroup API_Data_Management
  291. Equivalent of starpu_data_acquire_cb() with the possibility of enabling or disabling data dependencies.
  292. When the data specified in \p handle is available in the access
  293. \p mode, the \p callback function is executed. The application may access
  294. the requested data during the execution of this \p callback. The \p callback
  295. function must call starpu_data_release() once the application no longer
  296. needs to access the piece of data. Note that implicit data
  297. dependencies are also enforced by starpu_data_acquire_cb_sequential_consistency() in case they
  298. are not disabled specifically for the given \p handle or by the parameter \p sequential_consistency.
  299. Similarly to starpu_data_acquire_cb(), this function is
  300. non-blocking and may be called from task callbacks. Upon successful
  301. completion, this function returns 0.
  302. \fn int starpu_data_acquire_try(starpu_data_handle_t handle, enum starpu_data_access_mode mode)
  303. \ingroup API_Data_Management
  304. The application can call this function instead of starpu_data_acquire() so as to
  305. acquire the data like starpu_data_acquire(), but only if all
  306. previously-submitted tasks have completed, in which case starpu_data_acquire_try()
  307. returns 0. StarPU will have ensured that the application will get an up-to-date
  308. copy of \p handle in main memory located where the data was originally
  309. registered. starpu_data_release() must be called once the application no longer
  310. needs to access the piece of data.
  311. If not all previously-submitted tasks have completed, starpu_data_acquire_try
  312. returns -EAGAIN, and starpu_data_release() must not be called.
  313. \def STARPU_ACQUIRE_NO_NODE
  314. \ingroup API_Data_Management
  315. This macro can be used to acquire data, but not require it to be available on a given node, only enforce R/W dependencies.
  316. This can for instance be used to wait for tasks which produce the data, but without requesting a fetch to the main memory.
  317. \def STARPU_ACQUIRE_NO_NODE_LOCK_ALL
  318. \ingroup API_Data_Management
  319. This is the same as ::STARPU_ACQUIRE_NO_NODE, but will lock the data on all nodes, preventing them from being evicted for instance.
  320. This is mostly useful inside starpu only.
  321. \fn int starpu_data_acquire_on_node(starpu_data_handle_t handle, int node, enum starpu_data_access_mode mode)
  322. \ingroup API_Data_Management
  323. This is the same as starpu_data_acquire(), except that the data
  324. will be available on the given memory node instead of main
  325. memory.
  326. ::STARPU_ACQUIRE_NO_NODE and ::STARPU_ACQUIRE_NO_NODE_LOCK_ALL can be
  327. used instead of an explicit node number.
  328. \fn int starpu_data_acquire_on_node_cb(starpu_data_handle_t handle, int node, enum starpu_data_access_mode mode, void (*callback)(void *), void *arg)
  329. \ingroup API_Data_Management
  330. This is the same as starpu_data_acquire_cb(), except that the
  331. data will be available on the given memory node instead of main
  332. memory.
  333. ::STARPU_ACQUIRE_NO_NODE and ::STARPU_ACQUIRE_NO_NODE_LOCK_ALL can be
  334. used instead of an explicit node number.
  335. \fn int starpu_data_acquire_on_node_cb_sequential_consistency(starpu_data_handle_t handle, int node, enum starpu_data_access_mode mode, void (*callback)(void *), void *arg, int sequential_consistency)
  336. \ingroup API_Data_Management
  337. This is the same as starpu_data_acquire_cb_sequential_consistency(), except that the
  338. data will be available on the given memory node instead of main
  339. memory.
  340. ::STARPU_ACQUIRE_NO_NODE and ::STARPU_ACQUIRE_NO_NODE_LOCK_ALL can be used instead of an
  341. explicit node number.
  342. \fn int starpu_data_acquire_on_node_cb_sequential_consistency_sync_jobids(starpu_data_handle_t handle, int node, enum starpu_data_access_mode mode, void (*callback)(void *), void *arg, int sequential_consistency, int quick, long *pre_sync_jobid, long *post_sync_jobid)
  343. \ingroup API_Data_Management
  344. This is the same as starpu_data_acquire_on_node_cb_sequential_consistency(),
  345. except that the \e pre_sync_jobid and \e post_sync_jobid parameters can be used
  346. to retrieve the jobid of the synchronization tasks. \e pre_sync_jobid happens
  347. just before the acquisition, and \e post_sync_jobid happens just after the
  348. release.
  349. \fn int starpu_data_acquire_on_node_try(starpu_data_handle_t handle, int node, enum starpu_data_access_mode mode)
  350. \ingroup API_Data_Management
  351. This is the same as starpu_data_acquire_try(), except that the
  352. data will be available on the given memory node instead of main
  353. memory.
  354. ::STARPU_ACQUIRE_NO_NODE and ::STARPU_ACQUIRE_NO_NODE_LOCK_ALL can be used instead of an
  355. explicit node number.
  356. \def STARPU_DATA_ACQUIRE_CB(handle, mode, code)
  357. \ingroup API_Data_Management
  358. STARPU_DATA_ACQUIRE_CB() is the same as starpu_data_acquire_cb(),
  359. except that the code to be executed in a callback is directly provided
  360. as a macro parameter, and the data \p handle is automatically released
  361. after it. This permits to easily execute code which depends on the
  362. value of some registered data. This is non-blocking too and may be
  363. called from task callbacks.
  364. \fn void starpu_data_release(starpu_data_handle_t handle)
  365. \ingroup API_Data_Management
  366. Release the piece of data acquired by the
  367. application either by starpu_data_acquire() or by
  368. starpu_data_acquire_cb().
  369. \fn void starpu_data_release_on_node(starpu_data_handle_t handle, int node)
  370. \ingroup API_Data_Management
  371. This is the same as starpu_data_release(), except that the data
  372. will be available on the given memory \p node instead of main memory.
  373. The \p node parameter must be exactly the same as the corresponding \c
  374. starpu_data_acquire_on_node* call.
  375. \fn starpu_arbiter_t starpu_arbiter_create(void)
  376. \ingroup API_Data_Management
  377. Create a data access arbiter, see \ref ConcurrentDataAccess for the details
  378. \fn void starpu_data_assign_arbiter(starpu_data_handle_t handle, starpu_arbiter_t arbiter)
  379. \ingroup API_Data_Management
  380. Make access to \p handle managed by \p arbiter
  381. \fn void starpu_arbiter_destroy(starpu_arbiter_t arbiter)
  382. \ingroup API_Data_Management
  383. Destroy the \p arbiter . This must only be called after all data
  384. assigned to it have been unregistered.
  385. */