configuration.texi 17 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471
  1. @c -*-texinfo-*-
  2. @c This file is part of the StarPU Handbook.
  3. @c Copyright (C) 2009--2011 Universit@'e de Bordeaux 1
  4. @c Copyright (C) 2010, 2011, 2012 Centre National de la Recherche Scientifique
  5. @c Copyright (C) 2011 Institut National de Recherche en Informatique et Automatique
  6. @c See the file starpu.texi for copying conditions.
  7. @menu
  8. * Compilation configuration::
  9. * Execution configuration through environment variables::
  10. @end menu
  11. @node Compilation configuration
  12. @section Compilation configuration
  13. The following arguments can be given to the @code{configure} script.
  14. @menu
  15. * Common configuration::
  16. * Configuring workers::
  17. * Advanced configuration::
  18. @end menu
  19. @node Common configuration
  20. @subsection Common configuration
  21. @menu
  22. * --enable-debug::
  23. * --enable-fast::
  24. * --enable-verbose::
  25. * --enable-coverage::
  26. @end menu
  27. @node --enable-debug
  28. @subsubsection @code{--enable-debug}
  29. Enable debugging messages.
  30. @node --enable-fast
  31. @subsubsection @code{--enable-fast}
  32. Do not enforce assertions, saves a lot of time spent to compute them otherwise.
  33. @node --enable-verbose
  34. @subsubsection @code{--enable-verbose}
  35. Augment the verbosity of the debugging messages. This can be disabled
  36. at runtime by setting the environment variable @code{STARPU_SILENT} to
  37. any value.
  38. @smallexample
  39. % STARPU_SILENT=1 ./vector_scal
  40. @end smallexample
  41. @node --enable-coverage
  42. @subsubsection @code{--enable-coverage}
  43. Enable flags for the @code{gcov} coverage tool.
  44. @node Configuring workers
  45. @subsection Configuring workers
  46. @menu
  47. * --enable-maxcpus::
  48. * --disable-cpu::
  49. * --enable-maxcudadev::
  50. * --disable-cuda::
  51. * --with-cuda-dir::
  52. * --with-cuda-include-dir::
  53. * --with-cuda-lib-dir::
  54. * --disable-cuda-memcpy-peer::
  55. * --enable-maxopencldev::
  56. * --disable-opencl::
  57. * --with-opencl-dir::
  58. * --with-opencl-include-dir::
  59. * --with-opencl-lib-dir::
  60. * --enable-gordon::
  61. * --with-gordon-dir::
  62. * --enable-maximplementations::
  63. @end menu
  64. @node --enable-maxcpus
  65. @subsubsection @code{--enable-maxcpus=<number>}
  66. Define the maximum number of CPU cores that StarPU will support, then
  67. available as the @code{STARPU_MAXCPUS} macro.
  68. @node --disable-cpu
  69. @subsubsection @code{--disable-cpu}
  70. Disable the use of CPUs of the machine. Only GPUs etc. will be used.
  71. @node --enable-maxcudadev
  72. @subsubsection @code{--enable-maxcudadev=<number>}
  73. Define the maximum number of CUDA devices that StarPU will support, then
  74. available as the @code{STARPU_MAXCUDADEVS} macro.
  75. @node --disable-cuda
  76. @subsubsection @code{--disable-cuda}
  77. Disable the use of CUDA, even if a valid CUDA installation was detected.
  78. @node --with-cuda-dir
  79. @subsubsection @code{--with-cuda-dir=<path>}
  80. Specify the directory where CUDA is installed. This directory should notably contain
  81. @code{include/cuda.h}.
  82. @node --with-cuda-include-dir
  83. @subsubsection @code{--with-cuda-include-dir=<path>}
  84. Specify the directory where CUDA headers are installed. This directory should
  85. notably contain @code{cuda.h}. This defaults to @code{/include} appended to the
  86. value given to @code{--with-cuda-dir}.
  87. @node --with-cuda-lib-dir
  88. @subsubsection @code{--with-cuda-lib-dir=<path>}
  89. Specify the directory where the CUDA library is installed. This directory should
  90. notably contain the CUDA shared libraries (e.g. libcuda.so). This defaults to
  91. @code{/lib} appended to the value given to @code{--with-cuda-dir}.
  92. @node --disable-cuda-memcpy-peer
  93. @subsubsection @code{--disable-cuda-memcpy-peer}
  94. Explicitely disable peer transfers when using CUDA 4.0
  95. @node --enable-maxopencldev
  96. @subsubsection @code{--enable-maxopencldev=<number>}
  97. Define the maximum number of OpenCL devices that StarPU will support, then
  98. available as the @code{STARPU_MAXOPENCLDEVS} macro.
  99. @node --disable-opencl
  100. @subsubsection @code{--disable-opencl}
  101. Disable the use of OpenCL, even if the SDK is detected.
  102. @node --with-opencl-dir
  103. @subsubsection @code{--with-opencl-dir=<path>}
  104. Specify the location of the OpenCL SDK. This directory should notably contain
  105. @code{include/CL/cl.h} (or @code{include/OpenCL/cl.h} on Mac OS).
  106. @node --with-opencl-include-dir
  107. @subsubsection @code{--with-opencl-include-dir=<path>}
  108. Specify the location of OpenCL headers. This directory should notably contain
  109. @code{CL/cl.h} (or @code{OpenCL/cl.h} on Mac OS). This defaults to
  110. @code{/include} appended to the value given to @code{--with-opencl-dir}.
  111. @node --with-opencl-lib-dir
  112. @subsubsection @code{--with-opencl-lib-dir=<path>}
  113. Specify the location of the OpenCL library. This directory should notably
  114. contain the OpenCL shared libraries (e.g. libOpenCL.so). This defaults to
  115. @code{/lib} appended to the value given to @code{--with-opencl-dir}.
  116. @node --enable-gordon
  117. @subsubsection @code{--enable-gordon}
  118. Enable the use of the Gordon runtime for Cell SPUs.
  119. @c TODO: rather default to enabled when detected
  120. @node --with-gordon-dir
  121. @subsubsection @code{--with-gordon-dir=<path>}
  122. Specify the location of the Gordon SDK.
  123. @node --enable-maximplementations
  124. @subsubsection @code{--enable-maximplementations=<number>}
  125. Define the number of implementations that can be defined for a single kind of
  126. device. It is then available as the @code{STARPU_MAXIMPLEMENTATIONS} macro.
  127. @node Advanced configuration
  128. @subsection Advanced configuration
  129. @menu
  130. * --enable-perf-debug::
  131. * --enable-model-debug::
  132. * --enable-stats::
  133. * --enable-maxbuffers::
  134. * --enable-allocation-cache::
  135. * --enable-opengl-render::
  136. * --enable-blas-lib::
  137. * --disable-starpufft::
  138. * --with-magma::
  139. * --with-fxt::
  140. * --with-perf-model-dir::
  141. * --with-mpicc::
  142. * --with-goto-dir::
  143. * --with-atlas-dir::
  144. * --with-mkl-cflags::
  145. * --with-mkl-ldflags::
  146. * --disable-gcc-extensions::
  147. * --disable-socl::
  148. * --disable-starpu-top::
  149. @end menu
  150. @node --enable-perf-debug
  151. @subsubsection @code{--enable-perf-debug}
  152. Enable performance debugging through gprof.
  153. @node --enable-model-debug
  154. @subsubsection @code{--enable-model-debug}
  155. Enable performance model debugging.
  156. @node --enable-stats
  157. @subsubsection @code{--enable-stats}
  158. Enable statistics.
  159. @node --enable-maxbuffers
  160. @subsubsection @code{--enable-maxbuffers=<nbuffers>}
  161. Define the maximum number of buffers that tasks will be able to take
  162. as parameters, then available as the @code{STARPU_NMAXBUFS} macro.
  163. @node --enable-allocation-cache
  164. @subsubsection @code{--enable-allocation-cache}
  165. Enable the use of a data allocation cache to avoid the cost of it with
  166. CUDA. Still experimental.
  167. @node --enable-opengl-render
  168. @subsubsection @code{--enable-opengl-render}
  169. Enable the use of OpenGL for the rendering of some examples.
  170. @c TODO: rather default to enabled when detected
  171. @node --enable-blas-lib
  172. @subsubsection @code{--enable-blas-lib=<name>}
  173. Specify the blas library to be used by some of the examples. The
  174. library has to be 'atlas' or 'goto'.
  175. @node --disable-starpufft
  176. @subsubsection @code{--disable-starpufft}
  177. Disable the build of libstarpufft, even if fftw or cuFFT is available.
  178. @node --with-magma
  179. @subsubsection @code{--with-magma=<path>}
  180. Specify where magma is installed. This directory should notably contain
  181. @code{include/magmablas.h}.
  182. @node --with-fxt
  183. @subsubsection @code{--with-fxt=<path>}
  184. Specify the location of FxT (for generating traces and rendering them
  185. using ViTE). This directory should notably contain
  186. @code{include/fxt/fxt.h}.
  187. @c TODO add ref to other section
  188. @node --with-perf-model-dir
  189. @subsubsection @code{--with-perf-model-dir=<dir>}
  190. Specify where performance models should be stored (instead of defaulting to the
  191. current user's home).
  192. @node --with-mpicc
  193. @subsubsection @code{--with-mpicc=<path to mpicc>}
  194. Specify the location of the @code{mpicc} compiler to be used for starpumpi.
  195. @node --with-goto-dir
  196. @subsubsection @code{--with-goto-dir=<dir>}
  197. Specify the location of GotoBLAS.
  198. @node --with-atlas-dir
  199. @subsubsection @code{--with-atlas-dir=<dir>}
  200. Specify the location of ATLAS. This directory should notably contain
  201. @code{include/cblas.h}.
  202. @node --with-mkl-cflags
  203. @subsubsection @code{--with-mkl-cflags=<cflags>}
  204. Specify the compilation flags for the MKL Library.
  205. @node --with-mkl-ldflags
  206. @subsubsection @code{--with-mkl-ldflags=<ldflags>}
  207. Specify the linking flags for the MKL Library. Note that the
  208. @url{http://software.intel.com/en-us/articles/intel-mkl-link-line-advisor/}
  209. website provides a script to determine the linking flags.
  210. @node --disable-gcc-extensions
  211. @subsubsection @code{--disable-gcc-extensions}
  212. Disable the GCC plug-in. It is by default enabled if the GCC compiler
  213. provides a plug-in support.
  214. @node --disable-socl
  215. @subsubsection @code{--disable-socl}
  216. Disable the SOCL extension. It is by default enabled if a valid OpenCL
  217. installation is found.
  218. @node --disable-starpu-top
  219. @subsubsection @code{--disable-starpu-top}
  220. Disable the StarPU-Top interface. It is by default enabled if the required
  221. dependencies are found.
  222. @node Execution configuration through environment variables
  223. @section Execution configuration through environment variables
  224. @menu
  225. * Workers:: Configuring workers
  226. * Scheduling:: Configuring the Scheduling engine
  227. * Misc:: Miscellaneous and debug
  228. @end menu
  229. Note: the values given in @code{starpu_conf} structure passed when
  230. calling @code{starpu_init} will override the values of the environment
  231. variables.
  232. @node Workers
  233. @subsection Configuring workers
  234. @menu
  235. * STARPU_NCPUS:: Number of CPU workers
  236. * STARPU_NCUDA:: Number of CUDA workers
  237. * STARPU_NOPENCL:: Number of OpenCL workers
  238. * STARPU_NGORDON:: Number of SPU workers (Cell)
  239. * STARPU_WORKERS_CPUID:: Bind workers to specific CPUs
  240. * STARPU_WORKERS_CUDAID:: Select specific CUDA devices
  241. * STARPU_WORKERS_OPENCLID:: Select specific OpenCL devices
  242. @end menu
  243. @node STARPU_NCPUS
  244. @subsubsection @code{STARPU_NCPUS} -- Number of CPU workers
  245. Specify the number of CPU workers (thus not including workers dedicated to control acceleratores). Note that by default, StarPU will not allocate
  246. more CPU workers than there are physical CPUs, and that some CPUs are used to control
  247. the accelerators.
  248. @node STARPU_NCUDA
  249. @subsubsection @code{STARPU_NCUDA} -- Number of CUDA workers
  250. Specify the number of CUDA devices that StarPU can use. If
  251. @code{STARPU_NCUDA} is lower than the number of physical devices, it is
  252. possible to select which CUDA devices should be used by the means of the
  253. @code{STARPU_WORKERS_CUDAID} environment variable. By default, StarPU will
  254. create as many CUDA workers as there are CUDA devices.
  255. @node STARPU_NOPENCL
  256. @subsubsection @code{STARPU_NOPENCL} -- Number of OpenCL workers
  257. OpenCL equivalent of the @code{STARPU_NCUDA} environment variable.
  258. @node STARPU_NGORDON
  259. @subsubsection @code{STARPU_NGORDON} -- Number of SPU workers (Cell)
  260. Specify the number of SPUs that StarPU can use.
  261. @node STARPU_WORKERS_CPUID
  262. @subsubsection @code{STARPU_WORKERS_CPUID} -- Bind workers to specific CPUs
  263. Passing an array of integers (starting from 0) in @code{STARPU_WORKERS_CPUID}
  264. specifies on which logical CPU the different workers should be
  265. bound. For instance, if @code{STARPU_WORKERS_CPUID = "0 1 4 5"}, the first
  266. worker will be bound to logical CPU #0, the second CPU worker will be bound to
  267. logical CPU #1 and so on. Note that the logical ordering of the CPUs is either
  268. determined by the OS, or provided by the @code{hwloc} library in case it is
  269. available.
  270. Note that the first workers correspond to the CUDA workers, then come the
  271. OpenCL and the SPU, and finally the CPU workers. For example if
  272. we have @code{STARPU_NCUDA=1}, @code{STARPU_NOPENCL=1}, @code{STARPU_NCPUS=2}
  273. and @code{STARPU_WORKERS_CPUID = "0 2 1 3"}, the CUDA device will be controlled
  274. by logical CPU #0, the OpenCL device will be controlled by logical CPU #2, and
  275. the logical CPUs #1 and #3 will be used by the CPU workers.
  276. If the number of workers is larger than the array given in
  277. @code{STARPU_WORKERS_CPUID}, the workers are bound to the logical CPUs in a
  278. round-robin fashion: if @code{STARPU_WORKERS_CPUID = "0 1"}, the first and the
  279. third (resp. second and fourth) workers will be put on CPU #0 (resp. CPU #1).
  280. This variable is ignored if the @code{use_explicit_workers_bindid} flag of the
  281. @code{starpu_conf} structure passed to @code{starpu_init} is set.
  282. @node STARPU_WORKERS_CUDAID
  283. @subsubsection @code{STARPU_WORKERS_CUDAID} -- Select specific CUDA devices
  284. Similarly to the @code{STARPU_WORKERS_CPUID} environment variable, it is
  285. possible to select which CUDA devices should be used by StarPU. On a machine
  286. equipped with 4 GPUs, setting @code{STARPU_WORKERS_CUDAID = "1 3"} and
  287. @code{STARPU_NCUDA=2} specifies that 2 CUDA workers should be created, and that
  288. they should use CUDA devices #1 and #3 (the logical ordering of the devices is
  289. the one reported by CUDA).
  290. This variable is ignored if the @code{use_explicit_workers_cuda_gpuid} flag of
  291. the @code{starpu_conf} structure passed to @code{starpu_init} is set.
  292. @node STARPU_WORKERS_OPENCLID
  293. @subsubsection @code{STARPU_WORKERS_OPENCLID} -- Select specific OpenCL devices
  294. OpenCL equivalent of the @code{STARPU_WORKERS_CUDAID} environment variable.
  295. This variable is ignored if the @code{use_explicit_workers_opencl_gpuid} flag of
  296. the @code{starpu_conf} structure passed to @code{starpu_init} is set.
  297. @node Scheduling
  298. @subsection Configuring the Scheduling engine
  299. @menu
  300. * STARPU_SCHED:: Scheduling policy
  301. * STARPU_CALIBRATE:: Calibrate performance models
  302. * STARPU_PREFETCH:: Use data prefetch
  303. * STARPU_SCHED_ALPHA:: Computation factor
  304. * STARPU_SCHED_BETA:: Communication factor
  305. @end menu
  306. @node STARPU_SCHED
  307. @subsubsection @code{STARPU_SCHED} -- Scheduling policy
  308. Choose between the different scheduling policies proposed by StarPU: work
  309. random, stealing, greedy, with performance models, etc.
  310. Use @code{STARPU_SCHED=help} to get the list of available schedulers.
  311. @node STARPU_CALIBRATE
  312. @subsubsection @code{STARPU_CALIBRATE} -- Calibrate performance models
  313. If this variable is set to 1, the performance models are calibrated during
  314. the execution. If it is set to 2, the previous values are dropped to restart
  315. calibration from scratch. Setting this variable to 0 disable calibration, this
  316. is the default behaviour.
  317. Note: this currently only applies to @code{dm}, @code{dmda} and @code{heft} scheduling policies.
  318. @node STARPU_PREFETCH
  319. @subsubsection @code{STARPU_PREFETCH} -- Use data prefetch
  320. This variable indicates whether data prefetching should be enabled (0 means
  321. that it is disabled). If prefetching is enabled, when a task is scheduled to be
  322. executed e.g. on a GPU, StarPU will request an asynchronous transfer in
  323. advance, so that data is already present on the GPU when the task starts. As a
  324. result, computation and data transfers are overlapped.
  325. Note that prefetching is enabled by default in StarPU.
  326. @node STARPU_SCHED_ALPHA
  327. @subsubsection @code{STARPU_SCHED_ALPHA} -- Computation factor
  328. To estimate the cost of a task StarPU takes into account the estimated
  329. computation time (obtained thanks to performance models). The alpha factor is
  330. the coefficient to be applied to it before adding it to the communication part.
  331. @node STARPU_SCHED_BETA
  332. @subsubsection @code{STARPU_SCHED_BETA} -- Communication factor
  333. To estimate the cost of a task StarPU takes into account the estimated
  334. data transfer time (obtained thanks to performance models). The beta factor is
  335. the coefficient to be applied to it before adding it to the computation part.
  336. @node Misc
  337. @subsection Miscellaneous and debug
  338. @menu
  339. * STARPU_SILENT:: Disable verbose mode
  340. * STARPU_LOGFILENAME:: Select debug file name
  341. * STARPU_FXT_PREFIX:: FxT trace location
  342. * STARPU_LIMIT_GPU_MEM:: Restrict memory size on the GPUs
  343. * STARPU_GENERATE_TRACE:: Generate a Paje trace when StarPU is shut down
  344. @end menu
  345. @node STARPU_SILENT
  346. @subsubsection @code{STARPU_SILENT} -- Disable verbose mode
  347. This variable allows to disable verbose mode at runtime when StarPU
  348. has been configured with the option @code{--enable-verbose}.
  349. @node STARPU_LOGFILENAME
  350. @subsubsection @code{STARPU_LOGFILENAME} -- Select debug file name
  351. This variable specifies in which file the debugging output should be saved to.
  352. @node STARPU_FXT_PREFIX
  353. @subsubsection @code{STARPU_FXT_PREFIX} -- FxT trace location
  354. This variable specifies in which directory to save the trace generated if FxT is enabled. It needs to have a trailing '/' character.
  355. @node STARPU_LIMIT_GPU_MEM
  356. @subsubsection @code{STARPU_LIMIT_GPU_MEM} -- Restrict memory size on the GPUs
  357. This variable specifies the maximum number of megabytes that should be
  358. available to the application on each GPUs. In case this value is smaller than
  359. the size of the memory of a GPU, StarPU pre-allocates a buffer to waste memory
  360. on the device. This variable is intended to be used for experimental purposes
  361. as it emulates devices that have a limited amount of memory.
  362. @node STARPU_GENERATE_TRACE
  363. @subsubsection @code{STARPU_GENERATE_TRACE} -- Generate a Paje trace when StarPU is shut down
  364. When set to 1, this variable indicates that StarPU should automatically
  365. generate a Paje trace when starpu_shutdown is called.