configuration.texi 16 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449
  1. @c -*-texinfo-*-
  2. @c This file is part of the StarPU Handbook.
  3. @c Copyright (C) 2009--2011 Universit@'e de Bordeaux 1
  4. @c Copyright (C) 2010, 2011, 2012 Centre National de la Recherche Scientifique
  5. @c Copyright (C) 2011, 2012 Institut National de Recherche en Informatique et Automatique
  6. @c See the file starpu.texi for copying conditions.
  7. @menu
  8. * Compilation configuration::
  9. * Execution configuration through environment variables::
  10. @end menu
  11. @node Compilation configuration
  12. @section Compilation configuration
  13. The following arguments can be given to the @code{configure} script.
  14. @menu
  15. * Common configuration::
  16. * Configuring workers::
  17. * Extension configuration::
  18. * Advanced configuration::
  19. @end menu
  20. @node Common configuration
  21. @subsection Common configuration
  22. @table @code
  23. @item --enable-debug
  24. Enable debugging messages.
  25. @item --enable-fast
  26. Disable assertion checks, which saves computation time.
  27. @item --enable-verbose
  28. Increase the verbosity of the debugging messages. This can be disabled
  29. at runtime by setting the environment variable @code{STARPU_SILENT} to
  30. any value.
  31. @smallexample
  32. % STARPU_SILENT=1 ./vector_scal
  33. @end smallexample
  34. @item --enable-coverage
  35. Enable flags for the @code{gcov} coverage tool.
  36. @end table
  37. @node Configuring workers
  38. @subsection Configuring workers
  39. @table @code
  40. @item --enable-maxcpus=@var{count}
  41. Use at most @var{count} CPU cores. This information is then
  42. available as the @code{STARPU_MAXCPUS} macro.
  43. @item --disable-cpu
  44. Disable the use of CPUs of the machine. Only GPUs etc. will be used.
  45. @item --enable-maxcudadev=@var{count}
  46. Use at most @var{count} CUDA devices. This information is then
  47. available as the @code{STARPU_MAXCUDADEVS} macro.
  48. @item --disable-cuda
  49. Disable the use of CUDA, even if a valid CUDA installation was detected.
  50. @item --with-cuda-dir=@var{prefix}
  51. Search for CUDA under @var{prefix}, which should notably contain
  52. @file{include/cuda.h}.
  53. @item --with-cuda-include-dir=@var{dir}
  54. Search for CUDA headers under @var{dir}, which should
  55. notably contain @code{cuda.h}. This defaults to @code{/include} appended to the
  56. value given to @code{--with-cuda-dir}.
  57. @item --with-cuda-lib-dir=@var{dir}
  58. Search for CUDA libraries under @var{dir}, which should notably contain
  59. the CUDA shared libraries---e.g., @file{libcuda.so}. This defaults to
  60. @code{/lib} appended to the value given to @code{--with-cuda-dir}.
  61. @item --disable-cuda-memcpy-peer
  62. Explicitly disable peer transfers when using CUDA 4.0.
  63. @item --enable-maxopencldev=@var{count}
  64. Use at most @var{count} OpenCL devices. This information is then
  65. available as the @code{STARPU_MAXOPENCLDEVS} macro.
  66. @item --disable-opencl
  67. Disable the use of OpenCL, even if the SDK is detected.
  68. @item --with-opencl-dir=@var{prefix}
  69. Search for an OpenCL implementation under @var{prefix}, which should
  70. notably contain @file{include/CL/cl.h} (or @file{include/OpenCL/cl.h} on
  71. Mac OS).
  72. @item --with-opencl-include-dir=@var{dir}
  73. Search for OpenCL headers under @var{dir}, which should notably contain
  74. @file{CL/cl.h} (or @file{OpenCL/cl.h} on Mac OS). This defaults to
  75. @code{/include} appended to the value given to @code{--with-opencl-dir}.
  76. @item --with-opencl-lib-dir=@var{dir}
  77. Search for an OpenCL library under @var{dir}, which should notably
  78. contain the OpenCL shared libraries---e.g. @file{libOpenCL.so}. This defaults to
  79. @code{/lib} appended to the value given to @code{--with-opencl-dir}.
  80. @item --enable-gordon
  81. Enable the use of the Gordon runtime for Cell SPUs.
  82. @c TODO: rather default to enabled when detected
  83. @item --with-gordon-dir=@var{prefix}
  84. Search for the Gordon SDK under @var{prefix}.
  85. @item --enable-maximplementations=@var{count}
  86. Allow for at most @var{count} codelet implementations for the same
  87. target device. This information is then available as the
  88. @code{STARPU_MAXIMPLEMENTATIONS} macro.
  89. @item ----enable-max-sched-ctxs=@var{count}
  90. Allow for at most @var{count} scheduling contexts
  91. This information is then available as the
  92. @code{STARPU_NMAX_SCHED_CTXS} macro.
  93. @end table
  94. @node Extension configuration
  95. @subsection Extension configuration
  96. @table @code
  97. @item --disable-socl
  98. Disable the SOCL extension (@pxref{SOCL OpenCL Extensions}). By
  99. default, it is enabled when an OpenCL implementation is found.
  100. @item --disable-starpu-top
  101. Disable the StarPU-Top interface (@pxref{StarPU-Top}). By default, it
  102. is enabled when the required dependencies are found.
  103. @item --disable-gcc-extensions
  104. Disable the GCC plug-in (@pxref{C Extensions}). By default, it is
  105. enabled when the GCC compiler provides a plug-in support.
  106. @item --with-mpicc=@var{path}
  107. Use the @command{mpicc} compiler at @var{path}, for starpumpi
  108. (@pxref{StarPU MPI support}).
  109. @item --enable-comm-stats
  110. Enable communication statistics for starpumpi (@pxref{StarPU MPI
  111. support}).
  112. @end table
  113. @node Advanced configuration
  114. @subsection Advanced configuration
  115. @table @code
  116. @item --enable-perf-debug
  117. Enable performance debugging through gprof.
  118. @item --enable-model-debug
  119. Enable performance model debugging.
  120. @item --enable-stats
  121. @c see ../../src/datawizard/datastats.c
  122. Enable gathering of memory transfer statistics.
  123. @item --enable-maxbuffers
  124. Define the maximum number of buffers that tasks will be able to take
  125. as parameters, then available as the @code{STARPU_NMAXBUFS} macro.
  126. @item --enable-allocation-cache
  127. Enable the use of a data allocation cache to avoid the cost of it with
  128. CUDA. Still experimental.
  129. @item --enable-opengl-render
  130. Enable the use of OpenGL for the rendering of some examples.
  131. @c TODO: rather default to enabled when detected
  132. @item --enable-blas-lib
  133. Specify the blas library to be used by some of the examples. The
  134. library has to be 'atlas' or 'goto'.
  135. @item --disable-starpufft
  136. Disable the build of libstarpufft, even if fftw or cuFFT is available.
  137. @item --with-magma=@var{prefix}
  138. Search for MAGMA under @var{prefix}. @var{prefix} should notably
  139. contain @file{include/magmablas.h}.
  140. @item --with-fxt=@var{prefix}
  141. Search for FxT under @var{prefix}.
  142. @url{http://savannah.nongnu.org/projects/fkt, FxT} is used to generate
  143. traces of scheduling events, which can then be rendered them using ViTE
  144. (@pxref{Off-line, off-line performance feedback}). @var{prefix} should
  145. notably contain @code{include/fxt/fxt.h}.
  146. @item --with-perf-model-dir=@var{dir}
  147. Store performance models under @var{dir}, instead of the current user's
  148. home.
  149. @item --with-goto-dir=@var{prefix}
  150. Search for GotoBLAS under @var{prefix}.
  151. @item --with-atlas-dir=@var{prefix}
  152. Search for ATLAS under @var{prefix}, which should notably contain
  153. @file{include/cblas.h}.
  154. @item --with-mkl-cflags=@var{cflags}
  155. Use @var{cflags} to compile code that uses the MKL library.
  156. @item --with-mkl-ldflags=@var{ldflags}
  157. Use @var{ldflags} when linking code that uses the MKL library. Note
  158. that the
  159. @url{http://software.intel.com/en-us/articles/intel-mkl-link-line-advisor/,
  160. MKL website} provides a script to determine the linking flags.
  161. @item --enable-sched-ctx-hypervisor
  162. Enables the Scheduling Context Hypervisor plugin(@pxref{Scheduling Context Hypervisor}).
  163. By default, it is disabled.
  164. @end table
  165. @node Execution configuration through environment variables
  166. @section Execution configuration through environment variables
  167. @menu
  168. * Workers:: Configuring workers
  169. * Scheduling:: Configuring the Scheduling engine
  170. * Misc:: Miscellaneous and debug
  171. @end menu
  172. @node Workers
  173. @subsection Configuring workers
  174. @menu
  175. * STARPU_NCPU:: Number of CPU workers
  176. * STARPU_NCUDA:: Number of CUDA workers
  177. * STARPU_NOPENCL:: Number of OpenCL workers
  178. * STARPU_NGORDON:: Number of SPU workers (Cell)
  179. * STARPU_WORKERS_NOBIND:: Do not bind workers
  180. * STARPU_WORKERS_CPUID:: Bind workers to specific CPUs
  181. * STARPU_WORKERS_CUDAID:: Select specific CUDA devices
  182. * STARPU_WORKERS_OPENCLID:: Select specific OpenCL devices
  183. * STARPU_SINGLE_COMBINED_WORKER:: Do not use concurrent workers
  184. * STARPU_MIN_WORKERSIZE:: Minimum size of the combined workers
  185. * STARPU_MAX_WORKERSIZE:: Maximum size of the combined workers
  186. @end menu
  187. @node STARPU_NCPU
  188. @subsubsection @code{STARPU_NCPU} -- Number of CPU workers
  189. Specify the number of CPU workers (thus not including workers dedicated to control acceleratores). Note that by default, StarPU will not allocate
  190. more CPU workers than there are physical CPUs, and that some CPUs are used to control
  191. the accelerators.
  192. @node STARPU_NCUDA
  193. @subsubsection @code{STARPU_NCUDA} -- Number of CUDA workers
  194. Specify the number of CUDA devices that StarPU can use. If
  195. @code{STARPU_NCUDA} is lower than the number of physical devices, it is
  196. possible to select which CUDA devices should be used by the means of the
  197. @code{STARPU_WORKERS_CUDAID} environment variable. By default, StarPU will
  198. create as many CUDA workers as there are CUDA devices.
  199. @node STARPU_NOPENCL
  200. @subsubsection @code{STARPU_NOPENCL} -- Number of OpenCL workers
  201. OpenCL equivalent of the @code{STARPU_NCUDA} environment variable.
  202. @node STARPU_NGORDON
  203. @subsubsection @code{STARPU_NGORDON} -- Number of SPU workers (Cell)
  204. Specify the number of SPUs that StarPU can use.
  205. @node STARPU_WORKERS_NOBIND
  206. @subsubsection @code{STARPU_WORKERS_NOBIND} -- Do not bind workers to specific CPUs
  207. Setting it to non-zero will prevent StarPU from binding its threads to
  208. CPUs. This is for instance useful when running the testsuite in parallel.
  209. @node STARPU_WORKERS_CPUID
  210. @subsubsection @code{STARPU_WORKERS_CPUID} -- Bind workers to specific CPUs
  211. Passing an array of integers (starting from 0) in @code{STARPU_WORKERS_CPUID}
  212. specifies on which logical CPU the different workers should be
  213. bound. For instance, if @code{STARPU_WORKERS_CPUID = "0 1 4 5"}, the first
  214. worker will be bound to logical CPU #0, the second CPU worker will be bound to
  215. logical CPU #1 and so on. Note that the logical ordering of the CPUs is either
  216. determined by the OS, or provided by the @code{hwloc} library in case it is
  217. available.
  218. Note that the first workers correspond to the CUDA workers, then come the
  219. OpenCL and the SPU, and finally the CPU workers. For example if
  220. we have @code{STARPU_NCUDA=1}, @code{STARPU_NOPENCL=1}, @code{STARPU_NCPU=2}
  221. and @code{STARPU_WORKERS_CPUID = "0 2 1 3"}, the CUDA device will be controlled
  222. by logical CPU #0, the OpenCL device will be controlled by logical CPU #2, and
  223. the logical CPUs #1 and #3 will be used by the CPU workers.
  224. If the number of workers is larger than the array given in
  225. @code{STARPU_WORKERS_CPUID}, the workers are bound to the logical CPUs in a
  226. round-robin fashion: if @code{STARPU_WORKERS_CPUID = "0 1"}, the first and the
  227. third (resp. second and fourth) workers will be put on CPU #0 (resp. CPU #1).
  228. This variable is ignored if the @code{use_explicit_workers_bindid} flag of the
  229. @code{starpu_conf} structure passed to @code{starpu_init} is set.
  230. @node STARPU_WORKERS_CUDAID
  231. @subsubsection @code{STARPU_WORKERS_CUDAID} -- Select specific CUDA devices
  232. Similarly to the @code{STARPU_WORKERS_CPUID} environment variable, it is
  233. possible to select which CUDA devices should be used by StarPU. On a machine
  234. equipped with 4 GPUs, setting @code{STARPU_WORKERS_CUDAID = "1 3"} and
  235. @code{STARPU_NCUDA=2} specifies that 2 CUDA workers should be created, and that
  236. they should use CUDA devices #1 and #3 (the logical ordering of the devices is
  237. the one reported by CUDA).
  238. This variable is ignored if the @code{use_explicit_workers_cuda_gpuid} flag of
  239. the @code{starpu_conf} structure passed to @code{starpu_init} is set.
  240. @node STARPU_WORKERS_OPENCLID
  241. @subsubsection @code{STARPU_WORKERS_OPENCLID} -- Select specific OpenCL devices
  242. OpenCL equivalent of the @code{STARPU_WORKERS_CUDAID} environment variable.
  243. This variable is ignored if the @code{use_explicit_workers_opencl_gpuid} flag of
  244. the @code{starpu_conf} structure passed to @code{starpu_init} is set.
  245. @node STARPU_SINGLE_COMBINED_WORKER
  246. @subsubsection @code{STARPU_SINGLE_COMBINED_WORKER} -- Do not use concurrent workers
  247. If set, StarPU will create several workers which won't be able to work
  248. concurrently. It will create combined workers which size goes from 1 to the
  249. total number of CPU workers in the system.
  250. @node STARPU_MIN_WORKERSIZE
  251. @subsubsection @code{STARPU_MIN_WORKERSIZE} -- Minimum size of the combined workers
  252. Let the user give a hint to StarPU about which how many workers
  253. (minimum boundary) the combined workers should contain.
  254. @node STARPU_MAX_WORKERSIZE
  255. @subsubsection @code{STARPU_MAX_WORKERSIZE} -- Maximum size of the combined workers
  256. Let the user give a hint to StarPU about which how many workers
  257. (maximum boundary) the combined workers should contain.
  258. @node Scheduling
  259. @subsection Configuring the Scheduling engine
  260. @menu
  261. * STARPU_SCHED:: Scheduling policy
  262. * STARPU_CALIBRATE:: Calibrate performance models
  263. * STARPU_BUS_CALIBRATE:: Calibrate bus
  264. * STARPU_PREFETCH:: Use data prefetch
  265. * STARPU_SCHED_ALPHA:: Computation factor
  266. * STARPU_SCHED_BETA:: Communication factor
  267. @end menu
  268. @node STARPU_SCHED
  269. @subsubsection @code{STARPU_SCHED} -- Scheduling policy
  270. Choose between the different scheduling policies proposed by StarPU: work
  271. random, stealing, greedy, with performance models, etc.
  272. Use @code{STARPU_SCHED=help} to get the list of available schedulers.
  273. @node STARPU_CALIBRATE
  274. @subsubsection @code{STARPU_CALIBRATE} -- Calibrate performance models
  275. If this variable is set to 1, the performance models are calibrated during
  276. the execution. If it is set to 2, the previous values are dropped to restart
  277. calibration from scratch. Setting this variable to 0 disable calibration, this
  278. is the default behaviour.
  279. Note: this currently only applies to @code{dm}, @code{dmda} and @code{heft} scheduling policies.
  280. @node STARPU_BUS_CALIBRATE
  281. @subsubsection @code{STARPU_BUS_CALIBRATE} -- Calibrate bus
  282. If this variable is set to 1, the bus is recalibrated during intialization.
  283. @node STARPU_PREFETCH
  284. @subsubsection @code{STARPU_PREFETCH} -- Use data prefetch
  285. This variable indicates whether data prefetching should be enabled (0 means
  286. that it is disabled). If prefetching is enabled, when a task is scheduled to be
  287. executed e.g. on a GPU, StarPU will request an asynchronous transfer in
  288. advance, so that data is already present on the GPU when the task starts. As a
  289. result, computation and data transfers are overlapped.
  290. Note that prefetching is enabled by default in StarPU.
  291. @node STARPU_SCHED_ALPHA
  292. @subsubsection @code{STARPU_SCHED_ALPHA} -- Computation factor
  293. To estimate the cost of a task StarPU takes into account the estimated
  294. computation time (obtained thanks to performance models). The alpha factor is
  295. the coefficient to be applied to it before adding it to the communication part.
  296. @node STARPU_SCHED_BETA
  297. @subsubsection @code{STARPU_SCHED_BETA} -- Communication factor
  298. To estimate the cost of a task StarPU takes into account the estimated
  299. data transfer time (obtained thanks to performance models). The beta factor is
  300. the coefficient to be applied to it before adding it to the computation part.
  301. @node Misc
  302. @subsection Miscellaneous and debug
  303. @menu
  304. * STARPU_SILENT:: Disable verbose mode
  305. * STARPU_LOGFILENAME:: Select debug file name
  306. * STARPU_FXT_PREFIX:: FxT trace location
  307. * STARPU_LIMIT_GPU_MEM:: Restrict memory size on the GPUs
  308. * STARPU_GENERATE_TRACE:: Generate a Paje trace when StarPU is shut down
  309. @end menu
  310. @node STARPU_SILENT
  311. @subsubsection @code{STARPU_SILENT} -- Disable verbose mode
  312. This variable allows to disable verbose mode at runtime when StarPU
  313. has been configured with the option @code{--enable-verbose}.
  314. @node STARPU_LOGFILENAME
  315. @subsubsection @code{STARPU_LOGFILENAME} -- Select debug file name
  316. This variable specifies in which file the debugging output should be saved to.
  317. @node STARPU_FXT_PREFIX
  318. @subsubsection @code{STARPU_FXT_PREFIX} -- FxT trace location
  319. This variable specifies in which directory to save the trace generated if FxT is enabled. It needs to have a trailing '/' character.
  320. @node STARPU_LIMIT_GPU_MEM
  321. @subsubsection @code{STARPU_LIMIT_GPU_MEM} -- Restrict memory size on the GPUs
  322. This variable specifies the maximum number of megabytes that should be
  323. available to the application on each GPUs. In case this value is smaller than
  324. the size of the memory of a GPU, StarPU pre-allocates a buffer to waste memory
  325. on the device. This variable is intended to be used for experimental purposes
  326. as it emulates devices that have a limited amount of memory.
  327. @node STARPU_GENERATE_TRACE
  328. @subsubsection @code{STARPU_GENERATE_TRACE} -- Generate a Paje trace when StarPU is shut down
  329. When set to 1, this variable indicates that StarPU should automatically
  330. generate a Paje trace when starpu_shutdown is called.