environment_variables.doxy 17 KB


  1. /*
  2. * This file is part of the StarPU Handbook.
  3. * Copyright (C) 2009--2011 Universit@'e de Bordeaux 1
  4. * Copyright (C) 2010, 2011, 2012, 2013 Centre National de la Recherche Scientifique
  5. * Copyright (C) 2011, 2012 Institut National de Recherche en Informatique et Automatique
  6. * See the file version.doxy for copying conditions.
  7. */
  8. /*! \page ExecutionConfigurationThroughEnvironmentVariables Execution Configuration Through Environment Variables
  9. The behavior of the StarPU library and tools may be tuned thanks to
  10. the following environment variables.
  11. \section ConfiguringWorkers Configuring Workers
  12. <dl>
  13. <dt>STARPU_NCPU</dt>
  14. <dd>
  15. \anchor STARPU_NCPU
  16. \addindex __env__STARPU_NCPU
  17. Specify the number of CPU workers (thus not including workers
  18. dedicated to control accelerators). Note that by default, StarPU will
  19. not allocate more CPU workers than there are physical CPUs, and that
  20. some CPUs are used to control the accelerators.
  21. </dd>
  22. <dt>STARPU_NCPUS</dt>
  23. <dd>
  24. \anchor STARPU_NCPUS
  25. \addindex __env__STARPU_NCPUS
  26. This variable is deprecated. You should use \ref STARPU_NCPU.
  27. </dd>
  28. <dt>STARPU_NCUDA</dt>
  29. <dd>
  30. \anchor STARPU_NCUDA
  31. \addindex __env__STARPU_NCUDA
  32. Specify the number of CUDA devices that StarPU can use. If
  33. \ref STARPU_NCUDA is lower than the number of physical devices, it is
  34. possible to select which CUDA devices should be used by the means of the
  35. environment variable \ref STARPU_WORKERS_CUDAID. By default, StarPU will
  36. create as many CUDA workers as there are CUDA devices.
  37. </dd>
  38. <dt>STARPU_NOPENCL</dt>
  39. <dd>
  40. \anchor STARPU_NOPENCL
  41. \addindex __env__STARPU_NOPENCL
  42. OpenCL equivalent of the environment variable \ref STARPU_NCUDA.
  43. </dd>
  44. <dt>STARPU_OPENCL_ON_CPUS</dt>
  45. <dd>
  46. \anchor STARPU_OPENCL_ON_CPUS
  47. \addindex __env__STARPU_OPENCL_ON_CPUS
  48. By default, the OpenCL driver only enables GPU and accelerator
  49. devices. By setting the environment variable \ref
  50. STARPU_OPENCL_ON_CPUS to 1, the OpenCL driver will also enable CPU
  51. devices.
  52. </dd>
  53. <dt>STARPU_OPENCL_ONLY_ON_CPUS</dt>
  54. <dd>
  55. \anchor STARPU_OPENCL_ONLY_ON_CPUS
  56. \addindex __env__STARPU_OPENCL_ONLY_ON_CPUS
  57. By default, the OpenCL driver enables GPU and accelerator
  58. devices. By setting the environment variable \ref
  59. STARPU_OPENCL_ONLY_ON_CPUS to 1, the OpenCL driver will ONLY enable
  60. CPU devices.
  61. </dd>
  62. <dt>STARPU_WORKERS_NOBIND</dt>
  63. <dd>
  64. \anchor STARPU_WORKERS_NOBIND
  65. \addindex __env__STARPU_WORKERS_NOBIND
  66. Setting it to non-zero will prevent StarPU from binding its threads to
  67. CPUs. This is for instance useful when running the testsuite in parallel.
  68. </dd>
  69. <dt>STARPU_WORKERS_CPUID</dt>
  70. <dd>
  71. \anchor STARPU_WORKERS_CPUID
  72. \addindex __env__STARPU_WORKERS_CPUID
  73. Passing an array of integers (starting from 0) in \ref STARPU_WORKERS_CPUID
  74. specifies on which logical CPU the different workers should be
  75. bound. For instance, if <c>STARPU_WORKERS_CPUID = "0 1 4 5"</c>, the first
  76. worker will be bound to logical CPU #0, the second CPU worker will be bound to
  77. logical CPU #1 and so on. Note that the logical ordering of the CPUs is either
  78. determined by the OS, or provided by the library <c>hwloc</c> in case it is
  79. available.
  80. Note that the first workers correspond to the CUDA workers, then come the
  81. OpenCL workers, and finally the CPU workers. For example if
  82. we have <c>STARPU_NCUDA=1</c>, <c>STARPU_NOPENCL=1</c>, <c>STARPU_NCPU=2</c>
  83. and <c>STARPU_WORKERS_CPUID = "0 2 1 3"</c>, the CUDA device will be controlled
  84. by logical CPU #0, the OpenCL device will be controlled by logical CPU #2, and
  85. the logical CPUs #1 and #3 will be used by the CPU workers.
  86. If the number of workers is larger than the array given in \ref
  87. STARPU_WORKERS_CPUID, the workers are bound to the logical CPUs in a
  88. round-robin fashion: if <c>STARPU_WORKERS_CPUID = "0 1"</c>, the first
  89. and the third (resp. second and fourth) workers will be put on CPU #0
  90. (resp. CPU #1).
  91. This variable is ignored if the field
  92. starpu_conf::use_explicit_workers_bindid passed to starpu_init() is
  93. set.
  94. </dd>
  95. <dt>STARPU_WORKERS_CUDAID</dt>
  96. <dd>
  97. \anchor STARPU_WORKERS_CUDAID
  98. \addindex __env__STARPU_WORKERS_CUDAID
  99. Similarly to the \ref STARPU_WORKERS_CPUID environment variable, it is
  100. possible to select which CUDA devices should be used by StarPU. On a machine
  101. equipped with 4 GPUs, setting <c>STARPU_WORKERS_CUDAID = "1 3"</c> and
  102. <c>STARPU_NCUDA=2</c> specifies that 2 CUDA workers should be created, and that
  103. they should use CUDA devices #1 and #3 (the logical ordering of the devices is
  104. the one reported by CUDA).
  105. This variable is ignored if the field
  106. starpu_conf::use_explicit_workers_cuda_gpuid passed to starpu_init()
  107. is set.
  108. </dd>
  109. <dt>STARPU_WORKERS_OPENCLID</dt>
  110. <dd>
  111. \anchor STARPU_WORKERS_OPENCLID
  112. \addindex __env__STARPU_WORKERS_OPENCLID
  113. OpenCL equivalent of the \ref STARPU_WORKERS_CUDAID environment variable.
  114. This variable is ignored if the field
  115. starpu_conf::use_explicit_workers_opencl_gpuid passed to starpu_init()
  116. is set.
  117. </dd>
  118. <dt>STARPU_SINGLE_COMBINED_WORKER</dt>
  119. <dd>
  120. \anchor STARPU_SINGLE_COMBINED_WORKER
  121. \addindex __env__STARPU_SINGLE_COMBINED_WORKER
  122. If set, StarPU will create several workers which won't be able to work
  123. concurrently. It will by default create combined workers which size goes from 1
  124. to the total number of CPU workers in the system. \ref STARPU_MIN_WORKERSIZE
  125. and \ref STARPU_MAX_WORKERSIZE can be used to change this default.
  126. </dd>
  127. <dt>STARPU_MIN_WORKERSIZE</dt>
  128. <dd>
  129. \anchor STARPU_MIN_WORKERSIZE
  130. \addindex __env__STARPU_MIN_WORKERSIZE
  131. \ref STARPU_MIN_WORKERSIZE
  132. permits to specify the minimum size of the combined workers (instead of the default 2)
  133. </dd>
  134. <dt>STARPU_MAX_WORKERSIZE</dt>
  135. <dd>
  136. \anchor STARPU_MAX_WORKERSIZE
  137. \addindex __env__STARPU_MAX_WORKERSIZE
  138. \ref STARPU_MAX_WORKERSIZE
  139. permits to specify the minimum size of the combined workers (instead of the
  140. number of CPU workers in the system)
  141. </dd>
  142. <dt>STARPU_SYNTHESIZE_ARITY_COMBINED_WORKER</dt>
  143. <dd>
  144. \anchor STARPU_SYNTHESIZE_ARITY_COMBINED_WORKER
  145. \addindex __env__STARPU_SYNTHESIZE_ARITY_COMBINED_WORKER
  146. Let the user decide how many elements are allowed between combined workers
  147. created from hwloc information. For instance, in the case of sockets with 6
  148. cores without shared L2 caches, if \ref STARPU_SYNTHESIZE_ARITY_COMBINED_WORKER is
  149. set to 6, no combined worker will be synthesized beyond one for the socket
  150. and one per core. If it is set to 3, 3 intermediate combined workers will be
  151. synthesized, to divide the socket cores into 3 chunks of 2 cores. If it set to
  152. 2, 2 intermediate combined workers will be synthesized, to divide the the socket
  153. cores into 2 chunks of 3 cores, and then 3 additional combined workers will be
  154. synthesized, to divide the former synthesized workers into a bunch of 2 cores,
  155. and the remaining core (for which no combined worker is synthesized since there
  156. is already a normal worker for it).
  157. The default, 2, thus makes StarPU tend to building a binary trees of combined
  158. workers.
  159. </dd>
  160. <dt>STARPU_DISABLE_ASYNCHRONOUS_COPY</dt>
  161. <dd>
  162. \anchor STARPU_DISABLE_ASYNCHRONOUS_COPY
  163. \addindex __env__STARPU_DISABLE_ASYNCHRONOUS_COPY
  164. Disable asynchronous copies between CPU and GPU devices.
  165. The AMD implementation of OpenCL is known to
  166. fail when copying data asynchronously. When using this implementation,
  167. it is therefore necessary to disable asynchronous data transfers.
  168. </dd>
  169. <dt>STARPU_DISABLE_ASYNCHRONOUS_CUDA_COPY</dt>
  170. <dd>
  171. \anchor STARPU_DISABLE_ASYNCHRONOUS_CUDA_COPY
  172. \addindex __env__STARPU_DISABLE_ASYNCHRONOUS_CUDA_COPY
  173. Disable asynchronous copies between CPU and CUDA devices.
  174. </dd>
  175. <dt>STARPU_DISABLE_ASYNCHRONOUS_OPENCL_COPY</dt>
  176. <dd>
  177. \anchor STARPU_DISABLE_ASYNCHRONOUS_OPENCL_COPY
  178. \addindex __env__STARPU_DISABLE_ASYNCHRONOUS_OPENCL_COPY
  179. Disable asynchronous copies between CPU and OpenCL devices.
  180. The AMD implementation of OpenCL is known to
  181. fail when copying data asynchronously. When using this implementation,
  182. it is therefore necessary to disable asynchronous data transfers.
  183. </dd>
  184. <dt>STARPU_DISABLE_ASYNCHRONOUS_MIC_COPY</dt>
  185. <dd>
  186. \anchor STARPU_DISABLE_ASYNCHRONOUS_MIC_COPY
  187. \addindex __env__STARPU_DISABLE_ASYNCHRONOUS_MIC_COPY
  188. Disable asynchronous copies between CPU and MIC devices.
  189. </dd>
  190. <dt>STARPU_ENABLE_CUDA_GPU_GPU_DIRECT</dt>
  191. <dd>
  192. \anchor STARPU_ENABLE_CUDA_GPU_GPU_DIRECT
  193. \addindex __env__STARPU_ENABLE_CUDA_GPU_GPU_DIRECT
  194. Enable direct CUDA transfers from GPU to GPU, without copying through RAM.
  195. This permits to test the performance effect of GPU-Direct.
  196. </dd>
  197. </dl>
  198. \section ConfiguringTheSchedulingEngine Configuring The Scheduling Engine
  199. <dl>
  200. <dt>STARPU_SCHED</dt>
  201. <dd>
  202. \anchor STARPU_SCHED
  203. \addindex __env__STARPU_SCHED
  204. Choose between the different scheduling policies proposed by StarPU: work
  205. random, stealing, greedy, with performance models, etc.
  206. Use <c>STARPU_SCHED=help</c> to get the list of available schedulers.
  207. </dd>
  208. <dt>STARPU_CALIBRATE</dt>
  209. <dd>
  210. \anchor STARPU_CALIBRATE
  211. \addindex __env__STARPU_CALIBRATE
  212. If this variable is set to 1, the performance models are calibrated during
  213. the execution. If it is set to 2, the previous values are dropped to restart
  214. calibration from scratch. Setting this variable to 0 disable calibration, this
  215. is the default behaviour.
  216. Note: this currently only applies to <c>dm</c> and <c>dmda</c> scheduling policies.
  217. </dd>
  218. <dt>STARPU_BUS_CALIBRATE</dt>
  219. <dd>
  220. \anchor STARPU_BUS_CALIBRATE
  221. \addindex __env__STARPU_BUS_CALIBRATE
  222. If this variable is set to 1, the bus is recalibrated during intialization.
  223. </dd>
  224. <dt>STARPU_PREFETCH</dt>
  225. <dd>
  226. \anchor STARPU_PREFETCH
  227. \addindex __env__STARPU_PREFETCH
  228. This variable indicates whether data prefetching should be enabled (0 means
  229. that it is disabled). If prefetching is enabled, when a task is scheduled to be
  230. executed e.g. on a GPU, StarPU will request an asynchronous transfer in
  231. advance, so that data is already present on the GPU when the task starts. As a
  232. result, computation and data transfers are overlapped.
  233. Note that prefetching is enabled by default in StarPU.
  234. </dd>
  235. <dt>STARPU_SCHED_ALPHA</dt>
  236. <dd>
  237. \anchor STARPU_SCHED_ALPHA
  238. \addindex __env__STARPU_SCHED_ALPHA
  239. To estimate the cost of a task StarPU takes into account the estimated
  240. computation time (obtained thanks to performance models). The alpha factor is
  241. the coefficient to be applied to it before adding it to the communication part.
  242. </dd>
  243. <dt>STARPU_SCHED_BETA</dt>
  244. <dd>
  245. \anchor STARPU_SCHED_BETA
  246. \addindex __env__STARPU_SCHED_BETA
  247. To estimate the cost of a task StarPU takes into account the estimated
  248. data transfer time (obtained thanks to performance models). The beta factor is
  249. the coefficient to be applied to it before adding it to the computation part.
  250. </dd>
  251. <dt>STARPU_SCHED_GAMMA</dt>
  252. <dd>
  253. \anchor STARPU_SCHED_GAMMA
  254. \addindex __env__STARPU_SCHED_GAMMA
  255. Define the execution time penalty of a joule (\ref Power-basedScheduling).
  256. </dd>
  257. <dt>STARPU_IDLE_POWER</dt>
  258. <dd>
  259. \anchor STARPU_IDLE_POWER
  260. \addindex __env__STARPU_IDLE_POWER
  261. Define the idle power of the machine (\ref Power-basedScheduling).
  262. </dd>
  263. <dt>STARPU_PROFILING</dt>
  264. <dd>
  265. \anchor STARPU_PROFILING
  266. \addindex __env__STARPU_PROFILING
  267. Enable on-line performance monitoring (\ref EnablingOn-linePerformanceMonitoring).
  268. </dd>
  269. </dl>
  270. \section Extensions Extensions
  271. <dl>
  272. <dt>SOCL_OCL_LIB_OPENCL</dt>
  273. <dd>
  274. \anchor SOCL_OCL_LIB_OPENCL
  275. \addindex __env__SOCL_OCL_LIB_OPENCL
  276. THE SOCL test suite is only run when the environment variable \ref
  277. SOCL_OCL_LIB_OPENCL is defined. It should contain the location
  278. of the file <c>libOpenCL.so</c> of the OCL ICD implementation.
  279. </dd>
  280. <dt>OCL_ICD_VENDORS</dt>
  281. <dd>
  282. \anchor OCL_ICD_VENDORS
  283. \addindex __env__OCL_ICD_VENDORS
  284. When using SOCL with OpenCL ICD
  285. (https://forge.imag.fr/projects/ocl-icd/), this variable may be used
  286. to point to the directory where ICD files are installed. The default
  287. directory is <c>/etc/OpenCL/vendors</c>. StarPU installs ICD
  288. files in the directory <c>$prefix/share/starpu/opencl/vendors</c>.
  289. </dd>
  290. <dt>STARPU_COMM_STATS</dt>
  291. <dd>
  292. \anchor STARPU_COMM_STATS
  293. \addindex __env__STARPU_COMM_STATS
  294. Communication statistics for starpumpi (\ref MPISupport)
  295. will be enabled when the environment variable \ref STARPU_COMM_STATS
  296. is defined to an value other than 0.
  297. </dd>
  298. <dt>STARPU_MPI_CACHE</dt>
  299. <dd>
  300. \anchor STARPU_MPI_CACHE
  301. \addindex __env__STARPU_MPI_CACHE
  302. Communication cache for starpumpi (\ref MPISupport) will be
  303. disabled when the environment variable \ref STARPU_MPI_CACHE is set
  304. to 0. It is enabled by default or for any other values of the variable
  305. \ref STARPU_MPI_CACHE.
  306. </dd>
  307. </dl>
  308. \section MiscellaneousAndDebug Miscellaneous And Debug
  309. <dl>
  310. <dt>STARPU_HOME</dt>
  311. <dd>
  312. \anchor STARPU_HOME
  313. \addindex __env__STARPU_HOME
  314. This specifies the main directory in which StarPU stores its
  315. configuration files. The default is <c>$HOME</c> on Unix environments,
  316. and <c>$USERPROFILE</c> on Windows environments.
  317. </dd>
  318. <dt>STARPU_HOSTNAME</dt>
  319. <dd>
  320. \anchor STARPU_HOSTNAME
  321. \addindex __env__STARPU_HOSTNAME
  322. When set, force the hostname to be used when dealing performance model
  323. files. Models are indexed by machine name. When running for example on
  324. a homogenenous cluster, it is possible to share the models between
  325. machines by setting <c>export STARPU_HOSTNAME=some_global_name</c>.
  326. </dd>
  327. <dt>STARPU_OPENCL_PROGRAM_DIR</dt>
  328. <dd>
  329. \anchor STARPU_OPENCL_PROGRAM_DIR
  330. \addindex __env__STARPU_OPENCL_PROGRAM_DIR
  331. This specifies the directory where the OpenCL codelet source files are
  332. located. The function starpu_opencl_load_program_source() looks
  333. for the codelet in the current directory, in the directory specified
  334. by the environment variable \ref STARPU_OPENCL_PROGRAM_DIR, in the
  335. directory <c>share/starpu/opencl</c> of the installation directory of
  336. StarPU, and finally in the source directory of StarPU.
  337. </dd>
  338. <dt>STARPU_SILENT</dt>
  339. <dd>
  340. \anchor STARPU_SILENT
  341. \addindex __env__STARPU_SILENT
  342. This variable allows to disable verbose mode at runtime when StarPU
  343. has been configured with the option \ref enable-verbose "--enable-verbose". It also
  344. disables the display of StarPU information and warning messages.
  345. </dd>
  346. <dt>STARPU_LOGFILENAME</dt>
  347. <dd>
  348. \anchor STARPU_LOGFILENAME
  349. \addindex __env__STARPU_LOGFILENAME
  350. This variable specifies in which file the debugging output should be saved to.
  351. </dd>
  352. <dt>STARPU_FXT_PREFIX</dt>
  353. <dd>
  354. \anchor STARPU_FXT_PREFIX
  355. \addindex __env__STARPU_FXT_PREFIX
  356. This variable specifies in which directory to save the trace generated if FxT is enabled. It needs to have a trailing '/' character.
  357. </dd>
  358. <dt>STARPU_LIMIT_CUDA_devid_MEM</dt>
  359. <dd>
  360. \anchor STARPU_LIMIT_CUDA_devid_MEM
  361. \addindex __env__STARPU_LIMIT_CUDA_devid_MEM
  362. This variable specifies the maximum number of megabytes that should be
  363. available to the application on the CUDA device with the identifier
  364. <c>devid</c>. This variable is intended to be used for experimental
  365. purposes as it emulates devices that have a limited amount of memory.
  366. When defined, the variable overwrites the value of the variable
  367. \ref STARPU_LIMIT_CUDA_MEM.
  368. </dd>
  369. <dt>STARPU_LIMIT_CUDA_MEM</dt>
  370. <dd>
  371. \anchor STARPU_LIMIT_CUDA_MEM
  372. \addindex __env__STARPU_LIMIT_CUDA_MEM
  373. This variable specifies the maximum number of megabytes that should be
  374. available to the application on each CUDA devices. This variable is
  375. intended to be used for experimental purposes as it emulates devices
  376. that have a limited amount of memory.
  377. </dd>
  378. <dt>STARPU_LIMIT_OPENCL_devid_MEM</dt>
  379. <dd>
  380. \anchor STARPU_LIMIT_OPENCL_devid_MEM
  381. \addindex __env__STARPU_LIMIT_OPENCL_devid_MEM
  382. This variable specifies the maximum number of megabytes that should be
  383. available to the application on the OpenCL device with the identifier
  384. <c>devid</c>. This variable is intended to be used for experimental
  385. purposes as it emulates devices that have a limited amount of memory.
  386. When defined, the variable overwrites the value of the variable
  387. \ref STARPU_LIMIT_OPENCL_MEM.
  388. </dd>
  389. <dt>STARPU_LIMIT_OPENCL_MEM</dt>
  390. <dd>
  391. \anchor STARPU_LIMIT_OPENCL_MEM
  392. \addindex __env__STARPU_LIMIT_OPENCL_MEM
  393. This variable specifies the maximum number of megabytes that should be
  394. available to the application on each OpenCL devices. This variable is
  395. intended to be used for experimental purposes as it emulates devices
  396. that have a limited amount of memory.
  397. </dd>
  398. <dt>STARPU_LIMIT_CPU_MEM</dt>
  399. <dd>
  400. \anchor STARPU_LIMIT_CPU_MEM
  401. \addindex __env__STARPU_LIMIT_CPU_MEM
  402. This variable specifies the maximum number of megabytes that should be
  403. available to the application on each CPU device. This variable is
  404. intended to be used for experimental purposes as it emulates devices
  405. that have a limited amount of memory.
  406. </dd>
  407. <dt>STARPU_GENERATE_TRACE</dt>
  408. <dd>
  409. \anchor STARPU_GENERATE_TRACE
  410. \addindex __env__STARPU_GENERATE_TRACE
  411. When set to <c>1</c>, this variable indicates that StarPU should automatically
  412. generate a Paje trace when starpu_shutdown() is called.
  413. </dd>
  414. <dt>STARPU_MEMORY_STATS</dt>
  415. <dd>
  416. \anchor STARPU_MEMORY_STATS
  417. \addindex __env__STARPU_MEMORY_STATS
  418. When set to 0, disable the display of memory statistics on data which
  419. have not been unregistered at the end of the execution (\ref MemoryFeedback).
  420. </dd>
  421. <dt>STARPU_BUS_STATS</dt>
  422. <dd>
  423. \anchor STARPU_BUS_STATS
  424. \addindex __env__STARPU_BUS_STATS
  425. When defined, statistics about data transfers will be displayed when calling
  426. starpu_shutdown() (\ref Profiling).
  427. </dd>
  428. <dt>STARPU_WORKER_STATS</dt>
  429. <dd>
  430. \anchor STARPU_WORKER_STATS
  431. \addindex __env__STARPU_WORKER_STATS
  432. When defined, statistics about the workers will be displayed when calling
  433. starpu_shutdown() (\ref Profiling). When combined with the
  434. environment variable \ref STARPU_PROFILING, it displays the power
  435. consumption (\ref Power-basedScheduling).
  436. </dd>
  437. <dt>STARPU_STATS</dt>
  438. <dd>
  439. \anchor STARPU_STATS
  440. \addindex __env__STARPU_STATS
  441. When set to 0, data statistics will not be displayed at the
  442. end of the execution of an application (\ref DataStatistics).
  443. </dd>
  444. </dl>
  445. */