tips_and_tricks.doxy 3.7 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113
  1. /*
  2. * This file is part of the StarPU Handbook.
  3. * Copyright (C) 2009--2011 Universit@'e de Bordeaux 1
  4. * Copyright (C) 2010, 2011, 2012, 2013 Centre National de la Recherche Scientifique
  5. * Copyright (C) 2011, 2012 Institut National de Recherche en Informatique et Automatique
  6. * See the file version.doxy for copying conditions.
  7. */
  8. /*! \page TipsAndTricksToKnowAbout Tips and Tricks To Know About
  9. \section HowToInitializeAComputationLibraryOnceForEachWorker How To Initialize A Computation Library Once For Each Worker?
  10. Some libraries need to be initialized once for each concurrent instance that
  11. may run on the machine. For instance, a C++ computation class which is not
  12. thread-safe by itself, but for which several instanciated objects of that class
  13. can be used concurrently. This can be used in StarPU by initializing one such
  14. object per worker. For instance, the libstarpufft example does the following to
  15. be able to use FFTW.
  16. Some global array stores the instanciated objects:
  17. \code{.c}
  18. fftw_plan plan_cpu[STARPU_NMAXWORKERS];
  19. \endcode
  20. At initialisation time of libstarpu, the objects are initialized:
  21. \code{.c}
  22. int workerid;
  23. for (workerid = 0; workerid < starpu_worker_get_count(); workerid++) {
  24. switch (starpu_worker_get_type(workerid)) {
  25. case STARPU_CPU_WORKER:
  26. plan_cpu[workerid] = fftw_plan(...);
  27. break;
  28. }
  29. }
  30. \endcode
  31. And in the codelet body, they are used:
  32. \code{.c}
  33. static void fft(void *descr[], void *_args)
  34. {
  35. int workerid = starpu_worker_get_id();
  36. fftw_plan plan = plan_cpu[workerid];
  37. ...
  38. fftw_execute(plan, ...);
  39. }
  40. \endcode
  41. Another way to go which may be needed is to execute some code from the workers
  42. themselves thanks to starpu_execute_on_each_worker(). This may be required
  43. by CUDA to behave properly due to threading issues. For instance, StarPU's
  44. starpu_cublas_init() looks like the following to call
  45. <c>cublasInit</c> from the workers themselves:
  46. \code{.c}
  47. static void init_cublas_func(void *args STARPU_ATTRIBUTE_UNUSED)
  48. {
  49. cublasStatus cublasst = cublasInit();
  50. cublasSetKernelStream(starpu_cuda_get_local_stream());
  51. }
  52. void starpu_cublas_init(void)
  53. {
  54. starpu_execute_on_each_worker(init_cublas_func, NULL, STARPU_CUDA);
  55. }
  56. \endcode
  57. \section HowToLimitMemoryPerNode How to limit memory per node
  58. TODO
  59. Talk about
  60. \ref STARPU_LIMIT_CUDA_devid_MEM, \ref STARPU_LIMIT_CUDA_MEM,
  61. \ref STARPU_LIMIT_OPENCL_devid_MEM, \ref STARPU_LIMIT_OPENCL_MEM
  62. and \ref STARPU_LIMIT_CPU_MEM
  63. starpu_memory_get_available()
  64. \section ThreadBindingOnNetBSD Thread Binding on NetBSD
  65. When using StarPU on a NetBSD machine, if the topology
  66. discovery library <c>hwloc</c> is used, thread binding will fail. To
  67. prevent the problem, you should at least use the version 1.7 of
  68. <c>hwloc</c>, and also issue the following call:
  69. \verbatim
  70. $ sysctl -w security.models.extensions.user_set_cpu_affinity=1
  71. \endverbatim
  72. Or add the following line in the file <c>/etc/sysctl.conf</c>
  73. \verbatim
  74. security.models.extensions.user_set_cpu_affinity=1
  75. \endverbatim
  76. \section UsingStarPUWithMKL Using StarPU With MKL 11 (Intel Composer XE 2013)
  77. Some users had issues with MKL 11 and StarPU (versions 1.1rc1 and
  78. 1.0.5) on Linux with MKL, using 1 thread for MKL and doing all the
  79. parallelism using StarPU (no multithreaded tasks), setting the
  80. environment variable MKL_NUM_THREADS to 1, and using the threaded MKL library,
  81. with iomp5.
  82. Using this configuration, StarPU uses only 1 core, no matter the value of
  83. \ref STARPU_NCPU. The problem is actually a thread pinning issue with MKL.
  84. The solution is to set the environment variable KMP_AFFINITY to <c>disabled</c>
  85. (http://software.intel.com/sites/products/documentation/studio/composer/en-us/2011Update/compiler_c/optaps/common/optaps_openmp_thread_affinity.htm).
  86. */