installing.texi 11 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336
  1. @c -*-texinfo-*-
  2. @c This file is part of the StarPU Handbook.
  3. @c Copyright (C) 2009--2011 Universit@'e de Bordeaux 1
  4. @c Copyright (C) 2010, 2011, 2012, 2013 Centre National de la Recherche Scientifique
  5. @c Copyright (C) 2011, 2012 Institut National de Recherche en Informatique et Automatique
  6. @c See the file starpu.texi for copying conditions.
  7. @menu
  8. * Installing a Binary Package::
  9. * Installing from Source::
  10. * Setting up Your Own Code::
  11. * Benchmarking StarPU::
  12. @end menu
  13. @node Installing a Binary Package
  14. @section Installing a Binary Package
  15. One of the StarPU developers being a Debian Developer, the packages
  16. are well integrated and very uptodate. To see which packages are
  17. available, simply type:
  18. @example
  19. $ apt-cache search starpu
  20. @end example
  21. To install what you need, type:
  22. @example
  23. $ sudo apt-get install libstarpu-1.0 libstarpu-dev
  24. @end example
  25. @node Installing from Source
  26. @section Installing from Source
  27. StarPU can be built and installed by the standard means of the GNU
  28. autotools. The following chapter is intended to briefly remind how these tools
  29. can be used to install StarPU.
  30. @menu
  31. * Optional Dependencies::
  32. * Getting Sources::
  33. * Configuring StarPU::
  34. * Building StarPU::
  35. * Installing StarPU::
  36. @end menu
  37. @node Optional Dependencies
  38. @subsection Optional Dependencies
  39. The @url{http://www.open-mpi.org/software/hwloc, @code{hwloc} topology
  40. discovery library} is not mandatory to use StarPU but strongly
  41. recommended. It allows for topology aware scheduling, which improves
  42. performance. @code{hwloc} is available in major free operating system
  43. distributions, and for most operating systems.
  44. If @code{hwloc} is not available on your system, the option
  45. @code{--without-hwloc} should be explicitely given when calling the
  46. @code{configure} script. If @code{hwloc} is installed with a @code{pkg-config} file,
  47. no option is required, it will be detected automatically, otherwise
  48. @code{with-hwloc=prefix} should be used to specify the location
  49. of @code{hwloc}.
  50. @node Getting Sources
  51. @subsection Getting Sources
  52. StarPU's sources can be obtained from the
  53. @url{http://runtime.bordeaux.inria.fr/StarPU/files/,download page} of
  54. the StarPU website.
  55. All releases and the development tree of StarPU are freely available
  56. on INRIA's gforge under the LGPL license. Some releases are available
  57. under the BSD license.
  58. The latest release can be downloaded from the
  59. @url{http://gforge.inria.fr/frs/?group_id=1570,INRIA's gforge} or
  60. directly from the @url{http://runtime.bordeaux.inria.fr/StarPU/files/,StarPU download page}.
  61. The latest nightly snapshot can be downloaded from the @url{http://starpu.gforge.inria.fr/testing/,StarPU gforge website}.
  62. @example
  63. $ wget http://starpu.gforge.inria.fr/testing/starpu-nightly-latest.tar.gz
  64. @end example
  65. And finally, current development version is also accessible via svn.
  66. It should be used only if you need the very latest changes (i.e. less
  67. than a day!)@footnote{The client side of the software Subversion can
  68. be obtained from @url{http://subversion.tigris.org}. If you
  69. are running on Windows, you will probably prefer to use
  70. @url{http://tortoisesvn.tigris.org/, TortoiseSVN}.}.
  71. @example
  72. svn checkout svn://scm.gforge.inria.fr/svn/starpu/trunk StarPU
  73. @end example
  74. @node Configuring StarPU
  75. @subsection Configuring StarPU
  76. Running @code{autogen.sh} is not necessary when using the tarball
  77. releases of StarPU. If you are using the source code from the svn
  78. repository, you first need to generate the configure scripts and the
  79. Makefiles. This requires the availability of @code{autoconf},
  80. @code{automake} >= 2.60, and @code{makeinfo}.
  81. @example
  82. $ ./autogen.sh
  83. @end example
  84. You then need to configure StarPU. Details about options that are
  85. useful to give to @code{./configure} are given in @ref{Compilation
  86. configuration}.
  87. @example
  88. $ ./configure
  89. @end example
  90. By default, the files produced during the compilation are placed in
  91. the source directory. As the compilation generates a lot of files, it
  92. is advised to to put them all in a separate directory. It is then
  93. easier to cleanup, and this allows to compile several configurations
  94. out of the same source tree. For that, simply enter the directory
  95. where you want the compilation to produce its files, and invoke the
  96. @code{configure} script located in the StarPU source directory.
  97. @example
  98. $ mkdir build
  99. $ cd build
  100. $ ../configure
  101. @end example
  102. @node Building StarPU
  103. @subsection Building StarPU
  104. @example
  105. $ make
  106. @end example
  107. Once everything is built, you may want to test the result. An
  108. extensive set of regression tests is provided with StarPU. Running the
  109. tests is done by calling @code{make check}. These tests are run every night
  110. and the result from the main profile is publicly
  111. @url{http://starpu.gforge.inria.fr/testing/,available}.
  112. @example
  113. $ make check
  114. @end example
  115. @node Installing StarPU
  116. @subsection Installing StarPU
  117. In order to install StarPU at the location that was specified during
  118. configuration:
  119. @example
  120. $ make install
  121. @end example
  122. Libtool interface versioning information are included in
  123. libraries names (libstarpu-1.0.so, libstarpumpi-1.0.so and
  124. libstarpufft-1.0.so).
  125. @node Setting up Your Own Code
  126. @section Setting up Your Own Code
  127. @menu
  128. * Setting Flags for Compiling::
  129. * Running a Basic StarPU Application::
  130. * Kernel Threads Started by StarPU::
  131. * Enabling OpenCL::
  132. @end menu
  133. @node Setting Flags for Compiling
  134. @subsection Setting Flags for Compiling, Linking and Running Applications
  135. StarPU provides a pkg-config executable to obtain relevant compiler
  136. and linker flags.
  137. Compiling and linking an application against StarPU may require to use
  138. specific flags or libraries (for instance @code{CUDA} or @code{libspe2}).
  139. To this end, it is possible to use the @code{pkg-config} tool.
  140. If StarPU was not installed at some standard location, the path of StarPU's
  141. library must be specified in the @code{PKG_CONFIG_PATH} environment variable so
  142. that @code{pkg-config} can find it. For example if StarPU was installed in
  143. @code{$prefix_dir}:
  144. @example
  145. $ PKG_CONFIG_PATH=$PKG_CONFIG_PATH:$prefix_dir/lib/pkgconfig
  146. @end example
  147. The flags required to compile or link against StarPU are then
  148. accessible with the following commands@footnote{It is still possible to use the API
  149. provided in the version 0.9 of StarPU by calling @code{pkg-config}
  150. with the @code{libstarpu} package. Similar packages are provided for
  151. @code{libstarpumpi} and @code{libstarpufft}.}:
  152. @example
  153. $ pkg-config --cflags starpu-1.1 # options for the compiler
  154. $ pkg-config --libs starpu-1.1 # options for the linker
  155. @end example
  156. Make sure that @code{pkg-config --libs starpu-1.1} actually produces some output
  157. before going further: @code{PKG_CONFIG_PATH} has to point to the place where
  158. @code{starpu-1.1.pc} was installed during @code{make install}.
  159. Also pass the @code{--static} option if the application is to be
  160. linked statically.
  161. It is also necessary to set the variable @code{LD_LIBRARY_PATH} to
  162. locate dynamic libraries at runtime.
  163. @example
  164. $ LD_LIBRARY_PATH=$prefix_dir/lib:$LD_LIBRARY_PATH
  165. @end example
  166. When using a Makefile, the following lines can be added to set the
  167. options for the compiler and the linker:
  168. @cartouche
  169. @example
  170. CFLAGS += $$(pkg-config --cflags starpu-1.1)
  171. LDFLAGS += $$(pkg-config --libs starpu-1.1)
  172. @end example
  173. @end cartouche
  174. @node Running a Basic StarPU Application
  175. @subsection Running a Basic StarPU Application
  176. Basic examples using StarPU are built in the directory
  177. @code{examples/basic_examples/} (and installed in
  178. @code{$prefix_dir/lib/starpu/examples/}). You can for example run the example
  179. @code{vector_scal}.
  180. @example
  181. $ ./examples/basic_examples/vector_scal
  182. BEFORE: First element was 1.000000
  183. AFTER: First element is 3.140000
  184. @end example
  185. When StarPU is used for the first time, the directory
  186. @code{$STARPU_HOME/.starpu/} is created, performance models will be stored in
  187. that directory (@pxref{STARPU_HOME}).
  188. Please note that buses are benchmarked when StarPU is launched for the
  189. first time. This may take a few minutes, or less if @code{hwloc} is
  190. installed. This step is done only once per user and per machine.
  191. @node Kernel Threads Started by StarPU
  192. @subsection Kernel Threads Started by StarPU
  193. StarPU automatically binds one thread per CPU core. It does not use
  194. SMT/hyperthreading because kernels are usually already optimized for using a
  195. full core, and using hyperthreading would make kernel calibration rather random.
  196. Since driving GPUs is a CPU-consuming task, StarPU dedicates one core per GPU
  197. While StarPU tasks are executing, the application is not supposed to do
  198. computations in the threads it starts itself, tasks should be used instead.
  199. TODO: add a StarPU function to bind an application thread (e.g. the main thread)
  200. to a dedicated core (and thus disable the corresponding StarPU CPU worker).
  201. @node Enabling OpenCL
  202. @subsection Enabling OpenCL
  203. When both CUDA and OpenCL drivers are enabled, StarPU will launch an
  204. OpenCL worker for NVIDIA GPUs only if CUDA is not already running on them.
  205. This design choice was necessary as OpenCL and CUDA can not run at the
  206. same time on the same NVIDIA GPU, as there is currently no interoperability
  207. between them.
  208. To enable OpenCL, you need either to disable CUDA when configuring StarPU:
  209. @example
  210. $ ./configure --disable-cuda
  211. @end example
  212. or when running applications:
  213. @example
  214. $ STARPU_NCUDA=0 ./application
  215. @end example
  216. OpenCL will automatically be started on any device not yet used by
  217. CUDA. So on a machine running 4 GPUS, it is therefore possible to
  218. enable CUDA on 2 devices, and OpenCL on the 2 other devices by doing
  219. so:
  220. @example
  221. $ STARPU_NCUDA=2 ./application
  222. @end example
  223. @node Benchmarking StarPU
  224. @section Benchmarking StarPU
  225. Some interesting benchmarks are installed among examples in
  226. @code{$prefix_dir/lib/starpu/examples/}. Make sure to try various
  227. schedulers, for instance STARPU_SCHED=dmda
  228. @menu
  229. * Task size overhead::
  230. * Data transfer latency::
  231. * Gemm::
  232. * Cholesky::
  233. * LU::
  234. @end menu
  235. @node Task size overhead
  236. @subsection Task size overhead
  237. This benchmark gives a glimpse into how big a size should be for StarPU overhead
  238. to be low enough. Run @code{tasks_size_overhead.sh}, it will generate a plot
  239. of the speedup of tasks of various sizes, depending on the number of CPUs being
  240. used.
  241. @node Data transfer latency
  242. @subsection Data transfer latency
  243. @code{local_pingpong} performs a ping-pong between the first two CUDA nodes, and
  244. prints the measured latency.
  245. @node Gemm
  246. @subsection Matrix-matrix multiplication
  247. @code{sgemm} and @code{dgemm} perform a blocked matrix-matrix
  248. multiplication using BLAS and cuBLAS. They output the obtained GFlops.
  249. @node Cholesky
  250. @subsection Cholesky factorization
  251. @code{cholesky*} perform a Cholesky factorization (single precision). They use different dependency primitives.
  252. @node LU
  253. @subsection LU factorization
  254. @code{lu*} perform an LU factorization. They use different dependency primitives.