| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440 | @c -*-texinfo-*-@c This file is part of the StarPU Handbook.@c Copyright (C) 2009--2011  Universit@'e de Bordeaux 1@c Copyright (C) 2010, 2011, 2012  Centre National de la Recherche Scientifique@c Copyright (C) 2011, 2012 Institut National de Recherche en Informatique et Automatique@c See the file starpu.texi for copying conditions.@menu* Compilation configuration::   * Execution configuration through environment variables::  @end menu@node Compilation configuration@section Compilation configurationThe following arguments can be given to the @code{configure} script.@menu* Common configuration::        * Configuring workers::         * Extension configuration::     * Advanced configuration::      @end menu@node Common configuration@subsection Common configuration@table @code@item --enable-debugEnable debugging messages.@item --enable-fastDisable assertion checks, which saves computation time.@item --enable-verboseIncrease the verbosity of the debugging messages.  This can be disabledat runtime by setting the environment variable @code{STARPU_SILENT} toany value.@smallexample% STARPU_SILENT=1 ./vector_scal@end smallexample@item --enable-coverageEnable flags for the @code{gcov} coverage tool.@end table@node Configuring workers@subsection Configuring workers@table @code@item --enable-maxcpus=@var{count}Use at most @var{count} CPU cores.  This information is thenavailable as the @code{STARPU_MAXCPUS} macro.@item --disable-cpuDisable the use of CPUs of the machine. Only GPUs etc. will be used.@item --enable-maxcudadev=@var{count}Use at most @var{count} CUDA devices.  This information is thenavailable as the @code{STARPU_MAXCUDADEVS} macro.@item --disable-cudaDisable the use of CUDA, even if a valid CUDA installation was detected.@item --with-cuda-dir=@var{prefix}Search for CUDA under @var{prefix}, which should notably contain@file{include/cuda.h}.@item --with-cuda-include-dir=@var{dir}Search for CUDA headers under @var{dir}, which shouldnotably contain @code{cuda.h}. This defaults to @code{/include} appended to thevalue given to @code{--with-cuda-dir}.@item --with-cuda-lib-dir=@var{dir}Search for CUDA libraries under @var{dir}, which should notably containthe CUDA shared libraries---e.g., @file{libcuda.so}.  This defaults to@code{/lib} appended to the value given to @code{--with-cuda-dir}.@item --disable-cuda-memcpy-peerExplicitly disable peer transfers when using CUDA 4.0.@item --enable-maxopencldev=@var{count}Use at most @var{count} OpenCL devices.  This information is thenavailable as the @code{STARPU_MAXOPENCLDEVS} macro.@item --disable-openclDisable the use of OpenCL, even if the SDK is detected.@item --with-opencl-dir=@var{prefix}Search for an OpenCL implementation under @var{prefix}, which shouldnotably contain @file{include/CL/cl.h} (or @file{include/OpenCL/cl.h} onMac OS).@item --with-opencl-include-dir=@var{dir}Search for OpenCL headers under @var{dir}, which should notably contain@file{CL/cl.h} (or @file{OpenCL/cl.h} on Mac OS).  This defaults to@code{/include} appended to the value given to @code{--with-opencl-dir}.@item --with-opencl-lib-dir=@var{dir}Search for an OpenCL library under @var{dir}, which should notablycontain the OpenCL shared libraries---e.g. @file{libOpenCL.so}. This defaults to@code{/lib} appended to the value given to @code{--with-opencl-dir}.@item --enable-gordonEnable the use of the Gordon runtime for Cell SPUs.@c TODO: rather default to enabled when detected@item --with-gordon-dir=@var{prefix}Search for the Gordon SDK under @var{prefix}.@item --enable-maximplementations=@var{count}Allow for at most @var{count} codelet implementations for the sametarget device.  This information is then available as the@code{STARPU_MAXIMPLEMENTATIONS} macro.@end table@node Extension configuration@subsection Extension configuration@table @code@item --disable-soclDisable the SOCL extension (@pxref{SOCL OpenCL Extensions}).  Bydefault, it is enabled when an OpenCL implementation is found.@item --disable-starpu-topDisable the StarPU-Top interface (@pxref{StarPU-Top}).  By default, itis enabled when the required dependencies are found.@item --disable-gcc-extensionsDisable the GCC plug-in (@pxref{C Extensions}).  By default, it isenabled when the GCC compiler provides a plug-in support.@item --with-mpicc=@var{path}Use the @command{mpicc} compiler at @var{path}, for starpumpi(@pxref{StarPU MPI support}).@item --enable-comm-statsEnable communication statistics for starpumpi (@pxref{StarPU MPIsupport}).@end table@node Advanced configuration@subsection Advanced configuration@table @code@item --enable-perf-debugEnable performance debugging through gprof.@item --enable-model-debugEnable performance model debugging.@item --enable-stats@c see ../../src/datawizard/datastats.cEnable gathering of memory transfer statistics.@item --enable-maxbuffersDefine the maximum number of buffers that tasks will be able to takeas parameters, then available as the @code{STARPU_NMAXBUFS} macro.@item --enable-allocation-cacheEnable the use of a data allocation cache to avoid the cost of it withCUDA. Still experimental.@item --enable-opengl-renderEnable the use of OpenGL for the rendering of some examples.@c TODO: rather default to enabled when detected@item --enable-blas-libSpecify the blas library to be used by some of the examples. Thelibrary has to be 'atlas' or 'goto'.@item --disable-starpufftDisable the build of libstarpufft, even if fftw or cuFFT is available.@item --with-magma=@var{prefix}Search for MAGMA under @var{prefix}.  @var{prefix} should notablycontain @file{include/magmablas.h}.@item --with-fxt=@var{prefix}Search for FxT under @var{prefix}.@url{http://savannah.nongnu.org/projects/fkt, FxT} is used to generatetraces of scheduling events, which can then be rendered them using ViTE(@pxref{Off-line, off-line performance feedback}).  @var{prefix} shouldnotably contain @code{include/fxt/fxt.h}.@item --with-perf-model-dir=@var{dir}Store performance models under @var{dir}, instead of the current user'shome.@item --with-goto-dir=@var{prefix}Search for GotoBLAS under @var{prefix}.@item --with-atlas-dir=@var{prefix}Search for ATLAS under @var{prefix}, which should notably contain@file{include/cblas.h}.@item --with-mkl-cflags=@var{cflags}Use @var{cflags} to compile code that uses the MKL library.@item --with-mkl-ldflags=@var{ldflags}Use @var{ldflags} when linking code that uses the MKL library.  Notethat the@url{http://software.intel.com/en-us/articles/intel-mkl-link-line-advisor/,MKL website} provides a script to determine the linking flags.@end table@node Execution configuration through environment variables@section Execution configuration through environment variables@menu* Workers::                     Configuring workers* Scheduling::                  Configuring the Scheduling engine* Misc::                        Miscellaneous and debug@end menu@node Workers@subsection Configuring workers@menu* STARPU_NCPU::                	Number of CPU workers* STARPU_NCUDA::                	Number of CUDA workers* STARPU_NOPENCL::              	Number of OpenCL workers* STARPU_NGORDON::              	Number of SPU workers (Cell)* STARPU_WORKERS_NOBIND::       	Do not bind workers* STARPU_WORKERS_CPUID::        	Bind workers to specific CPUs* STARPU_WORKERS_CUDAID::       	Select specific CUDA devices* STARPU_WORKERS_OPENCLID::     	Select specific OpenCL devices* STARPU_SINGLE_COMBINED_WORKER:: 	Do not use concurrent workers* STARPU_MIN_WORKERSIZE::	 	Minimum size of the combined workers* STARPU_MAX_WORKERSIZE:: 		Maximum size of the combined workers@end menu@node STARPU_NCPU@subsubsection @code{STARPU_NCPU} -- Number of CPU workersSpecify the number of CPU workers (thus not including workers dedicated to control acceleratores). Note that by default, StarPU will not allocatemore CPU workers than there are physical CPUs, and that some CPUs are used to controlthe accelerators.@node STARPU_NCUDA@subsubsection @code{STARPU_NCUDA} -- Number of CUDA workersSpecify the number of CUDA devices that StarPU can use. If@code{STARPU_NCUDA} is lower than the number of physical devices, it ispossible to select which CUDA devices should be used by the means of the@code{STARPU_WORKERS_CUDAID} environment variable. By default, StarPU willcreate as many CUDA workers as there are CUDA devices.@node STARPU_NOPENCL@subsubsection @code{STARPU_NOPENCL} -- Number of OpenCL workersOpenCL equivalent of the @code{STARPU_NCUDA} environment variable.@node STARPU_NGORDON@subsubsection @code{STARPU_NGORDON} -- Number of SPU workers (Cell)Specify the number of SPUs that StarPU can use.@node STARPU_WORKERS_NOBIND@subsubsection @code{STARPU_WORKERS_NOBIND} -- Do not bind workers to specific CPUsSetting it to non-zero will prevent StarPU from binding its threads toCPUs. This is for instance useful when running the testsuite in parallel.@node STARPU_WORKERS_CPUID@subsubsection @code{STARPU_WORKERS_CPUID} -- Bind workers to specific CPUsPassing an array of integers (starting from 0) in @code{STARPU_WORKERS_CPUID}specifies on which logical CPU the different workers should bebound. For instance, if @code{STARPU_WORKERS_CPUID = "0 1 4 5"}, the firstworker will be bound to logical CPU #0, the second CPU worker will be bound tological CPU #1 and so on.  Note that the logical ordering of the CPUs is eitherdetermined by the OS, or provided by the @code{hwloc} library in case it isavailable.Note that the first workers correspond to the CUDA workers, then come theOpenCL and the SPU, and finally the CPU workers. For example ifwe have @code{STARPU_NCUDA=1}, @code{STARPU_NOPENCL=1}, @code{STARPU_NCPU=2}and @code{STARPU_WORKERS_CPUID = "0 2 1 3"}, the CUDA device will be controlledby logical CPU #0, the OpenCL device will be controlled by logical CPU #2, andthe logical CPUs #1 and #3 will be used by the CPU workers.If the number of workers is larger than the array given in@code{STARPU_WORKERS_CPUID}, the workers are bound to the logical CPUs in around-robin fashion: if @code{STARPU_WORKERS_CPUID = "0 1"}, the first and thethird (resp. second and fourth) workers will be put on CPU #0 (resp. CPU #1).This variable is ignored if the @code{use_explicit_workers_bindid} flag of the@code{starpu_conf} structure passed to @code{starpu_init} is set.@node STARPU_WORKERS_CUDAID@subsubsection @code{STARPU_WORKERS_CUDAID} -- Select specific CUDA devicesSimilarly to the @code{STARPU_WORKERS_CPUID} environment variable, it ispossible to select which CUDA devices should be used by StarPU. On a machineequipped with 4 GPUs, setting @code{STARPU_WORKERS_CUDAID = "1 3"} and@code{STARPU_NCUDA=2} specifies that 2 CUDA workers should be created, and thatthey should use CUDA devices #1 and #3 (the logical ordering of the devices isthe one reported by CUDA).This variable is ignored if the @code{use_explicit_workers_cuda_gpuid} flag ofthe @code{starpu_conf} structure passed to @code{starpu_init} is set.@node STARPU_WORKERS_OPENCLID@subsubsection @code{STARPU_WORKERS_OPENCLID} -- Select specific OpenCL devicesOpenCL equivalent of the @code{STARPU_WORKERS_CUDAID} environment variable.This variable is ignored if the @code{use_explicit_workers_opencl_gpuid} flag ofthe @code{starpu_conf} structure passed to @code{starpu_init} is set.@node STARPU_SINGLE_COMBINED_WORKER@subsubsection @code{STARPU_SINGLE_COMBINED_WORKER} -- Do not use concurrent workersIf set, StarPU will create several workers which won't be able to workconcurrently. It will create combined workers which size goes from 1 to thetotal number of CPU workers in the system.@node STARPU_MIN_WORKERSIZE@subsubsection @code{STARPU_MIN_WORKERSIZE} -- Minimum size of the combined workersLet the user give a hint to StarPU about which how many workers(minimum boundary) the combined workers should contain.@node STARPU_MAX_WORKERSIZE@subsubsection @code{STARPU_MAX_WORKERSIZE} -- Maximum size of the combined workersLet the user give a hint to StarPU about which how many workers(maximum boundary) the combined workers should contain.@node Scheduling@subsection Configuring the Scheduling engine@menu* STARPU_SCHED::                Scheduling policy* STARPU_CALIBRATE::            Calibrate performance models* STARPU_BUS_CALIBRATE::        Calibrate bus* STARPU_PREFETCH::             Use data prefetch* STARPU_SCHED_ALPHA::          Computation factor* STARPU_SCHED_BETA::           Communication factor@end menu@node STARPU_SCHED@subsubsection @code{STARPU_SCHED} -- Scheduling policyChoose between the different scheduling policies proposed by StarPU: workrandom, stealing, greedy, with performance models, etc.Use @code{STARPU_SCHED=help} to get the list of available schedulers.@node STARPU_CALIBRATE@subsubsection @code{STARPU_CALIBRATE} -- Calibrate performance modelsIf this variable is set to 1, the performance models are calibrated duringthe execution. If it is set to 2, the previous values are dropped to restartcalibration from scratch. Setting this variable to 0 disable calibration, thisis the default behaviour.Note: this currently only applies to @code{dm}, @code{dmda} and @code{heft} scheduling policies.@node STARPU_BUS_CALIBRATE@subsubsection @code{STARPU_BUS_CALIBRATE} -- Calibrate busIf this variable is set to 1, the bus is recalibrated during intialization.@node STARPU_PREFETCH@subsubsection @code{STARPU_PREFETCH} -- Use data prefetchThis variable indicates whether data prefetching should be enabled (0 meansthat it is disabled). If prefetching is enabled, when a task is scheduled to beexecuted e.g. on a GPU, StarPU will request an asynchronous transfer inadvance, so that data is already present on the GPU when the task starts. As aresult, computation and data transfers are overlapped.Note that prefetching is enabled by default in StarPU.@node STARPU_SCHED_ALPHA@subsubsection @code{STARPU_SCHED_ALPHA} -- Computation factorTo estimate the cost of a task StarPU takes into account the estimatedcomputation time (obtained thanks to performance models). The alpha factor isthe coefficient to be applied to it before adding it to the communication part.@node STARPU_SCHED_BETA@subsubsection @code{STARPU_SCHED_BETA} -- Communication factorTo estimate the cost of a task StarPU takes into account the estimateddata transfer time (obtained thanks to performance models). The beta factor isthe coefficient to be applied to it before adding it to the computation part.@node Misc@subsection Miscellaneous and debug@menu* STARPU_SILENT::               Disable verbose mode* STARPU_LOGFILENAME::          Select debug file name* STARPU_FXT_PREFIX::           FxT trace location* STARPU_LIMIT_GPU_MEM::        Restrict memory size on the GPUs* STARPU_GENERATE_TRACE::       Generate a Paje trace when StarPU is shut down@end menu@node STARPU_SILENT@subsubsection @code{STARPU_SILENT} -- Disable verbose modeThis variable allows to disable verbose mode at runtime when StarPUhas been configured with the option @code{--enable-verbose}.@node STARPU_LOGFILENAME@subsubsection @code{STARPU_LOGFILENAME} -- Select debug file nameThis variable specifies in which file the debugging output should be saved to.@node STARPU_FXT_PREFIX@subsubsection @code{STARPU_FXT_PREFIX} -- FxT trace locationThis variable specifies in which directory to save the trace generated if FxT is enabled. It needs to have a trailing '/' character.@node STARPU_LIMIT_GPU_MEM@subsubsection @code{STARPU_LIMIT_GPU_MEM} -- Restrict memory size on the GPUsThis variable specifies the maximum number of megabytes that should beavailable to the application on each GPUs. In case this value is smaller thanthe size of the memory of a GPU, StarPU pre-allocates a buffer to waste memoryon the device. This variable is intended to be used for experimental purposesas it emulates devices that have a limited amount of memory.@node STARPU_GENERATE_TRACE@subsubsection @code{STARPU_GENERATE_TRACE} -- Generate a Paje trace when StarPU is shut downWhen set to 1, this variable indicates that StarPU should automaticallygenerate a Paje trace when starpu_shutdown is called.
 |