| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112 | @c -*-texinfo-*-@c This file is part of the StarPU Handbook.@c Copyright (C) 2009--2011  Universit@'e de Bordeaux 1@c Copyright (C) 2010, 2011, 2012, 2013  Centre National de la Recherche Scientifique@c Copyright (C) 2011 Institut National de Recherche en Informatique et Automatique@c See the file starpu.texi for copying conditions.@menu* Per-worker library initialization::  How to initialize a computation library once for each worker?* Limit memory::* Thread Binding on NetBSD::@end menu@node Per-worker library initialization@section How to initialize a computation library once for each worker?Some libraries need to be initialized once for each concurrent instance thatmay run on the machine. For instance, a C++ computation class which is notthread-safe by itself, but for which several instanciated objects of that classcan be used concurrently. This can be used in StarPU by initializing one suchobject per worker. For instance, the libstarpufft example does the following tobe able to use FFTW.Some global array stores the instanciated objects:@cartouche@smallexamplefftw_plan plan_cpu[STARPU_NMAXWORKERS];@end smallexample@end cartoucheAt initialisation time of libstarpu, the objects are initialized:@cartouche@smallexampleint workerid;for (workerid = 0; workerid < starpu_worker_get_count(); workerid++) @{    switch (starpu_worker_get_type(workerid)) @{        case STARPU_CPU_WORKER:            plan_cpu[workerid] = fftw_plan(...);            break;    @}@}@end smallexample@end cartoucheAnd in the codelet body, they are used:@cartouche@smallexamplestatic void fft(void *descr[], void *_args)@{    int workerid = starpu_worker_get_id();    fftw_plan plan = plan_cpu[workerid];    ...    fftw_execute(plan, ...);@}@end smallexample@end cartoucheAnother way to go which may be needed is to execute some code from the workersthemselves thanks to @code{starpu_execute_on_each_worker}. This may be requiredby CUDA to behave properly due to threading issues. For instance, StarPU's@code{starpu_cublas_init} looks like the following to call@code{cublasInit} from the workers themselves:@cartouche@smallexamplestatic void init_cublas_func(void *args STARPU_ATTRIBUTE_UNUSED)@{    cublasStatus cublasst = cublasInit();    cublasSetKernelStream(starpu_cuda_get_local_stream());@}void starpu_cublas_init(void)@{    starpu_execute_on_each_worker(init_cublas_func, NULL, STARPU_CUDA);@}@end smallexample@end cartouche@node Limit memory@section How to limit memory per nodeTODOTalk about@code{STARPU_LIMIT_CUDA_devid_MEM}, @code{STARPU_LIMIT_CUDA_MEM},@code{STARPU_LIMIT_OPENCL_devid_MEM}, @code{STARPU_LIMIT_OPENCL_MEM}and @code{STARPU_LIMIT_CPU_MEM}@code{starpu_memory_get_available}@node Thread Binding on NetBSD@section Thread Binding on NetBSDWhen using StarPU on a NetBSD machine, if the topologydiscovery library @code{hwloc} is used, thread binding will fail. Toprevent the problem, you should at least use the version 1.7 of@code{hwloc}, and also issue the following call:@example$ sysctl -w security.models.extensions.user_set_cpu_affinity=1@end exampleOr add the following line in the file @code{/etc/sysctl.conf}@examplesecurity.models.extensions.user_set_cpu_affinity=1@end example
 |