123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501 |
- /* StarPU
- *
- * Copyright (C) 2010-2013,2015-2019 CNRS
- * Copyright (C) 2011-2013 Inria
- * Copyright (C) 2009-2011,2014,2015,2019 Université de Bordeaux
- *
- * StarPU is free software; you can redistribute it and/or modify
- * it under the terms of the GNU Lesser General Public License as published by
- * the Free Software Foundation; either version 2.1 of the License, or (at
- * your option) any later version.
- *
- * StarPU is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
- *
- * See the GNU Lesser General Public License in COPYING.LGPL for more details.
- */
- /*! \page BasicExamples Basic Examples
- \section HelloWorldUsingStarPUAPI Hello World
- This section shows how to implement a simple program that submits a task
- to StarPU.
- \subsection RequiredHeaders Required Headers
- The header starpu.h should be included in any code using StarPU.
- \code{.c}
- \endcode
- \subsection DefiningACodelet Defining A Codelet
- A codelet is a structure that represents a computational kernel. Such a codelet
- may contain an implementation of the same kernel on different architectures
- (e.g. CUDA, x86, ...). For compatibility, make sure that the whole
- structure is properly initialized to zero, either by using the
- function starpu_codelet_init(), or by letting the
- compiler implicitly do it as examplified below.
- The field starpu_codelet::nbuffers specifies the number of data buffers that are
- manipulated by the codelet: here the codelet does not access or modify any data
- that is controlled by our data management library.
- We create a codelet which may only be executed on CPUs. When a CPU
- core will execute a codelet, it will call the function
- <c>cpu_func</c>, which \em must have the following prototype:
- \code{.c}
- void
- STARPU_VECTOR_GET_PTR(buffers[0]);
- for (i = 0; i < n; i++)
- val[i] *= *factor;
- }
- struct starpu_codelet cl =
- {
- .cpu_funcs = { scal_cpu_func },
- .cpu_funcs_name = { "scal_cpu_func" },
- .nbuffers = 1,
- .modes = { STARPU_RW }
- };
- \endcode
- The first argument is an array that gives
- a description of all the buffers passed in the array starpu_task::handles. The
- size of this array is given by the field starpu_codelet::nbuffers. For
- the sake of genericity, this array contains pointers to the different
- interfaces describing each buffer. In the case of the <b>vector
- interface</b>, the location of the vector (resp. its length) is
- accessible in the starpu_vector_interface::ptr (resp.
- starpu_vector_interface::nx) of this interface. Since the vector is
- accessed in a read-write fashion, any modification will automatically
- affect future accesses to this vector made by other tasks.
- The second argument of the function <c>scal_cpu_func</c> contains a
- pointer to the parameters of the codelet (given in
- starpu_task::cl_arg), so that we read the constant factor from this
- pointer.
- \subsection ExecutionOfVectorScaling Execution of Vector Scaling
- \verbatim
- $ make vector_scal
- cc $(pkg-config
- $ ./vector_scal
- 0.000000 3.000000 6.000000 9.000000 12.000000
- \endverbatim
- \section VectorScalingOnAnHybridCPUGPUMachine Vector Scaling on an Hybrid CPU/GPU Machine
- Contrary to the previous examples, the task submitted in this example may not
- only be executed by the CPUs, but also by a CUDA device.
- \subsection DefinitionOfTheCUDAKernel Definition of the CUDA Kernel
- The CUDA implementation can be written as follows. It needs to be compiled with
- a CUDA compiler such as nvcc, the NVIDIA CUDA compiler driver. It must be noted
- that the vector pointer returned by ::STARPU_VECTOR_GET_PTR is here a
- pointer in GPU memory, so that it can be passed as such to the
- kernel call <c>vector_mult_cuda</c>.
- \snippet vector_scal_cuda.c To be included. You should update doxygen if you see this text.
- \subsection DefinitionOfTheOpenCLKernel Definition of the OpenCL Kernel
- The OpenCL implementation can be written as follows. StarPU provides
- tools to compile a OpenCL kernel stored in a file.
- \code{.c}
- __kernel void vector_mult_opencl(int nx, __global float* val, float factor)
- {
- const int i = get_global_id(0);
- if (i < nx)
- {
- val[i] *= factor;
- }
- }
- \endcode
- Contrary to CUDA and CPU, ::STARPU_VECTOR_GET_DEV_HANDLE has to be used,
- which returns a <c>cl_mem</c> (which is not a device pointer, but an OpenCL
- handle), which can be passed as such to the OpenCL kernel. The difference is
- important when using partitioning, see \ref PartitioningData.
- \snippet vector_scal_opencl.c To be included. You should update doxygen if you see this text.
- \subsection DefinitionOfTheMainCode Definition of the Main Code
- The CPU implementation is the same as in the previous section.
- Here is the source of the main application. You can notice that the fields
- starpu_codelet::cuda_funcs and starpu_codelet::opencl_funcs are set to
- define the pointers to the CUDA and OpenCL implementations of the
- task.
- \snippet vector_scal_c.c To be included. You should update doxygen if you see this text.
- \subsection ExecutionOfHybridVectorScaling Execution of Hybrid Vector Scaling
- The Makefile given at the beginning of the section must be extended to
- give the rules to compile the CUDA source code. Note that the source
- file of the OpenCL kernel does not need to be compiled now, it will
- be compiled at run-time when calling the function
- starpu_opencl_load_opencl_from_file().
- \verbatim
- CFLAGS += $(shell pkg-config
- LDLIBS += $(shell pkg-config
- CC = gcc
- vector_scal: vector_scal.o vector_scal_cpu.o vector_scal_cuda.o vector_scal_opencl.o
- %.o: %.cu
- nvcc $(CFLAGS) $< -c $@
- clean:
- rm -f vector_scal *.o
- \endverbatim
- \verbatim
- $ make
- \endverbatim
- and to execute it, with the default configuration:
- \verbatim
- $ ./vector_scal
- 0.000000 3.000000 6.000000 9.000000 12.000000
- \endverbatim
- or for example, by disabling CPU devices:
- \verbatim
- $ STARPU_NCPU=0 ./vector_scal
- 0.000000 3.000000 6.000000 9.000000 12.000000
- \endverbatim
- or by disabling CUDA devices (which may permit to enable the use of OpenCL,
- see \ref EnablingOpenCL) :
- \verbatim
- $ STARPU_NCUDA=0 ./vector_scal
- 0.000000 3.000000 6.000000 9.000000 12.000000
- \endverbatim
- */
|