|
@@ -2,7 +2,7 @@
|
|
|
* This file is part of the StarPU Handbook.
|
|
|
* Copyright (C) 2009--2011 Universit@'e de Bordeaux
|
|
|
* Copyright (C) 2010, 2011, 2012, 2013, 2014, 2015, 2016 CNRS
|
|
|
- * Copyright (C) 2011, 2012 INRIA
|
|
|
+ * Copyright (C) 2011, 2012, 2017 INRIA
|
|
|
* See the file version.doxy for copying conditions.
|
|
|
*/
|
|
|
|
|
@@ -681,8 +681,6 @@ for(x = 0; x < nblocks ; x++) {
|
|
|
starpu_mpi_gather_detached(data_handles, nblocks, 0, MPI_COMM_WORLD);
|
|
|
\endcode
|
|
|
|
|
|
-*/
|
|
|
-
|
|
|
Other collective operations would be easy to define, just ask starpu-devel for
|
|
|
them!
|
|
|
|
|
@@ -726,3 +724,30 @@ uses implicit MPI data transfers, <c>plu_outofcore_example</c> uses implicit MPI
|
|
|
data transfers and supports data matrices which do not fit in memory (out-of-core).
|
|
|
</li>
|
|
|
</ul>
|
|
|
+
|
|
|
+\section MPIMasterSlave MPI Master Slave Support
|
|
|
+
|
|
|
+StarPU includes an other way to execute the application across many nodes. The Master
|
|
|
+Slave support permits to use remote cores without thinking about data distribution. This
|
|
|
+support can be activated with the <c>--enable-mpi-master-slave</c>. However, you should not activate
|
|
|
+both MPI support and MPI Master-Slave support.
|
|
|
+
|
|
|
+If a codelet contains a kernel for CPU devices, it is automatically eligible to be executed
|
|
|
+on a MPI Slave device. However, you can decide to execute the codelet on a MPI Slave by filling
|
|
|
+the <c>mpi_ms_funcs</c> variable. The functions have to be globally-visible (i.e. not static ) for
|
|
|
+StarPU to be able to look them up, and -rdynamic must be passed to gcc (or -export-dynamic to ld)
|
|
|
+so that symbols of the main program are visible.
|
|
|
+
|
|
|
+By default, one core is dedicated on the master to manage the entire set of slaves. If MPI
|
|
|
+has a good multiple threads support, you can use <c>--with-mpi-master-slave-multiple-thread</c> to
|
|
|
+dedicate one core per slave.
|
|
|
+
|
|
|
+If you want to chose the number of cores on the slave device, use the <c>STARPU_NMPIMSTHREADS=<number></c>
|
|
|
+with <c><number></c> is the number of cores wanted. The default value is all the slave's cores. To select
|
|
|
+the number of slaves nodes, change the <c>-n</c> parameter when executing the application with mpirun
|
|
|
+or mpiexec.
|
|
|
+
|
|
|
+The node chosen by default is the with the MPI rank 0. To modify this, use the environment variable
|
|
|
+<c>STARPU_MPI_MASTER_NODE=<number></c> with <c><number></c> is the MPI rank wanted.
|
|
|
+
|
|
|
+*/
|