Преглед изворни кода

install and document benchmarks

Samuel Thibault пре 12 година
родитељ
комит
e3bce3ba01
3 измењених фајлова са 65 додато и 0 уклоњено
  1. 47 0
      doc/chapters/benchmarks.texi
  2. 9 0
      doc/starpu.texi
  3. 9 0
      tests/Makefile.am

+ 47 - 0
doc/chapters/benchmarks.texi

@@ -0,0 +1,47 @@
+@c -*-texinfo-*-
+
+@c This file is part of the StarPU Handbook.
+@c Copyright (C) 2012  University of Bordeaux
+@c See the file starpu.texi for copying conditions.
+
+@menu
+* Task size overhead::           Overhead of tasks depending on their size
+* Data transfer latency::        Latency of data transfers 
+* Gemm::                         Matrix-matrix multiplication
+* Cholesky::                     Cholesky factorization
+* LU::                           LU factorization
+@end menu
+
+Some interesting benchmarks are installed among examples in
+/usr/lib/starpu/examples . Make sure to try various schedulers, for instance
+STARPU_SCHED=heft
+
+@node Task size overhead
+@section Task size overhead
+
+This benchmark gives a glimpse into how big a size should be for StarPU overhead
+to be low enough.  Run @code{tasks_size_overhead.sh}, it will generate a plot
+of the speedup of tasks of various sizes, depending on the number of CPUs being
+used.
+
+@node Data transfer latency
+@section Data transfer latency
+
+@code{local_pingpong} performs a ping-pong between the first two CUDA nodes, and
+prints the measured latency.
+
+@node Gemm
+@section Matrix-matrix multiplication
+
+@code{sgemm} and @code{dgemm} perform a blocked matrix-matrix
+multiplication using BLAS and cuBLAS. They output the obtained GFlops.
+
+@node Cholesky
+@section Cholesky factorization
+
+@code{cholesky*} perform a Cholesky factorization (single precision). They use different dependency primitives.
+
+@node LU
+@section LU factorization
+
+@code{lu*} perform an LU factorization. They use different dependency primitives.

+ 9 - 0
doc/starpu.texi

@@ -75,6 +75,7 @@ was last updated on @value{UPDATED}.
 * StarPU FFT support::          How to perform FFT computations with StarPU
 * C Extensions::                Easier StarPU programming with GCC
 * SOCL OpenCL Extensions::      How to use OpenCL on top of StarPU
+* Benchmarks::                  Benchmarks worth running
 * StarPU Basic API::            The Basic API to use StarPU
 * StarPU Advanced API::         Advanced use of StarPU
 * Configuring StarPU::          How to configure StarPU
@@ -127,6 +128,14 @@ was last updated on @value{UPDATED}.
 @include chapters/advanced-examples.texi
 
 @c ---------------------------------------------------------------------
+@c Benchmarks
+@c ---------------------------------------------------------------------
+
+@node Benchmarks
+@chapter Benchmarks
+@include chapters/benchmarks.texi
+
+@c ---------------------------------------------------------------------
 @c Performance options
 @c ---------------------------------------------------------------------
 

+ 9 - 0
tests/Makefile.am

@@ -51,6 +51,8 @@ CLEANFILES = 					\
 BUILT_SOURCES =
 SUBDIRS =
 
+examplebindir = $(libdir)/starpu/examples
+
 if STARPU_USE_OPENCL
 nobase_STARPU_OPENCL_DATA_DATA =
 endif
@@ -234,6 +236,13 @@ noinst_PROGRAMS =				\
 	sched_policies/simple_deps              \
 	sched_policies/simple_cpu_gpu_sched
 
+examplebin_PROGRAMS = \
+	microbenchs/tasks_size_overhead		\
+	microbenchs/local_pingpong
+examplebin_SCRIPTS = \
+	microbenchs/tasks_size_overhead.gp \
+	microbenchs/tasks_size_overhead.sh
+
 if STARPU_HAVE_WINDOWS
 check_PROGRAMS = $(noinst_PROGRAMS)
 else