Samuel Thibault
|
b613d58104
Assume that we have at least cuda 3.1, so that we can make our BLAS examples always use streams
|
13 years ago |
Samuel Thibault
|
51a7e8c979
Use cudaMemcpyAsync instead of cudaMemcpy
|
13 years ago |
Samuel Thibault
|
78ac4a07ca
follow-up r7018: add missing declarations
|
13 years ago |
Samuel Thibault
|
d2cd1868e2
provide good examples by always using cudaMemsetAsync, not cudaMemset
|
13 years ago |
Samuel Thibault
|
57e59bf2d9
provide good examples by always using cudaMemcpyAsync, not cudaMemcpy
|
13 years ago |
Samuel Thibault
|
f18180c477
synchronize only the transfer stream
|
13 years ago |
Samuel Thibault
|
3e58d92492
Add 2d shadow filters
|
13 years ago |
Samuel Thibault
|
cb492ad84b
Keep documentation coherent with C convention
|
13 years ago |
Samuel Thibault
|
972caa3d96
Fix build without OpenGL headers
|
13 years ago |
Samuel Thibault
|
70864cd9ea
there are some optimizations with eager, just not wise scheduling
|
13 years ago |
Samuel Thibault
|
1d7dcdc5b6
fix compilation warning
|
13 years ago |
Samuel Thibault
|
c3610e1434
drop unused factor
|
13 years ago |
Samuel Thibault
|
9803389d55
fix number
|
13 years ago |
Samuel Thibault
|
fd06e119aa
more comments
|
13 years ago |
Samuel Thibault
|
cf5341ae88
more details on the shadow at work
|
13 years ago |
Samuel Thibault
|
6bc0d82b7c
fix non-cuda/opencl build
|
13 years ago |
Samuel Thibault
|
a88b26f1a7
update changelog
|
13 years ago |
Samuel Thibault
|
ee1e7926a3
Put single-combined-worker in nice order for starpu_machine_display
|
13 years ago |
Samuel Thibault
|
3b540f359f
document that starpu_machine_display shows combined workers
|
13 years ago |
Samuel Thibault
|
65688ef4d8
print topology before bandwidth
|
13 years ago |
Samuel Thibault
|
1aeeab7392
mergeinfo for backport
|
13 years ago |
Samuel Thibault
|
844951370b
Show topology in starpu_machine_display
|
13 years ago |
Samuel Thibault
|
bb9da9185b
Fix output alignment
|
13 years ago |
Samuel Thibault
|
8f8194dfa6
forwardport r6996 from 1.0: fix bus calibration for more than 32 cpus
|
13 years ago |
Samuel Thibault
|
ab9b3fcfbd
forwardport r6997 from 1.0: fix bus calibration for more than 32 cpus
|
13 years ago |
Samuel Thibault
|
f33693b5d4
Add actual bus measurements in starpu_machine_display output
|
13 years ago |
Samuel Thibault
|
f30a1170aa
Replace starpu_force_bus_sampling with bus_calibrate configuration option, as calibration should done carefully during starpu initialization, and not after initialization
|
13 years ago |
Samuel Thibault
|
09dab6fd3e
Drop spurious nbsp
|
13 years ago |
Samuel Thibault
|
c39f87d9cd
clearly separate CUDA and OpenCL
|
13 years ago |
Samuel Thibault
|
473617f4c0
Do not print CPU%d, it is misleading, explain that it is CPUs in preferrence order, instead
|
13 years ago |