Samuel Thibault
|
f1e95134c6
heft should now be fixed for multiple implementations
|
14 vuotta sitten |
Samuel Thibault
|
8360d50c2b
Fix expected start of tasks: after task execution, the CPU time is already elapsed.
|
14 vuotta sitten |
Cyril Roelandt
|
0a185ee4d8
Talking about the scheduler in the "multiple implementations" paragraph.
|
14 vuotta sitten |
Cyril Roelandt
|
2a921d3c5e
Bugfix : do not assign a particular codelet implementation to a job before selecting an appropriate worker.
|
14 vuotta sitten |
Samuel Thibault
|
af9be1babd
fix comment
|
14 vuotta sitten |
Nathalie Furmento
|
288af06552
remove un-needed copyright
|
14 vuotta sitten |
Nathalie Furmento
|
2177148f7e
add missing licence information
|
14 vuotta sitten |
Cyril Roelandt
|
16f070b7cd
Updated the documentation with data related to mutliple implementations.
|
14 vuotta sitten |
Samuel Thibault
|
b58ea76005
Spurious variable
|
14 vuotta sitten |
Samuel Thibault
|
eb5822de45
Assert rather than do bogus things for non-cuda/opencl case.
|
14 vuotta sitten |
Samuel Thibault
|
6af576989c
Add missing prototypes
|
14 vuotta sitten |
Samuel Thibault
|
8e7b92c6e8
silence warning
|
14 vuotta sitten |
Samuel Thibault
|
6b70d00b21
silent warning
|
14 vuotta sitten |
Samuel Thibault
|
846df70675
Fix build without fxt
|
14 vuotta sitten |
Samuel Thibault
|
e9e66b6c78
Mention DriverCopyAsync in the trace.
|
14 vuotta sitten |
Samuel Thibault
|
3926b7c7be
Fix hack which was wrong with peers: simply record both source and destination, it actually makes the statistics code simpler
|
14 vuotta sitten |
Samuel Thibault
|
209ebef2ac
Trace async transfers differently to watch for cuda spurious blocking
|
14 vuotta sitten |
Cyril Roelandt
|
2787e1fa46
Added an SSE codelet to the vector scaling example.
|
14 vuotta sitten |
Ludovic Courtès
|
886913ffe3
gcc: Support interleaved declarations & definitions of task implementations.
|
14 vuotta sitten |
Ludovic Courtès
|
68b2f37506
gcc: Simplify `task' attribute handling.
|
14 vuotta sitten |
Samuel Thibault
|
ed747c4790
typo
|
14 vuotta sitten |
Samuel Thibault
|
77007ead6a
revert r4190, it's completely bogus, we need to add -L for the AC_HAVE_LIBRARY. AC_HAVE_LIBRARY actually does not add -lcudart to LDFLAGS because it has a non-empty action. Let's thus add CUDA_LDFLAGS before the cublas test
|
14 vuotta sitten |
Samuel Thibault
|
89f3df6586
keep -lcudart when checking for libcublas. This is needed for linking when that library path is not in LD_LIBRARY_PATH
|
14 vuotta sitten |
Olivier Aumage
|
6c4d62d0b0
- code disabled for years
|
14 vuotta sitten |
Olivier Aumage
|
4a747573b4
- add missing ifdefs
|
14 vuotta sitten |
Samuel Thibault
|
3d15b9bbb1
drop debugging
|
14 vuotta sitten |
Nathalie Furmento
|
fd06148167
configure.ac: remove trailing whitespaces
|
14 vuotta sitten |
Nathalie Furmento
|
6db2c372ca
merge branch gpumem_prefetch
|
14 vuotta sitten |
Samuel Thibault
|
61eda0dcef
Fix asynchronicity of the wt mechanism by keeping a read reference on the data
|
14 vuotta sitten |
Samuel Thibault
|
4a0ed0dc25
Fix asynchronous prefetch: we also need to notify data dependencies in that case. Allocate the wrapper structure dynamically to permit asynchronous termination. Permit prefetch in callbacks and codelets
|
14 vuotta sitten |