Samuel Thibault
|
2d2acb47ab
Fix request list push: we need to push and pop in the same order to keep consistent with task order
|
13 år sedan |
Samuel Thibault
|
7205b70968
Fix computation/communication overlap: transfers always have to use the local_transfer stream, not the local stream
|
13 år sedan |
Samuel Thibault
|
408e6de3ce
take transfer time into account in heft when the task is scheduled to be executed almost immediately: the transfer will never be overlappable in that case.
|
13 år sedan |
Cyril Roelandt
|
7332d96c1d
Fix typo.
|
13 år sedan |
Samuel Thibault
|
f1e95134c6
heft should now be fixed for multiple implementations
|
13 år sedan |
Samuel Thibault
|
8360d50c2b
Fix expected start of tasks: after task execution, the CPU time is already elapsed.
|
13 år sedan |
Cyril Roelandt
|
0a185ee4d8
Talking about the scheduler in the "multiple implementations" paragraph.
|
13 år sedan |
Cyril Roelandt
|
2a921d3c5e
Bugfix : do not assign a particular codelet implementation to a job before selecting an appropriate worker.
|
13 år sedan |
Samuel Thibault
|
af9be1babd
fix comment
|
13 år sedan |
Nathalie Furmento
|
288af06552
remove un-needed copyright
|
13 år sedan |
Nathalie Furmento
|
2177148f7e
add missing licence information
|
13 år sedan |
Cyril Roelandt
|
16f070b7cd
Updated the documentation with data related to mutliple implementations.
|
13 år sedan |
Samuel Thibault
|
b58ea76005
Spurious variable
|
13 år sedan |
Samuel Thibault
|
eb5822de45
Assert rather than do bogus things for non-cuda/opencl case.
|
13 år sedan |
Samuel Thibault
|
6af576989c
Add missing prototypes
|
13 år sedan |
Samuel Thibault
|
8e7b92c6e8
silence warning
|
13 år sedan |
Samuel Thibault
|
6b70d00b21
silent warning
|
13 år sedan |
Samuel Thibault
|
846df70675
Fix build without fxt
|
13 år sedan |
Samuel Thibault
|
e9e66b6c78
Mention DriverCopyAsync in the trace.
|
13 år sedan |
Samuel Thibault
|
3926b7c7be
Fix hack which was wrong with peers: simply record both source and destination, it actually makes the statistics code simpler
|
13 år sedan |
Samuel Thibault
|
209ebef2ac
Trace async transfers differently to watch for cuda spurious blocking
|
13 år sedan |
Cyril Roelandt
|
2787e1fa46
Added an SSE codelet to the vector scaling example.
|
13 år sedan |
Ludovic Courtès
|
886913ffe3
gcc: Support interleaved declarations & definitions of task implementations.
|
13 år sedan |
Ludovic Courtès
|
68b2f37506
gcc: Simplify `task' attribute handling.
|
13 år sedan |
Samuel Thibault
|
ed747c4790
typo
|
13 år sedan |
Samuel Thibault
|
77007ead6a
revert r4190, it's completely bogus, we need to add -L for the AC_HAVE_LIBRARY. AC_HAVE_LIBRARY actually does not add -lcudart to LDFLAGS because it has a non-empty action. Let's thus add CUDA_LDFLAGS before the cublas test
|
13 år sedan |
Samuel Thibault
|
89f3df6586
keep -lcudart when checking for libcublas. This is needed for linking when that library path is not in LD_LIBRARY_PATH
|
13 år sedan |
Olivier Aumage
|
6c4d62d0b0
- code disabled for years
|
13 år sedan |
Olivier Aumage
|
4a747573b4
- add missing ifdefs
|
13 år sedan |
Samuel Thibault
|
3d15b9bbb1
drop debugging
|
13 år sedan |