StarPU 0.5 (svn revision ????) ============================================== The yet-more-stuff release * Provide the STARPU_REDUX data access mode * Externalize the scheduler API. * Add theoretical bound computation * Add the void interface * Add power consumption optimization * Add parallel task support * Add starpu_mpi_insert_task * Add profiling information interface. * Add STARPU_LIMIT_GPU_MEM environment variable. * OpenCL fixes * MPI fixes * Improve optimization documentation * Upgrade to hwloc 1.1 interface * Add fortran example * Add mandelbrot OpenCL example * Add cg example * Add stencil MPI example StarPU 0.4 (svn revision 2535) ============================================== The API strengthening release * Major API improvements - Provide the STARPU_SCRATCH data access mode - Rework data filter interface - Rework data interface structure - A script that automatically renames old functions to accomodate with the new API is available from https://scm.gforge.inria.fr/svn/starpu/scripts/renaming (login: anonsvn, password: anonsvn) * Implement dependencies between task directly (eg. without tags) * Implicit data-driven task dependencies simplifies the design of data-parallel algorithms * Add dynamic profiling capabilities - Provide per-task feedback - Provide per-worker feedback - Provide feedback about memory transfers * Provide a library to help accelerating MPI applications * Improve data transfers overhead prediction - Transparently benchmark buses to generate performance models - Bind accelerator-controlling threads with respect to NUMA locality * Improve StarPU's portability - Add OpenCL support - Add support for Windows StarPU 0.2.901 aka 0.3-rc1 (svn revision 1236) ============================================== The asynchronous heterogeneous multi-accelerator release * Many API changes and code cleanups - Implement starpu_worker_get_id - Implement starpu_worker_get_name - Implement starpu_worker_get_type - Implement starpu_worker_get_count - Implement starpu_display_codelet_stats - Implement starpu_data_prefetch_on_node - Expose the starpu_data_set_wb_mask function * Support nvidia (heterogeneous) multi-GPU * Add the data request mechanism - All data transfers use data requests now - Implement asynchronous data transfers - Implement prefetch mechanism - Chain data requests to support GPU->RAM->GPU transfers * Make it possible to bypass the scheduler and to assign a task to a specific worker * Support restartable tasks to reinstanciate dependencies task graphs * Improve performance prediction - Model data transfer overhead - One model is created for each accelerator * Support for CUDA's driver API is deprecated * The STARPU_WORKERS_CUDAID and STARPU_WORKERS_CPUID env. variables make it possible to specify where to bind the workers * Use the hwloc library to detect the actual number of cores StarPU 0.2.0 (svn revision 1013) ============================================== The Stabilizing-the-Basics release * Various API cleanups * Mac OS X is supported now * Add dynamic code loading facilities onto Cell's SPUs * Improve performance analysis/feedback tools * Application can interact with StarPU tasks - The application may access/modify data managed by the DSM - The application may wait for the termination of a (set of) task(s) * An initial documentation is added * More examples are supplied StarPU 0.1.0 (svn revision 794) ============================================== First release. Status: * Only supports Linux platforms yet * Supported architectures - multicore CPUs - NVIDIA GPUs (with CUDA 2.x) - experimental Cell/BE support Changes: * Scheduling facilities - run-time selection of the scheduling policy - basic auto-tuning facilities * Software-based DSM - transparent data coherency management - High-level expressive interface