ChangeLog 5.5 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149
  1. # StarPU --- Runtime system for heterogeneous multicore architectures.
  2. #
  3. # Copyright (C) 2009, 2010, 2011 Université de Bordeaux 1
  4. # Copyright (C) 2010, 2011 Centre National de la Recherche Scientifique
  5. #
  6. # StarPU is free software; you can redistribute it and/or modify
  7. # it under the terms of the GNU Lesser General Public License as published by
  8. # the Free Software Foundation; either version 2.1 of the License, or (at
  9. # your option) any later version.
  10. #
  11. # StarPU is distributed in the hope that it will be useful, but
  12. # WITHOUT ANY WARRANTY; without even the implied warranty of
  13. # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
  14. #
  15. # See the GNU Lesser General Public License in COPYING.LGPL for more details.
  16. StarPU 1.0 (svn revision xxxx)
  17. ==============================================
  18. The extensions-again release
  19. * Applications can provide several implementations of a codelet for the same
  20. architecture.
  21. * Add a gcc plugin to extend the C interface with pragmas which allow to
  22. easily define codelets and issue tasks.
  23. * Add codelet execution time statistics plot.
  24. * Add bus speed in starpu_machine_display.
  25. * Add a StarPU-Top feedback and steering interface.
  26. * Documentation improvement.
  27. * Add a STARPU_DATA_ACQUIRE_CB which permits to inline the code to be done.
  28. * Permit to specify MPI tags for more efficient starpu_mpi_insert_task
  29. * Add a communications.
  30. * Add SOCL, an OpenCL interface on top of StarPU.
  31. * Add gdb functions.
  32. * Add complex support to LU example.
  33. * Add an OpenMP fork-join example.
  34. StarPU 0.9 (svn revision 3721)
  35. ==============================================
  36. The extensions release
  37. * Provide the STARPU_REDUX data access mode
  38. * Externalize the scheduler API.
  39. * Add theoretical bound computation
  40. * Add the void interface
  41. * Add power consumption optimization
  42. * Add parallel task support
  43. * Add starpu_mpi_insert_task
  44. * Add profiling information interface.
  45. * Add STARPU_LIMIT_GPU_MEM environment variable.
  46. * OpenCL fixes
  47. * MPI fixes
  48. * Improve optimization documentation
  49. * Upgrade to hwloc 1.1 interface
  50. * Add fortran example
  51. * Add mandelbrot OpenCL example
  52. * Add cg example
  53. * Add stencil MPI example
  54. * Initial support for CUDA4
  55. StarPU 0.4 (svn revision 2535)
  56. ==============================================
  57. The API strengthening release
  58. * Major API improvements
  59. - Provide the STARPU_SCRATCH data access mode
  60. - Rework data filter interface
  61. - Rework data interface structure
  62. - A script that automatically renames old functions to accomodate with the new
  63. API is available from https://scm.gforge.inria.fr/svn/starpu/scripts/renaming
  64. (login: anonsvn, password: anonsvn)
  65. * Implement dependencies between task directly (eg. without tags)
  66. * Implicit data-driven task dependencies simplifies the design of
  67. data-parallel algorithms
  68. * Add dynamic profiling capabilities
  69. - Provide per-task feedback
  70. - Provide per-worker feedback
  71. - Provide feedback about memory transfers
  72. * Provide a library to help accelerating MPI applications
  73. * Improve data transfers overhead prediction
  74. - Transparently benchmark buses to generate performance models
  75. - Bind accelerator-controlling threads with respect to NUMA locality
  76. * Improve StarPU's portability
  77. - Add OpenCL support
  78. - Add support for Windows
  79. StarPU 0.2.901 aka 0.3-rc1 (svn revision 1236)
  80. ==============================================
  81. The asynchronous heterogeneous multi-accelerator release
  82. * Many API changes and code cleanups
  83. - Implement starpu_worker_get_id
  84. - Implement starpu_worker_get_name
  85. - Implement starpu_worker_get_type
  86. - Implement starpu_worker_get_count
  87. - Implement starpu_display_codelet_stats
  88. - Implement starpu_data_prefetch_on_node
  89. - Expose the starpu_data_set_wb_mask function
  90. * Support nvidia (heterogeneous) multi-GPU
  91. * Add the data request mechanism
  92. - All data transfers use data requests now
  93. - Implement asynchronous data transfers
  94. - Implement prefetch mechanism
  95. - Chain data requests to support GPU->RAM->GPU transfers
  96. * Make it possible to bypass the scheduler and to assign a task to a specific
  97. worker
  98. * Support restartable tasks to reinstanciate dependencies task graphs
  99. * Improve performance prediction
  100. - Model data transfer overhead
  101. - One model is created for each accelerator
  102. * Support for CUDA's driver API is deprecated
  103. * The STARPU_WORKERS_CUDAID and STARPU_WORKERS_CPUID env. variables make it possible to
  104. specify where to bind the workers
  105. * Use the hwloc library to detect the actual number of cores
  106. StarPU 0.2.0 (svn revision 1013)
  107. ==============================================
  108. The Stabilizing-the-Basics release
  109. * Various API cleanups
  110. * Mac OS X is supported now
  111. * Add dynamic code loading facilities onto Cell's SPUs
  112. * Improve performance analysis/feedback tools
  113. * Application can interact with StarPU tasks
  114. - The application may access/modify data managed by the DSM
  115. - The application may wait for the termination of a (set of) task(s)
  116. * An initial documentation is added
  117. * More examples are supplied
  118. StarPU 0.1.0 (svn revision 794)
  119. ==============================================
  120. First release.
  121. Status:
  122. * Only supports Linux platforms yet
  123. * Supported architectures
  124. - multicore CPUs
  125. - NVIDIA GPUs (with CUDA 2.x)
  126. - experimental Cell/BE support
  127. Changes:
  128. * Scheduling facilities
  129. - run-time selection of the scheduling policy
  130. - basic auto-tuning facilities
  131. * Software-based DSM
  132. - transparent data coherency management
  133. - High-level expressive interface