400_python.doxy 11 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256
  1. /* StarPU --- Runtime system for heterogeneous multicore architectures.
  2. *
  3. * Copyright (C) 2020 Université de Bordeaux, CNRS (LaBRI UMR 5800), Inria
  4. *
  5. * StarPU is free software; you can redistribute it and/or modify
  6. * it under the terms of the GNU Lesser General Public License as published by
  7. * the Free Software Foundation; either version 2.1 of the License, or (at
  8. * your option) any later version.
  9. *
  10. * StarPU is distributed in the hope that it will be useful, but
  11. * WITHOUT ANY WARRANTY; without even the implied warranty of
  12. * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
  13. *
  14. * See the GNU Lesser General Public License in COPYING.LGPL for more details.
  15. */
  16. /*! \page PythonInterface Python Interface
  17. A Python interface is also provided to allow the use of StarPU for Python users. This interface caters for the needs of all users accustomed to Python language who want a more concise and easily operability StarPU interface.
  18. The need to exploit the computing power of the available CPUs and GPUs, while relieving them from the need to specially adapt their programs to the target machine and processing units is present in most programs regardless the programming language. Providing a Python interface, in addition to the existing C interface, will extend the use of StarPU to Python users and thus support them in this perpetual quest for optimization.
  19. In this Python interface, some principle functions of StarPU are wrapped to simplify the use. Not all functions are existed compared to the original C interface, but at the same time, some new functions especially adapted to Python have been added in this interface.
  20. You can simply import StarPU module and use the provided functions of StarPU in your own Python library.
  21. \section ImplementingStarPUInPython Implementing StarPU in Python
  22. The StarPU module should be imported in any Python code using StarPU Python interface.
  23. \code{.py}
  24. >>> import starpu
  25. \endcode
  26. \subsection SubmittingTasks Submitting Tasks
  27. One of the most important function in StarPU is to submit tasks. Unlike the original C interface, Python interface simplifies the use of this function. It is more convenient for Python users to call the function directly without requiring more preparations. However, this simplification does not affect the final implementation.
  28. The function task_submit(func, *args, **kwargs) is used to submit tasks to StarPU in Python interface. When you want to let StarPU make optimizations for your program, you should submit all tasks and StarPU does smart scheduling to manage tasks. Submitted tasks will not be executed immediately, and you can only get the return value until the task has been executed.
  29. Here is an example to show how to use this function to submit a task in the most basic way.
  30. Suposse that there is a function:
  31. \code{.py}
  32. >>> def add(a, b):
  33. ... return a+b
  34. \endcode
  35. Then submitting this function as a task to StarPU, calling task_submit and setting the parameters to the function name and its arguments:
  36. \code{.py}
  37. >>> starpu.task_submit(add, 1, 2)
  38. 3
  39. \endcode
  40. You can also use decorator starpu.delayed to wrap your own function. The operation effect is the same as the previous example. However you can call your function directly, and the function will be submitted to StarPU as a task automatically.
  41. \code{.py}
  42. >>> @starpu.delayed
  43. >>> def add_deco(a, b):
  44. ... return a+b
  45. >>> add_deco(1, 2)
  46. 3
  47. \endcode
  48. \subsection ReturningFutureObject Returning Future Object
  49. In order to realise asynchronous frameworks, the task_submit function will return a Future object. This is an extended use for Python interface. A Future represents an eventual result of an asynchronous operation. It is an awaitable object, Coroutines can await on Future objects until they either have a result or an exception set, or until they are cancelled.
  50. The asyncio module should be imported in this case.
  51. \code{.py}
  52. >>> import asyncio
  53. \endcode
  54. Considering when submitting a task to StarPU, the task will not be executed immediately, but with this Future object, you do not need to wait for the eventual result but to perform other operations during task execution. When the return value is ready, awaiting this Future object, then you can get the return value.
  55. As in the following example: after calling task_submit function to create a Future object "fut", we perform awaiting until receiving a signal that the result is ready. Then we get the eventual result.
  56. \code{.py}
  57. >>> def add(a, b):
  58. ... print("The result is ready!")
  59. ... return a+b
  60. ...
  61. >>> fut = starpu.task_submit(add, 1, 2)
  62. The result is ready!
  63. >>> res = await fut
  64. >>> res
  65. 3
  66. \endcode
  67. Special attention is needed in above example that we use the argument "-m asyncio" (available in Python version > 3.8) when executing the program, then we can use "await" directly instead of "asyncio.run()". In addition, this argument only applies to execute programs in the command line. Therefore, if you want to write your program in Python script file or you only have an old version of Python, you need to await the Future in an asyncio function and use "asyncio.run()" to execute the function, like this:
  68. \code{.py}
  69. import starpu
  70. import asyncio
  71. def add(a, b):
  72. return a+b
  73. async def main():
  74. fut = starpu.task_submit(add, 1, 2)
  75. res = await fut
  76. print("The result of function is", res)
  77. asyncio.run(main())
  78. \endcode
  79. Execution:
  80. \verbatim
  81. The result of function is 3
  82. \endverbatim
  83. The Future object can be also used for the next step calculation even you do not get the task result. The eventual result will be awaited until the Future has a result.
  84. In this example, after submitting the first task, a Future object "fut1" is created, and it is used in the second task as one of arguments. During the first task is executed, the second task is submitted even we do not have the first return value. Then we receive the signal that the second result is ready right after the signal that the first result is ready. We can perform awaiting to get the eventual result.
  85. \code{.py}
  86. >>> import asyncio
  87. >>> import starpu
  88. >>> import time
  89. >>> def add(a, b):
  90. ... time.sleep(10)
  91. ... print("The first result is ready!")
  92. ... return a+b
  93. ...
  94. >>> def sub(x, a):
  95. ... print("The second result is ready!")
  96. ... return x-a
  97. ...
  98. >>> fut1 = starpu.task_submit(add, 1, 2)
  99. >>> fut2 = starpu.task_submit(sub, fut1, 1)
  100. >>> The first result is ready!
  101. The second result is ready!
  102. >>> res = await fut2
  103. >>> res
  104. 2
  105. \endcode
  106. \section ImitatingJoblibLibrary Imitating Joblib Library
  107. StarPU Python interface also provides parallel computing for loops using multiprocessing. The main idea is to imitate <a href="https://joblib.readthedocs.io/en/latest/index.html">Joblib Library</a>. Writing the code to be executed as a generator expression, and submitting it as task to StarPU parallel.
  108. \code{.py}
  109. >>> from math import log10
  110. >>> [log10(10 ** i) for i in range(10)]
  111. [0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0]
  112. \endcode
  113. In order to spread it over several CPUs, you can use starpu.joblib module, and call parallel function:
  114. \code{.py}
  115. >>> from math import log10
  116. >>> starpu.joblib.parallel(mode="normal", n_jobs=2)(starpu.joblib.delayed(log10)(10**i)for i in range(10))
  117. [0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0]
  118. \endcode
  119. You can also generate a list of functions instead of a generator expression, and submit it as task to StarPU parallel.
  120. \code{.py}
  121. #generate a list to store functions
  122. g_func=[]
  123. #function no input no output print hello world
  124. def hello():
  125. print ("Example 1: Hello, world!")
  126. g_func.append(starpu.joblib.delayed(hello)())
  127. #function has 2 int inputs and 1 int output
  128. def multi(a, b):
  129. res_multi = a*b
  130. print("Example 2: The result of ",a,"*",b,"is",res_multi)
  131. return res_multi
  132. g_func.append(starpu.joblib.delayed(multi)(2, 3))
  133. #function has 4 float inputs and 1 float output
  134. def add(a, b, c, d):
  135. res_add = a+b+c+d
  136. print("Example 3: The result of ",a,"+",b,"+",c,"+",d,"is",res_add)
  137. return res_add
  138. g_func.append(starpu.joblib.delayed(add)(1.2, 2.5, 3.6, 4.9))
  139. #function has 2 int inputs 1 float input and 1 float output 1 int output
  140. def sub(a, b, c):
  141. res_sub1 = a-b-c
  142. res_sub2 = a-b
  143. print ("Example 4: The result of ",a,"-",b,"-",c,"is",res_sub1,"and the result of",a,"-",b,"is",res_sub2)
  144. return res_sub1, res_sub2
  145. g_func.append(starpu.joblib.delayed(sub)(6, 2, 5.9))
  146. #input is iterable function list
  147. starpu.joblib.parallel(mode="normal", n_jobs=2)(g_func)
  148. \endcode
  149. Execution:
  150. \verbatim
  151. Example 1: Hello, world!
  152. Example 2: The result of 2 * 3 is 6
  153. Example 3: The result of 1.2 + 2.5 + 3.6 + 4.9 is 12.200000000000001
  154. Example 4: The result of 6 - 2 - 5.9 is -1.9000000000000004 and the result of 6 - 2 is 4
  155. \endverbatim
  156. \subsection ParallelParameters Parallel Parameters
  157. \subsubsection mode mode
  158. (string, default: "normal")
  159. You need to choose the mode between "normal" and "future". As in the previous example, with "normal" mode, you can call starpu.joblib.parallel directly without using asyncio module and you will get the result when the task is executed. With "future" mode, when you call starpu.joblib.parallel, you will get a Future object as return value. Here if you set another parameter "end_msg", you will receive a signal with this message that the result is ready, then you can perform awaiting to get the eventual result. The asyncio module should be imported in this case.
  160. \code{.py}
  161. >>> import starpu
  162. >>> import asyncio
  163. >>> from math import log10
  164. >>> fut = starpu.joblib.parallel(mode="future", n_jobs=3, end_msg="The result is ready!")(starpu.joblib.delayed(log10)(10**i)for i in range(10))
  165. >>> The result is ready! <_GatheringFuture finished result=[[0.0, 1.0, 2.0, 3.0], [4.0, 5.0, 6.0], [7.0, 8.0, 9.0]]>
  166. >>> await fut
  167. [[0.0, 1.0, 2.0, 3.0], [4.0, 5.0, 6.0], [7.0, 8.0, 9.0]]
  168. \endcode
  169. \subsubsection end_msg end_msg
  170. (string, default: None)
  171. As we introduced in the previous section, this parameter can be set with a prompt message to remind you that the task is executed and the result is ready, then you can perform awaiting and get the eventual result. If you do not set this parameter, the default value is None, and you will not receive any prompt message, but you still can perform awaiting and get the eventual result.
  172. \subsubsection n_jobs n_jobs
  173. (int, default: 1)
  174. You need to set the number of CPUs which is used for parallel computing. Thus for n_jobs=2, 2 CPUs are used. If 1 is given, no parallel computing. For n_jobs below 0, (n_cpus+1+n_jobs) CPUs are used. Thus for n_jobs=-2, all CPUs but one are used.
  175. \subsubsection perfmodel perfmodel
  176. (string, default: None)
  177. You can set a symbol for your function and its performance model will be stored for this function. Ideally, the same function should use the same symbol.
  178. In the following example, for the function log10 (i+1) for i in range(N), we set the performodel symbol to "log", and we submit the task in turn when N=10, 20, ..., 100, 200, ..., 1000, 2000, ..., 10000, 2000, ..., 100000,200000, ..., 1000000, 2000000, ..., 9000000.
  179. \code{.py}
  180. >>> from math import log10
  181. >>> for x in [10, 100, 1000, 10000, 100000, 1000000]:
  182. ... for N in range(x, x*10, x):
  183. ... starpu.joblib.parallel(mode="normal", n_jobs=2, perfmodel="log")(starpu.joblib.delayed(log10)(i+1)for i in range(N))
  184. \endcode
  185. The performance model will be saved in the file named by the symbol. You can also call the function perfmodel_plot by giving the symbol of perfmodel to view the performance curve.
  186. \code{.py}
  187. starpu.joblib.perfmodel_plot(perfmodel="log")
  188. \endcode
  189. The performance curve of this example will be shown as:
  190. \image html starpu_log.png
  191. \image latex starpu_log.eps "" width=\textwidth
  192. */