/* StarPU --- Runtime system for heterogeneous multicore architectures.
*
* Copyright (C) 2013-2020 Université de Bordeaux, CNRS (LaBRI UMR 5800), Inria
* Copyright (C) 2013 Simon Archipoff
*
* StarPU is free software; you can redistribute it and/or modify
* it under the terms of the GNU Lesser General Public License as published by
* the Free Software Foundation; either version 2.1 of the License, or (at
* your option) any later version.
*
* StarPU is distributed in the hope that it will be useful, but
* WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
*
* See the GNU Lesser General Public License in COPYING.LGPL for more details.
*/
/*! \page HowToDefineANewSchedulingPolicy How To Define A New Scheduling Policy
\section NewSchedulingPolicy_Introduction Introduction
StarPU provides two ways of defining a scheduling policy, a basic monolithic
way, and a modular way.
The basic monolithic way is directly connected with the core of StarPU, which
means that the policy then has to handle all performance details, such as data
prefetching, task performance model calibration, worker locking, etc.
|
starpu_push_task |
|
v
Fifo_Component
| ^
Push | | Can_Push
v |
Eager_Component
| ^
| |
v |
--------><-------------------><---------
| ^ | ^
Push | | Can_Push Push | | Can_Push
v | v |
Fifo_Component Fifo_Component
| ^ | ^
Pull | | Can_Pull Pull | | Can_Pull
v | v |
Worker_Component Worker_Component
| |
starpu_pop_task | |
v v
When a task is pushed by StarPU in a Modularized Scheduler, the task moves from
a Scheduling Component to an other, following the hierarchy of the
Scheduling Tree, and is stored in one of the Scheduling Components of the
strategy.
When a worker wants to pop a task from the Modularized Scheduler, the
corresponding Worker Component of the Scheduling Tree tries to pull a task from
its parents, following the hierarchy, and gives it to the worker if it succeded
to get one.
\subsection Interface
Each Scheduling Component must follow the following pre-defined Interface
to be able to interact with other Scheduling Components.
- push_task (child_component, Task) \n
The calling Scheduling Component transfers a task to its
Child Component. When the Push function returns, the task no longer
belongs to the calling Component. The Modularized Schedulers'
model relies on this function to perform prefetching.
See starpu_sched_component::push_task for more details
- pull_task (parent_component, caller_component) -> Task \n
The calling Scheduling Component requests a task from
its Parent Component. When the Pull function ends, the returned
task belongs to the calling Component.
See starpu_sched_component::pull_task for more details
- can_push (caller_component, parent_component) \n
The calling Scheduling Component notifies its Parent Component that
it is ready to accept new tasks.
See starpu_sched_component::can_push for more details
- can_pull (caller_component, child_component) \n
The calling Scheduling Component notifies its Child Component
that it is ready to give new tasks.
See starpu_sched_component::can_pull for more details
The components also provide the following useful methods:
- starpu_sched_component::estimated_load provides an estimated load of
the component
- starpu_sched_component::estimated_end provides an estimated date of
availability of workers behind the component, after processing tasks in
the component and below.
This is computed only if the estimated field of the tasks have been set
before passing it to the component.
\subsection BuildAModularizedScheduler Building a Modularized Scheduler
\subsubsection PreImplementedComponents Pre-implemented Components
StarPU is currently shipped with the following four Scheduling Components :
- Storage Components : Fifo, Prio \n
Components which store tasks. They can also prioritize them if
they have a defined priority. It is possible to define a threshold
for those Components following two criterias : the number of tasks
stored in the Component, or the sum of the expected length of all
tasks stored in the Component. When a push operation tries to queue a
task beyond the threshold, the push fails. When some task leaves the
queue (and thus possibly more tasks can fit), this component calls
can_push from ancestors.
- Resource-Mapping Components : Mct, Heft, Eager, Random, Work-Stealing \n
"Core" of the Scheduling Strategy, those Components are the
ones who make scheduling choices between their children components.
- Worker Components : Worker \n
Each Worker Component modelizes a concrete worker, and copes with the
technical tricks of interacting with the StarPU core. Modular schedulers
thus usually have them at the bottom of their component tree.
- Special-Purpose Components : Perfmodel_Select, Best_Implementation \n
Components dedicated to original purposes. The Perfmodel_Select
Component decides which Resource-Mapping Component should be used to
schedule a task: a component that assumes tasks with a calibrated
performance model; a component for non-yet-calibrated tasks, that will
distribute them to get measurements done as quickly as possible; and a
component that takes the tasks without performance models.\n
The Best_Implementation Component chooses which
implementation of a task should be used on the chosen resource.
\subsubsection ProgressionAndValidationRules Progression And Validation Rules
Some rules must be followed to ensure the correctness of a Modularized
Scheduler :
- At least one Storage Component without threshold is needed in a
Modularized Scheduler, to store incoming tasks from StarPU. It can for
instance be a global component at the top of the tree, or one component
per worker at the bottom of the tree, or intermediate assemblies. The
important point is that the starpu_sched_component::push_task call at the top can not
fail, so there has to be a storage component without threshold between
the top of the tree and the first storage component with threshold, or
the workers themselves.
- At least one Resource-Mapping Component is needed in a Modularized
Scheduler. Resource-Mapping Components are the only ones which can make
scheduling choices, and so the only ones which can have several child.
\subsubsection ModularizedSchedulerLocking Locking in modularized schedulers
Most often, components do not need to take locks. This allows e.g. the push
operation to be called in parallel when tasks get released in parallel from
different workers which have completed different ancestor tasks.
When a component has internal information which needs to be kept coherent, the
component can define its own lock at take it as it sees fit, e.g. to protect a
task queue. This may however limit scalability of the scheduler. Conversely,
since push and pull operations will be called concurrently from different
workers, the component might prefer to use a central mutex to serialize all
scheduling decisions to avoid pathological cases (all push calls decide to put
their task on the same target)
\subsubsection ImplementAModularizedScheduler Implementing a Modularized Scheduler
The following code shows how to implement a Tree-Eager-Prefetching Scheduler.
\code{.c}
static void initialize_eager_prefetching_center_policy(unsigned sched_ctx_id)
{
/* The eager component will decide for each task which worker will run it,
* and we want fifos both above and below the component */
starpu_sched_component_initialize_simple_scheduler(
starpu_sched_component_eager_create, NULL,
STARPU_SCHED_SIMPLE_DECIDE_WORKERS |
STARPU_SCHED_SIMPLE_FIFO_ABOVE |
STARPU_SCHED_SIMPLE_FIFOS_BELOW,
sched_ctx_id);
}
/* Properly destroy the Scheduling Tree and all its Components */
static void deinitialize_eager_prefetching_center_policy(unsigned sched_ctx_id)
{
struct starpu_sched_tree * tree = (struct starpu_sched_tree*)starpu_sched_ctx_get_policy_data(sched_ctx_id);
starpu_sched_tree_destroy(tree);
}
/* Initializing the starpu_sched_policy struct associated to the Modularized
* Scheduler : only the init_sched and deinit_sched needs to be defined to
* implement a Modularized Scheduler */
struct starpu_sched_policy _starpu_sched_tree_eager_prefetching_policy =
{
.init_sched = initialize_eager_prefetching_center_policy,
.deinit_sched = deinitialize_eager_prefetching_center_policy,
.add_workers = starpu_sched_tree_add_workers,
.remove_workers = starpu_sched_tree_remove_workers,
.push_task = starpu_sched_tree_push_task,
.pop_task = starpu_sched_tree_pop_task,
.pre_exec_hook = starpu_sched_component_worker_pre_exec_hook,
.post_exec_hook = starpu_sched_component_worker_post_exec_hook,
.pop_every_task = NULL,
.policy_name = "tree-eager-prefetching",
.policy_description = "eager with prefetching tree policy"
};
\endcode
starpu_sched_component_initialize_simple_scheduler() is a helper function which
makes it very trivial to assemble a modular scheduler around a scheduling
decision component as seen above (here, a dumb eager decision component). Most
often a modular scheduler can be implemented that way.
A modular scheduler can also be constructed hierarchically with
starpu_sched_component_composed_recipe_create().
That modular scheduler can also be built by hand in the following way:
\code{.c}
#define _STARPU_SCHED_NTASKS_THRESHOLD_DEFAULT 2
#define _STARPU_SCHED_EXP_LEN_THRESHOLD_DEFAULT 1000000000.0
static void initialize_eager_prefetching_center_policy(unsigned sched_ctx_id)
{
unsigned ntasks_threshold = _STARPU_SCHED_NTASKS_THRESHOLD_DEFAULT;
double exp_len_threshold = _STARPU_SCHED_EXP_LEN_THRESHOLD_DEFAULT;
[...]
starpu_sched_ctx_create_worker_collection
(sched_ctx_id, STARPU_WORKER_LIST);
/* Create the Scheduling Tree */
struct starpu_sched_tree * t = starpu_sched_tree_create(sched_ctx_id);
/* The Root Component is a Flow-control Fifo Component */
t->root = starpu_sched_component_fifo_create(NULL);
/* The Resource-mapping Component of the strategy is an Eager Component
*/
struct starpu_sched_component *eager_component = starpu_sched_component_eager_create(NULL);
/* Create links between Components : the Eager Component is the child
* of the Root Component */
starpu_sched_component_connect(t->root, eager_component);
/* A task threshold is set for the Flow-control Components which will
* be connected to Worker Components. By doing so, this Modularized
* Scheduler will be able to perform some prefetching on the resources
*/
struct starpu_sched_component_fifo_data fifo_data =
{
.ntasks_threshold = ntasks_threshold,
.exp_len_threshold = exp_len_threshold,
};
unsigned i;
for(i = 0; i < starpu_worker_get_count() + starpu_combined_worker_get_count(); i++)
{
/* Each Worker Component has a Flow-control Fifo Component as
* father */
struct starpu_sched_component * worker_component = starpu_sched_component_worker_new(i);
struct starpu_sched_component * fifo_component = starpu_sched_component_fifo_create(&fifo_data);
starpu_sched_component_connect(fifo_component, worker_component);
/* Each Flow-control Fifo Component associated to a Worker
* Component is linked to the Eager Component as one of its
* children */
starpu_sched_component_connect(eager_component, fifo_component);
}
starpu_sched_tree_update_workers(t);
starpu_sched_ctx_set_policy_data(sched_ctx_id, (void*)t);
}
/* Properly destroy the Scheduling Tree and all its Components */
static void deinitialize_eager_prefetching_center_policy(unsigned sched_ctx_id)
{
struct starpu_sched_tree * tree = (struct starpu_sched_tree*)starpu_sched_ctx_get_policy_data(sched_ctx_id);
starpu_sched_tree_destroy(tree);
starpu_sched_ctx_delete_worker_collection(sched_ctx_id);
}
/* Initializing the starpu_sched_policy struct associated to the Modularized
* Scheduler : only the init_sched and deinit_sched needs to be defined to
* implement a Modularized Scheduler */
struct starpu_sched_policy _starpu_sched_tree_eager_prefetching_policy =
{
.init_sched = initialize_eager_prefetching_center_policy,
.deinit_sched = deinitialize_eager_prefetching_center_policy,
.add_workers = starpu_sched_tree_add_workers,
.remove_workers = starpu_sched_tree_remove_workers,
.push_task = starpu_sched_tree_push_task,
.pop_task = starpu_sched_tree_pop_task,
.pre_exec_hook = starpu_sched_component_worker_pre_exec_hook,
.post_exec_hook = starpu_sched_component_worker_post_exec_hook,
.pop_every_task = NULL,
.policy_name = "tree-eager-prefetching",
.policy_description = "eager with prefetching tree policy"
};
\endcode
Other modular scheduler examples can be seen in
starpu_push_task
Pump
|
Area 1 |
|
v
-----------------------Fifo_Component-----------------------------
Pump
| ^
Push | | Can_Push
v |
Area 2 Eager_Component
| ^
| |
v |
--------><-------------------><---------
| ^ | ^
Push | | Can_Push Push | | Can_Push
v | v |
-----Fifo_Component-----------------------Fifo_Component----------
| ^ | ^
Pull | | Can_Pull Pull | | Can_Pull
Area 3 v | v |
Pump Pump
Worker_Component Worker_Component
\section GraphScheduling Graph-based Scheduling
For performance reasons, most of the schedulers shipped with StarPU use simple
list-scheduling heuristics, assuming that the application has already set
priorities. This is why they do their scheduling between when tasks become
available for execution and when a worker becomes idle, without looking at the
task graph.
Other heuristics can however look at the task graph. Recording the task graph
is expensive, so it is not available by default, the scheduling heuristic has
to set \c _starpu_graph_record to \c 1 from the initialization function, to make it
available. Then the