# StarPU --- Runtime system for heterogeneous multicore architectures.
#
# Copyright (C) 2013 Simon Archipoff
#
# StarPU is free software; you can redistribute it and/or modify
# it under the terms of the GNU Lesser General Public License as published by
# the Free Software Foundation; either version 2.1 of the License, or (at
# your option) any later version.
#
# StarPU is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
#
# See the GNU Lesser General Public License in COPYING.LGPL for more details.
Mutex policy
scheduler have to be protected when the hypervisor is modifying it.
there is a mutex in struct starpu_sched_tree wich should be taken by
the application to push a task
and one mutex per worker wich should be taken by workers when they pop
or push a task.
The hypervisor must take all of them to modifying the scheduler.
Creation/Destruction
All the struct starpu_sched_component * starpu_sched_component_foo_create()
function return a initialized struct starpu_sched_component.
The void starpu_sched_component_destroy(struct starpu_sched_component * component)
function call component->deinit_data(component) to free data allocated during
creation
Workers components are particulars, there is no creation function, only
accessor to guaranty uniqueness of worker components. worker_component->workers and
worker_component->workers_in_ctx should not be modified.
Add/Remove workers
I see 2 way for adding/removing workers of the scheduler
The hypervisor block all the scheduling and modify the scheduler in
the way it wants, and then update all component->workers_in_ctx bitmaps, and
all component->push_task should respect it.
And the second one may be done in an atomic way. The struct
starpu_sched_tree hold a struct starpu_bitmap * that represent
available workers in context. All component can make a call to struct starpu_bitmap
* starpu_sched_component_get_worker_mask(unsigned sched_ctx_id) to see
where they can push a task according to available workers.
But with this way we have a problem for component->estimated_end, in case
of fifo, we have to know how many workers are available to the fifo
component. We also have a problem for shared object. The first way seems to
be better.
Hierarchical construction
Bugs everywhere, works only in simple and particulars cases.
Its difficult to guess where we should plug accelerators because we cant rely on
hwloc topology. Hierarchical heft seems to work on simple machines with numa components
and GPUs
this fail if hwloc_socket_composed_sched_component or hwloc_cache_composed_sched_component is not
NULL
Various things
In several place realloc is used (in prio_deque and for
starpu_sched_component_add_child), because we should not have a lot
different priority level nor adding too many childs.