exa2pro/starpu-max: porting work to Maxeler DFE @ 2240d43c8fd1e70e8894e95a5d59520aef69afc9

Samuel Thibault 2240d43c8f more cleanup		12 anni fa
..
README	5e2a2b29fb licences everywhere	12 anni fa
deque_modeling_policy_data_aware.c	19ed150528 port r11354 and 11355 from 1.1: There is no need to signal workers in non-blocking mode	12 anni fa
deque_queues.c	786b8fb4b7 move _STARPU_PTHREAD_XXX macros to public API starpu_thread_util and rename them to STARPU_PTHREAD_XXX	12 anni fa
deque_queues.h	9e75014ab9 src/sched_policies: remove useless include files	12 anni fa
eager_central_policy.c	c3d255416f port r11361 from 1.1: Fix using fields of the just-pushed task without keeping the mutex, otherwise it might get executed and freed in between. Also make the blocking mode wake only workers which can execute the task	12 anni fa
eager_central_priority_policy.c	4eb6989308 Do not let the prio scheduler decrement task stats several times when popping a task with multiple available implementations	12 anni fa
fifo_queues.c	653110bc59 Second step of hierarchical schedulers' restructuring.	12 anni fa
fifo_queues.h	653110bc59 Second step of hierarchical schedulers' restructuring.	12 anni fa
hierarchical_heft.c	61b17de279 add hierarchical heft in sched policies list	12 anni fa
node_best_implementation.c	7a983bdaa4 Correcting copyrights on src/sched_policies/	12 anni fa
node_composed.c	7a983bdaa4 Correcting copyrights on src/sched_policies/	12 anni fa
node_eager.c	7a983bdaa4 Correcting copyrights on src/sched_policies/	12 anni fa
node_fifo.c	7a983bdaa4 Correcting copyrights on src/sched_policies/	12 anni fa
node_mct.c	2240d43c8f more cleanup	12 anni fa
node_perfmodel_select.c	7a983bdaa4 Correcting copyrights on src/sched_policies/	12 anni fa
node_prio.c	7a983bdaa4 Correcting copyrights on src/sched_policies/	12 anni fa
node_random.c	7a983bdaa4 Correcting copyrights on src/sched_policies/	12 anni fa
node_sched.c	716c571d26 clean up MCT computations	12 anni fa
node_work_stealing.c	7a983bdaa4 Correcting copyrights on src/sched_policies/	12 anni fa
node_worker.c	7a983bdaa4 Correcting copyrights on src/sched_policies/	12 anni fa
parallel_eager.c	9bb5a09263 src: minor fixes	12 anni fa
parallel_heft.c	596effeffa Merge from trunk @11364:11518	12 anni fa
prio_deque.c	b34e4bb751 First step of hierarchical schedulers' restructuring.	12 anni fa
prio_deque.h	b34e4bb751 First step of hierarchical schedulers' restructuring.	12 anni fa
random_policy.c	e2b943451f mic (perfmodel): merge trunk + finalize perfmodel	12 anni fa
sched_node.h	5e2a2b29fb licences everywhere	12 anni fa
scheduler_maker.c	c23d69272f Merge @10584:11363 and several bug fixes resulting from the merge	12 anni fa
stack_queues.c	19ed150528 port r11354 and 11355 from 1.1: There is no need to signal workers in non-blocking mode	12 anni fa
stack_queues.h	9e75014ab9 src/sched_policies: remove useless include files	12 anni fa
tree_eager.c	7a983bdaa4 Correcting copyrights on src/sched_policies/	12 anni fa
tree_eager_prefetching.c	2240d43c8f more cleanup	12 anni fa
tree_heft.c	2240d43c8f more cleanup	12 anni fa
tree_prio.c	7a983bdaa4 Correcting copyrights on src/sched_policies/	12 anni fa
tree_prio_prefetching.c	2240d43c8f more cleanup	12 anni fa
tree_random.c	7a983bdaa4 Correcting copyrights on src/sched_policies/	12 anni fa
tree_random_prefetching.c	2240d43c8f more cleanup	12 anni fa
tree_ws.c	7a983bdaa4 Correcting copyrights on src/sched_policies/	12 anni fa
work_stealing_policy.c	9bb5a09263 src: minor fixes	12 anni fa

		
			
			
				README
			
		
		
	
			
				# StarPU --- Runtime system for heterogeneous multicore architectures.
#
# Copyright (C) 2013 Simon Archipoff
#
# StarPU is free software; you can redistribute it and/or modify
# it under the terms of the GNU Lesser General Public License as published by
# the Free Software Foundation; either version 2.1 of the License, or (at
# your option) any later version.
#
# StarPU is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
#
# See the GNU Lesser General Public License in COPYING.LGPL for more details.



Mutex policy

scheduler have to be protected when the hypervisor is modifying it.
there is a mutex in struct starpu_sched_tree wich should be taken by
the application to push a task
and one mutex per worker wich should be taken by workers when they pop
or push a task.
The hypervisor must take all of them to modifying the scheduler.



Creation/Destruction

All the struct starpu_sched_node * starpu_sched_node_foo_create()
function return a initialized struct starpu_sched_node.

The void starpu_sched_node_destroy(struct starpu_sched_node * node)
function call node->deinit_data(node) to free data allocated during
creation

Workers nodes are particulars, there is no creation function, only
accessor to guaranty uniqueness of worker nodes. worker_node->workers and
worker_node->workers_in_ctx should not be modified.



Add/Remove workers

I see 2 way for adding/removing workers of the scheduler
The hypervisor block all the scheduling and modify the scheduler in
the way it wants, and then update all node->workers_in_ctx bitmaps, and
all node->push_task should respect it.

And the second one may be done in an atomic way. The struct
starpu_sched_tree hold a struct starpu_bitmap * that represent
available workers in context. All node can make a call to struct starpu_bitmap
* starpu_sched_node_get_worker_mask(unsigned sched_ctx_id) to see
where they can push a task according to available workers.
But with this way we have a problem for node->estimated_end, in case
of fifo, we have to know how many workers are available to the fifo
node. We also have a problem for shared object. The first way seems to
be better.


Hierarchical construction

Bugs everywhere, works only in simple and particulars cases.
Its difficult to guess where we should plug accelerators because we cant rely on
hwloc topology. Hierarchical heft seems to work on simple machines with numa nodes
and GPUs
this fail if hwloc_socket_composed_sched_node or hwloc_cache_composed_sched_node is not
NULL


Various things

In several place realloc is used (in prio_deque and for
starpu_sched_node_add_child), because we should not have a lot
different priority level nor adding too many childs.