[GSoC'19, libgomp work-stealing] GSoC 2nd Evaluation Status

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[GSoC'19, libgomp work-stealing] GSoC 2nd Evaluation Status

김규래
Hi,
Just submitted a WIP patch for my current status.
I've finished unifying the three queues and reducing the execution paths.
From now on, I will reduce the locked region so that in the end, only the queue accesses are locked.
Once this is done splitting the queues and implementing work-stealing will follow.
 
The link below is a simple benchmark result form the gcc-patches submitted version.  
https://imgur.com/IvaBDwT
The benchmark problem is computing the LU decomposition of an NxN matrix.
PLASMA [1] is a task-parallel linear algebra library.
The upstream version of PLASMA uses OpenMP's task scheduling system.
 
Looking at the results, the '2nd eval' version (currently submitted patch)
surpasses the upstream version's performance passed N=4096.
Apparently, unifying the queues improved the performance despite the
more frequent mutex lock/unlocks.
 
Ray Kim.
 
[1] https://bitbucket.org/icl/plasma/src