A
aminer
Hello,
I have tested parallelhashlist(a parallel hashtable that i have implemented)
with four threads on a quad core and it's giving a perfect scaling on both
reads and writes.
Also when i have done those benchmarks on parallelhashlist,
there was not enough/much items organized as a self-balancing
tree in the individual chains of the hashtable, so , almost all the items
was found and inserted in O(1) , so the parallel part in the Amdahl
equation was not much bigger compared to to the serial part. But
you will notice in pratice that as soon as you will have more items
on the chains of the Hash table , organized as self-balancing tree,
with a worst case log(n) , the parallel part will become bigger in the
Amdahl equation and you will have better performance and scalability
, This is Gustafson's Law.
Also i have done some scalability tests on my parallelsort library and i
have come
to the conclusion that parallel heapsort is better on scalability than
parallel quicksort
cause the P part (of the Amdahl equation) is bigger in parallel heapsort
than in parallel
quicksort, the parallel heapsort is doing more on the parallel part, it's
why it scales better than parallel quicksort, but parallel quicksort is
still
faster than parallel heapsort and parallel merge sort on my tests on a
quad core processor.
And about lockfree_mpmc( a lockfree fifo queue), i have done some tests
and it's not scaling cause when you are using a single thread some variable
are updated on the L1 cache but using multiple threads those variables are
loaded from the L2 cache and it's more expensive to load them from the L2
cache.
But even though lockfree_mpmc is not scalable, you can increase
the P (parallel) part by doing more of the same: Increase the volume of
data processed by the P part (and therefore the percentage p of time spent
in computing). This is Gustafson's Law and you will get more scalability.
For example i have used the IntToStr() function on each of the four threads
(on
a quad core) on my lockfree_mpmc test programs to convert from and integer
to a string, so i have increased the P (parallel) part and i have got more
scalability,
this is Gustafson's Law, and you have to remember Gustafson's Law ,
this is very important.
You can download my parallel libraries from
http://pages.videotron.com/aminer/
Sincerely,
Amine Moulay Ramdane.
I have tested parallelhashlist(a parallel hashtable that i have implemented)
with four threads on a quad core and it's giving a perfect scaling on both
reads and writes.
Also when i have done those benchmarks on parallelhashlist,
there was not enough/much items organized as a self-balancing
tree in the individual chains of the hashtable, so , almost all the items
was found and inserted in O(1) , so the parallel part in the Amdahl
equation was not much bigger compared to to the serial part. But
you will notice in pratice that as soon as you will have more items
on the chains of the Hash table , organized as self-balancing tree,
with a worst case log(n) , the parallel part will become bigger in the
Amdahl equation and you will have better performance and scalability
, This is Gustafson's Law.
Also i have done some scalability tests on my parallelsort library and i
have come
to the conclusion that parallel heapsort is better on scalability than
parallel quicksort
cause the P part (of the Amdahl equation) is bigger in parallel heapsort
than in parallel
quicksort, the parallel heapsort is doing more on the parallel part, it's
why it scales better than parallel quicksort, but parallel quicksort is
still
faster than parallel heapsort and parallel merge sort on my tests on a
quad core processor.
And about lockfree_mpmc( a lockfree fifo queue), i have done some tests
and it's not scaling cause when you are using a single thread some variable
are updated on the L1 cache but using multiple threads those variables are
loaded from the L2 cache and it's more expensive to load them from the L2
cache.
But even though lockfree_mpmc is not scalable, you can increase
the P (parallel) part by doing more of the same: Increase the volume of
data processed by the P part (and therefore the percentage p of time spent
in computing). This is Gustafson's Law and you will get more scalability.
For example i have used the IntToStr() function on each of the four threads
(on
a quad core) on my lockfree_mpmc test programs to convert from and integer
to a string, so i have increased the P (parallel) part and i have got more
scalability,
this is Gustafson's Law, and you have to remember Gustafson's Law ,
this is very important.
You can download my parallel libraries from
http://pages.videotron.com/aminer/
Sincerely,
Amine Moulay Ramdane.