Question about the example in perlthrtut

J

jl_post

Hi,

I recently read the "perldoc perlthrtut" document to learn how to
do threading with Perl, and I understand most of what it's teaching.
However, I have a question regarding the complete example given (the
one about generating prime numbers using a pipeline model):


#!/usr/bin/perl
# prime-pthread, courtesy of Tom Christiansen

use strict;
use warnings;

use threads;
use Thread::Queue;

my $stream = Thread::Queue->new();
for my $i ( 3 .. 1000 ) {
$stream->enqueue($i);
}
$stream->enqueue(undef);

threads->create(\&check_num, $stream, 2)->join();

sub check_num {
my ($upstream, $cur_prime) = @_;
my $kid;
my $downstream = Thread::Queue->new();
while (my $num = $upstream->dequeue()) {
next unless ($num % $cur_prime);
if ($kid) {
$downstream->enqueue($num);
} else {
print("Found prime $num\n");
$kid = threads->create(\&check_num, $downstream, $num);
}
}
if ($kid) {
$downstream->enqueue(undef);
$kid->join();
}
}


(For the record, I deleted the first "$kid->join();" line (the one
that was right after the first "thread->create" call) and appended "-
join()" to the previous line to make the program compile.)

If I understand correctly, each thread takes a queue of numbers,
reports that the first is prime (otherwise it wouldn't be first in the
queue), and then starts a new thread. The old thread then gives the
new thread all the numbers in its queue that are not multiples of the
prime number it reported.

I ran the script, and it works great. However, it seems that a thread
won't end until its child thread (that is, the thread it created/
started) ends. And since it seems like it creates another thread for
every prime number, that would mean that around 167 different threads
can be running (or at least exist) at once. (I say 167 because there
are 167 primes from 3 to 1000.)

So am I correct in saying that there are over 100 threads running at
once? If so, isn't that a bit taxing on the operating system? I
mean, I've noticed that in some cases over 100 levels of recursion can
error out (or at least give a warning message), so wouldn't
recursively creating a thread over 100 times also be something we'd
want to avoid? Or are threads different in this respect?

Thanks in advance for any advice.

-- Jean-Luc
 
S

sln

Hi,

I recently read the "perldoc perlthrtut" document to learn how to
do threading with Perl, and I understand most of what it's teaching.
However, I have a question regarding the complete example given (the
one about generating prime numbers using a pipeline model):


#!/usr/bin/perl
# prime-pthread, courtesy of Tom Christiansen

use strict;
use warnings;

use threads;
use Thread::Queue;

my $stream = Thread::Queue->new();
for my $i ( 3 .. 1000 ) {
$stream->enqueue($i);
}
$stream->enqueue(undef);
^^^^^
This last scalar in the queue will break out of the
while function in the first thread. But the $kid->join()
below that block will wait for the 2nd thread to complete,
which is waiting for the 3rd, and so on..
So the first thread will not complete untill the last thread
does.
threads->create(\&check_num, $stream, 2)->join();
^^^^^^
This first thread has a full queue and the join() blocks
until the thread ends.
sub check_num {
my ($upstream, $cur_prime) = @_;
my $kid;
my $downstream = Thread::Queue->new();
^^^^^^^^^
Starting a new queue. The queue is thread-safe.
Items will be added by the parent thread and taken off
by the direct child thread.
while (my $num = $upstream->dequeue()) {
^^^^^^^
This will take an item off the head of the queue looking
for undef value to break out of the while block.
If there is nothing in the queue it blocks if the thread is
still alive (it will be).
next unless ($num % $cur_prime);
^^^^
Since the first threads $cur_prime was 2 and based on ($num % $cur_prime),
it found a new prime of 3. A new thread is created with prime of 3.
if ($kid) {
$downstream->enqueue($num);
^^^^^^^^^^^^^
The first thread, since its prime was 2, only adds odd numbers to the child
queue. So after 3, the next odd number it adds is 5.
} else {
print("Found prime $num\n");
$kid = threads->create(\&check_num, $downstream, $num);
^^^^^^
First thread creates a new second child thread and passes the prime 3 and an empty queue
which blocks immediatly waiting for its parent thread to add some items.

End of the while, all the threads have been created and are waiting for cascading information
to flow into thier queue's from parent threads. Each thread is started on a new prime.
Its basically a bubble sort with queue size decreasing with every new thread.
if ($kid) {
$downstream->enqueue(undef);
^^^^^
Finally, trigger all the threads to break out of the while loop, last to first.
$kid->join();
^^^^^^
Cleanup, wait for all the threads to finish, last to the first.
}
}


(For the record, I deleted the first "$kid->join();" line (the one
that was right after the first "thread->create" call) and appended "-

If I understand correctly, each thread takes a queue of numbers,
reports that the first is prime (otherwise it wouldn't be first in the
queue), and then starts a new thread. The old thread then gives the
new thread all the numbers in its queue that are not multiples of the
prime number it reported.

I ran the script, and it works great. However, it seems that a thread
won't end until its child thread (that is, the thread it created/
started) ends. And since it seems like it creates another thread for
every prime number, that would mean that around 167 different threads
can be running (or at least exist) at once. (I say 167 because there
are 167 primes from 3 to 1000.)

So am I correct in saying that there are over 100 threads running at
once?

Yes, there were 115 runnon on my machine.
If so, isn't that a bit taxing on the operating system? I
mean, I've noticed that in some cases over 100 levels of recursion can
error out (or at least give a warning message), so wouldn't
recursively creating a thread over 100 times also be something we'd
want to avoid? Or are threads different in this respect?

It appeared relatively slow on my windows box. So I think you have a point there.
But, I don't know how threads are implemented in Perl too well. I will have to
read the limited information.

OTOH, the data transformation/cascading effect, this alogrithym does, may have merits
in its implementation.
I am not a guru on calculating primes so I don't know.

I haven't found the "threads" documentation (I just read the Threads in the help docs)
where you got this code, so I will have to look for that.

I did find that $kid = threads->create(\&check_num, $downstream, $num); failed towards the
end, seemingly due to too many threads per process. How this affects cascading I don't know.
There is one thing for sure, more needs to be discovered here. I wouldn't put this code into
production.

-sln
 
X

Xho Jingleheimerschmidt

Hi,

I recently read the "perldoc perlthrtut" document to learn how to
do threading with Perl, and I understand most of what it's teaching.
However, I have a question regarding the complete example given (the
one about generating prime numbers using a pipeline model):
....

If I understand correctly, each thread takes a queue of numbers,
reports that the first is prime (otherwise it wouldn't be first in the
queue), and then starts a new thread. The old thread then gives the
new thread all the numbers in its queue that are not multiples of the
prime number it reported.

I ran the script, and it works great. However, it seems that a thread
won't end until its child thread (that is, the thread it created/
started) ends. And since it seems like it creates another thread for
every prime number, that would mean that around 167 different threads
can be running (or at least exist) at once. (I say 167 because there
are 167 primes from 3 to 1000.)

So am I correct in saying that there are over 100 threads running at
once?

Yes. Well, not necessarily actively "running", but existing, anyway.
Notice that this example is just a pedagogical example. It is like
using Fibonacci's sequence as an example of recursion. If my actual
goal was to obtain prime numbers, I certainly wouldn't use this threaded
implementation to do it, just like I wouldn't use naive recursion to get
Fibonacci numbers.
If so, isn't that a bit taxing on the operating system?

That depends on your OS and your hardware. My rather sad netbook,
running a rather sad hack of Linux, can handle it OK up to 1000. Up to
3000, it has trouble. 10,000 and it swaps to do death. Of course on a
single CPU machine, there is no point in using threading on a CPU
limited problem.

I
mean, I've noticed that in some cases over 100 levels of recursion can
error out (or at least give a warning message),

It gives a warning, which can be turned off. I don't even know why it
is on by default. It says that it "probably indicates infinite
recursion" but there all kinds of ways to make infinite loops that don't
get their own special warnings. The only other infinite loops that do
get warnings are ones that happen in the c-level code, not the user
level. Anyway, I was recent using a program that recursed to a depth of
over 40,000. No problem at all, as long as that is what you want to do.
so wouldn't
recursively creating a thread over 100 times also be something we'd
want to avoid?

You would generally want to avoid that. But this code is for playing,
not for practical applications.

Xho
 
S

sln

Yes. Well, not necessarily actively "running", but existing, anyway.
Notice that this example is just a pedagogical example. It is like
using Fibonacci's sequence as an example of recursion. If my actual
goal was to obtain prime numbers, I certainly wouldn't use this threaded
implementation to do it, just like I wouldn't use naive recursion to get
Fibonacci numbers.


That depends on your OS and your hardware. My rather sad netbook,
running a rather sad hack of Linux, can handle it OK up to 1000. Up to
3000, it has trouble. 10,000 and it swaps to do death. Of course on a
single CPU machine, there is no point in using threading on a CPU
limited problem.



It gives a warning, which can be turned off. I don't even know why it
is on by default. It says that it "probably indicates infinite
recursion" but there all kinds of ways to make infinite loops that don't
get their own special warnings. The only other infinite loops that do
get warnings are ones that happen in the c-level code, not the user
level. Anyway, I was recent using a program that recursed to a depth of
over 40,000. No problem at all, as long as that is what you want to do.


You would generally want to avoid that. But this code is for playing,
not for practical applications.

Xho
You haven't a clue about this algo or threading. Disgracefull explanations.

-sln
 
J

jl_post

You haven't a clue about this algo or threading. Disgracefull explanations.

-sln


I don't agree. Personally, I thought Xho's explanation was great.

And yours was good, too.

Thank you both for your responses. I had a feeling the code was
just for demonstrating (and not for production), but I found no
explanation in the perldoc to support that. So I wasn't sure how
"kosher" the code was.

Thanks again.

-- Jean-Luc
 
S

sln

I don't agree. Personally, I thought Xho's explanation was great.

And yours was good, too.

Thank you both for your responses. I had a feeling the code was
just for demonstrating (and not for production), but I found no
explanation in the perldoc to support that. So I wasn't sure how
"kosher" the code was.

Thanks again.

-- Jean-Luc

It was an absolutely facinating piece of code though.
I will delve into it in more depth to learn more.

I want to thank you for posting this piece.

-sln
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,225
Members
46,815
Latest member
treekmostly22

Latest Threads

Top