Threading NOT working as expected

T

Ted

When I first tried creating perl threads, the main process ended after
the threads where created but before any of them really started. On
reading further, I saw that I had to join the threads so that the main
process would sit idle waiting for the threads to finish. So I added
statements to join each thread. But now, it looks like the
consequence of this is that the code in each thread is executed one
after the other, as if it was a single process rather than a set of
independantly executing threads. I had thought of joining only the
last created thread, but there is no guarantee that the last thread
will take the longest time to complete. So how do I create these
threads and guarantee that they will execute in parallel, and that the
main process will wait idle until all have finished? I am trying to
use a script to manage this analysis since there may be, in any given
batch, several dozen SQL scripts that need to be executed (each is
independant, of course, with no possibility of interacting with the
others), and I want to run these scripts by invoking a single perl
script that allows them to run in parallel making full use of all the
available computing resources.

Thanks

Ted
 
J

Joost Diepenmaat

Ted said:
When I first tried creating perl threads, the main process ended after
the threads where created but before any of them really started. On
reading further, I saw that I had to join the threads so that the main
process would sit idle waiting for the threads to finish. So I added
statements to join each thread. But now, it looks like the
consequence of this is that the code in each thread is executed one
after the other, as if it was a single process rather than a set of
independantly executing threads.

It shouldn't.

#!/usr/local/bin/perl -w
use strict;
use threads;

my @thrds = map { my $i = $_; threads->new(sub {
print "started $i\n";
sleep 2;
print "stopped $i\n" } ) } 0 .. 10;

$_->join for @thrds;

output:
started 0
started 1
started 2
started 3
started 4
started 5
started 6
started 7
started 8
started 9
started 10
stopped 0
stopped 1
stopped 2
stopped 3
stopped 4
stopped 5
stopped 6
stopped 7
stopped 8
stopped 9
stopped 10
 
X

xhoster

Ted said:
When I first tried creating perl threads, the main process ended after
the threads where created but before any of them really started. On
reading further, I saw that I had to join the threads so that the main
process would sit idle waiting for the threads to finish. So I added
statements to join each thread. But now, it looks like the
consequence of this is that the code in each thread is executed one
after the other, as if it was a single process rather than a set of
independantly executing threads.

The problem is in line 42.

Xho

--
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.
 
T

Ted

The problem is in line 42.

Xho

--
--------------------http://NewsReader.Com/--------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.

Will calling 'system' within a thread block all other threads in the
process until it has returned? If so, then that may be where my
problem lies?

If not, then I am puzzled.

Thanks

Ted
 
W

Willem

Ted wrote:
) When I first tried creating perl threads, the main process ended after
) the threads where created but before any of them really started. On
) reading further, I saw that I had to join the threads so that the main
) process would sit idle waiting for the threads to finish. So I added
) statements to join each thread. But now, it looks like the
) consequence of this is that the code in each thread is executed one
) after the other, as if it was a single process rather than a set of
) independantly executing threads.

How do you know this ? Have you tested this thoroughly ?

Note that if you run multiple threads, that one thread will be running
at a time, and the OS will switch to the next thread every so often.

So if you do a trivial task in one thread, then, yes, it's very likely that
the operating system won't have a chance to switch to other tasks before it
completes, effectively completing one task after another.


SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT
 
T

Ted

Ted wrote:

) When I first tried creating perl threads, the main process ended after
) the threads where created but before any of them really started.  On
) reading further, I saw that I had to join the threads so that the main
) process would sit idle waiting for the threads to finish.  So I added
) statements to join each thread.  But now, it looks like the
) consequence of this is that the code in each thread is executed one
) after the other, as if it was a single process rather than a set of
) independantly executing threads.

How do you know this ?  Have you tested this thoroughly ?

Note that if you run multiple threads, that one thread will be running
at a time, and the OS will switch to the next thread every so often.

So if you do a trivial task in one thread, then, yes, it's very likely that
the operating system won't have a chance to switch to other tasks before it
completes, effectively completing one task after another.

SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
            made in the above text. For all I know I might be
            drugged or something..
            No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT

Wilem,

These scripts that are being launched by the perl script threads
typically take several hours to complete.

Yes, I have done enough multithreaded programming (in C++ and Java) to
know that on a single core single processor machine, only one thread
runs at a time. There is, then, little advantage, for most of my
programming to do multithreaded development. However, my present
development machine has a dual core processor, and the server I'm
working with has a quad core processor. So, on my own machine, two
threads ought to be running concurrently and on the server, that would
be four concurrent threads.

Thanks

Ted
 
W

Willem

Ted wrote:
) Wilem,
)
) These scripts that are being launched by the perl script threads
) typically take several hours to complete.
)
) Yes, I have done enough multithreaded programming (in C++ and Java) to
) know that on a single core single processor machine, only one thread
) runs at a time. There is, then, little advantage, for most of my
) programming to do multithreaded development. However, my present
) development machine has a dual core processor, and the server I'm
) working with has a quad core processor. So, on my own machine, two
) threads ought to be running concurrently and on the server, that would
) be four concurrent threads.

Oh, I see. Sorry for jumping to conclusions.

I'll read through the reast of the thread to see if I have some
more useful insights.


SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT
 
W

Willem

Willem wrote:
) I'll read through the reast of the thread to see if I have some
) more useful insights.

I see you got your answer already in the other thread. ^_^


SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT
 
T

Ted Zlatanov

T> Yes, I have done enough multithreaded programming (in C++ and Java) to
T> know that on a single core single processor machine, only one thread
T> runs at a time. There is, then, little advantage, for most of my
T> programming to do multithreaded development. However, my present
T> development machine has a dual core processor, and the server I'm
T> working with has a quad core processor. So, on my own machine, two
T> threads ought to be running concurrently and on the server, that would
T> be four concurrent threads.

This is incorrect. A modern single processor will perform well in a
multithreaded application. Just because a single thread will run
doesn't mean a single thread is doing work at any time.

Most of the time in a modern system is spent waiting for I/O and memory
access. There are some very special cases where the CPU is actually
tied up while the application runs, but memory and disk speeds have
fallen far behind CPU speeds so the CPU will usually be waiting for
something to happen. This is why modern CPUs have ridiculously large L1
and L2 caches and many prefetching optimizations.

In a multithreaded setup, the CPU has a chance to run several threads
while memory and I/O fetches are happening.

Equating threads with number of processors is both inefficient and
misguided. Let the OS worry about scheduling resources, processes, and
threads. Just write your code to use as many threads as it absolutely
needs.

Ted
 
P

Peter J. Holzer

[number of threads should equal number of CPU cores]
This is incorrect. A modern single processor will perform well in a
multithreaded application. Just because a single thread will run
doesn't mean a single thread is doing work at any time.

Most of the time in a modern system is spent waiting for I/O and memory
access. There are some very special cases where the CPU is actually
tied up while the application runs, but memory and disk speeds have
fallen far behind CPU speeds so the CPU will usually be waiting for
something to happen. This is why modern CPUs have ridiculously large L1
and L2 caches and many prefetching optimizations.

In a multithreaded setup, the CPU has a chance to run several threads
while memory and I/O fetches are happening.

Multithreading won't help for memory fetches. Firstly because CPUs don't
have a way to inform the OS of a slow memory access, and secondly
because the overhead of switching to a different thread would be much
too high for such a (relatively) short wait. There is one exception:
So-called multi-threading CPUs can keep the state of a fixed (and
usually low) number of thrads on the CPU and switch between them. But
these are really just multi-core CPUs which share some of their units.

You are completely right about I/O of course.

hp
 
M

Martijn Lievaart

[number of threads should equal number of CPU cores]
This is incorrect. A modern single processor will perform well in a
multithreaded application. Just because a single thread will run
doesn't mean a single thread is doing work at any time.

Most of the time in a modern system is spent waiting for I/O and memory
access. There are some very special cases where the CPU is actually
tied up while the application runs, but memory and disk speeds have
fallen far behind CPU speeds so the CPU will usually be waiting for
something to happen. This is why modern CPUs have ridiculously large
L1 and L2 caches and many prefetching optimizations.

In a multithreaded setup, the CPU has a chance to run several threads
while memory and I/O fetches are happening.

Multithreading won't help for memory fetches. Firstly because CPUs don't
have a way to inform the OS of a slow memory access, and secondly
because the overhead of switching to a different thread would be much
too high for such a (relatively) short wait. There is one exception:
So-called multi-threading CPUs can keep the state of a fixed (and
usually low) number of thrads on the CPU and switch between them. But
these are really just multi-core CPUs which share some of their units.

You are completely right about I/O of course.

Not even. If all those I/Os are going to the same disk, you run the risk
of thrashing, and overall performance goes down instead of up.

Exactly the same behaviour can be seen with processes. Suppose you have a
bunch of files, together much larger than available memory. These files
are input to a program that handles one file and writes another output
file. You can do either:

for f in *; do program "$f" "$f.out"&; done; wait

or

for f in *; do program "$f" "$f.out"; done;

If the program is I/O bound, I expect the second version to be faster
than the first, although it depends on a lot of things.

So think, design, and profile, profile, profile.

M4
 
P

Peter J. Holzer

[number of threads should equal number of CPU cores]
This is incorrect. A modern single processor will perform well in a
multithreaded application. Just because a single thread will run
doesn't mean a single thread is doing work at any time. [...]
In a multithreaded setup, the CPU has a chance to run several threads
while memory and I/O fetches are happening.

Multithreading won't help for memory fetches. [...]
You are completely right about I/O of course.

Not even. If all those I/Os are going to the same disk, you run the risk
of thrashing, and overall performance goes down instead of up.

Of course. There are few problems which can be parallelized infinitely.
At some point further parallelization degrades performance instead of
improving it.
Exactly the same behaviour can be seen with processes. Suppose you have a
bunch of files, together much larger than available memory. These files
are input to a program that handles one file and writes another output
file. You can do either:

for f in *; do program "$f" "$f.out"&; done; wait

or

for f in *; do program "$f" "$f.out"; done;

If the program is I/O bound, I expect the second version to be faster
than the first, although it depends on a lot of things.

One of the things it depends on is size and placement of the files on
disk. If the files are large and stored (mostly) contiguously, the
second version is almost certainly faster. But if they are small and
scattered all over the disk, the first version may be faster because it
allows the kernel (or even the disk) to decide on the order in which it
reads these files.
So think, design, and profile, profile, profile.

Full ack.

hp
 
T

Ted Zlatanov

ML> Not even. If all those I/Os are going to the same disk, you run the risk
ML> of thrashing, and overall performance goes down instead of up.

That's very application-dependent. Note my original statement: the CPU
has a chance to run several threads while fetches are happening. It
doesn't mean that I/O will work better than way, but that's not a
multithreading problem. The elevator algorithm introduced fairly
recently in Linux kernels (IIRC) addresses this kind of I/O contention
in the right place, outside the application.

Ted
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,236
Members
46,822
Latest member
israfaceZa

Latest Threads

Top