java thread for core 2 due processors

M

Motaz K. Saad

Hello,

I am working on file processing (large number of files) which is
totally independent task for each file. I want to speedup processing
using java thread but I do not want overwhelm my processor (core 2 due
processor) with all threads at the same time.

I want to run 2 parallel threads simultaneously and wait until they
finish then run the next 2 threads. furthermore, each thread allocate
large amount of memory so I need to de-allocate the thread after
collecting the result from it.

I would appreciate if anyone direct my to similar set example
I would appreciate guideline and help


Thanks,
 
T

Tom Anderson

I am working on file processing (large number of files) which is totally
independent task for each file. I want to speedup processing using java
thread but I do not want overwhelm my processor (core 2 due processor)
with all threads at the same time.

You'd have to have an awful lot of threads to 'overwhelm' the processor -
hundreds, or perhaps even thousands. There is a chance that it might
overwhelm the disk, though, depending on things like how big your files
are and how the threads access them.

However, running your jobs in 1000 threads won't be any faster than
running them in 10 (although that might be faster than running them in 2,
because it lets you saturate the disk - two threads on two CPUs will leave
either the disk or CPU underutilised at some point, unless readahead and
GC keep them busy), so your conclusion is right - you want to use fewer
threads than you have tasks.
I want to run 2 parallel threads simultaneously and wait until they
finish then run the next 2 threads.

No, you want to run two parallel tasks simultaneously and wait until they
finish then run the next two tasks. You don't need to have one thread per
task.
furthermore, each thread allocate large amount of memory so I need to
de-allocate the thread after collecting the result from it.

As long as there are no pointers to the allocated objects after the file
is processed, the garbage collector will reclaim it.
I would appreciate if anyone direct my to similar set example I would
appreciate guideline and help

You want:

http://java.sun.com/javase/6/docs/api/java/util/concurrent/ExecutorService.html

Your code looks like:

public class FileProcessingTask implements Runnable {
private final File file;
public FileProcessingTask(File file) {
this.file = file;
}
public void run() {
try {
// process file
}
catch (Exception e) {
// log the exception
// close any open files
}
}
}

public class FileProcessingApp {
public static void main(String... args) {
Collection<File> filesToProcess; // initialise this however you like
int numThreads = Runtime.getRuntime().availableProcessors() * 2;
ExecutorService executor = Executors.newFixedThreadPool(numThreads);
for (File file: filesToProcess) {
executor.execute(new FileProcessingTask(file));
}
executor.awaitTermination(Long.MAX_VALUE, TimeUnit.MILLISECONDS);
}
}

tom
 
J

Joshua Cranmer

Motaz said:
I am working on file processing (large number of files) which is
totally independent task for each file. I want to speedup processing
using java thread but I do not want overwhelm my processor (core 2 due
processor) with all threads at the same time.

Depending on what you are exactly doing, it seems to me that doing two
threads at a time will not be maximizing throughput. They could be
spending most of their time waiting around for disk I/O. On modern
processors, one thread-per-core is generally not optimal; ISTR hearing
somewhere that the correct number is about 1.5-2 per core.
I want to run 2 parallel threads simultaneously and wait until they
finish then run the next 2 threads. furthermore, each thread allocate
large amount of memory so I need to de-allocate the thread after
collecting the result from it.

As long as you watch who holds references to what, the GC will clear
memory up itself. I think it would generally be sufficient to make sure
that you don't keep references to the threads unless necessary--good
programming practices will likely confine the leaks.
I would appreciate if anyone direct my to similar set example
I would appreciate guideline and help

The first thing to recommend is that you have a grasp on how to do
concurrent programming. If your tasks do not communicate to each other
or to the main program, you probably don't have any thread-safety issues.

The best way to actually implement this is probably with the new from
Java 5 ExecutorService API, as others have stated.
 
R

Roedy Green

However, running your jobs in 1000 threads won't be any faster than
running them in 10 (although that might be faster than running them in 2,

The main time large numbers of threads buys you something is web
scraping. Your threads are mostly sitting waiting for socket i/o to
complete.
 
R

Roedy Green

isn't that what NIO is for so you need only one thread to read.. and he
gives off the processing to an ExecutorService..

Yes. I've been told that is a less flexible, more efficient way to do
it.
 
T

Tom Anderson

so I assume that if you really have no interaction with the os and
nothing that could block you. Then number of Threads = CPU cores should
be fine/perfect.

Close.

If you're making garbage, you actually want slightly fewer threads than
cores, because the collector will need some CPU time every so often. If
you have 4 cores, you want to run about 3.8 threads. Of course, this is
not possible in practice.

Also, bear in mind that 'no interaction with the OS' includes 'not using
any virtual memory' (in the sense of 'memory which is paged out') - a
thread which accesses a memory location which is paged out will trigger
disk IO, which will cause it to block. If your java processes fits
entirely in physical memory, this is not an issue.

tom
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,997
Messages
2,570,241
Members
46,831
Latest member
RusselWill

Latest Threads

Top