memory leak

R

Rob Doug

Btw, what version of Ruby is this? IIRC there was a bug with
ruby 1.8.6 (2008-08-11 patchlevel 287) - it's strange because I've
updated it week ago :)

wow, when I used ruby 1.8.6, max amount of memory for the program was
500-600mb... with ruby 1.8.7 it can easy get more than 1GB
 
R

Rob Doug

Well, seems error not in my code. I made simple version of crawler,
there is just could not be my mistake. But, the program is still have
memory leak :( ... you can check yourself, here is the code(after 2000
random urls mem usage >200mb):

require 'rubygems'
require 'mechanize'
require 'thread'

mutex = Mutex.new

threads = []
$n = 0

THREADS = 50
q = SizedQueue.new(THREADS * 2)

threads = (1..THREADS).map do
Thread.new q do |qq|
until qq.equal?(myLink = qq.deq)
mutex.synchronize do
puts ($n +=1).to_s # + " : " + print_class_counts.to_s
end
begin
agent = WWW::Mechanize.new{ |agent|
agent.history.max_size=1
agent.open_timeout = 20
agent.read_timeout = 40
agent.user_agent_alias = 'Windows IE 7'
agent.keep_alive = false
}
page = agent.get(myLink)
puts myLink
puts page.forms.length

page.forms.each do |form|
end
rescue
end
end
end
end


File.foreach("bases/base.txt") do |line|
line.chomp!
q.enq(line)
end

threads.size.times { q.enq q}
sleep(120)

threads.each { |t| t.join}

It's very sad, because I like ruby very much, but seems it does not fit
to my projects :(
 
R

Rob Doug

Well, seems I found solution...
I tried to make some test on python as well. Simple script, previously
posted, eat memory on python too... and the only way I had it to use
forks. I checked out forkoff, but produce some strange bugs. This is the
working code:

threads = (1..THREADS).map do
Thread.new q do |qq|
until qq.equal?(myLink = qq.deq)
mutex.synchronize do
puts ($n +=1).to_s # + " : " + print_class_counts.to_s
end
fork # <----- You need to fork it, after exit fork will release
memory
begin
agent = WWW::Mechanize.new{ |agent|
agent.history.max_size=1
agent.open_timeout = 20
agent.read_timeout = 40
agent.user_agent_alias = 'Windows IE 7'
agent.keep_alive = false
}
page = agent.get(myLink)
puts myLink
puts page.forms.length

page.forms.each do |form|
end
rescue
end
end
end
end
end
 
R

Robert Klemme

Well, seems I found solution...
I tried to make some test on python as well. Simple script, previously
posted, eat memory on python too... and the only way I had it to use
forks. I checked out forkoff, but produce some strange bugs. This is the
working code:

threads = (1..THREADS).map do
Thread.new q do |qq|
until qq.equal?(myLink = qq.deq)
mutex.synchronize do
puts ($n +=1).to_s # + " : " + print_class_counts.to_s
end
fork # <----- You need to fork it, after exit fork will release
memory
begin
agent = WWW::Mechanize.new{ |agent|
agent.history.max_size=1
agent.open_timeout = 20
agent.read_timeout = 40
agent.user_agent_alias = 'Windows IE 7'
agent.keep_alive = false
}
page = agent.get(myLink)
puts myLink
puts page.forms.length

page.forms.each do |form|
end
rescue
end
end
end
end
end

You create threads and fork a process for every single item to process.
This has some consequences:

- your threads will eat all the entries in the queue very quickly
- you will get a large number of processes immediately

In this setup you do neither need threads nor a queue. Basically you
just need to iterate the input list and fork off a process for every
item you meet. However, then you do not have any control over
concurrency and your CPU will suffer. With the setup you presented you
should at least have threads wait for their processes to return so a
single thread does not fork off more than one process at a time.

Kind regards

robert
 
R

Rob Doug

You create threads and fork a process for every single item to process.
This has some consequences:

- your threads will eat all the entries in the queue very quickly
- you will get a large number of processes immediately

In this setup you do neither need threads nor a queue. Basically you
just need to iterate the input list and fork off a process for every
item you meet. However, then you do not have any control over
concurrency and your CPU will suffer. With the setup you presented you
should at least have threads wait for their processes to return so a
single thread does not fork off more than one process at a time.

sure, you're right, I forget to write it here, but in my own code there
is Process.wait after each fork :)
 
E

Eliza Sahoo

In a .net windows application the form might have resource leaks though
it is running with managed codes. We can use the below procedure to
check if a form is having resource leak.
Open Windows Task Manager
Click on Process tab.
Select "View" in the menu and then select "Select Columns" menu item.
Check the USER Objects and GDI Objects (check boxes) to make them appear
on the process page list header.
The code in your project that lunches the Win Form, please ensure you
have called the Dispose method. Forms implement the IDisposable
interface so their dispose method must be called the moment they are no
longer needed (Test 1). We can call Dispose explicitly or even better to
instantiate it implicitly by the help of using clause (Test 2).
We can use the GC.Collect() after the using statement or after the call
to dispose for troubleshooting purpose.
Now time to launch the form. Please note the note the USER Objects and
GDI Objects values at the task manager. Close the form after some time
and when the form is closed, note the values again at the task manager.
We can find the value is decreased if it is increased then there is a
leak in the form.
Fix the resource leak and remove the call to GC.Collect(). It is
generally unnecessary to make an explicit call to GC.Collect().

http://www.mindfiresolutions.com/Checking-resource-leaks-in-a-form-272.php
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,156
Messages
2,570,878
Members
47,413
Latest member
KeiraLight

Latest Threads

Top