multithreaded file access

M

Matias Surdi

Hi...

I've a class which is run by many threads at the same time.... this
class has to append a line to a text file eventually.

I do this with:

File.new('filename','a').puts("this is the string")


Is this already thread safe??? how can I make it so?????


Thanks...
 
J

Jellen

Well, I think it's OK to do that.
Seeing is believing:
# first one
a = Thread.new do
5.times do
f = File.new("qq.txt", "a").puts "I am a..."
f.close if f
end
end
b = Thread.new do
5.times do
f = File.new("qq.txt", "a").puts "I am b..."
f.close if f
end
end
a.join
b.join

# and this one
f = File.new("test.txt", "a")
a = Thread.new do
5.times do
f.puts "I am a..."
sleep 1
end
end
b = Thread.new do
5.times do
f.puts "I am b..."
sleep 1
end
end
a.join
b.join
f.close


Both program are ok. (But I am not sure myself:)
 
J

J. Ryan Sobol

Well, I think it's OK to do that.
Seeing is believing:
# first one
a = Thread.new do
5.times do
f = File.new("qq.txt", "a").puts "I am a..."
f.close if f
end
end
b = Thread.new do
5.times do
f = File.new("qq.txt", "a").puts "I am b..."
f.close if f
end
end
a.join
b.join

# and this one
f = File.new("test.txt", "a")
a = Thread.new do
5.times do
f.puts "I am a..."
sleep 1
end
end
b = Thread.new do
5.times do
f.puts "I am b..."
sleep 1
end
end
a.join
b.join
f.close


Both program are ok. (But I am not sure myself:)

Interesting examples, Jellen, but I don't think it answers Matias'
question, which was
File.new('filename','a').puts("this is the string")


Is this already thread safe??? how can I make it so?????

Correct me if I'm wrong, but your examples only prove that the thread
on the CPU will be able to append the file. I *think* Matias wants
to know if the statement ( File.new('filename','a').puts("this is the
string") ) is atomic. Or in other words, do you need to enforce
mutual exclusive access to the file with a mutex? Unfortunately, I
don't have an answer to that question.
 
J

J. Ryan Sobol

[kig@jugend:~] cat fw_test.rb
def writer(i, fn, ok)
Thread.new{
t_str = "#{i}" * 65536
while ok.first
File.open(fn, 'a'){|f|
f.puts t_str
}
end
}
end

ok = [true]
fn = 'fw_test.dat'
ts = (1..3).map{|i| writer i, fn, ok}

sleep 10

ok[0] = false

if File.readlines(fn).uniq.size == ts.size
puts "puts in different threads seems to be atomic"
else
puts "puts in different threads isn't atomic"
end

File.unlink fn

[kig@jugend:~] ruby fw_test.rb
puts in different threads seems to be atomic

Crafty test program. Coincidentally, my results differ from yours.

$ ruby fw_test.rb
puts in different threads isn't atomic
$ ruby -v
ruby 1.8.2 (2004-12-25) [powerpc-darwin8.3.0]

I'm getting 8 unique lines in File.readlines(fn).uniq, as opposed to
the 3 thread objects in ts. Assuming I understand the program, that
means threads are not waiting their turn like they should. FYI, I've
compiled my own ruby executable from DarwinPorts instead of using the
'broken' one shipped in OS X Tiger.

However, if I change the multiplication factor in line 3 to 655, then

$ ruby fw_test.rb
puts in different threads seems to be atomic

~ ryan ~

PS - I love the 205+ MB of text this app generates! :-D
 
J

J. Ryan Sobol

[kig@jugend:~] ruby fw_test.rb
puts in different threads seems to be atomic

Crafty test program. Coincidentally, my results differ from yours.

$ ruby fw_test.rb
puts in different threads isn't atomic
$ ruby -v
ruby 1.8.2 (2004-12-25) [powerpc-darwin8.3.0]

Good! I thought it was odd that they seemed to be atomic..

And now we know to mutex disk writes (or use flock.)
I'm getting 8 unique lines in File.readlines(fn).uniq, as opposed to
the 3 thread objects in ts. Assuming I understand the program, that
means threads are not waiting their turn like they should. FYI, I've
compiled my own ruby executable from DarwinPorts instead of using the
'broken' one shipped in OS X Tiger.

However, if I change the multiplication factor in line 3 to 655, then

$ ruby fw_test.rb
puts in different threads seems to be atomic

I don't know the reason (can guess, but too unsure)
Explanation anyone?
~ ryan ~

PS - I love the 205+ MB of text this app generates! :-D

Ilmari

Regardless, the important thing is to know that it seems file IO via
puts is not thread safe, which I kind of assumed from the beginning.
This begs the question: which methods, especially concerning file IO,
are thread safe? (if any)

~ ryan ~
 
M

Matias Surdi

J. Ryan Sobol escribi=F3:
On Dec 23, 2005, at 5:23 PM, Ilmari Heikkinen wrote:
=20
On Dec 23, 2005, at 4:13 PM, Ilmari Heikkinen wrote:

[kig@jugend:~] ruby fw_test.rb
puts in different threads seems to be atomic


Crafty test program. Coincidentally, my results differ from yours.

$ ruby fw_test.rb
puts in different threads isn't atomic
$ ruby -v
ruby 1.8.2 (2004-12-25) [powerpc-darwin8.3.0]


Good! I thought it was odd that they seemed to be atomic..

And now we know to mutex disk writes (or use flock.)
I'm getting 8 unique lines in File.readlines(fn).uniq, as opposed to
the 3 thread objects in ts. Assuming I understand the program, that
means threads are not waiting their turn like they should. FYI, I've
compiled my own ruby executable from DarwinPorts instead of using the
'broken' one shipped in OS X Tiger.

However, if I change the multiplication factor in line 3 to 655, then

$ ruby fw_test.rb
puts in different threads seems to be atomic


I don't know the reason (can guess, but too unsure)
Explanation anyone?
~ ryan ~

PS - I love the 205+ MB of text this app generates! :-D


Ilmari
=20
=20
Regardless, the important thing is to know that it seems file IO via =20
puts is not thread safe, which I kind of assumed from the beginning. =20
This begs the question: which methods, especially concerning file IO, =20
are thread safe? (if any)
=20
~ ryan ~
=20
=20


Hey!!.... I'm now more confused than before posting :-D
 
E

Eero Saynatkari

Matias said:
J. Ryan Sobol
escribi󺦧t;>> Crafty test program. Coincidentally, my results differ from yours.


Hey!!.... I'm now more confused than before posting :-D

You should guard your critical section :) The simplest way
is using Mutex from the core. Consult your ri or make a visit
to http://www.ruby-doc.org.


E
 
T

tony summerfelt

J. Ryan Sobol wrote on 12/23/2005 4:59 PM:
Crafty test program. Coincidentally, my results differ from yours.

i was in a similar situation.

i was under the impression from page 118 of my well thumbed copy of
'programming ruby' that any shared resource (eg. a file that will be
access by more than one thread) should be under the control of a mutex...

the program i was working is long running (and been in use for almost
a year) and seems to be holding up ok...
 
G

gwtmp01

I've a class which is run by many threads at the same time.... this
class has to append a line to a text file eventually.

I do this with:

File.new('filename','a').puts("this is the string")

On Posix file systems, writes to a file in append mode are
atomic but that is only true when you make a direct system
call. In Ruby that would be a call to IO#syswrite, which
bypasses all the standard buffering of puts and company.
You can't mix and match these types of IO calls. It is one
or the other.

Caveats: I think there is a limit on the size of the write
that will guaranteed to be atomic. The Unix system calls
pathconf and fpathconf give you access to the PIBE_BUF limit
that specifies this limit for *pipes*. I'm not sure if there
is a similar limit for files. I just couldn't locate anything
specifically in my quick research.

Related Question: Does Ruby have its own buffering methodology
or does it use the C stdio library buffering for file I/O?
I just don't know enough about the Ruby internals to answer
this question.

Gary Wright
 
Y

Yohanes Santoso

Matias Surdi said:
Hi...

I've a class which is run by many threads at the same time.... this
class has to append a line to a text file eventually.

I do this with:

File.new('filename','a').puts("this is the string")

Don't do this. It's not an atomic operation, either with buffered or
unbuffered IO.
Is this already thread safe???
No.

how can I make it so?????

If thread A outputs two line, and thread B outputs one line, would
this be acceptable?

A-1
B-1
A-2

Or should it be:

A-1
A-2
B-1

How granular do you want it? Per line? Per thread output?

Please investigate the mutex library.

file_mutex.synchronize {
file.puts(".....")
file.puts(".....")
}

YS.
 
R

Robert Klemme

Apart from MT issues this code has serious different issues: you do not
close the file (and you cannot because the IO is not returned from puts).
You rather want

File.open('filename','a') {|io| io.puts("this is the string")}

Otherwise you risk that the text is not written to the file in the proper
order - or you get even problems opening the file multiple times.

There are two possible solutions: synchronize every access to a file or use
a Queue and a separate writing thread (might be better performance wise,
because you don't have to reopen the file all the time).

Kind regards

robert
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,279
Messages
2,571,387
Members
48,090
Latest member
marky2025

Latest Threads

Top