Conducting Regular Expressions on a text file

P

Peter Marks

I'm trying to find and replacace the string "placeholder" in a text
file. Any idea how I could get the following code to work?

file = File.open('/folder/template.txt', 'r+')
file.gsub!(/placeholder/, "word")

Thanks,

Peter
 
L

Logan Capaldo

I'm trying to find and replacace the string "placeholder" in a text
file. Any idea how I could get the following code to work?

file = File.open('/folder/template.txt', 'r+')
file.gsub!(/placeholder/, "word")

Thanks,
The most "natural" way would be to use ruby-mmap.
http://raa.ruby-lang.org/project/mmap/

The other method would be of course:

File.open("source.txt", "r") do |src|
File.open("sink.txt", "w") do |sink|
src.each_line { |line| sink.write(line.gsub(/placeholder/, "word")) }
end
end

require 'fileutils'
FileUtils.mv("sink.txt", "source.txt")

This requires twice the space of the file while working so it may not
be viable for large files.

Incidentily ruby has a shortcut for doing this:

ruby -inp -e 'gsub!(/placeholder/, "word")' /folder/template.txt

To handle those really big files where you simply cannot afford to
have two copies simultaneously, or don;t have the address space to map
the whole file into memory with mmap, you can read in fixed sized
chunks, modify them in memory and then write them back out. It's a bit
more complicated than "fixed sized" chunks since the transformed text
may be smaller of larger than the original chunk.
 
P

Peter Marks

Thanks for your thoughful responses Logan and Giles. I'm going to use
the unnatural method for the time being since I'm working with
relatively small files (< 200kb) and I actually need another copy of the
file anyway to preserve the template. However, I'll look into ruby-mapp
when I get more comfortable with all of this (I'm still begining with
ruby).

Thanks again :),

Peter
 
P

Peter Marks

Follow up question:

How might I be able to make multiple placeholders within a text file? I
thought this would work:

File.open("source.txt", "r") do |src|
File.open("sink.txt", "w") do |sink|
src.each_line { |line| sink.write(line.gsub(/placeholder/, "word"))
sink.write(line.gsub(/placeholder2/,
"word2"))
}
end
end

, but my newbie assumptions were mistaken. Any ideas?

Thanks,

Peter
 
J

J-H Johansen

Follow up question:

How might I be able to make multiple placeholders within a text file? I
thought this would work:

File.open("source.txt", "r") do |src|
File.open("sink.txt", "w") do |sink|
src.each_line { |line| sink.write(line.gsub(/placeholder/, "word"))
sink.write(line.gsub(/placeholder2/,
"word2"))
}
end
end

, but my newbie assumptions were mistaken. Any ideas?

Nearly there though ;-)
I think you can connect all the gsub's together for that effect.

File.open("source.txt", "r") do |src|
File.open("sink.txt", "w") do |sink|
src.each_line { |line|
sink.write(line.gsub(/placeholder/, "word").gsub(/placeholder2/,"word2"))
}
end
end

You could also create a Hash for placeholder => word in case you have
a lot of things that needs to be changed.
 
P

Peter Marks

J-H Johansen said:
You could also create a Hash for placeholder => word in case you have
a lot of things that needs to be changed.

Thanks J-H, that works fine. I do have a a lot of these however, so a
hash is probably the be the way to go. This is what I'll be doing:

rehash = {
"placeholder" => "word",
"placeholder2" => "other",
}

File.open('test.txt') do |file|
File.open("sink.rtf", "w") do |sink|
while line = file.gets
sink.write line.gsub(/\[#([^#]*)#\]/){
rehash[$1]
}
end
end
end
 
M

Mark Wilden

To handle those really big files where you simply cannot afford to
have two copies simultaneously, or don;t have the address space to map
the whole file into memory with mmap, you can read in fixed sized
chunks, modify them in memory and then write them back out. It's a bit
more complicated than "fixed sized" chunks since the transformed text
may be smaller of larger than the original chunk.

The other thing to watch out for is if the search text spans chunks.

///ark
 
W

William James

Incidentily ruby has a shortcut for doing this:

ruby -inp -e 'gsub!(/placeholder/, "word")' /folder/template.txt

To handle those really big files where you simply cannot afford to
have two copies simultaneously,

I think that Ruby makes a temporary copy of the file.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,264
Messages
2,571,318
Members
48,003
Latest member
coldDuece

Latest Threads

Top