reading file after a particular line in file

Vandana · May 12, 2010

Hello All,

I would like to read a file in ruby. It is a 2G file, but
contain useless data in the beginning portion of the file.

There is a particular pattern towards the middle of the file after
which useful data begins. Is there a way to grep for this pattern and
then read every line henceforth, but ignore all lines previous to line
on which pattern found?

Thanks,
Vandana

Thomas Volkmar Worm · May 13, 2010

File.open("myfile", "r") do |f|

# Skip the garbage before pattern:
while f.gets !~ /pattern/ do; end

# Read your data:
while l = f.gets
puts l
end

end

Vandana · May 13, 2010

Thank you very much.

Robert Klemme · May 13, 2010

There's also the flip flop operator:

File.foreach "myfile" do |line|
if /pattern/ =~ line .. false
puts line
end
end

The trick I am using is that the FF operator starts to return true if
the first expression returns true and stays true until the last
expression returns true - in this case never since you want to read
until the end of the file.

Kind regards

robert

Une Bévue · May 13, 2010

Robert Klemme said:
There's also the flip flop operator:

File.foreach "myfile" do |line|
if /pattern/ =~ line .. false
puts line
end
end

The trick I am using is that the FF operator starts to return true if
the first expression returns true and stays true until the last
expression returns true - in this case never since you want to read
until the end of the file.

coud that trick be used for start and stop tags ? like :

File.foreach "myfile" do |line|
if /<body/ =~ line .. /<\/body/ =~ line
puts line
end
end

if true, that's clever !

Robert Klemme · May 13, 2010

coud that trick be used for start and stop tags ? like :

File.foreach "myfile" do |line|
if /<body/ =~ line .. /<\/body/ =~ line
puts line
end
end

if true, that's clever !

Yes, that could be done. However, I would not use this for languages
from the SGML family (XML, HTML) because there are no guarantees as to
how many tags you'll find on a single line of text. There are better
tools do deal with that (REXML, Nokogiri...).

Kind regards

robert

Une Bévue · May 13, 2010

Robert Klemme said:
Yes, that could be done. However, I would not use this for languages
from the SGML family (XML, HTML) because there are no guarantees as to
how many tags you'll find on a single line of text. There are better
tools do deal with that (REXML, Nokogiri...).

Right, however REXML isn't working for badly balanced tags.
I dis some test, today, of Nokogiri, it works even better than tidy for
the first step cleaning unbalanced tags.

the only question i have about Nokogiri is how to avoid the DOCTYPE
because it outputs :
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"
"http://www.w3.org/TR/REC-html40/loose.dtd">

even if i'm using #to_xhtml :

then, the DOCTYPE is wrong...

Need help reading .wav file in C#	0	Jun 18, 2019
Php combine identical lines in text file	4	Oct 11, 2023
How can I upload a tar.bz2 file to OpenStack swift object storage container using the Python swift client?	2	Mar 22, 2024
How to sort a CSV file with merge sort JAVA	7	May 6, 2021
reading in a file	2	Dec 11, 2007
Reading a data file	65	Jul 19, 2013
Reading a file in chunks, to a byte array	1	Jan 29, 2009
Reading File Into 2D List	2	Jul 9, 2013

reading file after a particular line in file

Vandana

Thomas Volkmar Worm

Vandana

Robert Klemme

Une Bévue

Robert Klemme

Une Bévue

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads