newbie question: how to seek in a file

A

Ak 756

Hi

I want to find a string in a text file, then read some bytes data after
the position of this string. Would anybody kindly help to tell me how to
do?

Thanks in advance.
 
L

Lutz Horn

Hi,

Ak said:
I want to find a string in a text file, then read some bytes data after
the position of this string. Would anybody kindly help to tell me how to
do?

You could do something like this:

File.open("my-file.txt") do |f|
i = f.read.index(/search/)
f.seek(i)
puts f.read(10)
end

Lutz
 
K

Konrad Meyer

--nextPart7349955.xA6aG7j4qs
Content-Type: text/plain;
charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

I want to find a string in a text file, then read some bytes data after
the position of this string. Would anybody kindly help to tell me how to
do?

If this is what you really want, you can do something like:

READ_BYTES =3D 16
my_file_contents =3D File.open('foo'){|f| f.read }
my_file_contents =3D~ /find this string(.{#{READ_BYTES}})/
my_data =3D $1

Of course, this is no good for large files. Generally if you're trying to
extract data from a large text file though, you should already know where it
is (constant width records) or you have to parse through everything before
the record you want to get at the one you do (csv, xml?).

If this is for a configuration file or some sort of semi-static storage, I'd
recommend using YAML or Marshal instead of making your own parser.

Cheers!
=2D-=20
Konrad Meyer <[email protected]> http://konrad.sobertillnoon.com/

--nextPart7349955.xA6aG7j4qs
Content-Type: application/pgp-signature; name=signature.asc
Content-Description: This is a digitally signed message part.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)

iD8DBQBGvCtkCHB0oCiR2cwRAv7VAJ9TJuiLAgr0K4ffsz29WyXwnbnOPwCdHALl
7DH6vHJ6ZAfPS0YtiQ9Etsc=
=PfL0
-----END PGP SIGNATURE-----

--nextPart7349955.xA6aG7j4qs--
 
K

Konrad Meyer

--nextPart1908127.qq0SpZucDl
Content-Type: text/plain;
charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

Hi,
=20

=20
You could do something like this:
=20
File.open("my-file.txt") do |f|
i =3D f.read.index(/search/)
f.seek(i)
puts f.read(10)
end

At that point though, you've already read the entire contents of the file
into memory (f.read()) so there's no point going back and seeking in the
file when:
* Seeking from memory should always be faster than from disk
* You're already ignoring the memory issues for large files
=2D-=20
Konrad Meyer <[email protected]> http://konrad.sobertillnoon.com/

--nextPart1908127.qq0SpZucDl
Content-Type: application/pgp-signature; name=signature.asc
Content-Description: This is a digitally signed message part.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)

iD8DBQBGvCxuCHB0oCiR2cwRApWaAJ96xfNzAdpus95l1wxryexAnsDA4QCeK0DX
Oc8sMYHAHTcvY17dAKhRFJk=
=lia6
-----END PGP SIGNATURE-----

--nextPart1908127.qq0SpZucDl--
 
K

Konrad Meyer

--nextPart2829944.Qna3f4A2ic
Content-Type: text/plain;
charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

=20
If this is what you really want, you can do something like:
=20
READ_BYTES =3D 16
my_file_contents =3D File.open('foo'){|f| f.read }
my_file_contents =3D~ /find this string(.{#{READ_BYTES}})/
my_data =3D $1

Better yet:
READ_BYTES =3D 16
content =3D File.open('foo'){|f| f.read }
my_data =3D content[content.index("find this string"), READ_BYTES]
# I tried to combine Lutz's and my earlier ideas

Though this of course assumes the string is found.
Of course, this is no good for large files. Generally if you're trying to
extract data from a large text file though, you should already know where= it
is (constant width records) or you have to parse through everything before
the record you want to get at the one you do (csv, xml?).
=20
If this is for a configuration file or some sort of semi-static storage, = I'd
recommend using YAML or Marshal instead of making your own parser.

=2D-=20
Konrad Meyer <[email protected]> http://konrad.sobertillnoon.com/

--nextPart2829944.Qna3f4A2ic
Content-Type: application/pgp-signature; name=signature.asc
Content-Description: This is a digitally signed message part.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)

iD8DBQBGvC3ECHB0oCiR2cwRAq2sAKCcYRt5AkFq3LYFmaPntuyYR9fZjQCfR/Pi
XhiZmxtfXJkLqseesVpY26I=
=AzUZ
-----END PGP SIGNATURE-----

--nextPart2829944.Qna3f4A2ic--
 
A

Ak 756

Konrad said:
my_data = $1
Better yet:
READ_BYTES = 16
content = File.open('foo'){|f| f.read }
my_data = content[content.index("find this string"), READ_BYTES]
# I tried to combine Lutz's and my earlier ideas

Though this of course assumes the string is found.

I don't care about huge file at present and this method works for me.
Thanks Konrad and Lutz.
 
A

Alex Young

Ak said:
Konrad said:
my_data = $1
Better yet:
READ_BYTES = 16
content = File.open('foo'){|f| f.read }
my_data = content[content.index("find this string"), READ_BYTES]
# I tried to combine Lutz's and my earlier ideas

Though this of course assumes the string is found.

I don't care about huge file at present and this method works for me.
Thanks Konrad and Lutz.
I believe you can also do this with RExpect, but I'm on Windows right
now and can't test it out.
 
R

Robert Klemme

2007/8/10 said:
Konrad said:
my_data = $1
Better yet:
READ_BYTES = 16
content = File.open('foo'){|f| f.read }
my_data = content[content.index("find this string"), READ_BYTES]

But be careful because #index returns the starting position of the
string searched for.
I don't care about huge file at present and this method works for me.
Thanks Konrad and Lutz.

You can even do it in one line:

bytes = File.read("foo")[/your_string(.{10})/, 1]

This reads the file into one String, does one regexp match and returns
contents of the capturing group which in this case contains arbitrary
10 characters (i.e. bytes).

Kind regards

robert
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,266
Messages
2,571,318
Members
47,998
Latest member
GretaCjy4

Latest Threads

Top