Idiomatic file snarf

T

Tim Bray

I want to open a file, suck the contexts into a variable and close it
again. I suppose I could IO#each and glue the lines together.

Currently I have

f = File.new(fname)
text = f.read
f.close

Surely there's an idiomatic Ruby one-liner? I must be looking in the
wrong place @ruby-doc.org -Tim
 
J

James Edward Gray II

I want to open a file, suck the contexts into a variable and close
it again. I suppose I could IO#each and glue the lines together.

Currently I have

f = File.new(fname)
text = f.read
f.close

Surely there's an idiomatic Ruby one-liner?

Sure:

text = File.read(fname)

James Edward Gray II
 
T

Thomas Adam

I want to open a file, suck the contexts into a variable and close
it again. I suppose I could IO#each and glue the lines together.

Currently I have

f = File.new(fname)
text = f.read
f.close

Surely there's an idiomatic Ruby one-liner? I must be looking in
the wrong place @ruby-doc.org -Tim

some_variable = File.new(fname).readlines

-- Thomas Adam
 
E

Eero Saynatkari

--f5QefDQHtn8hx44O
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

I want to open a file, suck the contexts into a variable and close it =20
again. I suppose I could IO#each and glue the lines together.

The 'correct' way is:

text =3D File.read file_name
=20
Currently I have
=20
f =3D File.new(fname)
text =3D f.read
f.close

A better way for the above is this:

text =3D File.open(file_name) {|f| f.read}

--f5QefDQHtn8hx44O
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFFGvtU7Nh7RM4TrhIRAkEoAKDNT62kc4gtJtTCvgD/sNFhS2BhnACg6DXK
vLTGIVUVD8kDAQ2c1LW+eOA=
=sl18
-----END PGP SIGNATURE-----

--f5QefDQHtn8hx44O--
 
J

John Carter

Sure:

text = File.read(fname)


Actually, the idiom I most use is

File.read( fname).scan( %r{ juicy stuff}x) do |match|
# do something with juicy stuff
end



John Carter Phone : (64)(3) 358 6639
Tait Electronics Fax : (64)(3) 359 4632
PO Box 1645 Christchurch Email : (e-mail address removed)
New Zealand

"We have more to fear from
The Bungling of the Incompetent
Than from the Machinations of the Wicked." (source unknown)
 
D

Devin Mullins

Eero said:
A better way for the above is this:
text = File.open(file_name) {|f| f.read}
To expand on that: the form that takes a block automatically calls
f.close when the block is done. What's more, it does it in an *ensure
clause*, you scr1pt k1ddie.

Also, Kernel#open exists. So: text = open(file_name) {|f| f.read}

What's more, if you're dealing with files on a linely basis,
File.include? Enumerable, so you can do fun things like:
matches = open(file_name) {|f| f.grep(/juicy stuff/)}
To return an Array of matching lines without having to bring the whole
file into memory. (Though, if you don't care about memory,
File.readlines(file_name).grep /juicy stuff/ is prettier.)

Devin
 
J

John Carter

Actually, the idiom I most use is

File.read( fname).scan( %r{ juicy stuff}x) do |match|
# do something with juicy stuff
end


Just remembered, I have an old RCR lying around on this.
Don't forget to vote for...
http://rcrchive.net/rcr/show/332


Currently there exists two very useful functions in ruby.

IO.read( file_name) reads in the entire file into a string.

string.scan( regexp){|match| } scans the entire string for regexp yielding matches.

The limit on doing...

IO.read(file_name).scan( regexp)

is the size of your machines unused physical memory.

Unix has the very handy facility called mmap that allows one to memory
map an entire file and the contents of that file appears mapped into
your virtual address space.

The operating system handles all the fuss and bother of reading (and
forgetting) pages of that file into memory.

Thus is would be very easy to create a mmap'd version, semantically the
same as the following function...

def IO.scan( file_name, regexp, &block)
IO.read(file_name).scan( regexp, &block)
end

But being mmap'd could handle files up (almost) up to 4GB in size.

Problem

IO.read(file_name).scan(regexp) is limited to the available physical
memory on your system.

Proposal
Reimplement...

def IO.scan( file_name, regexp, &block)
IO.read(file_name).scan( regexp, &block)
end

to use unix mmap.

Analysis

No language level change, merely an extension to the existing IO.c
Implementation

Here is some example code.

http://www.cs.purdue.edu/homes/fahmy/cs503/mmap.txt

Where they do the second mmap and the memcpy, we would do the regexp scan.

So that would have to be mashed together with io_read in io.c and
rb_str_scan in string.c

Hmm. Just thinking. Before STL existed I did my own template library in
C++. One of the most useful features was I could mmap a string to a file
and thereafter the entire file behaved as an ordinary string.

The alternate to this RCR would be something that hacked the internal
representation of a ruby string so that the data pointed to was mmap'd.

Now I can think of _many_ uses for that.

However, that would be a far harsher change on the string class and GC
system. Thinking on that a bit more.

One of the Grand Unifying Principles of Unix is...

"Everything (graphics card, directories, sockets, network cards, ....)
is a file, and a File is just a stream of Bytes."

Repeat that until it's firmly stuck in your head.

Now take one small step further.

A stream of bytes is just a (possibly mmap'd) String.

Doesn't that make life really really simple?

Existing implementations!

Similar idea discuss here..
http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/7673

Implementation for Unix here... http://moulon.inra.fr/ruby/mmap.html

Implementation for Win32 here... http://rubyforge.org/projects/win32utils/


John Carter Phone : (64)(3) 358 6639
Tait Electronics Fax : (64)(3) 359 4632
PO Box 1645 Christchurch Email : (e-mail address removed)
New Zealand

"We have more to fear from
The Bungling of the Incompetent
Than from the Machinations of the Wicked." (source unknown)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

Forum statistics

Threads
474,213
Messages
2,571,109
Members
47,701
Latest member
LeoraRober

Latest Threads

Top