Write to specified line number of file

Aditya Rajgarhia · Jul 8, 2006

Hi,

I want to read a file, and write to another file by randomize the lines.
I saw the method File.lineno= but it doesn't seem to work. I just get
the same file written as the original, without the lines having been
randomized.

I suppose it should be simple, but am new to Ruby. Any help would be
great.

Thanks,

Asitya

M. Edward (Ed) Borasky · Jul 9, 2006

1. How big is the file?

2. Do you want sampling without replacement (shuffle the original file
keeping the lines intact) or sampling with replacement (n lines randomly
chosen from the file)?

I'm going to assume that 1 is "too big for a considerate programmer to
read all into memory on a shared machine" and 2 is "without replacement
(shuffling)". I'm also going to assume that you're on some form of UNIX
machine that has the "sort" verb. For Windows, that could mean CygWin.

So what you want to do is make a copy of the file with random numbers
tacked on to the front of each line. Then sort the tagged copy
numerically using the external "sort" verb, and remove the tags from the
sorted copy. You'll be doing everything in Ruby except the sort.

If the answer to 1 is "small enough to fit into memory", just read the
file into memory, tag the lines with random numbers, and use a Ruby
"sort" to do the sorting, then untag the lines and write out the file.
You'll be doing everything in Ruby.

By the way, I do this sort of thing rather often. The files in question
are data files that drive performance test scripts. They're small (under
65536 lines), so I just read them into Excel, tack on a random column,
sort on the random column, delete the random column, and write the file
back out.

If you want sampling *with* replacement, the easiest way to do it is
using R. I don't know how to do it in Ruby or Excel, since I have R.

I think the "too big/shuffled" case would make an interesting Ruby quiz,
if you rule out the external sort verb as "cheating".

Aditya Rajgarhia · Jul 9, 2006

Thanks for the reply. That'll work. Ofcourse, using excel is the
straightforward way but I need to do it in Ruby.

I still think there must be a simple way to move to a specified line in
a file. I could write a function which does that by checking for "\n" at
the end of each line, and will do that unless someone can suggest an
easier method soon.

bbiker · Jul 9, 2006

Aditya said:
Thanks for the reply. That'll work. Ofcourse, using excel is the
straightforward way but I need to do it in Ruby.

I still think there must be a simple way to move to a specified line in
a file. I could write a function which does that by checking for "\n" at
the end of each line, and will do that unless someone can suggest an
easier method soon.

From one newbie to another.

Assuming that the file can be read into memory and you want to shuffle
the lines around.

read in the file and place the lines in an array

and use a shuffle method such:

def shuffle(ar)
stop_line = line_b = ar.size - 1

# by time were here, the lines are a scramble as they
going to get
stop_line /= 5

stop_line.upto(ar.size - 1) do
line_a = rand( line_b )
# exchange first and last line
ar[line_a], ar[line_b] = ar[line_b], ar[line_a]
line_b -= 1
end
return ar
end

then you write back the lines into a new file or simply overwrite the
old file.

guillaume.marcais · Jul 11, 2006

I think you misunderstand what IO#lineno and IO#lineno= do. The first
one returns the number of lines read from the IO stream so far
(probably only works if doing line oriented io, i.e. with gets or
each), and IIO#lineno= set the base number to start from. Setting
lineno does NOT change the position in the stream, it just change what
the current counter is:

echo "toto" | ruby -e 'p $stdin.lineno; $stdin.lineno = 100; p
$stdin.lineno; p $stdin.gets; p $stdin.lineno'
0
100
"toto\n"
101

The easiest is to swallow the file with IO#readlines, sort them at
random with Array#sort_by and write them out. In a one liner:

ruby -e '$stdout.print($stdin.readlines.sort_by { rand })' <
file_to_scramble

If the file is really to big to swallow, it can become more
complicated. But there is no need to get there in most cases.

Hope this help,
Guillaume.

Mike Fletcher · Jul 11, 2006

Aditya Rajgarhia wrote:
[...]

I still think there must be a simple way to move to a specified line in
a file. I could write a function which does that by checking for "\n" at
the end of each line, and will do that unless someone can suggest an
easier method soon.

Files don't consist of lines, files are ordered sequences of octets.
Lines are just an abstraction imposed on files by either the IO
libraries and the language you're using, or your editor.

Unless the lines in your file are all of the same length it's going to
be non-trivial to move them around in place in the file. You'll either
need to move the existing contents "up" (if the new line were shorter)
or move them "back" (if it's longer).

Having said that, you could accomplish your randomizing the file by
something along these lines:

* read the file line by line and store the offset of the beginning
character of each line (using IO#tell and subtracting the length of the
line)

* open a new temporary output file

* randomly pick a line number and use IO#seek to jump to the appropriate
offset

* read the line and print it to the temp file; remove that line's info
from your list of offsets

* when there's no more offsets left, close the temp file and File#rename
it over the original

How to use PDF-lib and how to center each line of texts on the page?	1	Aug 16, 2023
Add a text file that a user specified the name of in a program to a directory	0	Apr 28, 2022
Write a Python program according to the task, using modules of the standard library: os, os.path and pickle. Provide processing of the specified	1	Nov 3, 2022
Write a Python program according to the task, using modules of the standard library: os, os.path and pickle.Provide processing of the specified	0	Oct 24, 2022
PHP failed to create file	13	Dec 12, 2023
Windows command line to python	0	Sep 29, 2021
Command Line Arguments	0	Mar 7, 2023
I would like to use awk to calculate the total number of records processed	1	Aug 25, 2022

Write to specified line number of file

Aditya Rajgarhia

M. Edward (Ed) Borasky

Aditya Rajgarhia

bbiker

guillaume.marcais

Mike Fletcher

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads