Deleting first N lines from a text file

K

Kaz Kylheku

A command like:

head -n 99 x.txt > x.txt

(This particular command does not delete the first N lines; it keeps
only the first N lines).

has a tendancy to produce a zero-length file because the shell opens
the output file (truncating it, as in fopen(..., "w") ) before the
"head" program starts running. Then "head" reads an empty x.txt,
and outputs the first 99 lines (except only 0 exist) of it.

This could be fixed with a fictitious utility:

head -n 99 x.txt | wne x.txt # 'write if not empty'

wne reads standard input and copies to the named file, but does
not create the file if the input is empty.
You need to sequence things so the file isn't truncated until *after*
you have retrieved any data that might be needed from it.

"wne" could also have the behavior of delayed truncation
in addition to delayed creation. ("dtc"?)

This could be used in any situation where a file is being
filtered in place such that characters written
to an earlier position are all derived from the same or
later positions.
 
E

Eric Sosman

pozz said:
[...]
Fifty K shouldn't take long. Even on a system from forty years
ago it didn't take long. Even on paper tape, for goodness' sake, it
took less than a minute!

(That's a fast paper tape reader. The last one I used would have taken
nearly 3 hours.)

http://www.springerlink.com/content/x42n45gk4811lpq1/ from 1963
describes a "high speed paper tape reader with a maximum speed of
1000 characters per second." 50KB / (1000 B/s) = 50s < one minute.

However, I realize I've erred, and in two ways. First, I've
neglected the time to punch the paper tape, an operation a good
deal slower than reading it. Second, I've overlooked an O(1)
solution: Take the original file, on paper tape, and apply a pair
of scissors to remove the first N lines -- no copying involved,
although there might be some difficulty finding enough "leader"
to feed into the read mechanism next time the data is wanted ...
 
I

Ian Collins

pozz said:
[...]
Fifty K shouldn't take long. Even on a system from forty years
ago it didn't take long. Even on paper tape, for goodness' sake, it
took less than a minute!

(That's a fast paper tape reader. The last one I used would have taken
nearly 3 hours.)

http://www.springerlink.com/content/x42n45gk4811lpq1/ from 1963
describes a "high speed paper tape reader with a maximum speed of
1000 characters per second." 50KB / (1000 B/s) = 50s< one minute.

That modern? The machine that preceded Colossus at Bletchley Park in
the early 40s could read paper tape at 1000 characters per second!
 
B

BartC

Ian Collins said:
[...]
Fifty K shouldn't take long. Even on a system from forty years
ago it didn't take long. Even on paper tape, for goodness' sake, it
took less than a minute!

(That's a fast paper tape reader. The last one I used would have taken
nearly 3 hours.)

http://www.springerlink.com/content/x42n45gk4811lpq1/ from 1963
describes a "high speed paper tape reader with a maximum speed of
1000 characters per second." 50KB / (1000 B/s) = 50s< one minute.

That modern? The machine that preceded Colossus at Bletchley Park in the
early 40s could read paper tape at 1000 characters per second!

I've seen Colossus itself in action (the replica, not the original!) and
apparently it could read paper tape at 5000 characters per second.

I think that might have been the program itself, on an endless loop.

But the one I was thinking of belonged to a teletype, which, even if the
read data was not printed but read into the host, was limited to 10cps.
However if it could read and write at the same time, then it would only have
taken 80 minutes or so for 50KB.
 
T

tom st denis

Le 16/11/11 14:01, -.- a écrit :



That is why you hide behind a pseudo, because you have the courage of
your opinions...

While rude the guy has a point. You can accomplish this goal easily
with something like

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char **argv)
{
int skip;
char buf[2048];
skip = 0;
if (argc == 2) {
skip = atoi(argv[1]);
}
while (skip-- && fgets(buf, sizeof buf, stdin) != NULL);
while (fgets(buf, sizeof buf, stdin) != NULL) {
fputs(buf, stdout);
}
return 0;
}

This doesn't require buffering the entire file, it's way more
portable, it's about the same length as your original "solution," etc.

IOW you're just posting to advertise your wares... in a C group where
your offerings are not C....

Stop spamming USENET and maybe people will treat you better.

Tom
 
J

Jens Thoms Toerring

While rude the guy has a point. You can accomplish this goal easily
with something like
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char **argv)
{
int skip;
char buf[2048];
skip = 0;
if (argc == 2) {
skip = atoi(argv[1]);
}
while (skip-- && fgets(buf, sizeof buf, stdin) != NULL);
while (fgets(buf, sizeof buf, stdin) != NULL) {
fputs(buf, stdout);
}
return 0;
}
This doesn't require buffering the entire file, it's way more
portable, it's about the same length as your original "solution," etc.
IOW you're just posting to advertise your wares... in a C group where
your offerings are not C....
Stop spamming USENET and maybe people will treat you better.

I don't understand your complaints. As far as I can see Mr. Navia's
container library is written in standard compliant C - I just tried
to compile his program after minor modifications (added an include
for <stdio.h> and replaced the angle brackets around "containers.h"
by double quotes since the header file isn't in the systems include
directory on my system) with

gcc -std=c99 -pedantic -Wextra -Wall -Wwrite-strings

and there were no complaints at all. For C89 the main complaint
was the missing support for the 'long long' type used in the
library and a few minor niggles about two '//' comments in the
'containers.h" header file and about mixing of declarations and
code, finally, the missing return statement at the end - nothing
of any seriousness. (And if you try to compile the library it-
self in strict C89 mode all the compilers complaints seem to be
on the same level of seriousness, i.e. very low and not diffi-
cult to address.)

Further, the library is under a very permissive license, BSD,
so there are no strings attached.

So all I can see is that Mr. Navia proposed to use a library,
written by him and made available to the public, as a possible
solution for the problem the OP has. He didn't even mention
were his container library can be found - if the OP or others
are interested it's at

http://code.google.com/p/ccl/

If that is spamming then a lot of other posters in this group
are guilty of the same when they dare to mention that they have
written some functions or library others might use freely for a
problem they want to solve. Just because Mr. Navia is also the
author of a compiler you can pay him money for IMHO shouldn't
bare him from mentioning code he wrote and made available for
free and taking part in trying to help others, or should it?

The only valid point I can see in your post is the question if
it's an optimal solution having to read in all of the file into
memory. But then this was already explicitly pointed out by Mr.
Navia himself as a requirement (and thus a possible shortcoming)
of using his solution - and for a lot of cases I don't think it's
a complete showstopper.

On the other hand your solution doesn't address the second re-
quirement of the OP, i.e. that the input file itself is to be
changed on exit of the program - your program reads from stdin
and writes to stdout, so it's a simple filter. In contrast, Mr.
Navia's program does also handle this other requirement of the
OP in changing the input file itself with a very similar number
of lines of code the user has to type...

Finally, I've got to say that I find Mr. Navia's solution rather
easy to comprehend without even having read the documentation for
his library (which I will do now;-). It seems to be a rather nice
example of how using it could make writing as well as understanding
certain types of programs in C quite a bit easier.

Regards, Jens
 
D

Dr Nick

BartC said:
Ian Collins said:
On 11/16/2011 9:26 AM, BartC wrote:
[...]
Fifty K shouldn't take long. Even on a system from forty years
ago it didn't take long. Even on paper tape, for goodness' sake, it
took less than a minute!

(That's a fast paper tape reader. The last one I used would have taken
nearly 3 hours.)

http://www.springerlink.com/content/x42n45gk4811lpq1/ from 1963
describes a "high speed paper tape reader with a maximum speed of
1000 characters per second." 50KB / (1000 B/s) = 50s< one minute.

That modern? The machine that preceded Colossus at Bletchley Park
in the early 40s could read paper tape at 1000 characters per
second!

I've seen Colossus itself in action (the replica, not the original!)
and apparently it could read paper tape at 5000 characters per second.

I think that might have been the program itself, on an endless loop.

That loop is the message that is being attacked. The machine runs the
message through, does some operations on it and carries out a set of
counts on the results. If the answer is "interesting" it outputs the
fact, otherwise it advances its internal state and repeats. The
internal state is on uniselctors - electromechanically - and so you get
a "clunk" once per tape loop as it advances. The counters are valve for
speed.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,083
Messages
2,570,591
Members
47,212
Latest member
RobynWiley

Latest Threads

Top