Read dynamic string

S

Sillaba atona

I use this code to read dynamic string:

char *s1;
.......
puts("Inserire una stringa: ");
while((*s1++=getchar())!='\n');
*s1='\0';

The compilation (ANSI C) is OK but I receive an error during the execution.

The problem is not present if I use a static array.

I cannot find the error.
 
M

Malcolm

Sillaba atona said:
char *s1;
......
puts("Inserire una stringa: ");
while((*s1++=getchar())!='\n');
*s1='\0';

The compilation (ANSI C) is OK but I receive an error during the
execution.

The problem is not present if I use a static array.

I cannot find the error.
s1 points to nowhere in particular. You then start overwriting this random
memory location with characters. Usually the result will be a crash
(probably a message saying "segmentation fault").
If you say s1 = malloc(1000) then s1 will point to 1000 chars reserved for
you, and the program won't crash until you exceed that figure. (You can keep
a count and call realloc(), if you want to be able to read an arbitrary
string as long as the computer's meory allows).
 
W

Walter Roberson

I use this code to read dynamic string:
char *s1;
......
puts("Inserire una stringa: ");
while((*s1++=getchar())!='\n');

You have not allocated any storage. s1 is an uninitialized pointer;
you have to point it to a block of memory before you can use that code.

Except that that code doesn't take into account the possibility of
a really long string, and if you pre-allocate the memory then no matter
how much you allocate, the user might enter something larger.
Therefore you either need to limit the length that you will pay
attention to, or else you need to use a scheme in which the allocated
memory is grown as needed.
*s1='\0';
The compilation (ANSI C) is OK but I receive an error during the execution.

A good compiler would warn that s1 was potentially uninitialized, but
such a warning is not -required- by the standard.
The problem is not present if I use a static array.

Change back to a static, and then have the user paste in (say)
32K of text, and see whether you still say the problem is "not present"
when you use a static array.
 
T

Toni Uusitalo

Malcolm said:
If you say s1 = malloc(1000) then s1 will point to 1000 chars reserved for
you, and the program won't crash until you exceed that figure. (You can keep
a count and call realloc(), if you want to be able to read an arbitrary
string as long as the computer's meory allows).

Or you can keep reading into unallocated buffer until your computer meows.

;-)

with respect,
Toni Uusitalo
 
D

Daniel Fischer

Hi,
(You can keep a count and call realloc(), if you want to be able to read
an arbitrary string as long as the computer's meory allows).

there's also a neat way that involves using recursion to allocate
additional memory on the stack and putting the parts together on return,
thereby avoiding heap fragmentation.


Daniel
 
C

Christopher Benson-Manica

Sillaba atona said:
char *s1;
......
puts("Inserire una stringa: ");
while((*s1++=getchar())!='\n');
*s1='\0';
The compilation (ANSI C) is OK but I receive an error during the execution.

It's wrong, as others have indicated.

You might find Chuck Falconer's ggets() routine, available at
cbfalconer.home.att.net/download/ggets.zip, to be helpful. Chuck is
(or at least was, he seems to have been missing for some time) a
regular contributor here, so you can use the code with confidence
(per the license, which I believe is GPL).
 
M

Malcolm

Daniel Fischer said:
there's also a neat way that involves using recursion to allocate
additional memory on the stack and putting the parts together on return,
thereby avoiding heap fragmentation.
Could you post a getline() function (get a line from a file, of arbitrary
length) that uses this?
 
T

Toni Uusitalo

Daniel said:
Hi,




there's also a neat way that involves using recursion to allocate
additional memory on the stack and putting the parts together on return,
thereby avoiding heap fragmentation.

This is possible but I doubt it's worth the hassle (I admit it sounds
like neat trick).

I think something that reuses existing buffer would be enough for
avoiding constant buffer fiddling/reallocating.

Something like this:

#include <stdio.h>
#include <stdlib.h>

struct GetLine {
FILE *f;
char *buf;
size_t bufsize;
};

int GetLine_Init(struct GetLine *gl, FILE *f)
{
gl->f = f;
gl->bufsize = 256;
gl->buf = malloc(gl->bufsize);
return (gl->buf) ? 1 : 0;
}

void GetLine_Destroy(struct GetLine *gl)
{
free(gl->buf);
}

char *GetLine_Read(struct GetLine *gl)
{
size_t i=0;
int ch;

while(gl->buf) {
for(; i<gl->bufsize; gl->buf[i++]=ch) {
ch = fgetc(gl->f);
if (ch == EOF || ch == '\n') {
if (!i && ch == EOF) return NULL;
gl->buf = '\0';
return gl->buf;
}
}
gl->bufsize += gl->bufsize;
gl->buf = realloc(gl->buf, gl->bufsize);
}
return NULL;
}

int main(int argc, char* argv[])
{
struct GetLine gl;
char *line;
FILE *f = fopen("test.txt", "r");

if (!f) return (EXIT_FAILURE);
if (!GetLine_Init(&gl, f)) return (EXIT_FAILURE);

while((line=GetLine_Read(&gl)))
puts(line);

GetLine_Destroy(&gl);
fclose(f);
return 0;
}

with respect,
Toni Uusitalo
 
J

Joe Wright

Daniel said:
Hi,




there's also a neat way that involves using recursion to allocate
additional memory on the stack and putting the parts together on return,
thereby avoiding heap fragmentation.


Daniel
I'd love to see that. Can you post the code?
 
D

Daniel Fischer

I'd love to see that. Can you post the code?

Sure:

------------------------->8---------------------------

/*
* rfgets.c
* dynamically allocating fgets
* daniel.fischer at iitb.fraunhofer.de
*/

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

char *rfg(FILE * f, int l) {
char b[32], *p;
int x;

if(!fgets(b, sizeof(b), f)) return 0;
x = strlen(b);
if(b[x - 1] != '\n' && (p = rfg(f, l + x))) {
p -= x;
memcpy(p, b, x);
return p;
}
p = (char *) malloc(l + x + 1) + l;
strcpy(p, b);
return p;
}

char *rfgets(FILE * f) {
return rfg(f, 0);
}

------------------------->8---------------------------

Usage:

char *line = rfgets(stdin);
...
free(line); // don't forget


Daniel
 
C

Christopher Benson-Manica

Bonus points for ingenuity. But...
char *rfg(FILE * f, int l) {
char b[32], *p;
int x;
if(!fgets(b, sizeof(b), f)) return 0;
x = strlen(b);
if(b[x - 1] != '\n' && (p = rfg(f, l + x))) {
p -= x; /* NB */
memcpy(p, b, x);
return p;
}
p = (char *) malloc(l + x + 1) + l;

This cast is neither necessary nor desirable; search this group's
archives for any of several lengthy discussions as to why.
strcpy(p, b);
return p;

I believe the value that should be returned is p+l. The code is
certainly broken as is, as the line indicated generates a pointer x
bytes before the beginning of the malloc()'ed space.
 
B

Barry Schwarz

I'd love to see that. Can you post the code?

Sure:

------------------------->8---------------------------

/*
* rfgets.c
* dynamically allocating fgets
* daniel.fischer at iitb.fraunhofer.de
*/

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

char *rfg(FILE * f, int l) {
char b[32], *p;
int x;

if(!fgets(b, sizeof(b), f)) return 0;
x = strlen(b);
if(b[x - 1] != '\n' && (p = rfg(f, l + x))) {
p -= x;
memcpy(p, b, x);
return p;
}
p = (char *) malloc(l + x + 1) + l;

I give up. Which ones are ones and which are ells?
strcpy(p, b);
return p;
}

char *rfgets(FILE * f) {
return rfg(f, 0);
}

------------------------->8---------------------------

Usage:

char *line = rfgets(stdin);
...
free(line); // don't forget


Daniel


<<Remove the del for email>>
 
S

S.Tobias

Christopher Benson-Manica said:
Bonus points for ingenuity. But...
char *rfg(FILE * f, int l) {
char b[32], *p;
int x;
if(!fgets(b, sizeof(b), f)) return 0;
x = strlen(b);
if(b[x - 1] != '\n' && (p = rfg(f, l + x))) {
p -= x; /* NB */
memcpy(p, b, x);
return p;
}
p = (char *) malloc(l + x + 1) + l;

This cast is neither necessary nor desirable; search this group's
archives for any of several lengthy discussions as to why.
It is necessary here, you've missed `+ l' after malloc, which is
also an answer to your second query below.
 
C

Christopher Benson-Manica

S.Tobias said:
It is necessary here, you've missed `+ l' after malloc, which is
also an answer to your second query below.

Darn. I thought I was being observant, but turns out I need glasses
after all. (I blame it all on the use of lowercase 'l'.)
 
R

Randy Howard

Christopher Benson-Manica wrote
(in article said:
Darn. I thought I was being observant, but turns out I need glasses
after all. (I blame it all on the use of lowercase 'l'.)

Pick a better font (or terminal). :)
 
J

Jordan Abel

Darn. I thought I was being observant, but turns out I need glasses
after all. (I blame it all on the use of lowercase 'l'.)

You couldn't do it if it were a 1 (one) or an I (eye), either, though.
 
C

Christopher Benson-Manica

Toni Uusitalo said:
I think something that reuses existing buffer would be enough for
avoiding constant buffer fiddling/reallocating.

It is enough, but the recursive version uses repeated recursive calls
and memcpy() where your version uses realloc(). realloc() is
generally expensive performance-wise, so I think the recursive
version's avoidance of it is probably a good thing.
 
T

Toni Uusitalo

Christopher said:
It is enough, but the recursive version uses repeated recursive calls
and memcpy() where your version uses realloc(). realloc() is
generally expensive performance-wise, so I think the recursive
version's avoidance of it is probably a good thing.

Yes. Recursive version seems very clever! Recursion starts
to affect performance at some point I recall from some code I wrote,
I think that particular recursion was something like 30 levels deep on
my win platform that started to affect performance crucially.

Ideally my version uses realloc sparingly. If you need to store
line(s) returned from Readline_Read you should use strdup() etc.
and continue using already allocated buffer for reading lines.

some enchancements:

+ #define GetLine_Linelen(gl) ((gl)->lineLen)

struct GetLine {
FILE *f;
char *buf;
size_t bufsize;
+ size_t lineLen;
};


and new GetLine_Read (which simply uses gl->lineLen instead of i):


char *GetLine_Read(struct GetLine *gl)
{
int ch;
gl->lineLen=0;

while(gl->buf) {
for(; gl->lineLen<gl->bufsize; gl->buf[gl->lineLen++]=ch) {
ch = fgetc(gl->f);
if (ch == EOF || ch == '\n') {
if (!gl->lineLen && ch == EOF) return NULL;
gl->buf[gl->lineLen] = '\0';
return gl->buf;
}
}
gl->bufsize += gl->bufsize;
gl->buf = realloc(gl->buf, gl->bufsize);
}
return NULL;
}

Now it's possible to use memcpy after GetLine_Read instead of strdup()

with respect,
Toni Uusitalo
 
C

Christopher Benson-Manica

Jordan Abel said:
You couldn't do it if it were a 1 (one) or an I (eye), either, though.

Of course, but the problem is that I completely overlooked the 'l',
which would have been less likely if the variable had been given a
more descriptive name. Hopefully real code would have had a helpful
comment for the unobservant...
 
T

those who know me have no need of my name

in comp.lang.c i read:

i elided most of the code -- sometimes this is a mistake, though i hope not
-- only the bits with nits remain.
char *rfg(FILE * f, int l) {

int makes some sense, but i happen to prefer size_t as there is no meaning
in a negative length or offset in this code.

char *rfg(FILE * f, size_t l) {

also, this is now a very short name to have with external linkage. i would
probably change to internal linkage, or rename it rfgets_r -- heh, perhaps
both.

nothing wrong with it, but again as you cannot have a negative string
length i tend to prefer a size_t.

size_t x;
if(b[x - 1] != '\n' && (p = rfg(f, l + x))) {

what if the first byte fgets returns is a null byte? better check for zero
length just in case.

also, what if rfg returns a null pointer (malloc failed)? do you really
want to just chuck the last part of the line away and make believe it ends
at the current block?

sometimes losing as little data as possible is important, in which case
your code with a length check added would be fine:

if(0 < x && b[x - 1] != '\n' && (p = rfg(f, l + x))) {

but i think it's usually better to back propagate the malloc failure:

if(0 < x && b[x - 1] != '\n') {
if (0 == (p = rfg(f, l + x))) return 0;
p = (char *) malloc(l + x + 1) + l;
strcpy(p, b);

what if the malloc fails? (the cast just to do the arithmetic is ugly
enough to want a change, even though it is valid -- as long as malloc
doesn't fail.)

if (0 == (p = malloc(l + x + 1))) return 0;
p += l;
strcpy(p, b);
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,175
Messages
2,570,942
Members
47,476
Latest member
blackwatermelon

Latest Threads

Top