Global Variables

W

whisper

This is interesting, but I dont understand your response (sorry, I am
just learning).
What do you mean by "meaningless global identifiers" ?

I had the same question regarding returning the entire struct in Eric's
example
He wrote:
struct matrix create_set(int m, int n) {
struct matrix set;
set.rows = m;
/* code snipped */
}
return set;
}


Situation A:
Now when this function is called, storage is allocated for the new
structure
and a copy of it is passed to the calling function, and the original
storage is deallocated.
Is my understanding of the process correct ?

Situation B:
So why not allocate as he has done and return &set (a pointer to the
memory allocated).
How does compare to situation A ?


In particular, what is automatic storage ?

In situation A and B, isn't the memory allocated on the program stack
the same way -
the only difference being that one has a shorter lifespan than the
other ?
Is it just returning a pointer to the malloc-ed area ensure that the
allocated storage remains
in play for the rest of the life of the program ?
I appreciate everybody's responses in helping this student learn.
 
E

Eric Sosman

whisper said:
[...]
I had the same question regarding returning the entire struct in Eric's
example
He wrote:
struct matrix create_set(int m, int n) {
struct matrix set;
set.rows = m;
/* code snipped */
}
return set;
}

Situation A:
Now when this function is called, storage is allocated for the new
structure
and a copy of it is passed to the calling function, and the original
storage is deallocated.
Is my understanding of the process correct ?

Yes. The value of the `set' struct inside the function
is passed back to the caller, where it is (presumably) copied
by assignment into a similar struct the caller owns. It's no
different, really, from returning the value of an `int' variable.

In the example code I chose to return the entire struct
because it was simple and allowed me to avoid obfuscating
the central point ("Use a descriptor") with the issues of
memory management. It is usually less efficient to pass
entire struct values to and from functions than to pass
"primitive" values like integers and pointers, but for a
small struct like `struct matrix' and for a function that's
probably called infrequently (how often do you create a new
matrix?), the penalty is probably insignificant compared to
the gain in clarity.

That said, another approach would have been to let the
caller allocate a "blank" `struct matrix', pass a pointer
to create_set(), and let the function initialize the fields
of the caller's struct:

void create_set(struct matrix *ptr, int m, int n) {
ptr->rows = m;
ptr->cols = n;
ptr->data = malloc(m * sizeof *ptr->data);
...
}

void find_largest() {
struct matrix set;
create_set(&set, 3, 5);
...
}

Still another method would be to have create_set()
allocate a `struct matrix' in dynamic memory and return
a pointer to it:

struct matrix *create_set(int m, int n) {
struct matrix *ptr;
ptr = malloc(sizeof *ptr);
if (ptr == NULL) ...
ptr->rows = m;
ptr->cols = n;
ptr->data = malloc(m * sizeof *ptr->data);
...
return ptr;
}

void find_largest() {
struct matrix *set;
set = create_set(3, 5);
...
}

Other variations are possible. "There are nine and
sixty ways of constructing tribal lays, / And every single
one of them is right!"
Situation B:
So why not allocate as he has done and return &set (a pointer to the
memory allocated).
How does compare to situation A ?

It would be a gross error, because you'd be returning
a pointer to memory that no longer exists. See Question 7.5
in the comp.lang.c Frequently Asked Questions (FAQ) list

http://www.eskimo.com/~scs/C-faq/top.html
In particular, what is automatic storage ?

Your C textbook should explain this. Roughly speaking,
it's the storage that comes into existence when a function
(or other code block) is entered, and which disappears when
the function (block) exits. You don't need to manage it
explicitly, which is the "automatic" part.
In situation A and B, isn't the memory allocated on the program stack
the same way -
the only difference being that one has a shorter lifespan than the
other ?

Yes, except that "on the program stack" is not something
guaranteed by the language: it's a mere implementation detail.
True, nearly every implementation uses this technique, but it's
indelicate to say so on this newsgroup.
Is it just returning a pointer to the malloc-ed area ensure that the
allocated storage remains
in play for the rest of the life of the program ?
I appreciate everybody's responses in helping this student learn.

Crack the textbook again. You'll find that there a data
object has one of three "storage durations:" static, automatic,
or dynamic. All static objects already exist when the program
starts and continue to exist until it exits. Automatic objects
wink in and out of existence as their containing blocks are
entered and exited. Dynamic objects come into existence when
you create them with malloc() or calloc() or realloc() and
remain in existence until you destroy them with free() or
realloc(), regardless of what else the program may do.
 
M

Mark A. Odell

pete said:
Except you still have the biggest poison of all - your functions
are no longer re-entrant.

If your program uses standard library functions,
then it can't be considered reentrant either.

N869
7.1.4 Use of library functions
[#4] The functions in the standard library are not
guaranteed to be reentrant and may modify objects with
static storage duration.

Fine, but why exacerbate the problem by making user code needlessly
non-reentrant?
 
J

jacob navia

pete said:
N869
7.1.4 Use of library functions
[#4] The functions in the standard library are not
guaranteed to be reentrant and may modify objects with
static storage duration.

Note the "guaranteed". Many compilers offer reentrant libraries.
 
C

CBFalconer

jacob said:
pete said:
N869
7.1.4 Use of library functions
[#4] The functions in the standard library are not
guaranteed to be reentrant and may modify objects with
static storage duration.

Note the "guaranteed". Many compilers offer reentrant libraries.

For some functions. Others are inherently non-reentrant, and cause
problems in concurrent systems. E.g. strtok, fopen.

--
"I support the Red Sox and any team that beats the Yankees"
"Any baby snookums can be a Yankee fan, it takes real moral
fiber to be a Red Sox fan"
"I listened to Toronto come back from 3:0 in '42, I plan to
watch Boston come back from 3:0 in 04"
 
H

Herbert Rosenau

Method Man wrote: *** And failed to attribute ***
struct matrix create_set(int m, int n) {
struct matrix set;
set.rows = m;
set.cols = n;
set.data = malloc(m * sizeof *set.data);
if (set.data == NULL) ...
while (--m >= 0) {
set.data[m] = malloc(n * sizeof *set.data[m]);
if (set.data[m] == NULL) ...
}
return set;
}

As a design question, may I ask why you decided to allocate 'set'
on the automatic storage instead of the free storage? It seems a
waste of space to pass the whole struct back to the caller.

So he could later write such things as:

this_set = create_set(....);
that_set = create_set(....);

What does you mean contains set.rows after create_set returns and
another fuction using 4 parameters comes back? Any information about
the struct gets lost as it is completely inside the temporary memory
defined only while create_set is active. In best case:

this_set = create_set(4,5);
that_set = create_set(1000, 200);
other_set = create_set(1,1);

How bis is in set.rows of this_set now? What is in that_set? Why is
this_set now in dimesion [1][1] instead of [4][5]? Why gets the
dimsension information of all sets destroyed when you call simply
printf()?
and not have to worry about copying to and from a herd of
meaningless global identifiers which have no function except to
confuse and induce errors.

Failing to use dynamic memory instead of automatic makes anything
uncalculable.

Restart learning about lifespan of automatic variables and you would
never come again with such crap as above.
 
B

bd

CBFalconer said:
jacob said:
pete said:
N869
7.1.4 Use of library functions
[#4] The functions in the standard library are not
guaranteed to be reentrant and may modify objects with
static storage duration.

Note the "guaranteed". Many compilers offer reentrant libraries.

For some functions. Others are inherently non-reentrant, and cause
problems in concurrent systems. E.g. strtok, fopen.

Why is fopen inherently non-reentrant? Surely the implementation is
permitted to use some magic to make certain operations atomic if it really
needs to.
 
D

Dave Vandervies

Method Man wrote: *** And failed to attribute ***
struct matrix create_set(int m, int n) {
struct matrix set;
set.rows = m;
set.cols = n;
set.data = malloc(m * sizeof *set.data);
if (set.data == NULL) ...
while (--m >= 0) {
set.data[m] = malloc(n * sizeof *set.data[m]);
if (set.data[m] == NULL) ...
}
return set;
}

As a design question, may I ask why you decided to allocate 'set'
on the automatic storage instead of the free storage? It seems a
waste of space to pass the whole struct back to the caller.

So he could later write such things as:

this_set = create_set(....);
that_set = create_set(....);

What does you mean contains set.rows after create_set returns and
another fuction using 4 parameters comes back? Any information about
the struct gets lost as it is completely inside the temporary memory
defined only while create_set is active.

Any information except the copy that fgets returned, that is. But since
all the relevant information is in the copy that fgets returned, I fail
to see how this is a problem.
In best case:

this_set = create_set(4,5);
that_set = create_set(1000, 200);
other_set = create_set(1,1);

How bis is in set.rows of this_set now?
4

What is in that_set?

two values describing the size (1000 rows and 200 columns), and a
pointer to a suitable mallocd array of pointers to suitably mallocd
arrays of values.
Why is
this_set now in dimesion [1][1] instead of [4][5]? Why gets the
dimsension information of all sets destroyed when you call simply
printf()?

Compiler bug?

Failing to use dynamic memory instead of automatic makes anything
uncalculable.

I don't see any automatic memory misused where dynamic memory would have
been appropriate.

Restart learning about lifespan of automatic variables and you would
never come again with such crap as above.

Perhaps some awareness about the difference between pointers and variables
would have prevented you from coming up with such crap as above.


dave
 
C

CBFalconer

bd said:
CBFalconer said:
jacob said:
pete wrote:

N869
7.1.4 Use of library functions
[#4] The functions in the standard library are not
guaranteed to be reentrant and may modify objects with
static storage duration.

Note the "guaranteed". Many compilers offer reentrant libraries.

For some functions. Others are inherently non-reentrant, and cause
problems in concurrent systems. E.g. strtok, fopen.

Why is fopen inherently non-reentrant? Surely the implementation is
permitted to use some magic to make certain operations atomic if it
really needs to.

fopen has to connect to some named external entity. It then has to
set up various data to control the access, such as a current offset
within a buffer, or a current block within a disk file, or
whatever. These require the equivalent of global data items.

Re-entrancy allows reuse in concurrent operations. Semaphores etc.
do not guarantee re-entrancy by themselves.

--
"I support the Red Sox and any team that beats the Yankees"
"Any baby snookums can be a Yankee fan, it takes real moral
fiber to be a Red Sox fan"
"I listened to Toronto come back from 3:0 in '42, I plan to
watch Boston come back from 3:0 in 04"
 
J

Jonathan Adams

CBFalconer said:
jacob said:
pete said:
N869
7.1.4 Use of library functions
[#4] The functions in the standard library are not
guaranteed to be reentrant and may modify objects with
static storage duration.

Note the "guaranteed". Many compilers offer reentrant libraries.

For some functions. Others are inherently non-reentrant, and cause
problems in concurrent systems. E.g. strtok, fopen.

<OT>
Well, it depends on whether you're talking about reentrancy with regard
to signals, or reentrancy with regards to threads. It's fairly easy to
have both strtok() and fopen() be thread-safe (for strtok(), use a
thread-local buffer; for fopen(), use appropriate locking)

Making things reentrant w.r.t signals is typically much more difficult,
which is why the SUSv3 list of "async-signal-safe" functions is so short.
</OT>

Cheers,
- jonathan
 
R

Richard Bos

CBFalconer said:
fopen has to connect to some named external entity. It then has to
set up various data to control the access, such as a current offset
within a buffer, or a current block within a disk file, or
whatever. These require the equivalent of global data items.

This does not make it inherently non-reentrant. Sure, call fopen() on a
file which another thread has already fopen()ed, and it may fail - but
it may fail at any other time, as well. strtok(), OTOH, _is_
non-reentrant, because its description demands that it is.

Richard
 
S

siddhartha

Jonathan Adams said:
CBFalconer said:
jacob said:
pete wrote:

N869
7.1.4 Use of library functions
[#4] The functions in the standard library are not
guaranteed to be reentrant and may modify objects with
static storage duration.

Note the "guaranteed". Many compilers offer reentrant libraries.

For some functions. Others are inherently non-reentrant, and cause
problems in concurrent systems. E.g. strtok, fopen.

<OT>
Well, it depends on whether you're talking about reentrancy with regard
to signals, or reentrancy with regards to threads. It's fairly easy to
have both strtok() and fopen() be thread-safe (for strtok(), use a
thread-local buffer; for fopen(), use appropriate locking)

Making things reentrant w.r.t signals is typically much more difficult,
which is why the SUSv3 list of "async-signal-safe" functions is so short.
</OT>

Cheers,
- jonathan

Hi Jonathan,

I'm a relative newbie to signals - could you clarify your last
statement with an example?
 
J

Jonathan Adams

Jonathan Adams said:
CBFalconer said:
jacob navia wrote:
pete wrote:

N869
7.1.4 Use of library functions
[#4] The functions in the standard library are not
guaranteed to be reentrant and may modify objects with
static storage duration.

Note the "guaranteed". Many compilers offer reentrant libraries.

For some functions. Others are inherently non-reentrant, and cause
problems in concurrent systems. E.g. strtok, fopen.

<OT>
Well, it depends on whether you're talking about reentrancy with regard
to signals, or reentrancy with regards to threads. It's fairly easy to
have both strtok() and fopen() be thread-safe (for strtok(), use a
thread-local buffer; for fopen(), use appropriate locking)

Making things reentrant w.r.t signals is typically much more difficult,
which is why the SUSv3 list of "async-signal-safe" functions is so short.
</OT>

Cheers,
- jonathan

Hi Jonathan,

I'm a relative newbie to signals - could you clarify your last
statement with an example?

(added comp.unix.programmer, and set Followup-To:, since this is all
outside of the C standard, and POSIX has the most to say about it)

The problem with signals is that they interrupt the program at an
uncertain point in the text -- you could be in the middle of (say)
updating the malloc() bookkeeping information. If you are, then
calling any function which can invoke malloc() from the signal handler
would almost certainly hopelessly mess up the malloc() heap, causing
random data corruption.

In a multithreaded process, you're likely to just deadlock, since there
will be a lock protecting the malloc state which you are already holding.

Practically, this severely restricts what you can do in a signal handler.
Without further precautions[1], you can only safely:

1. assign a value to a (possibly volatile) sig_atomic_t variable.
2. mess with local state on the stack
3. call async-signal-safe functions. These are functions which
are "reentrant" (basically, those which do not manipulate any
kind of global state -- strcmp() and strlen() are good
examples, as are most system calls, since the kernel can
guarantee safety)

Typical "bad ideas" one sees in signal handlers are:

1. calling printf(), fprintf(), or any other stdio function
which acts on a FILE *, directly or indirectly.
(fputs(), puts(), etc.)

2. calling anything which might call malloc().
(sprintf()/snprintf() are insidious examples, because
many implementation of them can call malloc() unexpectedly.)

You also see people calling fork()+exec() in signal handlers, which is
not *precisely* unsafe, but a bit ill-advised.[2] Many gnome
applications do this, in order to pop up "I got a bad signal, what
should I do?" windows.

The simple fact is that signal handlers are very hard to write correctly
once you get beyond simple cases. (and even simple cases introduce
issues like EINTR and SA_RESTART, which can have program-wide effects)

Throw multi-threaded programming into the mix, and now you have *two*
subtly incompatible asynchronous mechanisms, which is why most robust MT
programs block all asynchronous signals in all threads, and have a
dedicated thread in sigwaitinfo(3thr) or sigtimedwait(3thr) handling all
signals.

To get back to your original question, strtok() cannot be made
async-signal-safe because it's interface relies on a piece of global
data. Calling it from a signal handler will smash that data, with no
way to recover the previous state.

fopen() uses malloc(), which is not safe. It also has to maintain the
global list of open files, for fflush(NULL) to find.

Cheers,
- jonathan

[1] For example, you could block the signal whenever you call into any
code which uses the unsafe functions. This can quickly lead to
out-of-hand code complexity, though.

[2] The worst case of this I've ever seen was a daemon which tried to
"save" itself from SIGSEGVs by fork()ing and exec()ing a new copy of
itself. Of course, the signal handler was buggy, and could deadlock...
 
C

CBFalconer

Richard said:
This does not make it inherently non-reentrant. Sure, call fopen() on
a file which another thread has already fopen()ed, and it may fail -
but it may fail at any other time, as well. strtok(), OTOH, _is_
non-reentrant, because its description demands that it is.

Yes it does, when the items are writable. Interrupt it in the
middle of it's process, when it has possibly just selected a new
index for the system table to retain data in, and call it again
from the interrupting code. That is the sort of thing you can do
with re-entrant code.

--
"I support the Red Sox and any team that beats the Yankees"
"Any baby snookums can be a Yankee fan, it takes real moral
fiber to be a Red Sox fan"
"I listened to Toronto come back from 3:0 in '42, I plan to
watch Boston come back from 3:0 in 04"
 
R

Richard Bos

CBFalconer said:
Yes it does, when the items are writable. Interrupt it in the
middle of it's process, when it has possibly just selected a new
index for the system table to retain data in, and call it again
from the interrupting code. That is the sort of thing you can do
with re-entrant code.

Ah, but, that problem belongs to the implementation. A compiler writer
certainly _can_, at least as far as the Standard is concerned, implement
fopen() so that it operates atomically with regard to its data
structures. Ok, it needn't do so; but then again, neither does any other
function in the library.
In particular, all functions in <stdio.h> which use an fopen()ed file
must use the same structures that fopen() itself uses. fprintf() must
write to the stream's buffer; it must read its error indicator. If
fopen() is inherently non-reentrant, so is fprintf(). Luckily, the
Standard does allow both fopen() and fprintf() to be implemented
thread-safe, if necessary.

Richard
 
C

CBFalconer

Richard said:
.... snip ...

In particular, all functions in <stdio.h> which use an fopen()ed
file must use the same structures that fopen() itself uses.
fprintf() must write to the stream's buffer; it must read its
error indicator. If fopen() is inherently non-reentrant, so is
fprintf(). Luckily, the Standard does allow both fopen() and
fprintf() to be implemented thread-safe, if necessary.

The standard knows nothing about threads, interrupts, etc. The
problem is for the implementor to protect use of his library
against the simultaneous presence of these things, which should not
affect the operation of the single-threaded C (or other) program.

The simplest way is to make the library routines re-entrant. As I
have pointed out, many routines are inherently non-reentrant, and
so must be protected against interruption (or possibly being called
from an interrupt). This involves OS specific games, which may
range from dis/enabling interrupts to the use of semaphores or
monitors. The C language doesn't care how it is done.

The presence of, or reference to, any static storage in a routine
is a glaring signal that that routine is not naturally reentrant,
and needs attention. For example the actual process may arrange
that all static storage is process specific, more or less avoiding
the problem until someone dreams up threads. Threads specifically
avoid making static storage thread specific, to facilitate common
memory usage, and thus bring back all the problems.

This is all OT for c.l.c, but may indicate some of the hidden costs
of having any so-called 'global' variables.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,147
Messages
2,570,835
Members
47,382
Latest member
MichaleStr

Latest Threads

Top