global/static variables & loops

S

Shao Miller

Can you give an example of a non-trivial program, that has no global
state in any way?

If, by "global state," you mean static-duration objects, I think I'm
working on such a library right now. It doesn't link with the Standard
Library, but requires [its own library] users to pass a pointer to a
structure with function pointer members for any functions that the
library can't implement without depending on other libraries or
operating system. The library-using caller might have global state, but
the library doesn't.

However even if all of this is squeezed through a single exported
function, that exported function could be considered "a global," of
sorts, though not a static-duration object.

Speaking of which, would 'const'-qualified global objects be considered
"global state?" And string literals, that ought to be treated as
'const'? I wouldn't think so, but perhaps opinions differ.
 
M

Malcolm McLean

Speaking of which, would 'const'-qualified global objects be considered
"global state?"  And string literals, that ought to be treated as
'const'?  I wouldn't think so, but perhaps opinions differ.
No. State is modified during the duration of the program. So compile
time constants are not state.
 
B

BartC

Eric Sosman said:
On 2/15/2012 7:20 PM, Kaz Kylheku wrote:
True story: A former employer's flagship product did things with
documents, and the earliest versions could handle only one document
at a time. THE document was, fairly naturally, described by a host
of global variables: ....
By the time I left that employer, the size of the automatically-
generated struct and of the save/restore functions had grown to the
point where some platforms' compilers could no longer handle them,
and the tools had been modified to break the struct into three, with
three sets of functions. Also, just moving your mouse from Window A
to Window B could drive your paging disk berserk as it tried to handle
all these references to globals scattered hither and yon all over the
address space; you could force your workstation to its knees just by
flicking your mouse back and forth on the screen.

That's the power of globals.

Wasn't it possible just to run multiple instances of the program?

Anyway I can't quite believe the situation was that bad; even copying a
million variables (today) or perhaps a thousand (years ago), would hardly
have that impact.
I take two lessons from this experience: First, sufficient ingenuity
and a willingness to hack can come up with a short-term solution to most
any problem. Second, short-term solutions have little staying power.

I think I prefer the solution you describe (call SelectDocument(n), and
continue to be able to use all these settings in a simple manner) than have
to acknowledge the 'multi-documenticity' of the application in a million
places in the source code. (And when there's a further enhancement, applying
multiplicity to something you hadn't thought of before, have to do it all
again.)
 
N

Nick Keighley

 Can you give an example of a non-trivial program, that has no global
state in any way?

I was encouraging him consider if he needed a global variable.
Excluding stuff from the standard library plenty of small non-trivial
programs can be written without global state. I'll have to hunt around
some of my software. If there are no extern variables does that count?
 
N

Nick Keighley

ok!

agreed

it's a jolly funny defintion. Yes its global state, but it isn't a
"global". The only people taht see it are those that need to. It makes
reasoning about the program easier and avoids certain classes of
error. Testing is simplified. Extension of the program is eased (see
the story of The Document by another poster).
If that counts as global state, then storing global state in global
variables is clearly not necessary.
quite


Your use of "just" and "baaad" implies that you think that the idea that
globals are bad is an unjustified prejudice, rather than the
well-justified judgment that it actually is. Are you actually unaware of
the problems that they can cause?

logically the function looked like this

void f (T p1, T p2);

but someone had decided multiple parameters were expensive (it even
said so in the programming standards).

So it was implemented like this

extern T fp1;
extern T fp2;
void f (void);

Then code like this was written

fp1 = x;
fp2 = y;
f();
g();
fp1 = z;
f();

which was fine until g() needed f() and updated fp2. I'm happy to say
it was my boss who wrote the "optimised" code above.
n
The key problem with global variables is visibility; by being visible
everywhere, they can be affected by anything, which can make it
difficult to determine what part of the program caused those variables
to have their current value, or what part had it's behavior affected by
the value.

which is one of the attractions of ADTs and OOP
Variables which are local to main() and are passed by a
pointer to subroutines need not be passed to all subroutines, but only
the ones that actually need to have access to those variables, and that
could be a different set of subroutines for different variables. Making
information available to a subroutine only on a "need to know" basis is
good software design. It can be overdone, but that's seldom a real-world
problem.

I've worked on programs that were nearing MLOC. I'd hate it if all
"global state" were globally visible.
 
N

Nick Keighley

Malcolm McLean <[email protected]> writes:
 I don't know if I've made myself clear previously, but I'm in this
thread, asking about programs having *NO GLOBAL STATE AT ALL*.

I think you're trying to extend your "global state" beyond what is
reasonable. Unless we're writing in some sort of functional
programming language the we're going to have GS, and even with an FP I
suspect you'd claim GS was present.

I'm talking about global variables. Data that's accessible throughout
the program. Stuff that's hard to debug because *anyone* could have
fiddled with it.

I accept taht most real world programs have *some* global data, but I
submit it should be kept to a minimum.
 No debug levels.

honourable exception.

Though large programs with global debug state tend to drown in
diagnostics. It needs to eb much more fine grained to be useful.
Turning on TRACE2(the noisiest level) globally on one system simply
brought it to its knees.
 No log destinations.

honourable exception.

It's handy to have more than one log though
 No file handles.

I don't accept that these are "global" in my sense. Except for stdin,
stout and stderr. And the large systems I've worked on didn't really
use 'em.
 No nothing.

oh you can have as many global nothings as you like! :)
 I have no trouble seing why *DATA* should not be globally visible,
unless there are some very good reasons for them being so,

we're in violent agreement then.
 
E

Eric Sosman

Wasn't it possible just to run multiple instances of the program?

The whole point was to avoid running multiple copies. That
way, you could (for example) grab a whole bunch of documents at
once and change their page sizes from Letter to A4 in a single
operation, instead of making the same change in N invocations of
the same program.
Anyway I can't quite believe the situation was that bad; even copying a
million variables (today) or perhaps a thousand (years ago), would hardly
have that impact.

I was there, you were not. I saw the page disk thrashing, you
did not. I saw the system freeze for as much as four seconds when
the mouse went from one window to another (although I admit most
freezes were less than three seconds).

But, hey: Who am I to challenge your unbelief?
I think I prefer the solution you describe (call SelectDocument(n), and
continue to be able to use all these settings in a simple manner) than have
to acknowledge the 'multi-documenticity' of the application in a million
places in the source code. (And when there's a further enhancement,
applying
multiplicity to something you hadn't thought of before, have to do it all
again.)

Been there, done that, found it bad.
 
A

Anders Wegge Keller

Nick Keighley said:
I think you're trying to extend your "global state" beyond what is
reasonable. Unless we're writing in some sort of functional
programming language the we're going to have GS, and even with an FP I
suspect you'd claim GS was present.

I'm talking about global variables. Data that's accessible throughout
the program. Stuff that's hard to debug because *anyone* could have
fiddled with it.

I was asking (and extending global state), to point out why strict
adherence to "no glabals" without thought isn't clever.
 
A

Anders Wegge Keller

James Kuyper said:
On 02/15/2012 05:10 PM, Anders Wegge Keller wrote:
I had intended some judgment to be used; need-to-access must always
be traded off against other issues.

I agree. Bu if - as I guess is the case for ls - about three quarters
of the options have to be propagated to three quarters of the code, if
not to provide a nightmarish complex call graph, we might as well
consider those options globals in all but name.
In many cases it will be convenient to tell a subroutine that a
given option is not turned on, by passing a null pointer as the
argument which would otherwise point to the data associated with
that option.

Up to a limit. I've seen code evolving over time, until you end up
with a call like this:

maniPrepareForHost (parcel_data, TRUE, FALSE, FALSE, TRUE, TRUE);

I guess that it would be equally painfull to guess what the function
did, if there were two NULL's instead.
 
M

Malcolm McLean

 I was asking (and extending global state), to point out why strict
adherence to "no glabals" without thought isn't clever.
Let's say we do this


double ehks(double x, bool write)
{
static double X;

if(write)
X = x;
return X;
}

we've avoided making X global, but we haven't achived much. However
we've done something psychologically, the function ehks() is much more
likely to be documented than just X. And we can intercept reads or
writes to X.
 
N

Nick Keighley

 I was asking (and extending global state), to point out why strict
adherence to "no glabals" without thought isn't clever.

lots of things in programming done without thought are bad.
 
K

Kaz Kylheku

On my machine, stdout is an address constant,
which is not something that I would refer to as a "variable".

I can compile this:

FILE *array[] = {stdout};

That will work at file scope or global scope on some platforms - such as
Solaris and probably HP-UX and AIX. It will not work on Linux or BSD
(including Mac OS X), so portable code doesn't try to initialize global
or file-scope (static) variables like that, nuisance though it be.

I think the point was to make exactly that point. And that was pointless
to begin with. stdout is morally a global variable. It's a good example of
a use for global variables: pervasive symbols that refer to something.

It is useful to be able assign a value to stdout to redirect output elsewhere,
and then restore the value.

So stdout is actually functionally crippled by /not/ being defined as a
variable.

Common Lisp:

;; write to standard output
(defun produce-output ()
(format t "Hello, world"))

;; temporarily redirect standard output to a string
;; works because *standard-output* is a variable
;; (and not something else like a symbol macro, or constant).
(with-output-to-string (*standard-output*)
(produce-output))
-> "Hello, world" ;; string pops out

Global variables need the discipline of dynamic scoping: the ability to
locally "re-bind" the global so that it appears to be saved and restored over
a dynamic extent.

Emacs Lisp has only dynamically scoped variables, yet it "does OK".

Saving and restoring is the key, and it has to be automated. Saving and
restoring is what allows the entire environment of an entire machine to share
a handful of global registers, making it look like every thread (or other
context) has its own.
 
B

Ben Bacarisse

Shao Miller said:
Inside a function, you can do such initializations (under C99 - not
under C89 with GCC and -pedantic), though an array of dimension 1 is
modestly pointless.

Actually, I think that some folks use an '[1]' for 'struct's and
union's in their code just for the sake of using only the '->'
operator throughout their code.
<snip example>

I've also seen in used in some APIs. You get a sort of automatic call
by reference and copying (by assignment) is prohibited:

typedef struct { /* stuff */ } MyType[1];

MyType a, b;
/* ... */
a = b; /* illegal */
mytype_copy(a, b); /* no need for &a on the target */

I don't know if I like it or hate it! To decide, I think I'd have to
maintain a large body of code that uses such an API. That usually
sorts out the good ideas from the bad ones.
 
M

Markus Wichmann

C'mon, Kaz, you know better. Usually, an instance inhabits a
`static' somewhere (so its name is not available outside the scope
of its keeper), or it inhabits dynamically allocated memory (which
has no name at all).

True story: A former employer's flagship product did things with
documents, and the earliest versions could handle only one document
at a time. THE document was, fairly naturally, described by a host
of global variables: What are THE page margins, what is THE set of
paragraph styles, what is THE associated file name, and so on. At
some point (before I joined), the product was extended to handle
multiple documents simultaneously -- but by then, all those globals
had infiltrated themselves into too many places to extricate; there
just Was Not Going To Be an attempt to track down every reference to
every global and route it through a pointer instead, nor to inflate
all the functions and function calls to pass the pointer around.

Where's the problem? Write a tool that notices when a function uses one
of the globals (you surely had at least a list of them, hadn't you?).
Then rewrite the function prototype in the header, if any, and the
function declaration in the code file so that it takes a pointer to a
struct as first argument and references the globals through that
pointer. Then remove any _definition_ of the globals and bundle them in
said struct. For starters, define one static instance of the struct
somewhere, but remove it later after inclusion of multiple-instance code.

By now, every old usage of the functions should result in a compiler
error or warning, and every usage of the globals should result in a
linker error. Correct those and you are nearly done. Now you only need
to find leftover locally defined static-duration variables that only
work for one document and eradicate those. Done!
Solution? The overarching framework noticed when the user moved
from Document A to Document B, and swapped things around to make it
work. It saved all the globals for A into one big struct, and then
restored all their values from B's struct (build-time tools created
the struct definition and the save/restore code, with a little help
from source-code markers to identify the globals). When the user then
moved to Document C, B's globals were squirrelled away and C's values
overwrote them.

So you had to create the struct anyway already!
By the time I left that employer, the size of the automatically-
generated struct and of the save/restore functions had grown to the
point where some platforms' compilers could no longer handle them,

How is that possible? You would need to break some hard limit in the
compiler, like 64KB on a 16-bit machine. You had 64KB worth of globals?
and the tools had been modified to break the struct into three, with
three sets of functions. Also, just moving your mouse from Window A
to Window B could drive your paging disk berserk as it tried to handle
all these references to globals scattered hither and yon all over the
address space; you could force your workstation to its knees just by
flicking your mouse back and forth on the screen.

At that point the "solution" has become unviable and another solution
must be sought. But, well, that's gonna cost money, right?

One solution would have been to leave the program in its state with all
the globals and just put a wrapper there to handle multiple documents
with multiple processes. Then the OS can worry about the swap drive,
with copy-on-write pages and such.
That's the power of globals.

I take two lessons from this experience: First, sufficient ingenuity
and a willingness to hack can come up with a short-term solution to most
any problem. Second, short-term solutions have little staying power.

Good design helps, too: If the company had worked object-orientedly from
the start (which would have resulted in a program that used those
structs from the start), much money could have been saved. But of
course, there would have been more money needed to be spent before the
first release.

What do we learn? Global variables are usually a hack that can be
removed by good design and a good compiler. Seriously, a compiler with
link time optimization should not yield worse results with a structured
global than with individual globals.

Ciao,
Markus
 
E

Eric Sosman

Where's the problem?

Scale. It is easy to kill one rhinovirus, but rather more
difficult to cure the common cold.
Write a tool that notices when a function uses one
of the globals (you surely had at least a list of them, hadn't you?).

Yes, implicitly by markers around their definitions. The build
tools used the markers to generate the "backing store" structs and
the save/restore functions that handled them, and it would have been
easy to make them emit the information in other forms, as well.
Then rewrite the function prototype in the header, if any, and the
function declaration in the code file so that it takes a pointer to a
struct as first argument and references the globals through that
pointer. [...]

You've put "if any" in the wrong place: Headers yes, prototypes
no, because the code's origins antedated the ANSI Standard by about
a decade. (In later years I supported a move to "ANSIfy" the code
base, but engineers with greater clout and seniority argued against
it and the project never got started.)
By now, every old usage of the functions should result in a compiler
error or warning, and every usage of the globals should result in a
linker error. Correct those and you are nearly done.

Kill a billion rhinovirii and you're well on your way to being
rid of your cold.
Now you only need
to find leftover locally defined static-duration variables that only
work for one document and eradicate those.

Obviously any such leftovers had already been eliminated, or the
swap-and-restore hack would not have worked.
How is that possible?

Scale. (Is there an echo in here?)
You would need to break some hard limit in the
compiler, like 64KB on a 16-bit machine. You had 64KB worth of globals?

Easily. How many did we eventually have? Sorry, I don't recall;
it was more than a quarter-century ago. I can't even recall whether
the limits we hit pertained to struct size or to function size, nor
which platforms' compilers croaked on them. (All were 32-bit, by the
way. A serious effort to get our stuff running on a 16-bit compiler
with a 32-bit address extension proved fruitless, and our only 64-bit
port came along rather late in the code base's life.)
 
S

Shao Miller

Shao Miller said:
Inside a function, you can do such initializations (under C99 - not
under C89 with GCC and -pedantic), though an array of dimension 1 is
modestly pointless.

Actually, I think that some folks use an '[1]' for 'struct's and
union's in their code just for the sake of using only the '->'
operator throughout their code.
<snip example>

I've also seen in used in some APIs. You get a sort of automatic call
by reference and copying (by assignment) is prohibited:

typedef struct { /* stuff */ } MyType[1];

MyType a, b;
/* ... */
a = b; /* illegal */ *a = *b; /* legal */
mytype_copy(a, b); /* no need for&a on the target */

I don't know if I like it or hate it! To decide, I think I'd have to
maintain a large body of code that uses such an API. That usually
sorts out the good ideas from the bad ones.

Yeah.

On another note, 'clang -ansi -pedantic' appears to be ok with the
following code, but 'gcc -ansi -pedantic' doesn't (4.3.3):

#include <stdio.h>

struct s_foo { unsigned char bytes[2]; };
typedef struct s_foo foo_t[1];
struct s_foo_wrapper { foo_t foo; };

int main(void) {
register const struct s_foo_wrapper reg_foo =
{ { { { 42, 42 } } } };
struct s_foo_wrapper normal_foo;

normal_foo = reg_foo;
printf(
"%d and %d\n",
normal_foo.foo->bytes[0],
normal_foo.foo->bytes[1]
);
return 0;
}

I am possibly overlooking something...

register.c: In function 'main':
register.c:8: error: register name not specified for 'reg_foo'
 
K

Keith Thompson

Shao Miller said:
On another note, 'clang -ansi -pedantic' appears to be ok with the
following code, but 'gcc -ansi -pedantic' doesn't (4.3.3):

#include <stdio.h>

struct s_foo { unsigned char bytes[2]; };
typedef struct s_foo foo_t[1];
struct s_foo_wrapper { foo_t foo; };

int main(void) {
register const struct s_foo_wrapper reg_foo =
{ { { { 42, 42 } } } };
struct s_foo_wrapper normal_foo;

normal_foo = reg_foo;
printf(
"%d and %d\n",
normal_foo.foo->bytes[0],
normal_foo.foo->bytes[1]
);
return 0;
}

I am possibly overlooking something...

register.c: In function 'main':
register.c:8: error: register name not specified for 'reg_foo'

That's a bug in gcc. Nothing in standard C requires a "register name".

Why did you bother with the "register" keyword there anyway?
 
S

Shao Miller

Shao Miller said:
On another note, 'clang -ansi -pedantic' appears to be ok with the
following code, but 'gcc -ansi -pedantic' doesn't (4.3.3):

#include<stdio.h>

struct s_foo { unsigned char bytes[2]; };
typedef struct s_foo foo_t[1];
struct s_foo_wrapper { foo_t foo; };

int main(void) {
register const struct s_foo_wrapper reg_foo =
{ { { { 42, 42 } } } };
struct s_foo_wrapper normal_foo;

normal_foo = reg_foo;
printf(
"%d and %d\n",
normal_foo.foo->bytes[0],
normal_foo.foo->bytes[1]
);
return 0;
}

I am possibly overlooking something...

register.c: In function 'main':
register.c:8: error: register name not specified for 'reg_foo'

That's a bug in gcc. Nothing in standard C requires a "register name".

Why did you bother with the "register" keyword there anyway?

Well, 'register' suggests that an access be "as fast as possible" and I
was pondering 'reg_foo' having the potential to be "an immediate
operand" for some given architecture. In that case, not only would
'reg_foo' not be addressable, but it mightn't even be stored in "a
register," but still could contain a C array value and still be useful
for assignment.

I'd assume that a decent compiler could do the same thing if 'register'
was replaced with 'static', there, or even with no explicit storage
class specifier.
 
P

Phil Carmody

Eric Sosman said:
printf ("Hello, world!\n");

(If you don't see the global variable, try using fprintf() instead.)

stdout's not a global variable, it's an expression. But more than that,
I do not see the C&V for that expression having to be a constant expression.
(perhaps my failing, enlightenment would be much appreciated).

If indeed it may be otherwise, it might not be an address constant (+/-
constant offset).

I guess the logical conclusion from that is that this might be valid:

FILE* __get_stdout_pointer(int i) { static FILE[3]; /* ... */ }
...
extern FILE* __get_stdout_pointer(int);
#define stdout (__get_std_file_pointer(1))

Which would be surprising, as I'm sure a lot of code has assumed that
stdout's a constant expression.

Phil
--
I'd argue that there is much evidence for the existence of a God.
Pics or it didn't happen.
-- Tom (/. uid 822)
 
P

Phil Carmody

Kaz Kylheku said:
On my machine, stdout is an address constant,
which is not something that I would refer to as a "variable".

I can compile this:

FILE *array[] = {stdout};

That will work at file scope or global scope on some platforms - such as
Solaris and probably HP-UX and AIX. It will not work on Linux or BSD
(including Mac OS X), so portable code doesn't try to initialize global
or file-scope (static) variables like that, nuisance though it be.

I think the point was to make exactly that point. And that was pointless
to begin with. stdout is morally a global variable. It's a good example of
a use for global variables: pervasive symbols that refer to something.

It is useful to be able assign a value to stdout to redirect output elsewhere,
and then restore the value.

So stdout is actually functionally crippled by /not/ being defined as a
variable.

Why do you want to leak both its value and its address, when all that
is needed by the clients is the value? (c.f. errno.)

Phil
--
I'd argue that there is much evidence for the existence of a God.
Pics or it didn't happen.
-- Tom (/. uid 822)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,952
Messages
2,570,115
Members
46,701
Latest member
mathewpark

Latest Threads

Top