Making Fatal Hidden Assumptions

M

Michael Wojcik

I should have quit when you first resorted to insulting those who
disagree with you. Bye, now.

Pompous insults are Alf's usual recourse when he's confronted with
someone willing to point out that his arguments are vacuous.

I haven't killfiled him because he occasionally has something
worthwhile to say in technical discussions, but in an argument of
this sort it's best just to ignore him. If past behavior is any
indication, he'll just try to shout everyone else down until we all
get tired of him and give up.
 
A

Alf P. Steinbach

* Michael Wojcik:
Pompous insults are Alf's usual recourse when he's confronted with
someone willing to point out that his arguments are vacuous.

I haven't killfiled him because he occasionally has something
worthwhile to say in technical discussions, but in an argument of
this sort it's best just to ignore him. If past behavior is any
indication, he'll just try to shout everyone else down until we all
get tired of him and give up.

Al chose to just insult me, in his first message -- I don't know why,
and generally I won't speculate what his reasons could be.

There is no technical content in your posting, but there is pack of lies
and personal attacks, the usual hare-brained ad homimem attack.

Like I won't speculate about Al's reasons for going 100% personal, I
won't speculate about your reasons.
 
D

Dave Thompson

Actuall I still think in PDP assembler at times (my first
assemblerprogramming).
so y=x++; really does map to a single instruction which both moves the
value to y and increments x (which had to be held in a register IIRC)

PDP was (almost) the entire product line of DEC for much of its life,
containing some similar architectures and some quite different ones
with (thus) different assembly languages.

You are probably thinking of PDP-11 autoincrement. This uses the
location _addressed_ by a register (only) which it increments (by
stride) at the same time:
register char * a = ?, b = *a++; // compiles to MOVB (R1)+, R2
register int * a = ?, b = *a++; // compiles to MOV (R1)+, R2
// note PDP-11 pointers are byte addresses & this adds _2_
// I write register explicitly for clarity, although a compiler could
// place variables not declared register in a register, and could
// choose not to use a register even though you specify it.

<snip other points about assy, macro assy, and C I mostly agree with>

- David.Thompson1 at worldnet.att.net
 
D

Dave Thompson

Forgive my memory,but is it PL/1 or ADA that lets the programmer define
what integer type he wants. Syntax was something like
INTEGER*12 X
defined X as a 12 bit integer. (Note that such syntax is portable in
that on two different processors, you still know that the range of X is
+2048 to -2047

You mean -2048 to +2047 for two's complement, and -2047 to +2047 for
the extremely rare case of non-2sC or 2sC-with-trap-representation.

PL/1 is like DECLARE X FIXED BINARY (11); /* not counting sign */
Pascal uses the actual range like X: -2047 .. +2047 and Ada similarly
except, as arguably usual, more verbosely in most cases. In both cases
you are only guaranteed _at least_ 12 bits; in Ada you can
additionally specify a 'representation clause' that requires exactly
12 bits (or a diagnostic if that can't be implemented).

The syntax INTEGER*n is a common extension in Fortran (though not
standard) for an integer of n _bytes_ not bits.
The point is a 16bit integer in ADA is always a 16bit integer and
writing
x=32768 +10
will always overflow in ADA, but it is dependent on the compiler and
processor in C. It can overflow, or it can succeed.

But my point on this was, you need to know your target processor in C
more than in a language like ADA. This puts a burden on the C
programmer closer to an assembler programmer on the same machine than
to a ADA programmer.
I don't think this is true. In both languages it is fairly easy to do
portable but perhaps overconservative code. In C it is easy to get
fairly well 'down to the metal' if you want; in Ada it is fairly easy
to get even further down if the compiler supports it. And at the
extreme, Ada standardly requires an interface to assembler; C does not
do this standardly, but practically all implementations have some way.

No I was talking about the original motivation for the design of the
language. It was designed to exploit the register increment on DEC
processors. in the right context, (e.g. y=x++;) the increment doesn't
even become a separate instruction, as I mentioned in another post.
Neither motivation nor correct effect; see other posts.

lets put it this way. there is a gradient scale, from pure digits of
machine language (e.g., programming obcodes in binary is closer to the
hardware than using octal or hex)
at the lowest end and moving up past assebmler to higher and higher
levels of abstraction away from the hardware. On that scale, I put C
much closer to assembler than any other HLL I know. here's some samples
s/much/noticeably/ and I agree. (Except see below.)
PERL, BASH, SQL

Also awk. (Although you could consider it subsumed by perl.) And all
Unix shells more or less not just bash. Also the LISP tribe (Scheme,
etc.) and Prolog here or perhaps a smidgen higher.
C++, JAVA
PASCAL, FORTRAN, COBOL

I'd insert Ada here.

I'd insert FORTH here. And maybe pull some of the more powerful macro
assemblers above 'basic' assembler.
assembler
HEX opcodes
binary opcodes

I wouldn't distinguish hex from binary; that's trivial.

I'd insert microcode and then registers/gates here.
digital voltages in the real hardware.
And below that turtles. <G>

- David.Thompson1 at worldnet.att.net
 
E

Ed Prochak

Chris said:
Alf said:
C was designed as a portable assembly language, the most successful so
far, so if the term has any practical meaning, then C is that meaning.

I would have said it was designed as a portable language to /replace/
[the use of] assembly language(s) [for a variety of "systems programming"
uses].

C looks nothing like any assembly language I've ever seen, which I
suppose may just indicate my limited knowledge of assemblers. Calling
it "portable assembly language" is an instructive metaphor, but it
is just that - a metaphor.

yes, he gets it!
 
S

Stephen Sprunk

Ed Prochak said:
Some other languages do so much more for you that you might be scared
to look at the disassembly. e.g. languages that do array bounds
checking for you will generate much more code for a[y]=x; than does C.
You can picture the assembly code for C in your head without much
difficulty. The same doesn't hold true for some other languages.

There are C compilers that will add bounds-checking code for a[y]=x, and
it's quite ugly if disassembled. I think what you mean is that C doesn't
require implementations to do it (and most don't), but other languages do.

I shudder to think of what the asm would look like for a[y]=x if a were a
C++ object with virtual methods for the [] and = operators, with the latter
having to call a copy constructor on x. For fun, compile as PIC too.

However, I agree with the general statement that when you write C, you have
a good shot at imagining what the corresponding asm would be. I'm not sure
that makes it a glorified assembler, however.

C's greatest feature, and its worst, is that you can do all sorts of ugly
unportable things that virtually no other HLL allows but also have portable
constructs available: it's your choice which to use. This means you can
write non-portable implementation code in the same language as portable user
code, and IMHO is the reason for C's enduring popularity.

S

--
Stephen Sprunk "Stupid people surround themselves with smart
CCIE #3723 people. Smart people surround themselves with
K5SSS smart people who disagree with them." --Aaron Sorkin

*** Free account sponsored by SecureIX.com ***
*** Encrypt your Internet usage with a free VPN account from http://www.SecureIX.com ***
 
E

Ed Prochak

Andrew said:
I posted a page-long description of what I concieve a universal assembler
to be in a previous message in the thread. Perhaps it didn't get to your
news server? Google has it here:
http://groups.google.com/group/comp.lang.c/msg/a91a898c08457481?hl=en&

The main properties that it would have, compared to a C (some other HLLs
do have some of these properties) are:

a) Rigidly defined functionality, without "optimization", except for
instruction scheduling, in support of VLIW or (some) superscaler cores.
(Different ways of expressing a particular algorithm, which perform more
or less efficiently on different architectures should be coded as such,
and selected at compile/configuration time, or built using
meta-programming techniques.) This is opposed to the HLL view which is
something like: express the algorithm in a sufficiently abstract way and
the compiler will figure out an efficient way to code it, perhaps. Yes,
compilers are really quite good at that, now, but that's not really the
point. This aspect is a bit like my suggestion in the linked post as
being something a bit like the Java spec, but without objects. Tao's
"Intent" VM is perhaps even closer. Not stack based. I would probably
still be happy if limited common-subexpression-elimination (factoring) was
allowed, to paper-over the array index vs pointer/cursor coding style vs
architecture differences.

If I can summarize this as:
-- the source code changes when the underlying processor architecture
changes
then I agree this is a key reason why i consider C a glorified
assembler.
b) Very little or no "language" level support for control structures or
calling conventions, but made-up-for with powerful compile-time
meta-programming facilities, and a standard "macro" library that provides
most of the expected facilities found in languages like C or Pascal. Much
of what are now thought of as compiler implementation features would wind
up in macro libraries. The advantage of this would be that code could be
written to *rely* on specific transformation performance and existence,
instead of just saying "hope that your compiler is clever enough to
recognize this idiom", in the documentation. It would also make possible
the sorts of small code factorizations that happen all the time in
assembly language, but which single-value-return, unnested function call
conventions in C make close to impossible. Or different coding styles,
like threaded interpreters, reasonable without language extensions.

Interesting features. I'm not sure how much different multiple value
returns would be from values returned via reference parameters
(pointers). it sounds like a good idea.
I imagine something like LLVM (http://llvm.cs.uiuc.edu/), but with a
powerful symbolic compile-time macro language on top (eg scheme...), an
algepraic (infix) operator syntax, and an expression parser.

In the mean time, "C", not as defined by the standard, but as implemented
in the half dozen or so compilers that I regularly use, is not so far from
what I want, to make me put in the effort to build my universal assembler
myself.

Cheers,

Thanks for the contribution to the discussion.
Ed
 
S

Stephen Sprunk

Al Balmer said:
Decades of use? This isn't a new rule.

An implementation might choose, for valid reasons, to prefetch the
data that pointer is pointing to. If it's in a segment not allocated
...

If a system traps on a prefetch, it's fundamentally broken. However, a
system that traps when an invalid pointer is loaded is not broken, and the
AS/400 is the usual example. Annoying, but not broken.

Why IBM did it that way, I'm not sure, but my guess is they found it was
cheaper to do validity/permission checks when the address was loaded than
when it was used since the latter has a latency impact.

S

--
Stephen Sprunk "Stupid people surround themselves with smart
CCIE #3723 people. Smart people surround themselves with
K5SSS smart people who disagree with them." --Aaron Sorkin

*** Free account sponsored by SecureIX.com ***
*** Encrypt your Internet usage with a free VPN account from http://www.SecureIX.com ***
 
A

Andrew Reilly

If I can summarize this as:
-- the source code changes when the underlying processor architecture
changes
then I agree this is a key reason why i consider C a glorified
assembler.

That's pretty close. I think that the link to Dan Bernstein's page on the
topic said it better than me: you can use different code and different
approaches where it matters to both the program and to the target
processor, but you can also use a simpler, generic approach that will just
work anywhere, when absolute maximum performance isn't necessary.
Interesting features. I'm not sure how much different multiple value
returns would be from values returned via reference parameters
(pointers). it sounds like a good idea.

The significant difference is that reference parameters (pointers) can't
be in registers. (Not to mention the inefficiency of repeatedly pushing
the reference onto the call stack...)

Say you have a few to half a dozen peices of state in some algorithm, and
the algorithm operates through a pattern of "mutations" of that state,
such that some or all of the state changes as a result of each operation.
The only way to code that in C is either to write out the code that
comprises the element operations of each pattern long-hand, or use
preprocessor macros.

The most obvious concrete example of this sort of thing is the pattern
where you have one or more "cursors" into a data structure, and code that
walks through it, producing results at the same time. You want your
"codelets" to return both their result *and* change the cursor to point to
the next element in the list to be processed. In C, you can't have both
the result and the pointer in registers, but that's how you would code it
in assembly.

Cheers,
 
S

Stephen Sprunk

Rod Pemberton said:
True. Mostly black people.

In my experience (which is more limited to recent years than many others'
here), it is typically do-gooder whites that are offended by words like
"black" or "oriental" or "Indian" (referring to the US domestic variety).

I recall an interview of Nelson Mandela by (I think) Dan Rather shortly
after the former's first election, and he was asked "How does it feel to be
the first African-American president of South Africa?" Mandela was
understandably confused, but the interviewer simply couldn't bring himself
to say the word "black". Mandela finally figured it out and answered, but
he had to come away from that thinking all Americans are complete dolts.

S

--
Stephen Sprunk "Stupid people surround themselves with smart
CCIE #3723 people. Smart people surround themselves with
K5SSS smart people who disagree with them." --Aaron Sorkin

*** Free account sponsored by SecureIX.com ***
*** Encrypt your Internet usage with a free VPN account from http://www.SecureIX.com ***
 
A

Andrew Reilly

If a system traps on a prefetch, it's fundamentally broken. However, a
system that traps when an invalid pointer is loaded is not broken, and the
AS/400 is the usual example. Annoying, but not broken.

And I still say that constraining C for everyone so that it could fit the
AS/400, rather than making C-on-AS/400 jump through a few more hoops to
match traditional C behaviour, was the wrong trade-off. I accept that
this may well be a minority view.
 
C

CBFalconer

Stephen said:
.... snip ...

Why IBM did it that way, I'm not sure, but my guess is they found
it was cheaper to do validity/permission checks when the address
was loaded than when it was used since the latter has a latency
impact.

A single pointer check can validate the pointer for multiple
dereferences. This is much cheaper than checking it at each
dereference.
 
D

Dik T. Winter

>
> The significant difference is that reference parameters (pointers) can't
> be in registers.

Why not? I have worked with a lot of implementations where the first few
parameters were passed through registers. (Depending on the processor,
from four to eight.) And i many cases no need at all to put those pointers
on the stack.
 
K

Keith Thompson

Andrew Reilly said:
And I still say that constraining C for everyone so that it could fit the
AS/400, rather than making C-on-AS/400 jump through a few more hoops to
match traditional C behaviour, was the wrong trade-off. I accept that
this may well be a minority view.

It is. The C standard wouldn't just have to forbid an implementation
from trapping when it loads an invalid address; it would have to
define the behavior of any program that uses such an address. A
number of examples have been posted here where that could cause
serious problems for some implementations other than the AS/400.
 
A

Arthur J. O'Dwyer

Why not? I have worked with a lot of implementations where the first few
parameters were passed through registers. (Depending on the processor,
from four to eight.) And i many cases no need at all to put those pointers
on the stack.

I believe Andrew means

void foo(int & x)
{
use(x);
}

void bar()
{
register int a;
foo(a); /* Will C++ accept this? */
}

I don't know whether standard C++ would accept the above code, or whether
it would, like standard C, insist that the programmer can't take the
address of a 'register' variable, even implicitly. But in any case, it
would be hard for the compiler to put the variable 'a' into a machine
register when it compiles 'bar', because it needs to pass its address
to 'foo' later on.

HTH,
-Arthur
 
J

Jordan Abel

It is. The C standard wouldn't just have to forbid an implementation
from trapping when it loads an invalid address; it would have to
define the behavior of any program that uses such an address.

Why? It's not that difficult to define the behavior of a program that
"uses" such an address other than by dereferencing, and no problem to
leave the behavior undefined for dereferencing
 
J

Jordan Abel

It is. The C standard wouldn't just have to forbid an implementation
from trapping when it loads an invalid address; it would have to
define the behavior of any program that uses such an address.

Why? It's not that difficult to define the behavior of a program that
"uses" such an address other than by dereferencing, and no problem to
leave the behavior undefined for dereferencing
 
D

Dik T. Winter

>
> Why? It's not that difficult to define the behavior of a program that
> "uses" such an address other than by dereferencing, and no problem to
> leave the behavior undefined for dereferencing

But that would have locked out machines that strictly separate pointers
and non-pointers, in the sense that you can not load a pointer in a
non-pointer register and the other way around. Note also that on the
AS/400 a pointer is longer than any integer, so doing arithmetic on them
in integer registers would require quite a lot.
 
J

Jordan Abel

But that would have locked out machines that strictly separate pointers
and non-pointers, in the sense that you can not load a pointer in a
non-pointer register and the other way around. Note also that on the
AS/400 a pointer is longer than any integer, so doing arithmetic on them
in integer registers would require quite a lot.

Surely there's some way to catch and ignore the trap from loading an
invalid pointer, though. I mean, it stops _somewhere_ even as it is now,
unless the register melts the silicon and drips through the floor, then
accelerates to the speed of light.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,226
Members
46,815
Latest member
treekmostly22

Latest Threads

Top