The way to read STL source code

N

Nick Keighley

"Nick Keighley" <[email protected]> ha scritto nel messaggio
learning how that language is converted to assembly [is important]
Each foo(), *foo, foo, ++foo, foo++, and for that matter, '{' and '}' has
in my mind very specific implications and costs.

[...]
As I type or read each
'if', 'switch', 'for', 'try', 'catch', and 'return', I hold in mind on a
subconscious level the underlying assembler, albeit on a broad generalized
model.
[...]

To me C is the machine at the bottom.

#this is io_x
#C can not be the one machine language.


probably not ideal. Trying to write an interpreter for another
language has taught me that C is quite awkward for some things. I want
computed gotos!
#it will be the language one machine if it has no Undefinited Behaviour,

not really relevent. Some assembly languages and even machine codes
have "undefined behaviour". Provide you stick to well defined
operations you have well defined semantics.
#and type of know size, and operation on these type all the same for all PCs

not relevent
#and i think a way to reach the stack it[C] use for call functions
#and for save variable in the function preamble...

can't see the need. For an interprter I hand built the list of
environments ("the stack")
 
N

Nick Keighley

Nick Keighley said:
On
Feb 19, 8:54 am, "Rod Pemberton" <[email protected]>
But I use an imaginary machine [...]

[if I do insist on compilign to machine code]
How do you "confirm correctness" by translating it into I86 assembler?
to me "correctness" is a mathematical property.

#io_x in his ot language
#balle; non puoi dimostrare la correttezza matematica di una espressione
#che ha diversi risultati dipendendo dalla macchina effettiva in cui gira....

english please
And if its good enough for Knuth...


code inspection, DbC, unit test

.....


no. I known plenty of professional progarmmers and vanishingly few of
them look at the assembler. Most would laugh at the idea.

#so they program what?

they program in C (or C++ or whatever)
not a machine, a machine need a precise language
#pheraps they program their imaginary machine != every machine in this planet

quarks. it's quarks all the way down
 
B

BartC

Nick Keighley said:
english please

As I understood it: "Nonsense; you can't demonstrate the mathematical
correction of an expression that has different results depending on the
machine in which it runs".

Although since that almost makes sense, it might not be quite right..
 
N

Nick Keighley

"Nick Keighley" <[email protected]> ha scritto nel messaggio
"Nick Keighley" <[email protected]> ha scritto nel
messaggio
Nick Keighley wrote:
learning how that language is converted to assembly [is important]
Each foo(), *foo, foo, ++foo, foo++, and for that matter, '{' and '}' has
in my mind very specific implications and costs.

[...]
To me C is the machine at the bottom. [the implementation machine]


Note I'm only talking about C/Assembler as the language to define high
level semantics. I think assembler is a poor way to describe
semantics. It's too verbose and too far away from HLLs. It's also
hellish to understand.
probably not ideal. Trying to write an interpreter for another
language has taught me that C is quite awkward for some things. I want
computed gotos!


not really relevent. Some assembly languages and even machine codes
have "undefined behaviour". Provide you stick to well defined
operations you have well defined semantics.

#i'm not agree, a machine has to execute its statement with no UBs
#is one assembly language has UBs for me is not an assembly and can not
#to pilot one cpu

sadly you are trumped by reality. The Z80 (and I believe 8080) had
unused op-codes. By tracing the logic of how instructions were build
you could deduce what these op-codes should do. And on many CPUs they
did exactly what you expected. But I suspect the manufacturers had
damn good reason for not including these instructions so you'd have to
be mad to use them.

I'm sure plenty of things give UB. Divide by zero, large shifts,
contention between coprocessors.
not relevent

#it is relevant, each basic oparation for 8,16,32,64,...
#size has to be fully portable

(a) not for general programming
(b) if you *really* need them then hide them behind typedefs
(c) not relevent when discussign HLL semantics
#and i think a way to reach the stack it[C] use for call functions
#and for save variable in the function preamble...

don't need this for HLL semantics
can't see the need. For an interprter I hand built the list of
environments ("the stack")

#i see the need because i think C work ok for many time just for the way
#the stack it use

the C Standard doesn't insist on a stack. You can implement a stack
without access to SP or whatever an I86 calls it.
#as someone other had to say, stack is fundamental in the programming area

but it doesn't have to be implemented in hardware. And some languages
don't follow simple semantics in procedure calls

<snip>
 
P

Peter Remmers

Am 20.02.2012 10:14, schrieb Rod Pemberton:
Have you ever had access to a non-emulated EBCDIC platform in your
entire lifetime? (No.) Have you ever had access to a 16-bit byte platform
in your entire lifetime? (No.) Have you ever had access to a 9-bit
character platform in your entire lifetime? (No.) So, like it or not, it's
entirely up to the users of obscure platforms to fix the "non-portable"
nature of C code for their platforms.

This is second-hand knowledge, but a co-worker of mine told me that
Analog Devices' "SHARC" DSP has 32-bit chars (i.e. CHAR_BIT==32). He had
lots of fun porting lwIP to it as you can't just overlay a struct on
some chunk of memory to parse the fields (like lwIP does)...


Peter
 
R

Rod Pemberton

Nick Keighley said:
On Feb 20, 9:14 am, "Rod Pemberton" <[email protected]>
wrote: ....
[a Clike language] most likely inherited the basic
functionality that a programmer needs to know.

exceptions, closures, continuations, modads,

lost of stuff C doesn't provide.

Don't you mean: lots of stuff C doesn't need? ...

How many of the languages which have a language feature that is
"missing" from C are compiled to C or once were?

(The great majority of them do or did. That includes C++ at one point in
time via Cfront. Nearly all modern dynamicly typed languages do.)
in my experience compiler bugs are very rare

Well, you must not have much experience. Or, you've just never noticed the
errors because 1) you've never studied or verified the assembly output and
2) the error was never "triggered". I've found either compilation errors or
library coding errors with most of the C compilers I've used. One compiler
optimized away code that shouldn't have been, but only for one intermediate
optimization level. All optimization levels above or below the intermediate
level worked correctly. Another had type conversion errors in the assembly
when a specific type was used with any other type. In one library, an
erroneously coded function was supposed to return three states, but returned
only two. Issues like these were not found in compilers where you'd expect
them, say SmallC, but in quality compilers, like GNU C, OpenWatcom, etc.
The problem with not looking at the assembly is that errors can exist for
exceptionally long time periods before being noticed.


Rod Pemberton
 
M

Mike McCarty

Don't you mean: lots of stuff C doesn't need? ...

That's arguable but I think most C++ programmers would say "With C++ I no longer need C." - and they are correct and happy to be able to say it.
(The great majority of them do or did. That includes C++ at one point in
time via Cfront. Nearly all modern dynamicly typed languages do.)

Irrelevant. You could've written them all in assembler if you so desired. It means nothing regarding the utility of the language.
Well, you must not have much experience. Or, you've just never noticed the
errors because 1) you've never studied or verified the assembly output and
2) the error was never "triggered". I've found either compilation errorsor
library coding errors with most of the C compilers I've used. One compiler
optimized away code that shouldn't have been, but only for one intermediate
optimization level. All optimization levels above or below the intermediate
level worked correctly. Another had type conversion errors in the assembly
when a specific type was used with any other type. In one library, an
erroneously coded function was supposed to return three states, but returned
only two. Issues like these were not found in compilers where you'd expect
them, say SmallC, but in quality compilers, like GNU C, OpenWatcom, etc.
The problem with not looking at the assembly is that errors can exist for
exceptionally long time periods before being noticed.

This is all very likely true but... what's your point? That software is buggy? That proving your program is bug-free is hard/impossible?

To hell with the buggy compiler, the OS is buggy.
To hell with the OS, the device drivers are buggy.
To hell with the device drivers, the hardware is buggy.

At what level of abstraction are you going to stop in your futile effort toprove that your program will execute correctly?

Earlier you stated:
What if you missed something you could've caught had you
simply just read the assembly?

What if you missed something because you wasted all of your time analyzing assembler? Which of these two scenarios do you suppose is more likely to happen?

Every once in a while, a colleague of mine will make some comment about adding some needless syntactic sugar to some code "just in case the compiler screws it up". 99 times out of 100, I respond "Well, if the compiler screwsthis up, we have much bigger problems than just this little bit of code." You can't predict bugs. If you could, the bugs probably wouldn't exist inthe first place.
 
D

Dombo

Op 20-Feb-12 23:36, Rod Pemberton schreef:
Nick Keighley said:
On Feb 20, 9:14 am, "Rod Pemberton"<[email protected]>
wrote: ...
[a Clike language] most likely inherited the basic
functionality that a programmer needs to know.

exceptions, closures, continuations, modads,

lost of stuff C doesn't provide.

Don't you mean: lots of stuff C doesn't need? ...

The language itself doesn't need anything, the programmer however might
appreciate some language features which makes him/her more productive.

But if you consider features not strictly necessary to make a
programming language Turing complete as unnecessary, maybe BF is the
language for you.
How many of the languages which have a language feature that is
"missing" from C are compiled to C or once were?

Even though it might be possible to compile those languages to C, the
language constructs may still be very alien to C programmers. And the
code these language constructs generate would never be written by a sane
C programmer, simply because it is not practical if you have write it
yourself.
(The great majority of them do or did. That includes C++ at one point in
time via Cfront. Nearly all modern dynamicly typed languages do.)

Well, you must not have much experience. Or, you've just never noticed the
errors because 1) you've never studied or verified the assembly output and
2) the error was never "triggered". I've found either compilation errors or
library coding errors with most of the C compilers I've used. One compiler
optimized away code that shouldn't have been, but only for one intermediate
optimization level. All optimization levels above or below the intermediate
level worked correctly. Another had type conversion errors in the assembly
when a specific type was used with any other type. In one library, an
erroneously coded function was supposed to return three states, but returned
only two.

If things don't work as expected I suspect the code being fed into the
compiler first, only as a last resort I inspect the assembly output of
the compiler. In the past twenty years I have run into maybe 3 or 4
compiler bugs, and about the same number in the libraries that came with
them (on mainstream compilers, the statistics are a lot worse for some
oddball compilers). I consider that rare, especially compared to most
other software products. Now I'm sure that there are more bug in the
compiler than I have been bitten by, but inspecting every assembly
instruction generated by the compiler just isn't practical for software
products where the HLL code consists of several million lines.
Issues like these were not found in compilers where you'd expect
them, say SmallC, but in quality compilers, like GNU C, OpenWatcom, etc.
The problem with not looking at the assembly is that errors can exist for
exceptionally long time periods before being noticed.

Even if you analyze the assembly output, you still don't know if the
code executes as intended. Many processors have lots of erratas and
execute some instructions incorrectly in some corner cases. You might
want to consider analyzing the processor design itself, just to be sure.
 
J

Juha Nieminen

In comp.lang.c++ Rod Pemberton said:
How many of the languages which have a language feature that is
"missing" from C are compiled to C or once were?

Obviously you could write an interpreter in C that interprets the
language you are trying to implement, and hey, you got the feature "in C".
The question is: Would it be *efficient*?

Even if instead of an interpreter the original language is butchered
into an actual C program by an automaton, and even if the resulting
compiled program somewhat approaches the efficiency of the binary
produced by an actual compiler for that other language, the resulting
C code is usually completely unreadable and unusable by a human being.
In many cases you can forget about trying to write such constructs
directly in C by hand.

Even then it would probably be hard to match the speed of an actual
compiler of the original language. Just take C++ exceptions, for instance.
I don't see any easy way to get the same effect in C without slowing
down the code.
 
J

Juha Nieminen

In comp.lang.c++ Juha Nieminen said:
Even if instead of an interpreter the original language is butchered
into an actual C program by an automaton, and even if the resulting
compiled program somewhat approaches the efficiency of the binary
produced by an actual compiler for that other language, the resulting
C code is usually completely unreadable and unusable by a human being.

Oh, and I forgot to mention: Any safety of the original construct will
be completely lost when translated to C, which further makes it unusable
when writing C directly.
 
N

Nick Keighley

[a C-like language] most likely inherited the basic
functionality that a programmer needs to know.
exceptions, closures, continuations, monads,
lost of stuff C doesn't provide.

Don't you mean: lots of stuff C doesn't need? ...

no. I'm saying there's lots of functionality a programmer might need
that is /not/ provided by C. Object oriented programming for a start.
How many of the languages which have a language feature that is
"missing" from C are compiled to C or once were?

many of them
(The great majority of them do or did.  That includes C++ at one point in
time via Cfront.  Nearly all modern dynamicly typed languages do.)

so what? All HLLs are Turing equivalent. You can write Lisp
interpreters in Cobol if you wish to but I suspect it's painful.

Learnign other paradigms gives you more tools in your toolbox.
Well, you must not have much experience.  Or, you've just never noticedthe
errors because 1) you've never studied or verified the assembly output and
2) the error was never "triggered".  I've found either compilation errors or
library coding errors with most of the C compilers I've used.  One compiler
optimized away code that shouldn't have been, but only for one intermediate
optimization level.  All optimization levels above or below the intermediate
level worked correctly.  Another had type conversion errors in the assembly
when a specific type was used with any other type.  In one library, an
erroneously coded function was supposed to return three states, but returned
only two.  Issues like these were not found in compilers where you'd expect
them, say SmallC, but in quality compilers, like GNU C, OpenWatcom, etc.
The problem with not looking at the assembly is that errors can exist for
exceptionally long time periods before being noticed.

fair enough
 
R

Rod Pemberton

Juha Nieminen said:
Obviously you could write an interpreter in C that interprets the
language you are trying to implement, and hey, you got the feature
"in C".

Yes, you could do that: write an interpreter. However, it's probably better
to just emit C code for it directly ...
The question is: Would it be *efficient*?

No, it wouldn't. There is no such thing as an "*efficient*" interpreter as
compared to compiled native code. There are two fundamental problems with
interpreters that can't be overcome: 1) they reimplement the mechanisms of
assembly language which slows things down alot 2) they can't optimize and
compile code sequences which dramatically slows down loops and other
control-flow.
Even if instead of an interpreter the original language is butchered
into an actual C program by an automaton, [...]

Yes, I agree, programmatic code conversion is rarely human readable.
[...] and even if the resulting compiled program somewhat
approaches the efficiency of the binary produced by an
actual compiler for that other language, [...]

They usually do produce efficient code. Sometimes it's actually better than
human coded code because the other language uses low level constructs
which can be optimized more readily.
[...] the resulting C code is usually completely unreadable
and unusable by a human being.

True, but if you coded that feature yourself, it doesn't have to be
coded as "butchered" code.
Even then it would probably be hard to match the speed of an actual
compiler of the original language.

I don't believe that. C has had some of the most intensive research into
language optimizations. Other languages just haven't had the time and
research put into them.
Just take C++ exceptions, for instance. I don't see any easy way
to get the same effect in C without slowing down the code.

Sorry, I'm not familiar with C++. However, if exceptions interrupt the
normal program flow like signals do, then it'll slow down the code purely
due to whatever save-restore state mechanism is used to handle the
exception. That requires substantial overhead in assembly.


Rod Pemberton
 
N

Nick Keighley

As I understood it: "Nonsense; you can't demonstrate the mathematical
correction of an expression that has different results depending on the
machine in which it runs".

and so? If you have a clear model of each machine then you perhaps
prove it would work on all of them. I really not sure what your point
is. The C abstract seems a reasonable thing to try and prove things
for. (Ok might be a bit messy)
 
R

Rod Pemberton

....
The [C] language itself doesn't need anything, the
programmer however might appreciate some language
features which makes him/her more productive.

C is very feature rich. You can do 97% of what needs to be done with C with
just the C language and stdio.h, if you exclude a handful of functions that
were placed in the 'wrong' include files. If you don't like extra coding
you can use string.h. There is no need for ctype.h whatsoever. The
remainder of the C library adds more "productive enhancing" code than your
heart could ever desire ... If you took a language completely devoid of
features, like BF which you've cited, then perhaps you have an argument.
But if you consider features not strictly necessary to make a
programming language Turing complete as unnecessary, maybe
BF is the language for you.

1) My understanding is that no programming language can be implemented as
Turing complete due to architecture constraints such as fixed size integers
and limited memory.

2) BF is novel, but it is currently too simple to program with
effectiveness. One of the serious issues it has is it's minimalism.
Various attempts have been made to enhance BF, such as PBrain and Toadskin.
Procedures is one of the most common thing added to the language. PBrain
adds that functionality but uses integers to represent the procedure. This
complicates the use of the procedures. It also complicates the conversion
of BF with PBrain procedures to other languages like C, since a cell may
have a changing value, therefore call different procedures at different
times. Toadskin adds the ability to group operators into named sequences,
which is procedure like and Forth-like. IMO, BF's serious problem is that
it is lacking direct, basic support for integers and characters as ASCII.
Values must be constructed from increment or decrement loops. I've not seen
variants of BF add the capability to express integers or characters
directly. That ability combined with the ability to do procedures would
make
it roughly as powerful as a simple C compiler of yesteryear. You can check
out Esolangs for more info on such languages:

http://esolangs.org/wiki/Main_Page
Even though it might be possible to compile those languages to C, the
language constructs may still be very alien to C programmers. And the
code these language constructs generate would never be written by a sane
C programmer, simply because it is not practical if you have write it
yourself.

I would disagree with both claims. One could claim that being able to do so
depends on skill level and work ethic. No skill? Lazy?
If things don't work as expected I suspect the code being fed into the
compiler first, only as a last resort I inspect the assembly output of
the compiler.

Ah, but you do check assembly, in contrast to NK's claim that that is
laughable.
In the past twenty years I have run into maybe 3 or 4
compiler bugs, and about the same number in the libraries that came with
them (on mainstream compilers, the statistics are a lot worse for some
oddball compilers).

Again, you've confirm my experiences in contrast to NK's claims.
Even if you analyze the assembly output, you still don't know if the
code executes as intended.

Exactly. You won't know the code executes as intended until it is executed,
which was why I asked this of NK:

RP> How do you confirm correctness of both high-level and
RP> low-level code with an imaginary machine? Is your
RP> confirmation of correctness imaginary also? ...

To which, I got more than one absurd response that imaginary
confirmation of correctness was entirely valid ...


Rod Pemberton
 
R

Rod Pemberton

Juha Nieminen said:
Oh, and I forgot to mention: Any safety of the original construct will
be completely lost when translated to C, which further makes it unusable
when writing C directly.

The exact same can be said about compiling to assembly or machine code.
The type checking, i.e., "safety," is done by the compiler, not the compiler
output. The compiler checks the syntax and parsing rules for errors. Once
you have assembly or machine code, the "safety of the original construct" is
"completely lost". The syntax and parsing rules are not encoded into the
compiled program.


Rod Pemberton
 
J

Juha Nieminen

In comp.lang.c++ Rod Pemberton said:
I don't believe that.

Why not? It's not very hard to believe. If the language has no support
for a certain higher-level concept, then the compiler cannot use that
concept to perform optimizations.

You cannot express everything in C that can be expressed in asm.
(You can probably achieve the same functionality, but you cannot achieve
the same efficiency in every single case.)
C has had some of the most intensive research into
language optimizations.

Yes, into *C* language optimizations, not other languages. What C does
not support as a concept the C compiler cannot optimize.
Sorry, I'm not familiar with C++. However, if exceptions interrupt the
normal program flow like signals do, then it'll slow down the code purely
due to whatever save-restore state mechanism is used to handle the
exception. That requires substantial overhead in assembly.

Nope. That's the genius in C++ exceptions: Support for exceptions does
not slow down the program in any way (compared to compiling the program
without support for exceptions). (It was, AFAIK, in fact a requirement
by the standardization committee, that exceptions would be added to the
standard only if it's possible to support them without compromising the
speed of the program. It turns out that it's possible.)

(Of course *throwing* an exception has overhead, but that's to be expected.
Exceptions are designed to be used to handle fatal errors, not for normal
operation. The main point is that when no exceptions are thrown, the code
is in no way slower than eg. the equivalent C program would be, even though
exceptions are supported and could be thrown at any moment, at any point
in the code.)
 
J

Juha Nieminen

In comp.lang.c++ Rod Pemberton said:
The exact same can be said about compiling to assembly or machine code.

Which is one of the reasons why asm is seldom used directly. Exactly
my point.
The type checking, i.e., "safety," is done by the compiler, not the compiler
output. The compiler checks the syntax and parsing rules for errors. Once
you have assembly or machine code, the "safety of the original construct" is
"completely lost". The syntax and parsing rules are not encoded into the
compiled program.

The original claim was that C supports everything that any other language
supports. Not true. C does not support most of the safety mechanisms that
other languages have.
 
B

BartC

Rod Pemberton said:
No, it wouldn't. There is no such thing as an "*efficient*" interpreter
as
compared to compiled native code.

In the case of a scripting language, this is largely irrelevant.

If all the language does is invoke other programs, then there would be no
measureable difference between an interpreter, and a native code version.

It also depends on how the language is used; it might be 100 times slower
than native code, but if runtimes are small anyway, then there may not be
much advantage in taking only 1msec to do something instead of 100msec.
There are two fundamental problems with
interpreters that can't be overcome: 1) they reimplement the mechanisms of
assembly language which slows things down alot

An interpreter may well be *implemented* in assembly language. Every
instruction that the interpreter deals with could be implemented by a
carefully hand-crafted and optimised block of assembly code! Sometimes
better than automatically compiled code.

2) they can't optimize and
compile code sequences which dramatically slows down loops and other
control-flow.

Tell that to the guy who implemented the LuaJIT interpreter. Numeric
benchmarks are often faster than optimised C!
[...] the resulting C code is usually completely unreadable
and unusable by a human being.

I used to find that anyway, even when not generated by a machine..
True, but if you coded that feature yourself, it doesn't have to be
coded as "butchered" code.


I don't believe that. C has had some of the most intensive research into
language optimizations. Other languages just haven't had the time and
research put into them.

C is pretty good, but it has to spend a lot of time recognising constructs
in the source code could be expressed directly in a higher level language
and which can be executed directly (vector operations for example). It
doesn't always manage to do that.

And if implementing something unusual, you can often write almost directly
from pseudo-code in a higher level language, but a quick, throwaway
implementation in C may well be slower; you need to spend time with C in
achieving what's already been done inside the interpreter.
 
D

Dmitry A. Kazakov

Exceptions are designed to be used to handle fatal errors, not for normal
operation.

No, they are for *exceptional* states. Such states are by no means fatal.
E.g. neither reading beyond file end, nor getting a numeric overflow are
fatal.

It is expected that exceptions are used to indicate states which are not so
frequent, so that the overhead caused the exception propagation could be
minor.
The main point is that when no exceptions are thrown, the code
is in no way slower than eg. the equivalent C program would be, even though
exceptions are supported and could be thrown at any moment, at any point
in the code.)

True, but even with exceptions thrown, the code may still be faster. Let us
consider an exception used to indicate the file end. Without an exception
the end file condition is tested twice per each file item to read. First
this is done in the subprogram reading the item. Then the check is repeated
by the caller when it tests the return code. A design that throws an
exception would check it only once, and thus become more efficient the
larger the file is.

Surely, it is theoretically possible to beat the performance of this
design, but at the cost of global optimization or else dropping read
subprogram and manually inserting equivalent I/O all over the code...
 
N

Nick Keighley

"Nick Keighley" <[email protected]> ha scritto nel messaggio


and so? If you have a clear model of each machine then you perhaps
prove it would work on all of them. I really not sure what your point
is. The C abstract seems a reasonable thing to try and prove things
for. (Ok might be a bit messy)

A better answer from me would have been "so don't do that. Use a
stable abstract machine to do do your verification of correctness on."

And I'm not sure an I86 qualifies as such a stable machine.
#you have to say just what a machine is...

the C abstract machine
#one has to program only one machine
#or the one machine is a subMachine of the machine
#and not a set of machines or cpus

well as long as all implementatons obey the semantics of the C
abstract machine we don't have a problem. In a sense this is what Rod
Pemberton is doing; he's inspecting each implementation to ensure it
complies with the C abstract machine semantics.
#but
#if all cpu have register of 32 bits with usual 32 bit
#jump operation on code
#it would be possible programm all these cpu in the subcpu
#of 32 bit as it is one portable cpu, as just one machine

seems way to low level to me. Who cares if its a 32 bit machien or a
19 bit machine? You do know there are 64 bit machines around now?

and again "32 bit" gives way to much leeway for your sort of
verifiction. What if its a 32 ARM?

<snip>
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,139
Messages
2,570,805
Members
47,352
Latest member
DianeKulik

Latest Threads

Top