function casts

B

BartC

Ben Bacarisse said:
Why are there structs here at all? I think that's the point that's been
made elsewhere ("be dumb, not smart") and what I was getting at by
saying that it looks more like a translator than a compiler.

That's been considered! And for both arrays and structs. But there were some
downsides:

Where arrays are well-behaved and there is obviously an indexing operation,
then I do use C's object-sized offsets. The alternative would be to
incorporate a multiply operation within the address calculation. In
assembler, this often just means a zero-cost scale factor on a register. In
C the operation would need to be explicit. Probably this isn't important,
the compiler will optimise it out, but it looks poor.

But the main problems were that initialising a complex struct or array (or
some combination) meant serialising the contents into a sequence of byte
values. That would include floating point constants, and knowing byte-order
of integers etc.

Even in assembler, you can define byte, word, double word and floating point
constants! C would become *too* low-level then.

Besides, doing this for addresses of objects would be impossible, or for
initialisation of auto structs or arrays which involve runtime expressions
(although the latter is probably an extension of gcc and I can take care of
it myself if necessary).

In fact I'd also considered encapsulating all arrays within a struct, then
names of arrays would behave like other variables (by value).

But another reason, is that the resulting code already looks like a travesty
of the C language; eliminating half the data types would make it worse. If
you say it's acceptable however, then I will think again, but it needs to be
workable (because of the serialising issues).
I suspect that you want the benefit of some C typing (so you don't have

Using C's type system for primitives is still necessary, otherwise it would
be impossible to write expressions. Given a "+" operator, C needs to know
what the types of the operands are. In assembler, this information is
provided in other ways.
 
B

BartC

BartC said:
"Ben Bacarisse" <[email protected]> wrote in message

(My mail system has a habit of sending stuff when it's not yet finished!)

[Serialising constant data for structs and arrays into byte-sized chunks]
Besides, doing this for addresses of objects would be impossible,

With gcc at least, chopping up an address into bytes is not impossible.
However, it still seems a bit much!
but that's causing
problems elsewhere because the compiler's model is still based round the
raw machine picture that assembler gives you.

That's about it. I've written quite a few compilers over the decades, and
usually there was no doubt who was in charge! I've never targeted a
high-level language before. Now it's one language fighting it out with
another.

(BTW if anyone knows of a good (also, free) ARM11 emulator for a PC, then
maybe I can just target it directly. The C solution would have been
temporary anyway.)
 
B

Ben Bacarisse

BartC said:
(My mail system has a habit of sending stuff when it's not yet
finished!)

And I'd a lready written a reply so I'll copy it here. It makes more
sense in this context...
[Serialising constant data for structs and arrays into byte-sized chunks]
Besides, doing this for addresses of objects would be impossible,

With gcc at least, chopping up an address into bytes is not impossible.
However, it still seems a bit much!
but that's causing
problems elsewhere because the compiler's model is still based round the
raw machine picture that assembler gives you.

That's about it. I've written quite a few compilers over the decades, and
usually there was no doubt who was in charge! I've never targeted a
high-level language before. Now it's one language fighting it out with
another.

I had written:

I think we are just agreeing but I am not entirely sure. You want the
benefits you get from C (portability with respect to formats, for
example) but you complain about the down-sides compared to assembler. I
think you just have to decide if you have chosen the correct level at
which you use C with respect to these benefits and irritations.

<snip>
 
K

Keith Thompson

BartC said:
Where arrays are well-behaved and there is obviously an indexing operation,
then I do use C's object-sized offsets. The alternative would be to
incorporate a multiply operation within the address calculation. In
assembler, this often just means a zero-cost scale factor on a register. In
C the operation would need to be explicit. Probably this isn't important,
the compiler will optimise it out, but it looks poor.

Does it matter how it looks?

[...]
But another reason, is that the resulting code already looks like a travesty
of the C language; eliminating half the data types would make it worse. If
you say it's acceptable however, then I will think again, but it needs to be
workable (because of the serialising issues).

Do you care *at all* how readable the generated C is? Are you trying to
generate legible C code, or C code that does what you need it to do?

If anybody is going to be maintaining the generated C code (other than
by re-generating it), then yes, it should look good. If not, it's just
an intermediate language used by your compiler, and style should be
irrelevant.

Well, mostly. You'll probably want to be able to examine the generated
C for debugging purposes. Including lines of your original source in
the generated C as comments would help that.

[...]
 
B

BartC

Keith Thompson said:
Does it matter how it looks?
[...]
But another reason, is that the resulting code already looks like a
travesty
of the C language; eliminating half the data types would make it worse.
If
you say it's acceptable however, then I will think again, but it needs to
be
workable (because of the serialising issues).

Do you care *at all* how readable the generated C is? Are you trying to
generate legible C code, or C code that does what you need it to do?

Up to a point, I do care about appearance. And as you say, I need to be able
to find my way around. Also sometimes to be able to tweak the code by hand
to experiment. Sample C code for the familiar old 'sieve' benchmark is given
below, and you can see for yourself.

This code will not win prizes for beauty or style (or for anything else),
but compiled with gcc-O3, timing is not far off that of a conventional
version written properly in C, and about the same speed as a couple of other
compilers _compiling that conventional version_.

(I know the $ symbols for identifiers are not standard ...)

/* Module sieve */
#include ... see below
/* Type defs: */

/* Function prototypes: */
global function void start(void);

/* File scope variables: */
static byte data[8190];

/* Function defs: */
function void start(void) {
i32 count;
i32 i;
i32 k;
i32 prime;
i32 av$1;
byte $1;
i32 $2;
i32 $3;

count = 0;
av$1 = 100000;
L2:
L5:
i = 1;
L6:
*(data+i-1) = 1;
L7:
++i;
if (i <= 8190) goto L6;
L8:
L9:
i = 1;
L10:
$1 = *(data+i-1);
if (!$1) goto L13;
$2 = i+i;
prime = $2+3;
$3 = i+i;
prime = $3+3;
k = prime+i;
goto L15;
L14:
*(data+k-1) = 0;
k += prime;
L15:
if (k <= 8190) goto L14;
L16:
++count;
L13:
L11:
++i;
if (i <= 8190) goto L10;
L12:
L3:
--av$1;
if (av$1) goto L2;
L4:
$startprintcon();
$prints("Count =");
$printi(count);
$println();
$endprint();
L1:

;
} /*start*/
end

/* End sieve */

Header file contents to make the above compile (although just noticed
missing prototypes for $-functions; doesn't seem to mind though). The main()
function is in the runtime; it just calls start():

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <memory.h>
#include <math.h>

#define i32 signed int
#define byte unsigned char

#define function
#define end
#define global
 
B

BartC

I think we are just agreeing but I am not entirely sure. You want the
benefits you get from C (portability with respect to formats, for
example) but you complain about the down-sides compared to assembler. I
think you just have to decide if you have chosen the correct level at
which you use C with respect to these benefits and irritations.

I want to port my tools to a board computer which not only doesn't use x86
(it's an arm11), but also runs Linux for good measure! So currently it's an
unfriendly, almost hostile environment for me.

The only language in common between that and my PC, that I can use
practically, is C. The main (compiled) software I want to port is an
interpreter for another language, currently written in yet another language
(with a lot of assembler).

The first option was to just recode that in 100% C. OK, I started off doing
that. But it didn't really work out (lots of tedious transcribing between
languages, and there was some 25K lines of this stuff.)

Next option was to dig up an abandoned compiler for a new language, and make
it generate C. That also means rewriting the interpreter in a new, untested,
buggy language, and compiling it with a new, untested, buggy compiler,
generating tens of thousands of lines of unreadable C code, and hoping it
will compile and run without problems on a different platform. This
interpreter then needs to interpret the compiler I started with. So all very
straightforward..

So, yes, the benefits of C are that it's there, I'm reasonably familiar with
it, and it's portable insofar as it will hopefully run the same program on
both platforms. Some tests written directly in C worked well. It's also
going to be considerably faster than if I was to directly generate assembler
or machine code from my own efforts (but not quite as fast as when I later
inject hand-written assembly).
 
B

BartC

BartC said:
"Ben Bacarisse" <[email protected]> wrote in message

[Using C as a target language for a compiler 'back-end']
The first option was to just recode that in 100% C...
Next option was to dig up an abandoned compiler for a new language, and
make
it generate C. That also means rewriting the interpreter in a new,
untested,
buggy language, and compiling it with a new, untested, buggy compiler,
generating tens of thousands of lines of unreadable C code, and hoping it
will compile and run without problems on a different platform. This
interpreter then needs to interpret the compiler I started with. So all
very
straightforward..

(Just an update to this project.

Although my C-generating compiler mostly worked, I had bad vibes about it;
it just didn't seem right. (A bit like putting on your trousers on .. over
your trousers.) So I will put it aside.

Instead the main software that needs to be ported (an interpreter) will be
created directly in C (but with a big chunk of it off-loaded to another
language, which doesn't need to run on the target, to reduce the amount of C
needed; perhaps only 15 Kloc to write).

(There was another reason too: there would have been circular dependency
chains that I had trouble getting my head around. Using C for one important
step (at least until everything becomes stable) breaks the chain; any bugs
in the code can't go further back than the C source code.)

Meanwhile the compiler involved will go back to generating native code; it
will be terrible code**, and needs to be done for two targets, but if the
performance *is* in any way reasonable, it will thanks to my own efforts and
not the optimising C compiler!)

(** On x86, about 50% slower than gcc -O1 averaged over 20 or so benchmarks.
For real programs, it'll be adequate)
 
Ö

Öö Tiib

Next option was to dig up an abandoned compiler for a new language, and make
it generate C. That also means rewriting the interpreter in a new, untested,
buggy language, and compiling it with a new, untested, buggy compiler,
generating tens of thousands of lines of unreadable C code, and hoping it
will compile and run without problems on a different platform. This
interpreter then needs to interpret the compiler I started with. So all very
straightforward..

First you are dooming yourself with "tens of thousands of lines of unreadable
C code". The reasoning why you doom yourself is that "the result will only be
seen by a C compiler". Then you see that generated code and have "bad vibes
about it" and so you "put it aside". That sounds natural outcome. ;)

Generated code has to be readable. Then it is simpler to evaluate that it is
well generated and it is simpler to understand the defects of generator (that
are always there). Readable output simplifies maintenance of the generator.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

Casts 81
Union and pointer casts? 13
Pointer casts for OOP 2
Casts on lvalues 74
Incompatible type casts 8
void pointers & void function pointers 3
Array of structs function pointer 10
casts and pointers 0

Members online

Forum statistics

Threads
473,961
Messages
2,570,131
Members
46,689
Latest member
liammiller

Latest Threads

Top