bug raport - about way of linking in c

B

BartC

I am talking here about convence of not linking
every module with any other, just the couple that
should be linked together (it will not break
builds it just avoids collisions on duplicate symbol names etc)

Some of this stuff is already done with dynamic libraries, although there it
is mainly functions that are imported and exported.

Within a single dynamic library L, you can have modules a and b that are
statically linked, and share functions and data, using names that are not
accessible from outside the library, unless exported.
reaches windows; //modules to reach from here
reaches opengl;

Certainly on Windows, these libraries are imported as shared libraries, and
not statically linked. So all the names shared between the modules that
comprise opengl, for example, stay private within that library.

But this isn't down to the linker; the dynamic loader, on Windows anyway, is
separate.
 
S

Stephen Sprunk

W dniu piątek, 28 września 2012 15:08:50 UTC+2 użytkownik Ben
Bacarisse napisał:

Yes, I think you understand me right, (as far as I understand you).
Let me explain more what do I mean.
...
In my opinion the wrong is that c linker at link time AFAIK puts all
the module symbols into one common 'bag' and try to link any module
inports with any other module exports. It is bug of c linkage system

According to the Standard, all identifiers have either "internal
linkage" or "external linkage". External linkage means that the
identifier can be referenced in _any_ translation unit. Internal
linkage means that the identifier can be referenced _only_ in the
translation unit in which it is defined.

C does not have any concept in between, i.e. an identifier that is
available in more than one translation unit but not all of them.

Generally, a "bug" is defined as something that does not work as
designed; since C was designed to work this way, this does not qualify.
My think begin with a notice that in c there are really no global
functions and no global static data (most people says about global -
but ther realy do not exist)

on module graph level there are no global symbols but linker
combines them as if they were global -it is stupid and it leads even
to unnecessary symbol conflicts

If you don't intend an identifier to have external ("global") linkage,
then declare it to have internal linkage, and there are no worries about
symbol conflicts. Problem solved.

For identifiers you declare to have external linkage, it is your
responsibility to make sure there are no conflicts. There are several
conventions for doing so, all of which are valid.

My psychic powers tell me that learning how to properly use header files
will likely resolve the issues you are having.

As to whether C's designers were "stupid" for C being this way, you are
free to think so, but considering all the other things they got right, I
choose to respectfully disagree.

S
 
K

Kaz Kylheku

[hullo, I am c fan from poland, deep involved
in the spirit of c and structural coding,
From few years I am working and thinking about
some way of c language improvements. Sorry for
my weak english]

Improving C is rather a waste of time, because the big improvements that really
move the productivity lever give rise to a different kind of language.
I want to say few words about some thing
I think it is wrong in c linking system
(as far as i know that bug is present in c
linking system at all)

Substantial changes to the linking system require moving "mountains", because
they are entrenched deeply into the toolchains that build big systems.

If you change the C linking system, you end up with a different language,
and the existing linking system does not go away.

There are already langauges with better linking systems; so you can just use
them.
On a 'compiler' (source) level, there is not
such thing as global function or variable -
the scope of symbol is limited to scope of
visibility of its declaration -

C has the notion of external linkage. This is part of the language. Certain
declarations introduce names in such a way that they are known among
translation units, not just within a translation unit.

(Of course, they still have to be declared in other translation units in order
to be visible there, so you have a valid point about scope. But lexical scope
is not all there is to visibility.)
but on linker level (as far as I heard
about it) linkers always try to link any
obj module with any other given module
- so it leads even to conflicts of symbols
not present on source level -it is obviously wrong

C++, an improved dialect of C that's been in development for some three
decades now, addresses this with namespaces.

In C, a translation unit can have private global names thanks to internal
linkage, but two modules cannot share private names that are not also visible
to all other modules.

In C++, two (or more) translation units can share private names via
namespaces, which are named containers of symbols.

So there is an improved C dialect in which the problem has been solved.
That dialect is used by a large number of programmers and has been throughs
several rounds of ISO standarization.

The problem with most people who want to improve C is that they tend to reject
other people's work such as like C++.

"I don't want any of those improvements from other people! That is not
C any more, but some bastardized language which sucks because <insert
reasons here>. I want only C, plus a couple of my own ideas,
and nobody else's, that's all!"

Problem is that C plus a couple of your ideas also creates a forked dialect,
just like C++. The only difference is that C++ is much farther along
in development.
Linker should not to try link everything with
everything but they should to be able to accept
some info about what module to link with
what other module (it would be obvious

If you're going to discuss linkers rather than the language, then
survey what is out there.

Linkers exist which have this capability. For instance on GNU systems with ELF
libraries, you can link a shared library with -Bsymbolic. This will resolve
all internal references *within* the library, so that when the library is
loaded, its references can no longer be hijacked to definitions outside of the
library.

Moreover, you can use linker scripts to precisely control which external
symbols are actually exported from the library, made visible to the attaching
clients. This symbol visibility can even be organized into named "version
nodes". So for instance programs compiled against an old version of the
library can see a different definition of a symbol, compared to programs
compiled against the new version, allowing for precise backward compatibility
support.

The technology exists, but is outside of the C language.
 
F

fir

As to whether C's designers were "stupid" for C being this way, you are
free to think so, but considering all the other things they got right, I
choose to respectfully disagree.
Asto this above -

I would not say that, oh no. C is genius work
it is brilliant, I am astonished of spirit of
c and I am much respectfull to spirit of c.

But c has some slight bugs and this is such. *
(I will answer more but little later.)

* to see it one must just compare the two
aproaches - it is probably no other way to judge
this things, If you would carefully compare twe
two you will see that the approach I am talking
about is better as i said.
 
A

Andrew Smallshaw

I want to say:

external linkage symbols should not be
treated as global to all module set
- this is bug

No, it is not a bug. A bug is where something does not work as
intended. That is not the case here: it is working as intended,
you simply want it to do something than was originally intended.
Other languages give different mechnisms to achieve what you attempt
whether it be namespaces or protected scope or something else that
doesn't immediately come to mind - but the fact C doesn't offer
them is at worst a limitation rather than a bug.

C's scoping rules are relatively simple which is a good fit for
the rest of the language. Adding this kind of capability would
inevitably involve the addition of extra red tape, the comparative
lack of which is one of the nicer things about C in the first place.
 
K

Keith Thompson

Eric Sosman said:
C gives you some control over how the linker works, by
giving each identifier a "linkage." There are three kinds
of linkage:

- An identifier with "external linkage" is visible to the
linker. This is the default for variables and functions
declared at file scope in a module. Some identifiers
in other scopes can be given external linkage by using
the `extern' keyword.

- An identifier with "internal linkage" is invisible to
the linker. This is the linkage you get by using the
`static' keyword on a variable or function at file
scope. Different modules can use the same internal-
linkage identifier to refer to different things, and
the uses will not clash because the linker does not
see them.

- There are also identifiers with "no linkage," which are
things like function parameters, `auto' variables, typedef
names, macro names, and so on. The linker does not see
these, so different modules can use the same no-linkage
identifier for different things.
[...]

And if I were designing a new C-like language (without much regard
for backward compatibility), one thing I'd probably change is
to give file-scope definitions internal linkage by default -- or
perhaps to require an explicit linkage ("extern" or "static" --
or maybe "external" or "internal") on each file-scope definition.
 
F

fir

W dniu piątek, 28 września 2012 20:09:36 UTC+2 użytkownik Bart napisał:
Some of this stuff is already done with dynamic libraries, although thereit

is mainly functions that are imported and exported.



Within a single dynamic library L, you can have modules a and b that are

statically linked, and share functions and data, using names that are not

accessible from outside the library, unless exported.







Certainly on Windows, these libraries are imported as shared libraries, and

not statically linked. So all the names shared between the modules that

comprise opengl, for example, stay private within that library.



But this isn't down to the linker; the dynamic loader, on Windows anyway,is

separate.
sure, i heve used names windows and opengl
for illustration instead of just a or b,
I meant just static linking here

It was a part of my notice how modular
programming could be simplified in improved c ->

headers could be trashed out, any c file
source of given module could look like

module blitter;

reaches log;
reaches timers;

// code here

module blitter (blitter.c compiled to
blitter.obj ) links to log.obj (which was
obtained from log.c module source) and
timers.obj (obtained from timers.c)

no headers, no symbol declarations, here linker
should link also only between referenced modules
not between all possible in cross maneer

linking between all possible pairs just has no
reason and it is just main reason i say it is
wrong

throwing all symbols (and its referenced data)
into one global bag
1) has no reason
2) exposes symbols on name conflicts
3) makes modular systems somewhat flat

that is what i want to say,
 
F

fir

No, it is not a bug. A bug is where something does not work as

intended. That is not the case here: it is working as intended,

I mean 'design bug', I mean slight change of
it leads to better language.
I wouldnt call it red tape adding, it is more
like some assumption removal.
 
S

Stephen Sprunk

sure, i heve used names windows and opengl for illustration instead
of just a or b, I meant just static linking here

It was a part of my notice how modular programming could be
simplified in improved c ->

headers could be trashed out, any c file source of given module
could look like

module blitter;

reaches log; reaches timers;

// code here

module blitter (blitter.c compiled to blitter.obj ) links to log.obj
(which was obtained from log.c module source) and timers.obj
(obtained from timers.c)

no headers, no symbol declarations, here linker should link also only
between referenced modules not between all possible in cross maneer

What you are describing is not C, and changing C in this way would
instantly break billions of lines of code that work just fine today.

There are other languages that work that way, eg. Java, and you are
welcome to use them if that's what you prefer. However, if you want to
use C, then you need to learn how C works.
linking between all possible pairs just has no reason and it is just
main reason i say it is wrong

You are welcome to that opinion, but your opinion does not change the
fact that C does not work that way.

S
 
S

Stephen Sprunk

Asto this above -

I would not say that, oh no. C is genius work
it is brilliant, I am astonished of spirit of
c and I am much respectfull to spirit of c.

But c has some slight bugs and this is such.

Generally, a "bug" is defined as something that does not work as
designed; since C was designed to work this way, this does not qualify
as a bug.

S
 
A

Andrew Smallshaw

I mean 'design bug', I mean slight change of
it leads to better language.
I wouldnt call it red tape adding, it is more
like some assumption removal.

It isn't even a design bug. A design bug would be something like
gets() which introduces a bug into any program using it simply
because of the way it is defined. This is simply the lack of a
feature you want.

As for the red tape issue, you acknowlege elsewhere in this thread
that you havn't fully determined how it should work. If you attempt
to fully codify it you'll see that there are any number of issues
that need to be addressed. It isn't as simple as saying "use this
module".

How do you define what each module is known as? From the filename?
That breaks if the same filename exists in multiple directories
and there are occasions when you would want to do precisely that
- think automatic code generation or maintenence. Can you include
multiple scopes? If so how do you resolve conflicts where the same
identifier is defined multiple times? Since something like this is
invariably going to involve some form of name mangling you need to
define and accommodate that too.

It would beggar belief that you could get this through committee
without someone tacking on some form of protection (public, private
etc) which would need to be addressed to.

In short, it's looking a lot less like C and a lot more like C++.
If you want to use C++, use C++, don't complain that C is defective
because the design choices made are not what you personally would
have chosen.
 
E

Eric Sosman

I mean 'design bug', I mean slight change of
it leads to better language.
I wouldnt call it red tape adding, it is more
like some assumption removal.

Speaking for myself, I *like* the assumption that every
module in a program agrees that there is only one `printf'.
Why would I want Modules A,B,C to use one `printf' while X,Y,Z
use a different one?

Hmmm: If Module A uses the malloc() provided by Module B,
while Module X uses Module Y's malloc(), which free() should
Module Q use, if Q can be called from both A and X?
 
F

fir

W dniu sobota, 29 września 2012 04:36:56 UTC+2 użytkownik Eric Sosman napisał:
Speaking for myself, I *like* the assumption that every

module in a program agrees that there is only one `printf'.

Why would I want Modules A,B,C to use one `printf' while X,Y,Z

use a different one?

In a big or very big program (big as Europe i would
say ) you could want to use one instance printf's
in France and the other printf's in Belarus
(it is somewhat like with namespaces,
printf version is identyfied by its region,
of course you do not must and this may be
a bad way of writing code very often)

All this lies in a way and philosophy
of present c. I would not call it even a
change, I call it 'slight bug removal'

As to namespaces it can works simlar to
namespaces but this is much simpler (it is
natural) and we can obtain the namesbace
goodness without any syntaxical addon
- tis is better way, *It brings just no
disadvantaes, one and only thing one could
mean as a disadvantage is that to linker
there must be given a graph of modulle
conection not just list of all modules to
link, (if you not write it on source level
and in present c there is no way to that
so it must be obtained by linker commnadline)
but it can be understand as natural
and advantage becouse one can see your
application module graph explicitly and see
if it is good designed on module graph level
Hmmm: If Module A uses the malloc() provided by Module B,

while Module X uses Module Y's malloc(), which free() should

Module Q use, if Q can be called from both A and X?

why you should use free in Q to free a data in A
allocked in A, it is not bad imo

you should use appriopriate free to use appriopriate malloc

If you will insist you would had to link both B and Y to Q and get a linkererror on free symbol, so it will not compile
 
F

fir

W dniu sobota, 29 września 2012 04:24:24 UTC+2 użytkownik Andrew Smallshaw napisał:
It isn't even a design bug. A design bug would be something like

gets() which introduces a bug into any program using it simply

because of the way it is defined. This is simply the lack of a

feature you want.



As for the red tape issue, you acknowlege elsewhere in this thread

that you havn't fully determined how it should work. If you attempt
to fully codify it you'll see that there are any number of issues
that need to be addressed. It isn't as simple as saying "use this
module".

No, there is none issue at all. Please one
could try to find any real issue, there is
none.


How do you define what each module is known as? From the filename?

That breaks if the same filename exists in multiple directories

and there are occasions when you would want to do precisely that

- think automatic code generation or maintenence. Can you include

multiple scopes? If so how do you resolve conflicts where the same

identifier is defined multiple times? Since something like this is

invariably going to involve some form of name mangling you need to

define and accommodate that too.

I do not understand the troubles you
mention modules are identified by its
names a.obj b.obj c.obj and so

instead of

link main.obj windows.lib stdlib.lib asm_lib.obj

as in prestent form (when all symbols are
combined and may conflict) you should explicitely give the real graph of references
to linker, for example (the real form of
most handy linker commandline is to be
chosen )

link main.obj -> windows.lib ,
main.obj -> stdlib.lib ,
main.obj -> asm_lib.obj ,
asm_lib -> stdlib.lib
 
A

Andrew Smallshaw

W dniu sobota, 29 wrze?nia 2012 04:24:24 UTC+2 u?ytkownik Andrew Smallshaw napisa?:


I do not understand the troubles you
mention modules are identified by its
names a.obj b.obj c.obj and so

instead of

link main.obj windows.lib stdlib.lib asm_lib.obj

Right, so you criticise C for not having a feature that you don't
even want to add to C. Once you have an object file C is over and
done with. What you propose is a modification to the _linker_.
How is this a limitation of C? By avoiding doing the job
properly at the source level you relegate vital information to
_another_ language, namely the description file you'll need to
control the linker operation. Work through all the possible cases
and you will soon see that it is beyond what is reasonable for
command line switches.

This is simply confirming what I suggested previously, this is an
idle hankering for something that hasn't been thought through. We
all get them from time to time: only a few days ago I found myself
wishing I could push a sequence of tokens back into the input in
a yacc action. You can't and after a little thought I realised
why not. You need to go through a similar process here and address
_all_ the problem areas before you have any proposal at all.
 
F

fir

You're reinventing namespaces, but poorly.

This is no matter of namespace.

The real purpose of it, is that combining
(tryin to link) all modules with all other modulesin c linking process has no reason.
- It is alogic and it brings symbol
pollution as a consequence

(One should not cure this by adding
namespaces, this is no c spirit in such
things, it would be bad)




In your design, can more than one translation unit be part of the same

"module"? IOW, I might well want to split the "windows" module into

several pieces of source. Can a translation unit contain bits from

more than one module? How would that be scoped? How is this better

than C++ namespaces (which could be grafted onto C with little impact

on the rest of the language)?

By module I mean a.obj (and its corresponding
source a.c) (a.obj is module binary
a.c is module source ) So I mean by module
a thing you call probably translation unit,
I use word module only

The rest part of question I do not understand
partialy probably becouse of my weak english.

As I said, this change is a matter of linking
only so sources will be the same, you could
use windows.h header - and link with some
corresponding .lib it brings to compiler
 
F

fir

W dniu sobota, 29 września 2012 08:42:21 UTC+2 użytkownik Andrew Smallshaw napisał:
Right, so you criticise C for not having a feature that you don't

even want to add to C. Once you have an object file C is over and

done with. What you propose is a modification to the _linker_.

How is this a limitation of C? By avoiding doing the job

properly at the source level you relegate vital information to

_another_ language, namely the description file you'll need to

control the linker operation. Work through all the possible cases

and you will soon see that it is beyond what is reasonable for

command line switches.



This is simply confirming what I suggested previously, this is an

idle hankering for something that hasn't been thought through. We

all get them from time to time: only a few days ago I found myself

wishing I could push a sequence of tokens back into the input in

a yacc action. You can't and after a little thought I realised

why not. You need to go through a similar process here and address

_all_ the problem areas before you have any proposal at all.

I see you do not understand, then you say
'that is all'. What 'problem areas' what
'all possible cases' - you suspect them
but they are none existant. Please find
one You will not find one.
This is a matter of simple question as
I said :

There is no reason of link all modules
with all other modules (even between
pairs of modules which are not intended
to be linked together)

It is alogical

It brings symbol namespace pollution as
a consequence

Such alogical design choices should not be cured
by adding namespace on top of it in a form of patch - just it should be cured by taking this
slight allogical behaviour in linker back

If one will understand that - one will see
it, ant then will say, ye youre right (and that is all) I wanted to show that.
 
S

Stephen Sprunk

This is no matter of namespace.

The real purpose of it, is that combining (tryin to link) all
modules with all other modulesin c linking process has no reason. -
It is alogic and it brings symbol pollution as a consequence

(One should not cure this by adding namespaces, this is no c spirit
in such things, it would be bad)

The correct solution to symbol pollution is namespaces, for a variety of
reasons.

C does not have namespaces. C++ does; if you want namespaces, then use
C++. The newsgroup for that is down the hall and to the left.

(It is a common convention in C to use prefixes as a rudimentary form of
namespaces, though. It's just not an explicit feature of the language.)
As I said, this change is a matter of linking only so sources will
be the same, you could use windows.h header - and link with some
corresponding .lib it brings to compiler

C does not define how linking works, nor even require that such a step
exist at all. Linking is a mere artifact of how some (but not all)
implementations work.

S
 
F

fir

W dniu sobota, 29 września 2012 09:38:03 UTC+2 użytkownik StephenSprunk napisał:
The correct solution to symbol pollution is namespaces, for a variety of
reasons.

youre wrong, there is no need to 'correct
symbol pollution' because symbol pollution
would even NOT EXIST if linker would not
stupidly tried to link unrelated modules

pollution is consequence of something
that has any reason to exist, just remove it
not build workaround on top of it






C does not have namespaces. C++ does; if you want namespaces, then use

C++. The newsgroup for that is down the hall and to the left.



(It is a common convention in C to use prefixes as a rudimentary form of

namespaces, though. It's just not an explicit feature of the language.)








C does not define how linking works, nor even require that such a step

exist at all. Linking is a mere artifact of how some (but not all)

implementations work.

If this is not in c language it is error
in linking system, but AFAIK it is common
in c
 
B

BartC

fir said:
W dniu piątek, 28 września 2012 20:09:36 UTC+2 użytkownik Bart napisał:
It was a part of my notice how modular
programming could be simplified in improved c ->

headers could be trashed out, any c file
source of given module could look like

module blitter;

reaches log;
reaches timers;

What happens when log exports XYZ, and so does timers?

Which one should be linked to when 'XYZ' is encountered in this module? You
will probably want to make use of both.

See, it is a namespace issue (although you say it isn't).

There are also issues even when considering only a single module: two
functions FA() and FB() want to share some common data, perhaps a static XYZ
declared inside one of them. How to do this, without making it visible to
all functions? How do functions FC() and FD() similarly share a name XYZ
without clashes?

Again, this is about language and namespaces, and not linking.
throwing all symbols (and its referenced data)
into one global bag
1) has no reason
2) exposes symbols on name conflicts
3) makes modular systems somewhat flat

I have a project that uses a single, global 'bag' of names that are all
unique. Yet they represent a structured, hierarchical symbol table where the
same identifier can be reused at any of the levels of the table.

The language can superimpose this stuff on the linker, without having to
write a new linker. (Actually the language could dispense with the linker
completely if it was so minded!)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,079
Messages
2,570,574
Members
47,207
Latest member
HelenaCani

Latest Threads

Top