Boost process and C

E

Ed Jensen

CBFalconer said:
If you publish your source under GPL, there is very little chance
of conflicts. In the case of things I have originated, all you
have to do is contact me to negotiate other licenses. I can be
fairly reasonable on months with a 'R' in them.

Many years ago, I helped write a moderately large project in C. More
recently, I helped rewrite the project in Java. Due to the standard
Java library, here are some of the advantages I encountered:

1. Much less time spent evaluating dozens of third party libraries.

2. Much less time spent ensuring legal compliance with dozens of third
party libraries.

3. Much less time spent keeping third party libraries updated (when
new features became available that we wanted to leverage or exploits
in older versions were discovered).

4. Many fewer cross platform issues.

Alas, I can already tell this post will make me look like a wild eyed
Java zealot. All I can do is state that my only real intention with
this post is to demonstrate the value of a comprehensive standard
library.
No, there is nothing wrong with expanding the standard library.
Nothing forces anyone to use such components anyhow. There is
provision in the standard for "future library expansion". This is
a far cry from bastardizing the language with overloaded operators
and peculiar non-standard syntax, as recommended by some of the
unwashed.

I mostly agree. C++ already fills that role, for those who care to
use it. (I say mostly because I would like to see some minor changes
to C syntax, but nothing too wild.)
Go ahead and advocate. I would certainly like to see at least
strlcpy/cat in the next standard, with gets removed, and possibly
my own hashlib and ggets added. What all of those things are is
completely described in terms of the existing C standards, so the
decisions can be fairly black and white.

I think those changes would be an excellent start. I'm not sure C
will be able to continue to grow without starting to break from the
past (at least a little bit).
 
K

Keith Thompson

Ian Collins said:
That's one of the things that hurts C, think how much more productive
those programmers would be if they didn't have to write their own string
library.

Nobody *has* to write his own string library. There are a number of
them floating around, as a quick Google search will show.

In some cases, if you only need a few operations, writing your own
might turn out to be easier than tracking down an existing library and
learning to use it.
 
I

Ian Collins

Keith said:
Nobody *has* to write his own string library. There are a number of
them floating around, as a quick Google search will show.

In some cases, if you only need a few operations, writing your own
might turn out to be easier than tracking down an existing library and
learning to use it.
But if there was a standard one...

I fear that C is in danger of shrinking into that ever diminishing niche
where other languages can't go. Give the language library some standard
containers, string, regular expressions and let it compete on a level
playing field with more recent languages.
 
W

Walter Roberson

CBFalconer said:
For example, a future standard could restrict 'precedence' to three
levels (e.g. logical, additive, and multiplicative) only, requiring
parentheses for any further control, yet allowing the actions of
the present silly system.

People might possibly grudgingly accept needing parens for
~ and !, but there is a long history of unary minus in indicating the
sign of constants and I'm not sure how happy people would be with
needing parens around every negative number.

I think people might also object to needing to put parens around
the elements of the triple in a for loop:

for ((i=10);(i>(-1));(i--)) ({ ((A[(i*2)])=(A[(i+1)])) });

since you also eliminated the precedence associated with
array indexing, assignment, and compound blocks..

Hmmm, how should that assignment be written? As
parens would be needed to demark lvalues (since they
are not logical, additive, or multiplicative)... but
the parens would imply taking the value of the array
element at that point, rather than the address...
 
W

websnarf

Ian said:
That's one of the things that hurts C, think how much more productive
those programmers would be if they didn't have to write their own string
library.

I don't know if you are being sarcastic or not. Certainly C
programmers would be more productive if they were not wasting time
debugging buffer overflows. I think C is the only language or
mechanism which has hundreds CERT advisories for the exact same bug. C
programmers might be more productive if they didn't need to reroll a
hash table, or vector or myriad of other data structures that come
prepackaged in other languages (and debug them) as well.
 
I

Ian Collins

I don't know if you are being sarcastic or not. Certainly C
programmers would be more productive if they were not wasting time
debugging buffer overflows. I think C is the only language or
mechanism which has hundreds CERT advisories for the exact same bug. C
programmers might be more productive if they didn't need to reroll a
hash table, or vector or myriad of other data structures that come
prepackaged in other languages (and debug them) as well.
No sarcasm intended. I agree with all you say.
 
C

CBFalconer

Ed said:
Many years ago, I helped write a moderately large project in C. More
recently, I helped rewrite the project in Java. Due to the standard
Java library, here are some of the advantages I encountered:

1. Much less time spent evaluating dozens of third party libraries.

What evaluation? You do it once. You can also evaluate the
source.
2. Much less time spent ensuring legal compliance with dozens of third
party libraries.

If you operate under GPL what compliance problems? If you want
something for nothing and also want to absorb it, that's another
matter.
3. Much less time spent keeping third party libraries updated (when
new features became available that we wanted to leverage or exploits
in older versions were discovered).

What updates? If things are written in standard C they won't need
updating, apart from insects. The older the source files dates the
better (up to a point).
4. Many fewer cross platform issues.

Once again, use standard C. Virtually eliminates platform issues.

Please don't strip attributions for material you quote.

If you want a specialized library for the DeathStation when fitted
with three Mark XVII missiles, then you should probably resign
yourself to writing it for yourself.

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell.org/google/>
Also see <http://www.safalra.com/special/googlegroupsreply/>
 
C

CBFalconer

Walter said:
People might possibly grudgingly accept needing parens for
~ and !, but there is a long history of unary minus in indicating the
sign of constants and I'm not sure how happy people would be with
needing parens around every negative number.

The unary minus is a chimera. C does not parse these things in
that form. The action of "-32768" is to apply a negation to the
positive integer 32768. If that creates an overflow, tough.

Examining the definition of INT_MIN in limits.h may be informative.

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell.org/google/>
Also see <http://www.safalra.com/special/googlegroupsreply/>
 
C

CBFalconer

Ian Collins wrote:
.... snip ...

I don't know if you are being sarcastic or not. Certainly C
programmers would be more productive if they were not wasting
time debugging buffer overflows. I think C is the only language
or mechanism which has hundreds CERT advisories for the exact
same bug. C programmers might be more productive if they didn't
need to reroll a hash table, or vector or myriad of other data
structures that come prepackaged in other languages (and debug
them) as well.

Well, I can't remember debugging a buffer overflow in my own code.
I do regularly use the standard string package.

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell.org/google/>
Also see <http://www.safalra.com/special/googlegroupsreply/>
 
J

Joe Wright

CBFalconer said:
Well, I can't remember debugging a buffer overflow in my own code.
I do regularly use the standard string package.
No you don't. Not all of it. I'll bet you haven't used gets() in years. :)
 
E

Ed Jensen

CBFalconer said:
What evaluation? You do it once. You can also evaluate the
source.

Even once can involve a lot of overhead if you need a few dozen
libraries and there are 3 or 4 competitors in each area. Take a look
at the sometimes obscene dependencies list on non-trivial projects.

Having the source available is always an advantage.
If you operate under GPL what compliance problems? If you want
something for nothing and also want to absorb it, that's another
matter.

Operating under the GPL is often not an option. In addition,
sometimes the solution you've chosen is not available under the GPL.
What updates? If things are written in standard C they won't need
updating, apart from insects. The older the source files dates the
better (up to a point).

If you're linking in a few dozen libraries, I should hope you're
keeping abreast of updates to see if newer versions fix exploits.
Once again, use standard C. Virtually eliminates platform issues.

It's possible to write extremely portable C code. It can also be an
extremely non-trivial exercise.
 
J

jacob navia

Ed Jensen a écrit :
[snip]
It's possible to write extremely portable C code. It can also be an
extremely non-trivial exercise.

Portable C will never have the speed of fine tuned assembler/optimized
libraries that have been written by the compiler system writers and use
each advantage for the specific platform they are writing to.

If you want to remain portable, you can't use any speedups, shortcuts or
system specific stuff by definition.

The advantage of language wide libraries like the C library is that the
compiler vendors can optimize it for each platform, underneath a common
interface visible by the library user.

jacob
 
W

websnarf

CBFalconer said:
Well, I can't remember debugging a buffer overflow in my own code.
I do regularly use the standard string package.

And therefore there are no CERT advisories related to buffer overflows
when using C string functions. Right? Wow -- pure genius, you just
solved far and away the biggest software security problem in existence
(by a ridiculous margin) with mere force of your eloquent disposition.
 
W

websnarf

CBFalconer said:
What evaluation? You do it once. You can also evaluate the
source.

What exactly do you think he's talking about? There is a lot of
unusuable crap out there, and you have to examine things seriously
before you use them. You end up evaluating everything you use as well
as everything you consider to be a candidate but do not use.
If you operate under GPL what compliance problems? If you want
something for nothing and also want to absorb it, that's another
matter.

Tell that to Microsoft, SCO or other proprietary vendors. BSD and MIT
tend to be far more compatible licenses. You only use the GPL if you
are supporting the FSF adgenda -- i.e., if you intentionally want to
keep proprietary vendors from using your code (even if they make a
widely deployed C compiler).
What updates? If things are written in standard C they won't need
updating, apart from insects. The older the source files dates the
better (up to a point).

You write bug free, scalable code that never needs to be updated? Oh
wait, I *know* you don't do that -- perhaps its some code you don't
publish on your website that we never get to see that is written to
such perfection.

The truth is that for all serious non-trivial libraries, you have to
make provisions for periodic updates. And C libraries are worse
because there is a much higher probability of being bitten by a flaw of
catastrophic proportions. The standard JPEG library flaw, that allowed
people to arbitrarily take over client machines remotely by hosting
specially constructed exploit images is merely the latest in classic
examples of this.
Once again, use standard C. Virtually eliminates platform issues.

Use any other modern language and it does a better job at eliminating
platform issues.
If you want a specialized library for the DeathStation when fitted
with three Mark XVII missiles, then you should probably resign
yourself to writing it for yourself.

This problem of dealing with control of WMDs is something completely
different -- it affects more people that the people creating them. In
that event you throw money at the problem, by paying for review, not
operating on deadlines set by marketing and you program them in serious
and verifiable languages like Ada.
 
M

Malcolm

Ian Collins said:
I must be missing something, you have the same problem with pointers,
don't you?

If you see someFn( &x ), how do you know if someFn's prototype is

void someFn( int* ); or
void someFn( const int* );

without looking it up?
If the function takes a pointer to a single integer it must either change
that integer, sometimes change the integer, or sometimes go through versions
which change the integer. Otherwise the author would have written it to take
an int.
(The unfortunate exception is functions which are written in Fortran but
called from C. Fortran takes all arguments as pointer.)

When x is an array, there is a problem. With hindsight K and R should have
incuded some sort of syntactical marker, such as puting writable arguments
in <> brackets, to mark arrays as writeable. It is too late now.
 
I

Ian Collins

Malcolm said:
If the function takes a pointer to a single integer it must either change
that integer, sometimes change the integer, or sometimes go through versions
which change the integer. Otherwise the author would have written it to take
an int.

OK, int was a bad choice, replace it with someStruct and the argument is
more compelling.
 
J

John F

Richard Tobin said:
That's OK, I understand mathematics, tell me the "mathematical
sense"
in which it's meaningless. Perhaps you could start with a
mathematical
definition of "meaningless", since I've never come across one.

It's no longer a date. So you are no longer within the same set. You
leave the domain, which is not acceptable and thus not meaningful for
an inner binary operator as the usual addition is.
What binary operator?
"+"


It's something which when multiplied by 2 gives you March 13 2003,
and which when added to (March 11 2003) / 2 gives youy March 12
2003.
It is not, of course, a date. There isn't any conventional way to
write it, nor is there a conventional name for its type. But we can
perfectly well define various operations on it.

Sure we can. But here you agree that the result is meaningless in the
sense of being a date (thus not staying within the domain). You may
define an outer binary operator.
 
J

John F

Keith Thompson said:
John F said:
There is a difference between

(date1+date2)/2 (adds dates and divides by two)

and

date1+(date2-date1)/2 (proportionally adds the difference between
two
dates to another date)

The former is meaningless (in a mathematical sense), the latter is
perfectly legal.

[...]

The former includes a subexpression that yields a result that's not
meaningful, but the expression has a whole is meaningful.

Thanks for clarification. I was referring to the math behind it where
it yealds an error in the domain ot the binary operator +...
Dates, temperatures, and pointers are all similar in the sense that
each can be thought of as a pair: base+offset, where the base is
respectively an arbitrary epoch, an arbitrary zero temperature, or a
zero address (or the base address of a containing object). In each
case, the "base" is some fixed point, and the "offset" is a scalar
quantity. Offsets by themselves can be added and subtracted freely;
the base is meaningful only if you have exactly zero or one of it.
(With no base, you have an interval or distance; with a base, you
have
a position.)

(Assume Celsius or Fahrenheit temperatures; the zero base of Kelvin
scale is uniquely meaningful, so it's not restricted to this model.)
Subtracting two dates
date1 - date2
is equivalent to:
(base+offset1) - (base+offset2)
which reduces to:
offset1 - offset2
which is meaningful; it's an interval, measurable in seconds, with
no
defined base.

Exactly. At least if you assume the same base for the representation,
and the same scale for the offset too.
Adding two dates:
date1 + date2
is equivalent to:
(base+offset1) + (base+offset2)
which reduces to:
2*base + offset1 - offset2
which is not directly meaningful because of the 2*base term.

Exactly. You can't tell what it should be. Same applies as above. Same
scale for the offset and a common base can be found to set the base_x
values. Which should be quite hard for a time-scale. One can always go
back and back... With temperature a natural limit showed up with
absolute zero.

Have you ever tried to translate a chinese date into the gregorian
calendar? It is really nice.
However, if it's used as an intermediate expression, as in:
(date1 + date2) / 2
we have
((base+offset1) + (base+offset2)) / 2
or
(2*base + offset1 - offset2) / 2
or
base + (offset1 - offset2)/2
a meaningful expression that denotes the midpoint between the two
dates.

Again assuming a common base and the same scale for the offset. (just
to remind :)
Ideally, we'd like to make expressions with meaningful results
legal,
and expressions without meaningful results illegal. Theoretically,
we
could do this by allowing meaningless intermediate results, as long
as
the result of the full expression is meaningful.
Agreed.

There are (at least) two problems with this approach. First, it
makes
the language rules more complex, which affects both compiler writers
(who can deal with it) and users (who likely can't). Try explaining
to a newbie that (date1 + date2), or (pointer1 + pointer2), is
usually
illegal, except that it's allowed as a subexpression if and only if
the expression as a whole obeys certain constraints.

I don't want to do that. Really. Avoiding objects where intermediate
results can't be illegal as a whole would fix 80% of all bugs, I
guess.
Second, it makes
it difficult to define the circumstances in which overflow can
occur.
Yes.

Pointer arithmetic in C is defined only within a single object (or
just past its end); allowing pointer+pointer gives you intermediate
results that not only don't have an obvious meaning, but may not be
representable, depending on where in the address space the object
happens to be allocated.

Exactly. It would be hard to implement too.
Ignoring the overflow problem, if you're using raw numbers to
represent dates, something like (date1+date2)/2 is ok, as long as
you
have the discipline to avoid writing a full expression that's not
meaningful:
x = date1 + date2; /* meaningless, but the compiler won't
complain */

At least with weak checking. One could argue that we could define a
type

"SOD" (Sum of dates" and you would need to make this type illegal for
an lvalue.

the "2" in the expression
x=(date1 + date2)/2 should get type "DD" (DateDevider)

and [SOD]/[DD] = [D] such that x would get type D again. It is a lot
of brainy stuff, but should work. (never seen a language with such a
feature (except VB with its "Variant", where almost every operation
yealds a meaningful result (regarding the type)))

I'd rather have a function x = mediandate( date1, date2 ); for that
job.
If you're using a hypothetical language that does this kind of smart
type checking, allowing date1+date2 as a subexpression but not as a
full expression, you can add dates to your heart's content and
depend
on the implementation to detect any type matching errors. It would
be
interesting to design this kind of thing, either as a language
feature
(off-topic here), or as a library with run-time checking (doable in
standard C, with functions rather than overloaded operators).
Again,
overflow is a concern unless you can either use extended precision
for
intermediate results, or rely on the implementation to rearrange the
expressions for you.

Using the natural division: char for day, char for month, long long
for year and double for the seconds within a day it shouldn't be any
problem. It is all a matter of promotion :)
Finally, if you're using standard C and operating on pointers,
you'll
just have to rearrange the expression yourself. If you need to know
the address halfway between two pointers, you can't write:
mid = (ptr1 + ptr2) / 2;
You'll just have to bite the bullet and write:
mid = ptr1 + (ptr2 - ptr1) / 2;

One should add: for ptr2 >= ptr1.
(never seen a negative pointer in C)

(sorry for the long post, thanks for reading up till here)
 
K

Keith Thompson

Malcolm said:
If the function takes a pointer to a single integer it must either change
that integer, sometimes change the integer, or sometimes go through versions
which change the integer. Otherwise the author would have written it to take
an int.
(The unfortunate exception is functions which are written in Fortran but
called from C. Fortran takes all arguments as pointer.)

When x is an array, there is a problem. With hindsight K and R should have
incuded some sort of syntactical marker, such as puting writable arguments
in <> brackets, to mark arrays as writeable. It is too late now.

You can't have a parameter of array type in C. This:
void foo(int x[]);
is precisely equivalent to this:
void foo(int *x);

In my opinion it would have been better to drop the [] syntax for
parameters, and require them to be written as the pointers they really
are.

Either that, or somehow make arrays first-class objects and actually
support passing them as parameters.

C99 adds some new syntax for array parameters, which complicates the
discussion -- but it is, as you say, too late now.
 
R

Richard Bos

CBFalconer said:
The unary minus is a chimera. C does not parse these things in
that form. The action of "-32768" is to apply a negation to the
positive integer 32768.

That is precisely Walter's point. Under your restricted precedence
rules, -32768 would have to be written as -(32768) (and in most contexts
as (-(32768)) ), since unary minus is not a logical, additive or
multiplicative operator. (In particular, despite appearances it is not
the additive minus operator, since it must have a different precedence
in expressions such as -3*4, which is (-3)*4, not -(3*4).)

Richard
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,183
Messages
2,570,968
Members
47,517
Latest member
TashaLzw39

Latest Threads

Top