C/C++ question about dynamic "static struct"

G

gwowen

Fair enough; I'm not sure what you were driving at then. It's fully
possible for 'C' programs to leave no widowed
resources, memory or otherwise.

No-one has suggested otherwise. But that's not what RAII is. RAII is
mapping resource acquisition/release to object lifetime, and thus
using the *automatic* object construction/destruction semantics of the
language to ensure correct resource management.

C doesn't have meaningful automatic, deterministic construction/
destruction of non-trivial objects, so you can't do RAII (as it is
understood) in pure C.
It's not even that difficult.

Well, there we'll have to differ.
 
L

Les Cargill

gwowen said:
Les was too busy correcting University Dons about category mistakes...

LOLz!

One a' them Frenchies, anyway! :) Trotting out "category error" on
usenet is problematic - it's hard enough just getting people to agree
to what words mean.
 
L

Les Cargill

gwowen said:
No-one has suggested otherwise. But that's not what RAII is. RAII is
mapping resource acquisition/release to object lifetime, and thus
using the *automatic* object construction/destruction semantics of the
language to ensure correct resource management.

Ach! Burry McDonald is nae true Scotsman :)

I still suspect that some of that contains distinctions without
differences. But I like the working definition you gave - "mapping
construction/deconstruction to object lifetimes." That's
clearer than what I'd seen before.

maybe I'll get the privilege of actually *learning* RAII ( through
doing a project with it ) rather than reading about it. Otherwise,
it's like pronouncing a word you've only read...
C doesn't have meaningful automatic, deterministic construction/
destruction of non-trivial objects, so you can't do RAII (as it is
understood) in pure C.


Well, there we'll have to differ.

Well, you have to *cheat* :) ( Ever use "atexit()"?).

The thing I was talking about really isn't GC, and it really
isn't ( apparently ) RAII. Got the thing(s) shipped, though.

It's probably closer to mark()/release().
 
J

James Kuyper

....
I still suspect that some of that contains distinctions without
differences. But I like the working definition you gave - "mapping
construction/deconstruction to object lifetimes." That's
clearer than what I'd seen before. ....

Well, you have to *cheat* :) ( Ever use "atexit()"?).

Using an exit handler to deallocate a resource corresponds to RAII only
for objects whose lifetime ends precisely at the time the exit handler
is executed. That could only be objects with automatic storage duration
that are free()d inside the exit handler. All other objects have a
lifetime that ends too soon or too late to qualify as RAII. RAII is
often used for objects with lifetimes that could end long before the end
of the entire program.
 
I

ImpalerCore

Ach! Burry McDonald is nae true Scotsman :)

I still suspect that some of that contains distinctions  without
differences. But I like the working definition you gave - "mapping
construction/deconstruction to object lifetimes." That's
clearer than what I'd seen before.

maybe I'll get the privilege of actually *learning* RAII ( through
doing a project with it ) rather than  reading about it. Otherwise,
it's like pronouncing a word you've only read...



Well, you have to *cheat* :) ( Ever use "atexit()"?).

The thing I was talking about really isn't GC, and it really
isn't ( apparently ) RAII. Got the thing(s) shipped, though.

It's probably closer to mark()/release().

One option to do RAII in C in a limited sense is to reserve a buffer
of automatic storage within a function, and then direct allocations to
use that buffer. Typically these region allocators are simple linear
bump the pointer schemes with no individual object deallocation, so
one has to tune the buffer size to the expected input. There's
commonly a reset that "frees" all the objects at once. You have to
take care of the alignment yourself, either manually using 'alignof
(type)' or 'ALIGN_MAX', but at the end of the function, the automatic
storage goes away and you can skip any deallocation that you would
have had to do.

It can be useful in functions that build some kind of precomputed
state that you want to optimize for smaller problems, but default back
to the standard allocator for bigger ones. An example could be
building a partial match table for string searching using the Knuth-
Morris-Pratt algorithm, or the matrix used to compute the Levenshtein
distance. Since automatic allocation is typically fast (bumping the
stack pointer) when compared to what happens internally using 'malloc'
and friends, it's possible to leverage automatic storage to squeeze
out a little more performance out of your allocations in these kinds
of algorithms.

Best regards,
John D.
 
I

ImpalerCore

(e-mail address removed):




This again assumes RAII is only about releasing memory. If this was the
case, one could just add a Boehm garbage collector to the program and
forget about all those issues. In reality, RAII in C++ is equivalently
important for mutex unlocking, file handle closing, ending the sandclock
mode of the mouse cursor, etc, etc.

Originally RAII was designed to handle memory cleanup in the presence
of exceptions, as it's pretty much a necessity to write exception safe
code in C++. I would say that when people saw how good RAII was in
memory cleanup that it was applied to other resource cleanup as well.
Hence the phrase "limited sense", as C does not provide language
support of anything resembling destructors.
For stack allocations there is the alloca() function, and also C99 VLA-s,
no need to reinvent the wheel. Or is this something different you are
talking about?

A general problem with using variable amounts of stack memory is that the
total amount of the stack space is quite small and implementation-
dependent and a stack overflow yields UB without any standard detection
or interception means. So writing a robust program with variable-size
stack allocations is pretty hard.

The difference is that this mechanism is a "fixed" stack allocation.
One issue is that alloca is not standard, and C99 VLAs do not apply to
C90 environments, which limit the scope of environments they can be
applied. While the 'region' technique has the same recursion issues
that alloca and C99 VLAs have, the stack size reserved to an algorithm
that needs to allocate memory is fixed. That means that passing big
input doesn't result in a big alloca or VLA resize that blows up the
stack.

Here's an excerpt from my region allocator.

\code
struct c_region
{
/*! \brief The start of the memory buffer. */
unsigned char* start;

/*! \brief The end of the memory buffer. */
unsigned char* end;

/*! \brief The next available address to allocate from in the
region. */
unsigned char* current;
};

void c_region_initialize( struct c_region* region, void* buffer,
size_t size )
{
c_return_if_fail( region != NULL );
c_return_if_fail( buffer != NULL );

region->start = buffer;
region->end = (unsigned char*)buffer + size;
region->current = buffer;

C_REGION_INVARIANT( region );
}

void* c_region_allocate( struct c_region* region,
size_t size,
size_t alignment )
{
unsigned char* aligned_p;

c_return_value_if_fail( region != NULL, NULL );
C_REGION_INVARIANT( region );

size = size == 0 ? 1 : size;
alignment = alignment == 0 ?
ALIGN_MAX :
c_round_up_to_power_of_two( alignment );
c_assert( c_is_power_of_two( alignment ) );

aligned_p = c_align_up( region->current, alignment );
c_assert( c_is_aligned_ptr( aligned_p, alignment ) );

/* To determine whether the region has enough space, one must
factor in the alignment padding as well as the object size. */
if ( aligned_p + size > region->end ) {
return NULL;
}

region->current = aligned_p + size;

C_REGION_INVARIANT( region );

return aligned_p;
}
\endcode

\code
int c_levenshtein( const char* s1, const char* s2 )
{
int distance;

int m, n;
int* proximity_matrix;
struct c_region pm_region;
unsigned char pm_workspace[256];
bool c_malloc_used = false;

c_return_value_if_fail( s1 != NULL, -1 );
c_return_value_if_fail( s2 != NULL, -1 );

m = strlen( s1 );
n = strlen( s2 );

/* If one of the strings is empty "", the edit distance is equal
to the length of the non-empty string. */
if ( m == 0 || n == 0 ) {
return m + n;
}

c_region_initialize( &pm_region, pm_workspace, sizeof
pm_workspace );
proximity_matrix = c_region_allocate( &pm_region,
sizeof (int) * (m+1) * (n+1),
alignof (int) );
if ( proximity_matrix == NULL )
{
proximity_matrix = c_malloc( sizeof (int) * (m+1) * (n+1) );
c_malloc_used = true;
}

if ( proximity_matrix )
{
++m;
++n;

gc_compute_levenshtein_matrix( s1, s2, &m, &n, proximity_matrix );
distance = proximity_matrix[m*n-1];

if ( c_malloc_used ) {
c_free( proximity_matrix );
}
}
else {
distance = -1;
}

return distance;
}
\endcode

The key point to take away from this example is that the size of the
stack used is independent of the size of s1 and s2. While alloca and
VLAs are convenient, they lend themself to a programming style where
it's easy to blow the stack in the presence of a big string. In my
case, I structure the stack allocation so that if it exceeds the
maximum amount I reserved for it (the size of pm_workspace), I return
*NULL*. Adjusting the amount of stack space reserved for
'c_levenshtein' is done by changing the size of 'pm_workspace',
eliminating any manual code to delineate allocations between an alloca
or VLA and malloc; the logic is built into the c_region interface.

One can also leverage a static buffer in the same manner, although one
needs to manually reset the c_region after finished with the workspace
memory.

I hope that adequately explains the benefit of going to the trouble of
using a region allocator.

Best regards,
John D.
 
Ö

Öö Tiib

Originally RAII was designed to handle memory cleanup in the presence
of exceptions, as it's pretty much a necessity to write exception safe
code in C++. I would say that when people saw how good RAII was in
memory cleanup that it was applied to other resource cleanup as well.
Hence the phrase "limited sense", as C does not provide language
support of anything resembling destructors.

RAII idiom was invented by Bjarne Stroustrup. It means "Resource
Acquisition Is Initialisation". He personally named it so originally.
The name is complex to crasp. You should not think that he however
originally meant "Memory Acquisition Is Initialization". Looking his
faq it seems that he knows very well what "resource" is:
http://www.stroustrup.com/glossary.html#Gresource
 
N

Nick Keighley

You are entitled to. However, do you have an actual *argument* against
what I said, other than "I simply disagree" and "for instance?"

well I thought you were the one making the large claim; that a
language that is "simple" because it's small and reasonably easy to
learn isn't "simple" in its application. Obviously you can write
horrible programs in any language. C (or reasonably well written C) is
usually quite transparent.

I have to admit these days I write C++. Even for noddy little programs
(which I probably should be using a script for). std::vector,
std::string just being too good to pass over.
In things like Lisp, Haskell and Scheme, the simplicity of the language
usually translates to the code itself becoming simple and elegant.

probably never became good enough with any of these for it to kick. I
quite like scheme but it never seemed to end up being really "simple".
But I accept that's my fault.
Often
you can express in a couple of lines what requires dozens of lines in C++
and hundreds of lines in C. And those two lines are not obfuscated beyond
comprehension, but (when you know the language) are clear, simple and
elegant.

There are several reasons for this, but one of the most important ones is
that the language doesn't burden the programmer with things like memory
management.

and garbage collection has its own problems
 
N

Nick Keighley

Having the know the exact layout of the data used in the program is a
really niche use case. Hardly something relevant to the average program.

besides it isn't true anyway. You have to know quite a bit about your
compiler to know how data is laid out. In principle it can change at
the drop of an optimisation flag
Regardless, since we are comparing C to C++, and the argument is that
C is "simpler" (and therefore many people prefer to use it), what makes
to you think that it's not possible (or even harder) to know the exact
memory layout of the data in a C++ program?

If, for example, you need for some reason to define a struct in some
precise manner, there's nothing in C++ stopping you from doing that.
There's nothing that C can offer in this regard that's not possible
in C++.

(And before you argue "well, if you are restricting your C++ program
to what C already does, why use C++ at all?" the answer is: To make
the *rest* of the program, ie. the parts that do *not* need that kind
of low-level tinkering, much easier.)

yes I've seen programs with C down in the bowels doing lost of bit
banging (near signal processing) but this was all embedded in classes
that hid the nasty stuff from the rest of the program.
 
N

Nick Keighley

Almost all arguments from "category error" are themselves category
errors. So "mu". I've seen Oxford dons make massive piles of steaming
nonsense out of "category error".

"category error" appear to cover what I'd been calling "type error".
"you're comparing oranges with orchards!" "C++ is a superset of C
because it's implemented in C"
No, the thing about lisp is that it is *interpretive*.

no. Not since 1962. Common Lisp is required to be compiled. Stalin an
implementation of scheme (which is a Lisp) is one of the most
aggressive whole program optimising compilers known.
C++ is much *worse* for memory management than is 'C' - in that
the possibility of memory leaks goes up.

use RAII.
Absent writing GUI code,
it's entirely possible to write entire deployable systems that use no
dynamic memory at all in 'C'. indeed, that's been the shop
standard most places I have worked.

you can do this in C++ as well
Not... really.

really. RAII isn't hard. My problem is large C++ programs written by
people who've never heard of it...
Let's not confuse our preferences for facts, shall we?
The subject is sufficiently complex that you have to figure out how to
measure the thing being discussed - you can't constructively assert
superiority with out doing a lot of work, and that's pretty boring
work. It's actually both boring and terrifying.

I find writing the exception handling stuff in C hard work. And so do
other people. Which is why they often don't write it. If I'm using
Win32 (yes its old- I know, sue me) I check the return of every
function call. Tedious.
To my eye, the solutions  there are uglier, and a choice like Tcl
or Python makes C++ less interesting. Because interpreters provide
a much better suite of "furniture".

not to this one. Or maybe I'm not experienced enough.
Because experienced programmer is experienced. The very basis for
determining whether the language features are improvements is
really complex and usually such discussions are simply people
exposing biases.


Just... wow. It's a *huge* problem. I'm not "into languages"
to be into languages, I am into them to be able to operate on teams
that produce deployable systems that work without causing anybody
any problems.

A novice can really get 95% of everything they need to know
about 'C' in a matter of months, while being productive
under supervision. I am not sure there is a collection
of ten people on the planet who , between them , know everything
there is to know about C++.

just learning C's syntax and a few library calls still leaves you with
plenty of pifalls to fall into. People do actually manage to write
large complex systems in C++. And they work.
 
N

Nick Keighley

Not in the specific manner Stoustrup used it, but it's
perfectly easy to achieve the same goal. But RAII is
much less a concern when you don't have exceptions... although
I have seen people do things with setjmp()/longjmp() that
were very very clever - including hooking hardware exceptions and
hitting longjmp() with it.











Sorry, this really doesn't apply to any cases I've seen in a long
time that were not GUI programs - and I believe I stipulated to
"use something besides C++ for GUIs, please" upthread.

'Course, I'm the sort who would use a state machine to manage an
overly onerous memory allocation problem - and in C++ -  so...

in the systems I've used there all sorts of things that are
dynamically allocated- there are "events" (internal messages), TCP/IP
packets, timers and all sorts of application specific thingies (calls,
targets, alarms, equipment allocations, maps etc. etc.). I hard to see
how you can avoid dynamically allocating things except in the most
hard real time systems.
 
N

Nick Keighley

Not in the specific manner Stoustrup used it, but it's
perfectly easy to achieve the same goal. But RAII is
much less a concern when you don't have exceptions... although
I have seen people do things with setjmp()/longjmp() that
were very very clever - including hooking hardware exceptions and
hitting longjmp() with it.











Sorry, this really doesn't apply to any cases I've seen in a long
time that were not GUI programs - and I believe I stipulated to
"use something besides C++ for GUIs, please" upthread.

'Course, I'm the sort who would use a state machine to manage an
overly onerous memory allocation problem - and in C++ -  so...

oh yes- and all the non-memory stuff. Files, database handles, pens
yadda yadda
 
N

Nick Keighley

do you assume memory is allocated in a stack-like manner? Because that
simply wouldn't work on the systems I'm thinking about. Mobile radio
calls don't appear and disappear in a stack-like manner.
Ach! Burry McDonald is nae true Scotsman :)

I still suspect that some of that contains distinctions  without
differences. But I like the working definition you gave - "mapping
construction/deconstruction to object lifetimes." That's
clearer than what I'd seen before.

maybe I'll get the privilege of actually *learning* RAII ( through
doing a project with it ) rather than  reading about it. Otherwise,
it's like pronouncing a word you've only read...



Well, you have to *cheat* :) ( Ever use "atexit()"?).

way too late in most of the programs I've encountered. These systems
are supposed to run "forever". Cleaning up all the crap on exit is no
good in a program that isn't supposed to exit!
 
N

Nick Keighley

That last question is really odd at the end of the paragraph that's
otherwise talking about something else entirely. Care to explain?

I wondered how good a C++ programmer he was. I see a fair amount of C+
+ code that news multiple items and deletes them in the destructor.
That defines non-trivial destructors but not assignment or copy
construction. Code that sprinkles try-catch about seemingly at random.
Writing *good* C++ isn't hard but it requires some basic knowledge.
 
I

ImpalerCore

Just curious, do you have any links to support this? Exception safe code
needs to clean up all resources anyway, not only memory. My google-fu
only produced wordings containing the general term "resources", including
Stroustrup himself (http://www.velocityreviews.com/forums/t688168-who-
invented-deterministic-construction-destruction.html).

No, just my personal exposure to it in the context of my old C++
college courses, which was explained for the purpose of deallocating
memory. I'm sure Bjarne designed it for files and other more exotic
constructs.
BTW, I just noticed a gcc extension doing something almost exactly like
this (a "cleanup" attribute):http://en.wikipedia.org/wiki/Resource_Acquisition_Is_Initialization#G...
xtensions_for_C

Sure, if you restrict your code to a compiler toolchain that supports
that extension. I have a habit of avoiding 'gcc-isms' or other
compiler-isms for library development when I can.
The difference is that this mechanism is a "fixed" stack allocation.
One issue is that alloca is not standard, and C99 VLAs do not apply to
C90 environments, which limit the scope of environments they can be
applied.  While the 'region' technique has the same recursion issues
that alloca and C99 VLAs have, the stack size reserved to an algorithm
that needs to allocate memory is fixed.  That means that passing big
input doesn't result in a big alloca or VLA resize that blows up the
stack.

[snipped lengthy example code of fixed-pool-size stack allocator scheme]
The key point to take away from this example is that the size of the
stack used is independent of the size of s1 and s2.  While alloca and
VLAs are convenient, they lend themself to a programming style where
it's easy to blow the stack in the presence of a big string.

OK, I can see now how this scheme can be indeed useful for more reliable
stack-based allocations. In this sense it reminds me the short string
optimization techniques used by some C++ implementations. There, if the
string is shorter than a given threshold, it is stored directly inside
the string object (which is often a local variable on stack). Only larger
strings require a dynamic memory allocation. This mechanism is fully
encapsulated of course, as is customary in C++, so that the class users
do not have to know or care about such implementation details.

Agreed, but I also think the "RAII for memory" is also encapsulated in
'c_levenshtein', unless I misunderstand what you're saying by
"encapsulation". By that I mean that the c_levenshtein just takes two
strings; there's no c_region being passed as a parameter and the user
can call it whether 'c_levenshtein' used the region or just plain
malloc.

Best regards,
John D.
 
D

Dombo

Op 27-Oct-12 13:21, ImpalerCore schreef:
Originally RAII was designed to handle memory cleanup in the presence
of exceptions, as it's pretty much a necessity to write exception safe
code in C++.

RAII was already supported (and useful) before C++ supported exceptions.
Exceptions made RAII in C++ more or less a necessity. There is nothing
about RAII that makes it specific for releasing memory.
 
L

Les Cargill

Nick said:
do you assume memory is allocated in a stack-like manner? Because that
simply wouldn't work on the systems I'm thinking about. Mobile radio
calls don't appear and disappear in a stack-like manner.

No, I don't assume that. Dynamic circuit creation/deletion usually
means you have state per circuit. It's possible to have a "global"
list of this state with instrumentation against that list
to track whether or not widowed circuits exist.

RAII would also solve this in a less "global" manner.
way too late in most of the programs I've encountered. These systems
are supposed to run "forever". Cleaning up all the crap on exit is no
good in a program that isn't supposed to exit!

I am using "atexit()" metaphorically. There exist frames of context,
and at the bottom of each frame ( when the frame goes away ), it
cleans up all the things it allocated.

This is still not "stack" stuff, because it's not LIFO.
 
I

ImpalerCore

That's true, but the usability of this encapsulation is on a different
level.

Okay, can you enumerate what "levels of encapsulation" you associate
std::string and c_levenshtein? Are you saying 'class' is a higher
level of encapsulation than 'function'?
To illustrate this, I have never needed to call c_levenshtein() (or
something equivalent) in my code

Until you have the need to implement some kind of fuzzy string
matching, which when used to match user input against some kind of
dictionary, implies the potential for a lot of comparisons.
, but I am making use of std::string every
day (as well as my home-grown variant class which is also using small-
string optimization).

Are you trying to point out that 'std::string (high) > char* (low)'?
Encapsulation means I can build my own abstractions, and then build other
stuff on top of them, ad infinitum. Using abstractions means the upper
level code is not concerned with lower-level details. In contrast, your
c_levenshtein() function is containing lower-level code (like setting up
pm_workspace) which has absolutely nothing to do with the actual purpose of
this function. I understand this is kind of inevitable in C.

I'm a bit confused. Evaluating the levenshtein distance in the
classical method requires a matrix that you have to get memory for
from somewhere. Even if you have 'int c_levenshtein( const string&
s1, const string& s2 )' or some levenshtein member function, you still
have to provide memory for the matrix; it's not innate to 's1' and
's2'. I agree that 'pm_workspace' is a kind of small string
optimization for malloc that is not directly related to the algorithm,
but you still need to get the memory from somewhere. Can you
pseudocode your own version of levenshtein using the std::string
framework, so I can better understand where you're getting the memory
for the matrix from, and classify what parts of the function are
"high" and "low" level?

If you're basically saying C++ > C for encapsulation, I agree with
you. However, one can still build a C interface to a resizing string
to have a kind of std::string equivalent in functionality, but the
code won't look as pretty, especially to someone accustomed to class
based design. But that doesn't mean that it can't be done, and that
it wouldn't be useful for someone using C.

Best regards,
John D.
 
L

Les Cargill

Nick said:
in the systems I've used there all sorts of things that are
dynamically allocated- there are "events" (internal messages), TCP/IP
packets, timers and all sorts of application specific thingies (calls,
targets, alarms, equipment allocations, maps etc. etc.). I hard to see
how you can avoid dynamically allocating things except in the most
hard real time systems.

TCP/IP is inherently dynamic, but it's a well trod service.

The rest? You simply have an "auction" with yourself - "I will
allow no more than 50 simultaneous events, 20 targets, 100
mappings", then you get the worst-case numbers and test it.
This includes developing ... generators that prove
all the constraints and will make you a script which you
can use to convince yourself you've exhaustively tested it.

State machines seem to be relegated to FPGA development
and the hardest of hard real-time but IMO, they're especially
useful in applications development.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,126
Messages
2,570,751
Members
47,309
Latest member
ShannaPaul

Latest Threads

Top