Adapting software to multiple usage patterns

G

gwowen

What is the exact equivalent using your syntax?

Well, I first introduced this example, and I did so merely to note
that Jacob's container forced a relatively ugly syntax.

The explicit loop C++ version would be something like this,
for a vector:

double mean = 0.0;
const size_t sz = vec.size();
for(size_t idx = 0; idx < sz; ++idz){
mean += vec[idx];
}
mean = mean / vec.size();

For an arbitrary container (substitute your container for list)

double mean = 0.0;
list<double>::iterator cntr, end = cntr.end();
for(it = cntr.begin(); it != end; ++it){
mean += *it;
}
mean = mean / vec.size();
 
F

Flash Gordon

Nick said:
But you shouldn't have exposed your choice of data structure to the
application
level code anyway. It all should have been hidden away behind an
abstraction
anyway. The application wants to add things, remove things and find
things
it shouldn't care if it's an array, a linked list or a relational
database.

It depends. In some cases I'm thinking of we really do hit 100% CPU
utilisation on fairly meaty servers, adding an extra layer of
indirection for frequently accessed objects would probably impact on
performance of the code quite significantly (an extra indirection every
few lines of C code). Because of the frequency of access, there is no
chance of them every being stored in a database (there would be no
chance of getting required performance), by nature of design they are
loaded once at application start up, and then frequently accessed in a
random pattern (some lines of code access specific elements, others loop
over blocks of elements, others select elements based on user entered
data meaning we want the nth element. To me this screams out for using
malloc/realloc on loading the data in to contiguous blocks and then
accessing the required elements using array/pointer access, never tree,
list or database access.
Changeing the container type is not expensive becasue it involves
rewriting
one thin library.

Well, yes, where you are doing certain types of things you do wrap it in
an appropriate library, and then only need to change that one small library.
You may not know how big the system is going to get.

If your access pattern is that you regularly expect to access the nth
element, where n is a number which can be determined without iterating
over the elements, then I would say that a list is definitely the wrong
structure however small or large your data set is.
Or you may
be concentrating on getting other things right first. If things
are being addded and removed semi-randomly than a balanced tree
may be the right thing but an array ot list might be ok for a first
hack.

If you know what you are doing is a temporary hack then you will
implement it in a way that you can strip it out and replace it.

I've seen an application scale from a handful of thingies (an
important
measure of the system's load) to thousands of thingies. Admittedly
this
was known from the start.

In which case the expected usage was a known range, and I'm sure it was
designed with that in mind.
I think Jacob wants the flexibility in his library. Note even with C++
which has a set of generic containers I would still say hide the
choice
of container from the application code.

As I say, Ive never needed to do such drastic change in data structure.
Whenever I've used a list (single linked, double linked, or something
more strange and complex) that has always need what it has needed to
stay as. I've occasionally had to move something from being an array to
being an allocated block, but with the magic of C that just changes
things at the points of creation and destruction, but the code accessing
it is unchanged.

flexible arrays can be nice. I'd hide the malloc() in a library.

Well, I do actually have my malloc calls wrapped, but that is wrapped in
something that provides the application code with the same interface.
o Software specialization and subclassing (This was the original theme I
wanted to discuss with "Adapting software to multiple usage patterns")
I can't say I've a need for that, not in the was you are doing it. I do
have stuff with indirection through function pointers, but that is not
working in the same [way] as your interface.

I've done similar-ish things but not with containers. It doesn't feel
right.
The C++ invented a whole major language construct (templates) to avoid
using inheritance (which is what your function pointers look like) to
implement containers.

<snip>

Actually I was thinking of a library where the application code calls
explicit functions, and a fair way down the call graph in the library it
goes through function pointers to database specific functions so we can
select the database type at run time. However, that is deeply embedded
in the library and no where near the level of application code.
 
S

Seebs

But you shouldn't have exposed your choice of data structure to the
application
level code anyway. It all should have been hidden away behind an
abstraction
anyway. The application wants to add things, remove things and find
things
it shouldn't care if it's an array, a linked list or a relational
database.
Changeing the container type is not expensive becasue it involves
rewriting
one thin library.

I don't agree with this necessarily. Data structures and algorithms are
closely tied. In some cases, knowing about the data structure's performance
characteristics is important in deciding how to use a data structure.

I have a bunch of things to add. Does it matter whether I try to add them
to the head or the tail of the list? Depending on the data structure, the
answer is a definite "maybe".
You may not know how big the system is going to get. Or you may
be concentrating on getting other things right first. If things
are being addded and removed semi-randomly than a balanced tree
may be the right thing but an array ot list might be ok for a first
hack.

Sure. But there is a real chance that, once you know what you want to do,
your code will want to be adapted to the performance characteristics of
the suitable data structure, not just using it blindly.
Which a well-designed program confines to a small part of the code.
You should only be caught by this once (at worst) because after that
you
will have Refactored Mercilessly.

Unless you've profiled, and this data library is consuming 15% of your CPU
time, and tweaking for a specific implementation can reduce that to 5%...

-s
 
S

Seebs

really? I thought C++ demonstrated that you don't need inheritance for
a decent set of containers.

Well, the minimal case, as I see it:

Container <-- general operations
Array <-- array-friendly stuff, plus it's a container
List <-- list-friendly stuff, plus it's a container

Sounds inheritancy to me.
...which is poorly modelled by inheritance
Perhaps.

which is not how C++ did it...

C++ isn't really an OO language.
perhaps you could do a template thingy without inheritance. It's a
common
interface you need rather than inheritance (just because C++ does
interfaces
via inheritance doesn't mean you have to do it that way)

True.

But you need some kind of way to specify the "one of these is also one
of those" type relations, so that code can take a container whether it's
a list or an array, and that starts getting suspiciously similar to the
territory of inheritance and such.

-s
 
J

jacob navia

Seebs a écrit :
... you need some kind of way to specify the "one of these is also one
of those" type relations, so that code can take a container whether it's
a list or an array, and that starts getting suspiciously similar to the
territory of inheritance and such.


If you use

container->lpVtbl->GetElement(container,index);

you can use it with lists, arraylists (flexible arrays) and all
*sequential* containers.

I would make the distinction at the top level between sequential
containers (lists, arrays, arraylists, and maybe others) and
non sequential ones like hash tables for instance, where the idea of
"the item 42" doesn't make any sense.

I would call this just grouping, not inheritance since there
are shared characteristics but not a class hierarchy.

Trees and graphs would be another type of containers, of course
completely different (and with a complete different access methods)
than sequential containers.
 
J

jacob navia

Flash Gordon a écrit :
Actually I was thinking of a library where the application code calls
explicit functions, and a fair way down the call graph in the library it
goes through function pointers to database specific functions so we can
select the database type at run time. However, that is deeply embedded
in the library and no where near the level of application code.

I thought about that by using a generic creation function for each container
that would receive a set of flags specifying the size of the container
(huge, big, normal, small tiny) and the creation function would choose
according to the flags which vtable to set into the new list.

The interface of all those function would still be exactly the same, the
algorithms would be quite different of course.

For big lists we would use a heap manager, for normal lists malloc at
each element and for tiny lists we would allocate all the elements at once

That is what I proposed with the theme of this thread:

"Adapting software to multiple usage patterns"
 
F

Flash Gordon

jacob said:
Seebs a écrit :

If you use

container->lpVtbl->GetElement(container,index);

you can use it with lists, arraylists (flexible arrays) and all
*sequential* containers.

I would make the distinction at the top level between sequential
containers (lists, arrays, arraylists, and maybe others) and
non sequential ones like hash tables for instance, where the idea of
"the item 42" doesn't make any sense.

Ive wanted to iterate over all the elements of a hash in a language
which has hashes natively, whilst at other points using it as a hash. If
you can iterate over it, then there is a concept of "the item 42" (even
if the order is not obvious, as long as it is consistent on any instance
of a hash whilst it exists between modifications). The same can apply to
trees (where there is a more obvious ordering).
I would call this just grouping, not inheritance since there
are shared characteristics but not a class hierarchy.

The container concept can be thought of as the base class with the lists
implementation being derived from it, even if that is not how you have
implemented it.
Trees and graphs would be another type of containers, of course
completely different (and with a complete different access methods)
than sequential containers.

It's not unusual to want to want to do a tree traversal where you are
iterating over all the elements of the tree.
 
F

Flash Gordon

jacob said:
Seebs a écrit :

If you use

container->lpVtbl->GetElement(container,index);

you can use it with lists, arraylists (flexible arrays) and all
*sequential* containers.

I would make the distinction at the top level between sequential
containers (lists, arrays, arraylists, and maybe others) and
non sequential ones like hash tables for instance, where the idea of
"the item 42" doesn't make any sense.

Ive wanted to iterate over all the elements of a hash in a language
which has hashes natively, whilst at other points using it as a hash. If
you can iterate over it, then there is a concept of "the item 42" (even
if the order is not obvious, as long as it is consistent on any instance
of a hash whilst it exists between modifications). The same can apply to
trees (where there is a more obvious ordering).
I would call this just grouping, not inheritance since there
are shared characteristics but not a class hierarchy.

The container concept can be thought of as the base class with the lists
implementation being derived from it, even if that is not how you have
implemented it.
Trees and graphs would be another type of containers, of course
completely different (and with a complete different access methods)
than sequential containers.

It's not unusual to want to want to do a tree traversal where you are
iterating over all the elements of the tree.
 
S

Seebs

If you use
container->lpVtbl->GetElement(container,index);

Which I wouldn't. If I were implementing this, I'd make it something
like:

static inline ctr_getelement(container *x, int index) {
x->vtable->getelement(x, index);
}

and write
ctr_getelement(x, index);

.... Because C does not use camel case.
you can use it with lists, arraylists (flexible arrays) and all
*sequential* containers.

And what iteration model does it use? Is it more appropriate to
write
ctr_getlement(x, index);
or
ctr_next(x, ... what goes here?);
or..

You see the issue. To write reasonable code, I actually usually want to
know how to iterate.

Which is why my list library had an "iterate" operation.

-s
 
S

Seebs

Ive wanted to iterate over all the elements of a hash in a language
which has hashes natively, whilst at other points using it as a hash. If
you can iterate over it, then there is a concept of "the item 42" (even
if the order is not obvious, as long as it is consistent on any instance
of a hash whilst it exists between modifications). The same can apply to
trees (where there is a more obvious ordering).

In ruby/perl, it's definitely possible for order to vary with modifications,
but I'm not even sure it's impossible for order to vary without
externally-visible modifications.

Imagine a hypothetical implementation where the hash may, under some
non-obvious circumstances, choose to restructure itself internally. (Say,
some kind of "please clean things up if you can" message coming from a
garbage collector causing it to rebuild itself in some way.)
The container concept can be thought of as the base class with the lists
implementation being derived from it, even if that is not how you have
implemented it.

This, I agree with.
It's not unusual to want to want to do a tree traversal where you are
iterating over all the elements of the tree.

Yes.

Basically, any container-like thing can reasonably have an iterator,
but the behavior might vary widely.

-s
 
F

Flash Gordon

jacob said:
Flash Gordon a écrit :

I thought about that by using a generic creation function for each
container
that would receive a set of flags specifying the size of the container
(huge, big, normal, small tiny) and the creation function would choose
according to the flags which vtable to set into the new list.

That's one way to do it.
The interface of all those function would still be exactly the same, the
algorithms would be quite different of course.

For big lists we would use a heap manager, for normal lists malloc at
each element and for tiny lists we would allocate all the elements at once

That is what I proposed with the theme of this thread:

"Adapting software to multiple usage patterns"

However, it does not map to the usage pattern of, I need this to run as
fast as possible because it is used a vast amount and my heavy duty
server will hit 100% CPU usage and take a while even if it is fast. That
is the usage pattern I'm hitting these days.
 
F

Flash Gordon

Seebs said:
In ruby/perl, it's definitely possible for order to vary with modifications,
but I'm not even sure it's impossible for order to vary without
externally-visible modifications.

I don't know ruby, but it is possible to iterate over all elements of a
hash in Perl. As you say, the order can certainly change when the hash
table is modified.
Imagine a hypothetical implementation where the hash may, under some
non-obvious circumstances, choose to restructure itself internally. (Say,
some kind of "please clean things up if you can" message coming from a
garbage collector causing it to rebuild itself in some way.)

Possible, but I don't think it happens in Perl.

Yes.

Basically, any container-like thing can reasonably have an iterator,
but the behavior might vary widely.

Yes indeed. One common(ish) reason being to serialise it, although that
is not the only reason by any means.
 
T

Tim Rentsch

Richard Heathfield said:
jacob navia wrote: said:
For instance this proposal about a container library
is an evolution that doesn't add any new complexity to the language.

That depends on whether you consider the library to be a part of the
language. Some do, others don't. [... snip rest...]

Jacob's usage is consistent with how the Standard identifies
these different areas:

...

6. Language

...

7. Library
 
T

Tim Rentsch

Richard Heathfield said:
I guess I'm a lot less fussy about what I am prepared to accept as a
formal system.

If you mean you're using a term in a way that almost no
one else uses it, I agree with you.
 
T

Tim Rentsch

Seebs said:
Of course it does.

The library is part of "the language".

I feel obliged to point out that the Standard itself makes the same
distinction that Jacob does here. Section 6 is Language. Section 7
is Library.
 
P

Phil Carmody

Seebs said:
Which I wouldn't. If I were implementing this, I'd make it something
like:

static inline ctr_getelement(container *x, int index) {
x->vtable->getelement(x, index);
}

and write
ctr_getelement(x, index);

... Because C does not use camel case.

C doesn't /use/ any case. It copes with camel case precisely
as well as it copes with lower case.

Phil
 
F

Flash Gordon

Phil said:
C doesn't /use/ any case. It copes with camel case precisely
as well as it copes with lower case.

Ah, but the C library, as defined by the language standard, does not use
camel case, and this would almost certainly be what Seebs was referring to.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,981
Messages
2,570,188
Members
46,732
Latest member
ArronPalin

Latest Threads

Top