maximum cases in a switch ?

M

Marcel Müller

Juha said:
Since he's clearly aiming for maximum speed, the switch block is
probably the most efficient solution because most compilers will create
a jump table, which means executing the proper case block will be done
in O(1) (rather than in O(log n) as with the binary search).

(Now, of course it's not guaranteed that all compilers will be so smart
as to generate a jump table from a large switch block, but AFAIK most
modern compilers do.)

Well, if and only if the values are in a sufficiently small range.
If not, the compiler will use binary or linear search.
Of course an alternative solution would be to create the jump table
explicitly yourself (basically, an array of function pointers which you
index with the value). This would allow a better rearrangment of the
code into their own functions, but I don't know if it would result in
a more, less or equally efficient program. (This solution, however, has
the disadvantage of making it more complicated for the "case blocks", in
this case the separate functions, to handle shared local variables. Passing
them as parameters to the functions might decrease efficiency.)

As long as the case blocks do nothing but returning a constant (i.e. the
address of a global object) the functions are entirely superfluous as
well as the case code too. Storing the addresses in the lookup table is
sufficient. And if you do not need the symbol names of the
DataDescriptors elsewhere the entire Objects can be stored in the lookup
table, saving another indirection.

If he has really need for speed, he should entirely eliminate the the
integer IDs of the DataDescriptors and use the address of the
DataDescriptors instead. If they are multitons this is equivalent. Of
course, if there is a transposed data model in the backend with the IDs
as keys this is not an option. But transposed data models are not fast.


Marcel
 
J

Juha Nieminen

Victor Bazarov said:
Not if you *assume* those functions are inlined, like "most modern
compilers do"...

There's only one explicit "function call", and several functions. The
call is not direct, it's an indirect call using a (variable) function
pointer. You can't inline several functions to one function call.
 
J

Joe Greer

On 8/9/2011 11:27 AM, LR wrote:
Lynn McGuire wrote:
Is there a maximum number of cases in a switch ?
I have a class with a method that now has 525
cases in a switch and will probably double this
over the next couple of years.

Thanks,
Lynn

According to n3242,

"Case labels for a switch statement (excluding those for any nested
switch statements) [16 384]."

A short int, cool ! Gives me plenty of room
for additions.

Thanks,
Lynn

Have you considered just creating an array of your return values and
using your symbol as it's index? It would seem to be quicker and
more straight forward than trying to have a huge switch statement.
For example:

static DataDescriptor * myDataDescriptors[] =
{
&aDD_GEN_CALVAPPRE,
&aDD_GEN_CALSONVEL,
&aDD_GEN_ONLYWRITESTREAM,
.
.
.
}
;

And this would only work if the values on which you index the array
('GEN_CALVAPPRE', 'GEN_CALSONVEL', etc) are actually ordinal numbers,
and their order does not change (not that I'm suggesting that it will
do so often, but still). Otherwise, it's a lookup table, which,
granted, needs to only be initialized once, and then used all the
time, and it will only be faster if it's a dense array. As soon as
it's a sparse array, you either waste memory, or you need to use
std::map (or 'unordered_map') and the time for looking values up is
not necessarily better...
DataDescriptor * GenGroup::descriptor(int aSymbol, int version)
{
// do some validating of aSymbol here, can possibly add or
subtract
an offset etc.
return myDataDescriptors[aSymbol];
}

joe

V

Totally agreed, but it was never mentioned one way or the other how these
ids were formed. If they are just an enum, then lookup is the way to go
IMO and I think you should always at least consider it before coding a
switch statement. Not that I think there is anything wrong with a switch
statement, but it is fundamentally a flow of control sort of statement,
not a syntactically heavy lookup mechanism.

joe
 
L

Lynn McGuire

Is there a maximum number of cases in a switch ?
I have a class with a method that now has 525
cases in a switch and will probably double this
over the next couple of years.

Thanks,
Lynn

According to n3242,

"Case labels for a switch statement (excluding those for any nested
switch statements) [16 384]."

A short int, cool ! Gives me plenty of room
for additions.

Thanks,
Lynn



Have you considered just creating an array of your return values and
using your symbol as it's index? It would seem to be quicker and
more straight forward than trying to have a huge switch statement.
For example:

static DataDescriptor * myDataDescriptors[] =
{
&aDD_GEN_CALVAPPRE,
&aDD_GEN_CALSONVEL,
&aDD_GEN_ONLYWRITESTREAM,
.
.
.
}
;

And this would only work if the values on which you index the array
('GEN_CALVAPPRE', 'GEN_CALSONVEL', etc) are actually ordinal numbers,
and their order does not change (not that I'm suggesting that it will
do so often, but still). Otherwise, it's a lookup table, which,
granted, needs to only be initialized once, and then used all the
time, and it will only be faster if it's a dense array. As soon as
it's a sparse array, you either waste memory, or you need to use
std::map (or 'unordered_map') and the time for looking values up is
not necessarily better...
DataDescriptor * GenGroup::descriptor(int aSymbol, int version)
{
// do some validating of aSymbol here, can possibly add or
subtract
an offset etc.
return myDataDescriptors[aSymbol];
}

joe

V

Totally agreed, but it was never mentioned one way or the other how these
ids were formed. If they are just an enum, then lookup is the way to go
IMO and I think you should always at least consider it before coding a
switch statement. Not that I think there is anything wrong with a switch
statement, but it is fundamentally a flow of control sort of statement,
not a syntactically heavy lookup mechanism.

joe

/* GenGroup - base group */
#define GEN_COM 0
#define GEN_COMNAM 1
#define GEN_COMFOR 2
....
#define GEN_CALVAPPRE 205

Lynn
 
A

Alain Ketterlin

Lynn McGuire said:
static DataDescriptor * myDataDescriptors[] =
{
&aDD_GEN_CALVAPPRE,
&aDD_GEN_CALSONVEL,
&aDD_GEN_ONLYWRITESTREAM,
DataDescriptor * GenGroup::descriptor(int aSymbol, int version)
{
// do some validating of aSymbol here, can possibly add or
subtract
an offset etc.
return myDataDescriptors[aSymbol];
}
/* GenGroup - base group */
#define GEN_COM 0
#define GEN_COMNAM 1
#define GEN_COMFOR 2
...
#define GEN_CALVAPPRE 205

Sorry, but why don't you just use the pointers to the descriptors insted
of your made up numbers? And maybe put a (unique) id inside each object
if your really need one (handle uniqueness inside DataDescriptor's
ctor). Your whole switch would go away. IIRC there is nothing wrong in
comparing pointers, as long as they are from compatible types.

-- Alain.
 
L

Lynn McGuire

Lynn McGuire said:
static DataDescriptor * myDataDescriptors[] =
{
&aDD_GEN_CALVAPPRE,
&aDD_GEN_CALSONVEL,
&aDD_GEN_ONLYWRITESTREAM,
DataDescriptor * GenGroup::descriptor(int aSymbol, int version)
{
// do some validating of aSymbol here, can possibly add or
subtract
an offset etc.
return myDataDescriptors[aSymbol];
}
/* GenGroup - base group */
#define GEN_COM 0
#define GEN_COMNAM 1
#define GEN_COMFOR 2
...
#define GEN_CALVAPPRE 205

Sorry, but why don't you just use the pointers to the descriptors insted
of your made up numbers? And maybe put a (unique) id inside each object
if your really need one (handle uniqueness inside DataDescriptor's
ctor). Your whole switch would go away. IIRC there is nothing wrong in
comparing pointers, as long as they are from compatible types.

-- Alain.

Please dont assume that these indexes are used only
in this one place.

Lynn McGuire
 
A

Alain Ketterlin

Lynn McGuire said:
Sorry, but why don't you just use the pointers to the descriptors insted
of your made up numbers? [...]
Please dont assume that these indexes are used only
in this one place.

So what? Do the same at every place they're used. At least, adding a new
descriptor will happen at a single location in your code.

-- Alain.
 
L

Lynn McGuire

Sorry, but why don't you just use the pointers to the descriptors insted
of your made up numbers? [...]
Please dont assume that these indexes are used only
in this one place.

So what? Do the same at every place they're used. At least, adding a new
descriptor will happen at a single location in your code.

-- Alain.

And destroy the object hierarchical model that I carefully
built for data diversity, obscurity and speed ? I think
not.

Remember what Einstein said: "“Everything should be made
as simple as possible, but no simpler.” "

You have no idea of the amount of data that I am storing
and manipulating.

Lynn
 
A

Alain Ketterlin

Lynn McGuire said:
Sorry, but why don't you just use the pointers to the descriptors insted
of your made up numbers? [...]
Please dont assume that these indexes are used only
in this one place.

So what? Do the same at every place they're used. At least, adding a new
descriptor will happen at a single location in your code.
And destroy the object hierarchical model that I carefully
built for data diversity, obscurity and speed ? I think
not.

Oh sorry, that was impossible to guess. All I've seen is a incredibly
long list of #define (obscurity at its best) and a giant switch (an
almost perfect example of what "object hierarchical models" want to
avoid).

Here is my last cheap piece of advice: make it an enum, at least the
compiler will check that you don't miss one. You can even keep the
numbers if you like them.
Remember what Einstein said: "“Everything should be made
as simple as possible, but no simpler.†"

That's your problem. And please, keep your lessons. You've been asking
for help, not me.
You have no idea of the amount of data that I am storing
and manipulating.

Honestly, I don't care.

-- Alain.
 
J

Jorgen Grahn

With such an unusually large switch, I'd rather not assume anything.
The price for being wrong could be high, like Victor said.

I'd be interested to hear what the compiler generates in this case.

/Jorgen

DataDescriptor * GenGroup::descriptor (int aSymbol, int version)
{
0075D5B0 push ebp
0075D5B1 mov ebp,esp
0075D5B3 sub esp,8
0075D5B6 mov dword ptr [ebp-4],ecx
switch (aSymbol)
0075D5B9 mov eax,dword ptr [aSymbol]
0075D5BC mov dword ptr [ebp-8],eax
0075D5BF cmp dword ptr [ebp-8],20Ch
0075D5C6 ja $LN1+7 (75EA22h)
0075D5CC mov ecx,dword ptr [ebp-8]
0075D5CF jmp dword ptr (75EA2Ch)[ecx*4]
{
case GEN_CALVAPPRE:
return & aDD_GEN_CALVAPPRE;
0075D5D6 mov eax,offset aDD_GEN_CALVAPPRE (0DA0080h)
0075D5DB jmp $LN1+9 (75EA24h)
case GEN_CALSONVEL:
return & aDD_GEN_CALSONVEL;
0075D5E0 mov eax,offset aDD_GEN_CALSONVEL (0D88908h)
0075D5E5 jmp $LN1+9 (75EA24h)
case GEN_ONLYWRITESTREAMBOXONFIRSTSHEET:
return & aDD_GEN_ONLYWRITESTREAMBOXONFIRSTSHEET;
0075D5EA mov eax,offset aDD_GEN_ONLYWRITESTREAMBOXONFIRSTSHEET (0D9C648h)
0075D5EF jmp $LN1+9 (75EA24h)
...

And I guess here is where I'll have to admit that I don't know
assembly unless its MC68k ;-) Is this a jump table implementation?
It's not a plain if ... elseif ... elseif chain -- that much I can
see.

/Jorgen
 
N

Nick Keighley

You have your answer, I'm not going to repeat it.  Remember that it's
not normative, though.

isn't it normative in the C Standard? Does the C++ Standard take any
of (a particular version of) the C Standard unchanged (eg. the C
library)?
 
J

Joe Greer

Lynn McGuire said:
/* GenGroup - base group */
#define GEN_COM 0
#define GEN_COMNAM 1
#define GEN_COMFOR 2
...
#define GEN_CALVAPPRE 205

Lynn

Since these do seem to be mostly sequential, then an array lookup seems appropriate. The next
question then seems to be, does it make more sense to do what I first suggested (array of pointers)
or does it make more sense to just have an array of DataDescriptors and do away with the variables
aDD_GEN_CALVAPPRE etc entirely. e.g.


static DataDescriptor descriptors[] =
{
{ /* whatever is in a DataDescriptor */ },
{ /* whetever is in a DataDescriptor */ },
..
..
}

and just return the address of the element. i.e. return &descriptors[aSymbol];

Of course, that only works if DataDescriptor is POD, otherwise you need to construct objects in there
and do other object lifetime management.

joe
 
L

Lynn McGuire

You'll be amazed how many people assume they know better despite
that...

Yes and the thread drift made him/her assume that I was asking
for help in data storage. My app is working with several
thousand paid users with demands for more features. I would
be foolish to re-architect my app's data storage at this time
unless I had a very good reason (such as running out of switch
cases).

Lynn
 
L

Lynn McGuire

Lynn McGuire said:
/* GenGroup - base group */
#define GEN_COM 0
#define GEN_COMNAM 1
#define GEN_COMFOR 2
...
#define GEN_CALVAPPRE 205

Lynn

Since these do seem to be mostly sequential, then an array lookup seems appropriate. The next
question then seems to be, does it make more sense to do what I first suggested (array of pointers)
or does it make more sense to just have an array of DataDescriptors and do away with the variables
aDD_GEN_CALVAPPRE etc entirely. e.g.


static DataDescriptor descriptors[] =
{
{ /* whatever is in a DataDescriptor */ },
{ /* whetever is in a DataDescriptor */ },
.
.
}

and just return the address of the element. i.e. return&descriptors[aSymbol];

Of course, that only works if DataDescriptor is POD, otherwise you need to construct objects in there
and do other object lifetime management.

joe

What is POD ? The data descriptors live the lifetime of
the app and are built at compile time to save runtime.
"Performance is a Feature" applies to all software:
http://www.codinghorror.com/blog/2011/06/performance-is-a-feature.html

I gained 100X in speed when I converted my app from
smalltalk to C++ and do not intend to give it back over
some language semantics. Besides that, the data descriptors
are contained in almost 100 classes (this particular class
is the biggest at 525) and changing this code would be
un-needful.

Please note that I got my original question answered and
I really appreciate that. It means that my current data
storage descriptor design is extensible and that meets my
needs for now.

Thanks,
Lynn
 
J

Joe Greer

What is POD ? The data descriptors live the lifetime of
the app and are built at compile time to save runtime.
"Performance is a Feature" applies to all software:
http://www.codinghorror.com/blog/2011/06/performance-is-a-feature.h
tml

I gained 100X in speed when I converted my app from
smalltalk to C++ and do not intend to give it back over
some language semantics. Besides that, the data descriptors
are contained in almost 100 classes (this particular class
is the biggest at 525) and changing this code would be
un-needful.

Please note that I got my original question answered and
I really appreciate that. It means that my current data
storage descriptor design is extensible and that meets my
needs for now.

Thanks,
Lynn

POD = Plain Old Data i.e. a struct or class with no special behavior in
the case.

Only you know what your app needs to do. I am just exploring
possibilities that you can either use or reject as suits your needs.

joe
 
L

Lynn McGuire

POD = Plain Old Data i.e. a struct or class with no special behavior in
the case.

Only you know what your app needs to do. I am just exploring
possibilities that you can either use or reject as suits your needs.

joe

Thanks !

Lynn
 
V

Victor Bazarov

isn't it normative in the C Standard?

I doubt it. And I don't have the C Standard handy to check.

It shouldn't matter, should it? C++ language proper (not library) is
completely governed by the C++ Standard. C Standard has nothing to do
with C++ language and its limitations.
Does the C++ Standard take any
of (a particular version of) the C Standard unchanged (eg. the C
library)?

Yes, most of the library IIRC.

V
 
L

Leo Equinox Gaspard

Le 12/08/2011 22:22, Joe Greer a écrit :
POD = Plain Old Data i.e. a struct or class with no special behavior in
the case.

Only you know what your app needs to do. I am just exploring
possibilities that you can either use or reject as suits your needs.

joe

AFAIK a POD may have special behavior. The only restriction is to have only
POD types as member variables, with the only base-non-POD types being
pointers
(and maybe arrays, but I'm not sure).

<<< Errors in, see after >>>

Roughly, it is a type whose member variables are PODs too.

But it may contain member functions, or not.

Some examples of POD and non-POD types :

struct POD { int i; }; // Only POD members
struct NonPOD { int * p; }; // Pointer isn't POD

struct DerivedPOD : POD { char c; }; // Only POD members
struct DerivedNonPOD : NonPOD { char c; }; // Inherits pointer, so not POD
struct DerivedFromBothNonPOD : POD, NonPOD { char c; }; // Idem

struct ComposedPOD { POD p; long l; }; // Only POD members
struct ComposedNonPOD { NonPOD p; long l; }; // At least one non-POD member
struct ComposedFromBothNonPOD { NonPOD p1; POD p2; long l; }; // Idem

struct MemberFunctionPOD { int i; int plus_one() { return ++i; } };
struct MemberFunctionNonPOD { int * p; int value() { return *p; } };

struct NonTrivialDestructorPOD {
int i;
~NonTrivialDestructorPOD() { i = 0; }
};
struct NonTrivialDestructorNonPOD {
int * i;
~NonTrivialDestructorNonPOD() { delete i; }
};
struct NonTrivialDestructorNonPOD2 {
int * i;
~NonTrivialDestructorNonPOD2 { *i = 0; }
};

Am I wrong anywhere ? <<< Yes, see below >>>

BTW, I just checked the latest free paper about C++0x, and I saw this.

(9.10)
A POD struct is a class that is both a trivial class and a standard-layout
class, and has no non-static data members of type non-POD struct, non-POD
union (or arrays of such types). Similarly, a POD union is a union that is
both a trivial class and a standard layout class, and has no non-static
data members of type non-POD struct, non-POD union (or arrays of such
types). A POD class is a class that is either a POD struct or a POD union.

So, as we are speaking of POD classes, we need :
- trivial
- standard-layout
- no non-static data members that are non-POD

Now, trivial classes.

(9.6)
A trivial class is a class that has a trivial default constructor (12.1)
and is trivially copyable.
[Note: In particular, a trivially copyable class does not have virtual
functions or virtual base classes. - end note]

So, we need :
- trivial default constructor
- trivially copyable
- no virtual functions
- no virtual base classes
- standard-layout
- no non-static data members that are non-POD

Now, trivially copyable.

(9.6)
A trivially copyable class is a class that:
- has no non-trivial copy constructors (12.8),
- has no non-trivial move constructors (12.8),
- has no non-trivial copy assignment operators (13.5.3, 12.8),
- has no non-trivial move assignment operators (13.5.3, 12.8),
- has a trivial destructor (12.4).

So, we need :
- trivial default, copy and move constructors
- trivial copy and move assignment operators
- trivial destructor
- no virtual functions, base classes
- standard-layout
- no non-static data members that are non-POD

Now, standard-layout.

(9.7)
A standard-layout class is a class that:
- has no non-static data members of type non-standard-layout class
(or arrays of such types) or reference,
- has no virtual functions (10.3) and no virtual base classes (10.1),
- has the same access control (Clause 11) for all non-static data
members,
- has no non-standard-layout base classes,
- either has no non-static data members in the most derived class and
at most one base class with non-static data members, or has no base
class with non-static data members, and
- has no base classes of the same type as the first non-static data
member.

So, we need :
- trivial default, copy and move constructors
- trivial copy and move assignment operators
- trivial destructor
- no virtual functions, base classes
- no non-static data members that are not standard-layout
- no virtual functions, base classes
- only one access control for all non-static data members
- no non-standard-layout base classes
- only one class in the hierarchy that has non-static data members
- no base classes of the same type as the first non-static data member
- no non-static data members that are non-POD

Restrictions 3 and 5 are the same.
Restriction 4 is a subset of restriction 10.

So, we need :
- trivial default, copy and move constructors
- trivial copy and move assignment operators
- trivial destructor
- no virtual functions, base classes
- only one access control for all non-static data members
- no non-standard-layout base classes
- only one class in the hierarchy that has non-static data members
- no base classes of the same type as the first non-static data member
- no non-static data members that are non-POD

So, we can do anything we want with static members and member functions.

So I've mistaken on examples NonPOD (pointer is POD), DerivedPOD
(restriction 7), ComposedNonPOD (NonPOD is POD, funny isn't it ?),
ComposedFromBothNonPOD (same reason), MemberFunctionNonPOD (pointer is
POD), NonTrivialDestructorPOD (restriction 3).
And NonTrivialDestructorNonPOD and NonTrivialDestructorNonPOD2 weren't
non-POD for the reason I thought.

Now, am I mistaking anywhere ?

Cheers & hope this article won't be too long to read,
Leo
 
V

Victor Bazarov

[..]
Some examples of POD and non-POD types :

struct POD { int i; }; // Only POD members
struct NonPOD { int * p; }; // Pointer isn't POD

Really? No, REALLY?!! Any proof (quote from the Standard, a book or
some such)?
[..]

Now, am I mistaking anywhere ?

I don't know. I stopped reading after you called a pointer non-POD...

V
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,141
Messages
2,570,817
Members
47,364
Latest member
Stevanida

Latest Threads

Top