Object of Combination, an example of C++ templates programming

B

Bo Xu

Object of Combination
By Bo Xu

Introduction
A combination of n things, taken s at a time, often referred as an
s-combination out of n, is a way to select a subset of size s from a
given set of size n. There are n!/(s!(n-s)!) ways to do this. Donald
E. Knuth gives several methods (algorithms) to generate all the
s-combinations in [1]. In such procedure-oriented way, each
s-combination is processed while it's being generated. In some
situations, this "generate-all-at-one-time" procedure is not properly
applicable. Instead, an s-combination on demanded is needed. In this
article, s-combination is handled as object so it bears the merits of
object-oriented programming, namely, usability and reusability.
In this article, we are going to:
• Describe and implement three algorithms: increasing a given
s-combination to its next s-combination in their lexicographic order;
decreasing a given s-combination to its previous s-combination in
their lexicographic order; and getting the s-combination out of its
lexicographic number.
• Define and implement the class Comb. An s-combination is represented
by an object of this class. Comb is a finite number system with its
objects ordered by the lexicographic order of the under-lying
s-combinations. Operator ++, --, and == are defined and the capability
of getting an object of random s-combination is provided.
• Extend the class Comb to class template Comb<int S, int N> so that
the s-combination objects are typed with different s and n.
• Further generalize Comb<int S, int N> to Comb<int S, int N, typename
T>. T is any class of super set of built-in type "unsigned int". With
T being a huge integer type, we are able to handle combinations of
large sets.

We represent the set of n things as a set of integers ranging from 0
to n-1, or {0, 1, … , n-1}. An integer array v with size s holds an
s-combination of n integers, and satisfies:
0 <= v[0] < v[1] < … < v[s-1] < n
An equivalent binary string of size n can also be used for an
s-combination of n:
b0 b1 b2 b3 … bn-1
If bi is 1, integer i is selected otherwise it is 0, i is unselected.
There are s 1s in such a binary string.
All the s-combinations out of n can be arranged in lexicographic
order. Each combination has its lexicographic number from 0 to
n!/(s!(n-s)!) – 1. In the integer array representation, v[0] is the
least significant position and v[s-1] is the most significant one. The
combination with lowest lexicographic number 0 has its v[0] = 0, …,
v = i, …, v[s-1] = s – 1. The highest lexicographic number
n!/(s!(n-s)!) – 1 has v[0] = n – s , …, v = n – s + i , …, v[s-1] =
n – 1.
Here is the list of all combinations of s = 3, n = 6 in lexicographic
order:
Example 1
v[0] v[1] v[2] b0 b1 b2 b3 b4 b5 Lexicographic
0 1 2 1 1 1 0 0 0 0
0 1 3 1 1 0 1 0 0 1
0 2 3 1 0 1 1 0 0 2
1 2 3 0 1 1 1 0 0 3
0 1 4 1 1 0 0 1 0 4
0 2 4 1 0 1 0 1 0 5
1 2 4 0 1 1 0 1 0 6
0 3 4 1 0 0 1 1 0 7
1 3 4 0 1 0 1 1 0 8
2 3 4 0 0 1 1 1 0 9
0 1 5 1 1 0 0 0 1 10
0 2 5 1 0 1 0 0 1 11
1 2 5 0 1 1 0 0 1 12
0 3 5 1 0 0 1 0 1 13
1 3 5 0 1 0 1 0 1 14
2 3 5 0 0 1 1 0 1 15
0 4 5 1 0 0 0 1 1 16
1 4 5 0 1 0 0 1 1 17
2 4 5 0 0 1 0 1 1 18
3 4 5 0 0 0 1 1 1 19

Combination Generating Algorithms
Algorithm 1: Increase a given combination to its next combination in
the lexicographic order.
All s-combinations forms a finite set of size n!/(s!(n-s)!). The
combination with the highest lexicographic number, n!/(s!(n-s)!) – 1,
has no normal successor that is lexicographically higher. We define
the successor of the highest combination to be the one of
lexicographic number 0. In Example 1, the combination 3 – 4 – 5 has
the highest lexicographic 19, increasing it will result to the
combination 0 – 1 – 2 of lexicographic 0.
In Listing 1, the function increaseComb(UINT_VECTOR &v, unsigned int
n) increases the combination v. The argument "n" is the set size, and
the size of vector v, namely v.size( ), is s for s-combination out of
n. This algorithm is based on "Algorithm L, lexicographic
combinations" of D. E. Knuth's work [1].
Algorithm 2: Decrease a given combination to its previous combination
in the lexicographic order.
As for the increase, we define that the previous of the lowest
combination (lexicographic number 0) to be the highest combination of
lexicographic number n!/(s!(n-s)!) - 1. So, decreasing combination
0-1-2 results combination 3-4-5 in Example 1.
Listing 2 is the code.
Algorithm 3: Getting a combination out of its lexicographic number.
Given an integer lex, where 0 <= lex < n!(s!(n-s)!), function
lex2Comb( ) in Listing 3 generates an s-combination with lexicographic
number lex . First the combination in its binary format is determined.
The vector variable selector[] is used to hold the binary
representation in reverse order. For example, selector[0] =1 indicates
n – 1 is selected, selector[0] = 0 indicates n-1 is not selected. In
the same way, selector[n-1] determines whether the number 0 is
selected or not. The function combCount(s, n) calculates the value of
n!/(s!(n-s)!).
This algorithm basically determines whether the current highest
number (initially, n – 1) is selected or not:
case 1: lex is 0, the result combination is the lowest of "s out of
n". It is 0, 1, … , s – 1. The problem is solved.
case 2: lex < combCount(s, n – 1), this means n – 1 is not included
in the resultant combination, so the problem scale is reduced to
finding the combination of "s out of n – 1" with the lexicographic
number lex.
case 3: lex >= combCount(s, n – 1), this means n -1 is chosen. The
problem scale is reduced to finding the combination of "s – 1 out of n
– 1" with the lexicographic number lex – combCount(s, n – 1).
For understanding this algorithm, let us take a look at the "3 out of
6" combinations in Example 1. Given lex = 8. The following table shows
the process of finding the combination.
n s lex Highest number combCount(s, n – 1) Comparison Result
6 3 8 5 10 lex < combCount(3, 5) 5 not selected
5 3 8 4 4 lex > combCount(3, 4) 4 selected
4 2 4 3 3 lex > combCount(2, 3) 3 selected
3 1 1 2 2 lex < combCount(1, 2) 2 not selected
2 1 1 1 1 lex == combCount(1, 1) 1 selected
Terminate here, since we already got three selections. The result
combination is 1, 3, 4.
The Comb Class
With these three algorithms implemented, we define the class Comb to
represent combinations in objects. Listing 4 is the declaration of
class Comb.
• The constant S and constant N specify that the class is for
combinations "S out of N".
• The actual numbers made up the combination are held in the data
member m_v which is a vector of size S.
• The method getLex( ) returns the lexicographic number of the
combination. The way to calculate the lexicographic number out of its
combination is described as "Theorem L" in [1]. The algorithm is
pretty simple.
• If the least significant element, m_v[0] gets its maximal value of N
– S, the combination is the highest in lexicographic order. On the
other hand, if the most significant element, m_v[S – 1], gets its
minimal vlaue of S – 1, the combination is the lowest in lexicographic
order. The methods isMin( ) and isMax( ) reflect this fact.
• The default copy constructor and the default assignment operator are
good enough for this class, so we do not need to define our own.
• As shown in Listing 5, constructor, Comb( ), constructs an object of
lexicographic number 0, while Comb(unsigned int lex) constructs an
object of lexicographic number lex.
• With the function increaseComb( ) and decreaseComb( ), the
implementation of operator++ and operator-- is straightforward as in
Listing 6.
• The boolean operator "==" is implemented by comparing the member
vector m_v's elements of same significance. It seems that ">" and "<"
operators should also be defined in the same way. However, since our
definition of operator ++ rounds the highest combination to the lowest
one, given two objects c1 and c2, if c1 < c2 is true, ++c1 < ++ c2 is
not necessarily true when c2 is of max lexicographic number. For this
reason, there are no operator ">" nor "<" defined.

Use the operator ++, all the combinations can be listed with the
following code:
Comb c;
for ( ; !c.isMax( ); ++c)
c.print( );
c.print();

Given a random unsigned integer r, Comb c(r) makes a random
combination of "s out of n".
The Comb<S, N> Class
For the class Comb, there are two constants, S and N, which define the
combinations to be "S out of N". These two constants are specified
with preprocessor directive "#define". Obviously, this is not a
flexible way. First, each time when a different kind of combination is
required, the source code has to be modified to re-define these two
constants. Second, it is not possible to have objects of different
combinations in the same program. Thanks to C++ templates, we can make
class Comb a class template with S and N being the template
parameters.
Instead of declaring
#define S some-number
#define N some-number
class Comb {...};
We declare
template <unsigned int S, unsigned int N>
class Comb {...};

The class Comb becomes class template Comb<S, N>. All the method
definitions should be modified accordingly, for example, the prefix ++
operator, now, looks as:
template <unsigned int S, unsigned int N>
const Comb<S, N> Comb<S, N>::eek:perator++(int) //postfix
{ Comb<S, N> oldValue = *this;
++(*this);
return oldValue;
}

To get an object of 3 out 6 combination, we define a variable:
Comb<3, 6> c1;
While to get a combination of 5 out of 10, we define:
Comb<5, 10> c2;
The following assignment will cause compile time error.
c1 = c2; //error, they are two different types.
One important fact is that the assertion (N >= S) && (S > 0) should be
held, otherwise, an instantiation of class template Comb<S, N> will
have undefined behavior. Ideally, the assertion failure could be
determined at compile time. Unfortunately there is no way to do so.
The current solution is let the constructors throw exceptions at run
time if the assertion failed. In doing so, The ANSI assert macro,
assert(N >= S && S > 0) is put at the beginning of the two
constructors.
The Comb<unsigned int, unsigned int, typename T> Class
The number of combinations grows fast as N increases and S gets close
to N/2. For example, N = 50, S = 6, the total number of combinations
is 15,890,700, while with N = 50, S = 10, it becomes 10,272,278,170. A
common 32-bit computer has unsigned integer up to 4,294,967,296. So,
the class Comb<10, 50> can not be handled with 32-bit unsigned
integer.
The function combCount(s, n) computing n!/(s!(n-s)!) is the one that
gets over-flowed. We can get rid of this function and reduce some
features of class template Comb<S, N>. Without combCount( ), the
function lex2Comb( ) in Listing 3 can not be implemented. In class
template Comb<S, N>, the following, one constructor and two methods,
can not be implemented and have to be removed:
explicit Comb(unsigned int lex);
const unsigned int getLex() const;
void setValue(const unsigned int lex);
With this feature-reduced class template, we can instantiate Comb<10,
50> and generate all the objects sequentially (that is a giant list).
A constructor, like Comb(int seed) may be added to construct an object
of random combination, but this can only get non-deterministic result.
If we want the full features of the combination class and the ability
to handle huge combinations, a bigger integer for the function
combCount( ) and related methods have to be used. Originally, the
return type of combCount( ) is "hard-coded" to be "unsigned int" as:
unsigned int combCount(unsigned int s, unsigned int n) { ...... };
To make it possible for returning some other integral type, this code
can be changed as:
#typedef [some integral type, default is unsigned int] T;
T combCount(unsigned int s, unsigned int n){ ...... };
However, using preprocessor directive #typedef has the same
inflexibility as mentioned before with #define: source code has to be
modified to use a different integral type and it is impossible to
handle different integral types in the same program. Again, thanks to
C++ template, we make T as template parameter. Listing 7 is the
definition of function template combCount<T>(unsigned int s, unsigned
int n). With T be 64-bit or even greater integral type, we are able to
handle bigger combination classes.
Now, we declare
template <unsigned int S, unsigned int N, typename T = unsigned int>
class Comb {...};
The default type for T is "unsigned int". The three related functions
are declared as
explicit Comb(T lex);
const T getLex() const;
void setValue(const T lex);
The helper function lex2Comb( ) is defined as
template <typename T>
void lex2Comb(T lex, unsigned int s, unsigned int n, UINT_VECTOR &v){
.... };
Of cause, all the definitions of constructors and methods should be
re-written a little bit. The constructor Comb<T lex> looks as:
template <unsigned int S, unsigned int N, typename T>
Comb<S, N, T>::Comb(T lex)
{ assert((N >= S) && (S > 0));
m_v.resize(S);
lex2Comb<T>(lex, S, N, m_v);
}

On MS Windows, there is 64-bit integral type, "__int64". With this, we
can instantiate our class template to bigger combination class that
can not be handled with 32-bit integer. With the following definition,
Comb<10, 50, __int64> c(10000000000);
c is the 10,000,000,001th combination of 10 out of 50. The output of
c.print( ) is
0 7 11 15 26 33 36 41 48 49
The rule of California Super Lotto Plus is "Pick five numbers from 1
to 47 and a MEGA number from 1 to 27". The total of combinations of 5
out of 47 is 1,533,939. This can be handled with T being the default
unsigned int. The following gets the 1,000,001th combination object of
5 out of 47:
Comb<5, 47> c(1000000);
c.print( );
The output is: 7 15 21 32 43. Our representation with range "0 to
46" is slightly different from the lotto's range of "1 to 47", so this
output should be interpreted as 8 16 22 33 44. You can try your
luck with this combination, but don't forget to pick a MEGA number
from 1 to 27.
Conclusions
C++ templates programming appears to be a difficult task. The
impressive complication of Standard Template Library (STL) makes an
ordinary programmer feel such a task is formidable. However, if we
just take C++ templates as a tool or mechanism to scale up or expand
already well-designed classes or algorithms, it can be very helpful.
In this article, hopefully, we not only provide a useful class but
also demonstrate the rational design path of using templates through
the evolution of class Comb to class template Comb<S, N, T>. The moral
is: in the initial design stage, forget about templates, focus on
making useful classes or algorithms, then take a close look at the
"#define" and "#typedef" items. If some of these items fluctuate as
required, use templates to expand the initial design by moving
fluctuating preprocessor directive items to be template parameters.

Notes and References
[1] Donald E. Knuth, The Art of Computer Programming , Volume 4,
Combinatorial Algorithms. Pre-Fascicle 2c (Generating all
combinations) http://www-cs-faculty.stanford.edu/~knuth/
[2] The companion source code has been built with Visual C++ 7 and
tested on Windows XP. You can get it for free by email request.

Listing 1
typedef std::vector<unsigned int> UINT_VECTOR;

void increaseComb(UINT_VECTOR &v, unsigned int n)
{ unsigned int i;
unsigned int j = 0;
for (i = 0; i < v.size() - 1; i++)
{ if (v + 1 == v[i+1])
v = j++;
else break;
}
if (i >= v.size() - 1)//no j found
{ j = v.size() - 1;
if (v[j] >= n - 1) //if highest lex reached?
v[j] = j; //round to lowest lex
else v[j]++;
}else v[j]++;
}

Listing 2
void decreaseComb(UINT_VECTOR &v, unsigned int n)
{ unsigned int i;
for (i = 0; i < v.size(); i++)
{ if (v > i)
{ v--;
break;
}
}
if (i == v.size()) //the lowest reached
{ i = v.size() - 1;
v = n - 1; //round to highest
}
int j, k;
for (j = i - 1, k = 1 ; j >= 0; j--, k++)
v[j] = v - k;
}

Listing 3
void lex2Comb(unsigned int lex, unsigned int s, unsigned int n,
UINT_VECTOR &result)
{ std::vector<char> selector(n);
unsigned int i;
for (i = 0; i < n; i++)
selector = 0;
unsigned int curLex = lex;
unsigned int curCombLen = s;
unsigned int nSelected = 0;
for (i = 0; (i < n) && (curLex > 0); i++){
unsigned int combCnt = CombCount(curCombLen, n - i - 1);
if (curLex < combCnt)
selector = 0;
else {
selector = 1;
nSelected++;
curLex -= combCnt;
curCombLen--;
}
}
if (nSelected < s){
// curLex == 0, set the rest to the lowest
for (i = n -1; nSelected < s; i--){
selector = 1;
nSelected++;
}
}
//convert binary format to vector, selector is in reverse
result.resize(s); //make sure we have enough size
unsigned int j = s - 1;
for (i = 0; i < n; i++){
if (selector == 1){
result[j] = n - i - 1;
if (j == 0) break;
j--;
}
}
}

Listing 4
#define S ? //some positive integer constant
#define N ? //some positive integer constant that > S
class Comb {
public:
Comb();
explicit Comb(unsigned int lex);

Comb& operator++(); //prefix
const Comb operator++(int); //postfix
Comb& operator--(); //prefix
const Comb operator--(int); //postfix

const unsigned int getLex() const;
const bool operator==(const Comb& rhs) const;
const bool isMax() const {return m_v[0] == N - S;}
const bool isMin() const {return m_v[S - 1] == S - 1;}
void setValue(const unsigned int lex);
const UINT_VECTOR& getValue()const {return m_v;}
void print()const; //print elements in vector m_v
private:
UINT_VECTOR m_v;
};

Listing 5
Comb::Comb()
{ m_v.resize(S);
for (unsigned int i = 0; i < S; i++)
m_v = i;
}

Comb::Comb(unsigned int lex)
{ m_v.resize(S);
lex2Comb(lex, S, N, m_v);
}

Listing 6
Comb& Comb::eek:perator++() //prefix
{ increaseComb(m_v, N);
return *this;
}

//postfix has to return by value
const Comb Comb::eek:perator++(int) //postfix
{ Comb oldValue = *this;
++(*this);
return oldValue;
}

Comb& Comb::eek:perator--() //prefix
{ decreaseComb(m_v, N);
return *this;
}

//postfix has to return by value
const Comb Comb::eek:perator--(int) //postfix
{ Comb<S, N> oldValue = *this;
--(*this);
return oldValue;
}

Listing 7
template <typename T>
T combCount(unsigned int s, unsigned int n)
{ if (s == 0) return 0;
if (n < s) return 0;
unsigned int k = n - s;
if (k == 0) return 1;
k = std::min(s, k);
T count = n;
for (unsigned int i = 1; i < k; i++)
count = count * (n - i) / (i + 1);
return count;
}
 
V

Vyacheslav Kononenko

Bo said:
[skip...]
One important fact is that the assertion (N >= S) && (S > 0) should be
held, otherwise, an instantiation of class template Comb<S, N> will
have undefined behavior. Ideally, the assertion failure could be
determined at compile time. Unfortunately there is no way to do so.
[skip..]

This is not true. For example:
Code:
template <bool> struct Assert;

template<> struct Assert<true> {
};

template <int S, int N > class Comb {
// ...
private:
Assert< (N >= S) > N_Must_Be_Not_Less_Than_S;
Assert< (S > 0)  > S_Must_Be_Greater_Than_Zero;
};

int main()
{
// OK
Comb < 3, 3 > comb1;

// error: `Comb<S, N>::N_Must_Be_Not_Less_Than_S' has incomplete type
Comb < 4, 3 > comb2;

// error: `Comb<S, N>::Comb<S, N>::S_Must_Be_Greater_Than_Zero' has
incomplete type
Comb < 0, 0 > comb3;
return 0;
}

see Alexandrescu's http://www.moderncppdesign.com/ for more details on
generic programming
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,954
Messages
2,570,116
Members
46,704
Latest member
BernadineF

Latest Threads

Top