Serialization Framework

M

molw5.iwg

I'm looking for some feedback on an early version of a serialization
framework I've been working on, see:

http://www.github.com/molw5/framework

The idea was to use some of the new C++11 features to provide a natural
syntax for the definition of serializable objects down to their
representation on the wire. The motivating use-case here was protocol
analysis - I wanted the ability to specify protocols I was working with
naturally with zero duplication of information. The following example
illustrates the syntax used:

struct Object : serializable <Object,
value <NAME("Field 1"), little_endian <uint32_t>>,
value <NAME("Field 2"), big_endian <float>>,
optional_field <uint16_t,
optional_value <0x0001, NAME("Field 3"), stl_null_string>,
optional_value <0x0002, NAME("Field 4"), stl_null_wstring>,
optional_value <0x0004, NAME("Field 5"), little_endian <uint32_t>>,
optional_value <0x0008, NAME("Field 6"),
stl_vector <
little_endian <uint32_t>,
inline_object <
value <NAME("Field 1"), little_endain <uint32_t>>,
value <NAME("Field 2"), little_endain <uint32_t>>>>>>,
value <NAME("Field 7"),
stl_map <
little_endian <uint32_t>,
stl_string,
stl_wstring>>>
{
void foo ()
{
using x1 = get_base <Object, NAME("Field 2")>;
using x2 = get_base <Object, NAME("Field 4")>;

x2::set("Hello World!");
x1::set(x1::get() + x2::get().size());
}
};

Note that Visual C++ 2011 is not supported - the syntax above relies
heavily on the ability to translate a string literal to a typename
through the NAME macro. This macro requires the ability to perform
compile-time character extraction from string literals; in C++11
this was made possible through constexpr. Visual C++ lacks support
for this feature, even in CTP; recent versions of clang and GCC are
supported.
 
M

molw5.iwg

How does this compare to boost serialization with portable_binary archive?



Jeff

I've never used that module - boost's Serialization was of course
examined as an alternative initially, however it did not appear to meet
my needs at the time. I took a look through some of the portable_binary
examples and correct me if I'm wrong but it looks like portable_binary
archive specifies the underlying format in Archive and uses that when
serializing data - ie:

struct Object
{
....
int x;
int y;
int z;
template <typename Archive>
void serialize (Archive& ar, const unsigned int)
{
ar & x & y & z;
}
};

Specifically, ar above would serialize x and y using the format dictated
by Archive - be that big endian, little endian, or some other scheme. In
my use case x and y could use dramatically different packing schemes - for
example, x may need to be serialized as a little endain while y and z may
need to be packed into a short using some bit packing scheme. A scheme
similar to the above could, in principal, still be used by wrapping x, y,
and z in structures that alter how they're serialized (I don't know if boost's
Serialization supports this, but it's clearly possible):

struct Object
{
...
as_little_endian <int> x;
...
// not clear how definitions extending
// across multiple variables should be handled

template <typename Archive>
void serialize (Archive& ar, const unsigned int)
{
ar & x & ...;
}
};

The reason why I did not end up pursuing that design is that fact that
definitions similar to the above fundamentally either split or duplicate
information in the protocol specification. In both examples above a
variable's type is first defined as part of the structure definition, then
the order in which the variables are serialized is defined as part of the
serialize function. That probably seems trivial :) - it was actually the
original motivation behind the design. The serialization of certain
complex structures was defined in methods similar to serialize - they
contained non-trivial control paths. This turned debugging and updating
the protocol into an absolute nightmare as the protocol definition was
effectively split into three parts; the structure definition, the read
function, and the write function. Clearly moving to a syntax similar to
serialize would have mitigated that to a degree, however fundamentally it
wasn't possible eliminate the division entirely under that design.

In addition to the above, there are serious advantages to defining the
layout of structures inside an object's type information. Specifically, a
kind of comile-time introspection is possible using this approach - if you
examine comparable.hpp you'll note that it defines methods for member-wise
comparisons between an arbitrary pair of serializable objects. Similar
wrappers may be written naturally for generic data member visitors, such
as print methods or similar for the purpose of data visualization. Other
very neat possible uses exist - for example, as the base class order is
mutable, one could solve a bin-packing problem at compile-time to
optimally construct an arbitrary object to minimize space. Containers
could be written that mutate the serialization order of a set of variables
based on the value of a compile-time constant, allowing build-specific
protocols to be trivially written. The list is endless :)
 
W

woodbrian77

How does this compare to boost serialization with portable_binary archive?


Just a reminder of another "boost serialization" library.
The C++ Middleware Writer isn't part of Boost, but it has
serialization support for some parts of Boost that the
serialization library in Boost doesn't have --
http://webEbenezer.net/comparison.html
.. G-d willing, we'll be adding support for other
libraries in Boost in 2013.


Brian
Ebenezer Enterprises -- making programming fun again.
www.duckduckgo.com
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,994
Messages
2,570,223
Members
46,812
Latest member
GracielaWa

Latest Threads

Top