StateFull vs Stateless Singleton

J

James Kanze

If there is some user command to change them later, then
they aren't command line options. Long running programs
don't normally use command line options (other than the
specify the configuration files); they use configuration
files, and if the user can issue commands to modify the
configuration on the fly, they probably rewrite the
configuration file as well. Command line options are for
short running programs, which do one thing and exit (like
a compiler). My options are variables at namespace scope.
Preferrably local to the module in which they are used. The
constructor calls a function in CommandLine which enrols
them.
While convenient at some levels, this solution isn't
perfect. You've still got to find all of those options when
it comes to writing the man page or help text. But
globally, I've found it very useful to be able to define an
option in the module which uses it, without having to modify
any central code which "knows" all of the options.
Ok, i have used a bit different way. Each option is mapped to command
(function or functor). As options are parsed the commands are
executed. That results with command line being parsed, passed around
and forgotten immediately. Same commands can be executed later by
other means for some reason an option has to be changed. At same place
where commands are there may be little help texts and hints if needed.
At simplest case it is static constant array in some nameless
namespace.

That's more or less what I do as well: each option is mapped to
a command. But practically all commands have state, since
options control the later execution of the program. And there
are some more or less standard cases: BooleanOption,
NumericOption, StringOption, etc.

The state is necessary because exploiting the command line is
really a two phase operation: first you parse out the options,
then you treat what is left as an array of filenames. (Most of
the time. My CommandLine class doesn't enforce this---once the
options have been parsed out, it looks very much like an
std::vector said:
Probably it is because i like somewhat extended interface for command
line application ...
If some mandatory option was missing from command line then program
may ask for it (instead of telling it was error).

That's fairly easy to implement on top of my interface. In my
code, if some mandatory option is missing, I'll abort with an
error message, but the same code which does this could ask for
it instead. (Since the command line programs I write are used
more often in scripts than directly, going interactive isn't an
option.)
When no options were given at all it may enter into
interactive mode (instead of telling about typical usage with
few lines). Such twists are not too uncommon and make testing
it simpler.

Again, going interactive sort of defeats the purpose of using
a command line driven program: you don't want to go interactive
in the middle of a script. But it shouldn't be that difficult
to implement: just check if argc is 1 before calling
CommandLine::parse, and if it is, do something else.
It is difficult to achieve if there are components written by third
parties. It is bit easier to equip them with an interface how to add
command line commands.

If the components are obtained already written, they won't
conform to any system you define. If the third party is writing
them on contract for you: my "interface" as to how to add
a command line option is to derive from Option, and declare
a static instance of the derived class (or just use one of the
pre-defined classes, for the frequent cases like just setting
a boolean variable, or getting the name of a file as a string).
If keeping handling configuration files must be
centralized then i prefer to provide a simplest interface:
namespace configuration
{
void read( ModuleID id, vector<char>& bytes );
void write( ModuleID id, vector<char> const& bytes );
}

This begs the question. A configuration file needs an internal
representation to contain its data. You have to define this
internal representation somewhere. It needs some sort of means
of accessing the data: the most widespread format uses a two
level access, so you end up with a more or less complicated
interface where you have to specify both a section and a name to
access a value. (Why two levels, I don't know: for a lot of
simple applications, one level is largely sufficient, and for
larger applications, you may want more than two levels. Which
can be easily simulated by a naming convention, but you might
want to structure the file itself more.)
Simplest interfaces are simplest to extend. I may handle it as
ini or xml file, for me it is set of bytes i preserve in exact
form.

It can't stay that way forever. At some point, you need to
extract the actual information.
When all modules are mine then of course it is cheaper
to have some extended interface so each configurable value can
retrieved from common tree (or its branch).
There are so many possible ways how to handle and manage
concurrency. I think there are at least 4 major ways plus
endless subtypes. Some, (like OpenMP) make it business of
implementation others deal with threads very closely and
explicitly. I do not think that any of them does start threads
from constructors of static objects, but who knows.

Who said anything about starting a thread from the constructor
of a static object? You call a function (typically a member of
a Thread object) to start a thread. A thread itself has two
distinct software components, the code to be executed, and
a component which represents the thread itself: its
meta-information, like state. (Many early thread designs
confounded the two.) But if you want to shut down cleanly, you
need to be able to find all of those threads, in order to notify
them, and to wait for them to terminate. (Calling exit() in one
thread, while other threads are still running, will generally
result in undefined behavior.) So each thread must register
itself somewhere; that registry must be unique, and so
a singleton.
Why not in some personal vector of taskmanager?

Because namespaces can't have "personal" members? You can't
protect access to a namespace. (I'd probably make TaskManager
a singleton class, with the actual implementation a forward
declared private member class, for a maximum of encapsulation.)
Everyone has to give their tasks to taskmanager without
knowing how many threads for the job there is. Is it some
other model of concurrency?

It depends on why you are using threads. In a server, for
example, there will be one or two threads per connection. In
a GUI, a thread may be explicitly created in order to handle
a specific request. About the only time you don't really know
how many threads you want to create is when you're using threads
for parallelization. (Of course, the thread you create when
a client logs onto your server may create other threads, to do
specific tasks for it.)

None of this is really relevant to what I was saying, however.
It doesn't matter who creates the threads, or how many they may
create. Every thread must register with some central, unique
thread manager, and provide some means for this thread manager
to shut it down. This is really private to the threading
subsystem (except that you'd generally provide an externally
accessible function to trigger the shutdown---and usually
functions to iterator through the threads, display there status,
etc., for debugging purposes).
Yes, some things may need specific thread. For example OpenGL
is AFAIK implemented so that you can draw to one display from
only one thread and no other thread. However presence of such
threads may be also consulted (with some special function) to
taskmanager.

That's a different issue. Some components may use threads, and
may require a specific registry of the threads they're using.
(In the case of a GUI, there will normally be one thread per
window, and all updates to the window occur in that thread. The
system will, however, provide some means of posting a request
to the window.)
Thread is not like object for me. It is more like running function
(activity). Threads usually communicate with messages or signals.
Basic functionality function of receiving messages and accepting jobs
to do and reporting back about problems may be same for all threads.
On that case also the place from where they get their tasks may be
centralized and be located at same place the function is implemented.

A thread has behavior (in addition to just executing its code)
and state. It's necessary to maintain the meta-information
concerning the threads state somewhere.
The tasks that thread does may have some other mechanics for
canceling them on half run, for example they may get some
signal interface to attach and observe. That really needs
whole separate newsgroup about how to implement concurrency in
C++; there probably are some in usenet.

If you need clean shutdown (not all applications do, and I've
written programs where the only way of stopping them was a "kill
-9"), then you need to be able to stop threads in
a deterministic amount of time. If every thread except the root
thread runs in a deterministic amount of time, this is trivial:
just block the creation of new threads, and wait. Otherwise,
there must be a means of signaling each thread (and unblocking
it), so that it can shutdown cleanly. And a means of signaling
each thread presupposes a means of finding each thread.
I believe it is bad twist and does not fit too well into C++
where you may have free functions.

It's neither bad nor good. It becomes bad or good depending on
what you do with it.

You can often use free functions as well. But if you have state
(immutable or otherwise), it usually makes more sense to use
a class, in order to control access to the state.
Every time i see one it feels like
"std::sorter::instance().sort(from, to)".

I've never seen anything that silly (at least not in C++).
A "sorter" doesn't have any state that is maintained between
instances, so wouldn't be a singleton. For this reason,
std::sort is not a class, but a function. (But the standard
isn't always very clean about this distinction.
std::list<>::sort is a member function, because it needs access
to the internal details of list<> in order to have an efficient
implementation. A much better solution would be to have
a specialization of the std::sort free function, which is
a friend. But that would probably require some sort of partial
specialization of functions; I'm not sure if it could be done
with just overloading and SFINAE.)
When it is surely same object of whose member functions i call
then i prefer to not have that object on table at all.

Whether the object is visible to the user or not is really not
an important issue. I tend to use static member functions, so
the user doesn't have to consider the instance. But when
obtaining the instance is non trivial (e.g. it may require
a lock), that can have serious performance implications. There
is no universal solution.
With that object i may anyway do nothing but call its member
functions.

If obtaining the object is not free, then it may be an advantage
to obtain it once, rather than to obtain it each time you need
to access one of its functions (directly or indirectly).
I switched to a static interface (only static member functions,
so you never call instance()) in CommandLine, because its
functions are definitely called seldom enough, and because
getting the instance is very fast. In some earlier projects,
however, I've used a standard singleton, and avoided calling the
instance() function in tight loops for performance reasons.
 
Ö

Öö Tiib

Ok, we wrote and wrote:
The state is necessary because exploiting the command line is
really a two phase operation: first you parse out the options,
then you treat what is left as an array of filenames.  (Most of
the time.  My CommandLine class doesn't enforce this---once the
options have been parsed out, it looks very much like an
std::vector<std::string>.)

May be more phases:
1) parse options (among options may be option that suggests a file for
more options)
2) parse optional (default? pre-set? provided by other option?)
options file
3) parse mandatory other fields (like paths/filenames)
4) execute commands based on combined options

Like i said somewhere not far ago, it does not bother me if middle way
parsing product (some compiled together vector of strings) has been
preserved for some reason for later inspection in some singleton.
After all it is constant and dependencies on such singleton are
unlikely.
The whole operation (all phases) can be however a job of a single
function. Imagining convenient interface how to add options and
commands to such list is simple. Boost.Program_options has almost
such:

// Declare the supported options.
po::eek:ptions_description desc("Allowed options");
desc.add_options()
("help", "produce help message")
("compression", po::value<int>(), "set compression level")
;
po::eek:ptions_description cmdline_options;
cmdline_options.add(desc);

What are missing for me are the functions/functors for commands.
Should be given immediately from same interface with strings (and best
is when possible conflicts can be detected compile-time). I am not
convinced that whole config has to be provided with options (as boost
seems to think). At certain point of size everything can not simply
fit to command line (at least on Windows where it is maximum 4000
characters or like that).

[...]
Again, going interactive sort of defeats the purpose of using
a command line driven program: you don't want to go interactive
in the middle of a script.

That is clearly a bug in a script. Script turning application it
executes into interactive mode is easiest to notice. Application
asking for values of missing parameters or going fully interactive in
middle of script are like debug break in script because of a bug.
Alternative symptoms like error state ("option missing" or "no
arguments" returned by application) may be ignored by script and that
following is undefined behavior that is sometimes slightly harder to
notice. I can imagine that buggy script in hands of end user is
similarly embarrassing on both cases.
 But it shouldn't be that difficult
to implement: just check if argc is 1 before calling
CommandLine::parse, and if it is, do something else.

Yes. Probably that provides no difficulties.

[...]
This begs the question.  A configuration file needs an internal
representation to contain its data.  You have to define this
internal representation somewhere.  It needs some sort of means
of accessing the data: the most widespread format uses a two
level access, so you end up with a more or less complicated
interface where you have to specify both a section and a name to
access a value.  (Why two levels, I don't know: for a lot of
simple applications, one level is largely sufficient, and for
larger applications, you may want more than two levels.  Which
can be easily simulated by a naming convention, but you might
want to structure the file itself more.)

Yes. The most widespread format is like in ini file. That is the two
levels. That usually fits to command line too. From there come
configurations that may contain binaries and are likely best
structured to several levels.
It can't stay that way forever.  At some point, you need to
extract the actual information.

When ini file does not pull the weight then the information is likely
complex. There can be parts with xml, tlv whatever formats. Zipping
them all together (like open document format does) and then providing
files (or like i above did vectors of char) is least intrusive and
good way out of too far responsibilities (and therefore dependencies).
Where i come the whole configuration may be required to be digitally
signed. Major difficulties with ini file then, but with open document
format it takes minor extension. I have also thought about writing a
generic configuration handling framework that scales but have not
found time for that. MicroSoft did try (registry) and like everybody
see, they are sort of sorry.
Who said anything about starting a thread from the constructor
of a static object?

You did ... as example of difficulties of initialization order and why
it needs to be a class? Does not matter, really. We both agree it is
pointless.
 You call a function (typically a member of
a Thread object) to start a thread.  A thread itself has two
distinct software components, the code to be executed, and
a component which represents the thread itself: its
meta-information, like state.  (Many early thread designs
confounded the two.)  But if you want to shut down cleanly, you
need to be able to find all of those threads, in order to notify
them, and to wait for them to terminate.  (Calling exit() in one
thread, while other threads are still running, will generally
result in undefined behavior.)  So each thread must register
itself somewhere; that registry must be unique, and so
a singleton.

Yes, OK. Just that it must not be publicly accessible and so may need
"friends" as a class.
Because namespaces can't have "personal" members?  You can't
protect access to a namespace.  (I'd probably make TaskManager
a singleton class, with the actual implementation a forward
declared private member class, for a maximum of encapsulation.)

Yes, that is what i am pushing. PIMPL idiom is not needed with
singleton. The whole private implementation may be local in .cpp with
no traces of its existence in .h. It is there really single, sole and
only one of the kind.

[...]
None of this is really relevant to what I was saying, however.
It doesn't matter who creates the threads, or how many they may
create.  Every thread must register with some central, unique
thread manager, and provide some means for this thread manager
to shut it down.  This is really private to the threading
subsystem (except that you'd generally provide an externally
accessible function to trigger the shutdown---and usually
functions to iterator through the threads, display there status,
etc., for debugging purposes).

Ok. Yes there has to be single data somewhere, yes. So I did provide
such function. Only difference is that it is free and no one is on
table who represents such single data (besides that namespace, but it
does not imply that the data is in that namespace). How that function
gets access to imaginable central data (and how it detects if such
central data is constructed or not yet destructed or even phoenixed
back) is its own business.
It's neither bad nor good.  It becomes bad or good depending on
what you do with it.

You can often use free functions as well.  But if you have state
(immutable or otherwise), it usually makes more sense to use
a class, in order to control access to the state.

I prefer to control access to that state with its location in single
translation unit. Translation unit local makes sure that there are no
questions of access to it. Like PIMPL. Only that even no pointer with
forward declared type is not needed in header file. Same is with
members that are simultaneously private and static. If there are no
friends then why to hint their existence at all?
I've never seen anything that silly (at least not in C++).

You are lucky. Notice our topic name: "Stateless singleton" is exactly
that terribly puerile thing AFAIK. I have unfortunately seen them in
piles. Collected into component! Component was "Made in USA". You are
correct that this silliness does not extend to all singletons.

[...]
Whether the object is visible to the user or not is really not
an important issue.  I tend to use static member functions, so
the user doesn't have to consider the instance.  But when
obtaining the instance is non trivial (e.g. it may require
a lock), that can have serious performance implications.  There
is no universal solution.

That is again somewhat talking against classical singletons. Often
providing copies of (or parts of) mutable state is cheaper than
locking it. Singleton is non-copyable. So functions for taking an
interface (or const interface) are more flexible. Sometimes it is good
to lock a whole, sometimes a part. Now you have to expose whole
interrelated family of singletons to let to obtain them with
"instance()". I prefer when such interfaces spread as parameters to
constructors and not public getters.

Better is no "instance()" whatsoever since why to carve answers into
rock:
Q: Is there singleton? Is it is whole singleton? Is it part of
singleton? Is it just a copy of part of singleton that implements the
interface?
A: Got interface? Use it. See contract how. Report a defect if you see
one. What is the problem?
If obtaining the object is not free, then it may be an advantage
to obtain it once, rather than to obtain it each time you need
to access one of its functions (directly or indirectly).
I switched to a static interface (only static member functions,
so you never call instance()) in CommandLine, because its
functions are definitely called seldom enough, and because
getting the instance is very fast.  In some earlier projects,
however, I've used a standard singleton, and avoided calling the
instance() function in tight loops for performance reasons.

Yes, i switched bit farther, only functions. Plain functions are best
on simple cases. When there are performance/synchronization reasons
then interfaces or proxies to it (or parts of it) can be obtained/
spread. No way to obtain singleton object itself (if such exists).
Synchronization should be dealt with immediately if need for it is
visible from architecture. Performance however should be dealt only
when there are issues with it.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,999
Messages
2,570,243
Members
46,836
Latest member
login dogas

Latest Threads

Top