Those are true (I know this because I have done a lot of C++ compiler
work), but frankly, people writing C++ parsers / compilers / etc, are
a miniscule fraction of the people using C++ for other tasks.
So what you have, basically, is harder work for a few, to make things
easier/better/safer for many.
Again, a pretty decent tradeoff.
fair enough.
the problem is that I don't really have enough time/energy/... to
personally justify going through the work (all at once), though I may
eventually have it in a "one piece at a time" sense.
for interfacing between a script language and C++ at the C++ level, one
would probably need (in a basic sense):
ability to share classes (at least import them, to use C++ APIs directly
from the HLL, much harder is exporting classes to C++, and harder still
would be to allow mutual inheritance/overloading);
....
or, say for example, transparently interfacing C++ and an ECMAScript
variant (similar to JavaScript or ActionScript), ideally without making
awkward/arbitrary restrictions for one or the other (including semantic
mismatches).
as-is, C structs can be shared, but my scripting VM has several notable
restrictions WRT shared structs (ones shared between C code and the
scripting VM):
minor "construction" issues ("new structname();" will create a struct,
but not necessarily in the "correct" way for a given use-case);
currently, the struct needs to be a type known to the GC/... (needs to
be allocated in "managed" memory, or the memory-region needs to be
registered with the GC with the appropriate handlers set up, also the
declaration needs to be visible-to/processed-by the metadata tool);
currently, structs may not be directly/physically nested (except in
certain special cases, but may be linked freely via pointers);
arrays-of-structs don't currently work (one needs arrays of pointers to
structs instead);
shared function pointers are very much a "use with caution" feature
(function pointers work both-ways, but have not been well tested
extensively, and I don't much "trust" the feature, as well as it
introducing additional restrictions);
....
a lot of the reason for these issues is that the VM/language is
internally "soft typed" (sort of a mix of static and dynamic typing).
WRT handling complex data types, it currently uses dynamic
type-checking, and if the VM can't determine the type of the object a
pointer refers to, it will simply treat it as a no-op opaque value
(often happens, say, will malloc'ed memory, ...).
in a few cases, the VM uses "boxed pointer values" (basically where the
pointer is boxed in a heap-based object which also tracks its type,
usually done if the static type is known but a dynamic type is not
visible with the pointer), but this is expensive (when loading a
pointer, the VM may examine it and determine if it needs to box it,
avoiding doing so if possible).
as can be noted, "new structname();" will generally create the struct as
a "pass-by-value" boxed type (rather than currently as its assigned
dynamic type name, note example below). however, semantics/... get hairy
here (and, typically, explicit constructor functions are used for most
cases where this would matter).
so, as-is, not even C is perfectly handled at its own level (but, the
subset is "generally good enough" for most tasks).
so, in a basic sense, C++ struct-based objects could be shared in a
limited sense:
typdef struct Foo_s Foo;
struct Foo_s dytname("myapp_foo_t") //dytname==dynamic type name
{
fields...
#ifdef __cplusplus
.... non-virtual methods ...
#endif
};
the trivial extension would be to have the metadata tool be able to
accept the class keyword, and be able to parse non-virtual methods.
the biggie issue though:
about as soon as one defines "__cplusplus" in the preprocessor, then
usually one also gets a mountain of template definitions and other
things (even from otherwise theoretically plain C headers), meaning that
the parser/tools/... have to be prepared to deal with all this.
there is no real option for a tool to be like (to the system headers)
"only give me those pieces of code I can deal with". actually... one
often even has to emulate certain compilers just to make the headers
happy, as many headers will themselves refuse to work correctly if one
doesn't define, say:
__GNUC__
__GCC_VERSION__
....
which means that my tools have ended up supporting many GCC and MSVC
extensions (and partial compiler emulation) simply trying to get the
headers parsed.
as-is, I am partly left scared of the C++ case.
otherwise, one would need a lot of special case stuff, essentially
making the "tools" case separate from either the C or C++ cases.
#if defined(__cplusplus) || defined(_BGBMETA)
....
#endif
or, creating a new PP define to reflect a common sub-case (a C++ subset
understandable to the tools).
my preference thus far had been for types/magic-modifiers/... which
could be made special for the tool, but otherwise invisible to plain C
or C++ compilers (or will expand to something more generic).
"dytname()" is such an example, where in the tool it will expand to a
special attribute, and in plain C or C++ (native compiler) it will
expand to nothing.
technically, the main metadata tool should be able to at least parse
basic C++ style syntax (class definitions, ...), but lacks many other
less trivial matters.
namespaces may pose additional difficulties (they are partially
supported, but very hackish, and concepts like scoped-typedefs, ... are
not currently handled).
IMO, templates also look like a big ugly issue.
currently, there is no support for the C++ ABI (and my own object system
works a bit differently, meaning a 1:1 mapping with C++ ABI classes
could be itself difficult).
also, classes would either need to overload "new" to use the GC for
allocation, or the VM/GC will have to be aware of and able to deal with
RTTI (has its own sets of worries).
....
I guess one can ask though how easy it can really be to interface
several not-exactly-similar languages (such as C and ECMAScript), and
avoid creating any ugly seams in the process.
it is also a question of how much can be readily done in a single person
project as well.
[and given that I'm part of "the few", as well as part of the "many"
(I use C++ for many tasks), I think I have a pretty reasonable view of
this tradeoff. Indeed, for compiler writers, the additional
complexity isn't some sort of pure annoyance -- it's also in many ways
more interesting to work on (and for those writing their compiler in
C++, of course, a benefit)... :]
fair enough...
actually, my compiler and VM stuff is itself mostly written in C, mostly
because I figure it would be useful for the VM to be able to access its
own functionality.
similarly, most VM functionality is also exposed to C and C++ code,
albeit with the existing restrictions (typically, a minimally-adorned
C-like interface is provided, as going and writing a mountain of
cosmetic wrappers would be much added effort).
a few parts of my 3D engine, however, are C++, but admittedly it is
typically a fairly minimalist form (C-like with a few classes, and
generally no real use of templates or namespaces).
a partial reason for lack of namespaces:
because I am lazy and generate many of my headers with tools, and this
particular tool is fairly naive/simplistic (also under 1 kloc), neither
understanding namespaces nor preprocessor directives, and has some of
its own special command-language embedded in comments (or occasionally
in special preprocessor directives, which are treated as literal tokens).
this tool doesn't care about "using" and namespaces, or "::", but does
care about declaring code in namespaces, which it will screw up with,
and would have to be told to ignore the whole region.