J
Joshua Maurice
I tend to blame the company developing the tool in cases like these. I
know Rational Rose (not Rose Realtime, which I hear is different and
better), and I have always loathed it for its lack of support for sane
version control, parallel development etc.
It never occured to me because I have never used it for code
generation, but I suppose that in the same way it lacks support for
building.
Well, our problem is that we develop the GUI and some newer components
in Java, but our old engine is in C++. Our engine solves a domain
specific problem with a domain specific language. This language is
represented by an object graph, whose representation is logically
coupled with the GUI of the end users. Our engine takes this graph,
picks it apart into separate tasks assignable to separate threads, and
begins processing. Implicit in this is that we want to be able to take
an object graph from Java, serialize to some XML format or some binary
format, and then deserialize to C++ to give to the engine, and vice
versa for debugging etc. This potentially requires arbitrary code
generation. At one point in time, we used several Rose model files to
describe the object graph of the domain specific language. We had a
custom inhouse tool which converted this to C++ classes and Java
classes with the serialization code in place. I don't really know any
other sane way to handle this use case, the serialization of object
graphs between different languages such as C++ and Java.
Perhaps when you generate a lot of source files at random,
you should at the same time generate a Makefile fragment describing
the dependencies between them. Perhaps the code generator should
follow the same policy as a Makefile build and not touch a generated
header file unless it is actually changed.
So I'm defining tools that break Make as bad tools ... which of course
doesn't help people who are stuck with them :-/
Maybe it would help somewhat to wrap the code generator in a shell
script which only touches the regenerated .cpp/.h files which have
actually changed (and removes all of them if the generation fails).
The "conditional touching of files in a command of a rule" won't work,
at least not with GNU Make. The GNU Make mailing list has confirmed
that any file creation, deletion, or modification during phase 2 may
not be picked up. This has been my experience playing with it as well.
GNU Make effectively determines which portions of the graph are out of
date before running any command of any rule.
This is one facet of my major beef with Make: from a command, you
cannot conditionally choose to mark downstream nodes as up to date or
out of date. Depending on the kind of build step, this affects its
incremental "goodness", the ability to skip unnecessary build steps,
to varying degrees.
With the Rose generation, with GNU Make, when the code generation task
is out of date, you can mark all output files out of date. It'll
result in some additional C++ compilation - a better system could skip
more unnecessary work, but at least it's incrementally correct.
With Java compilation, you could make an incrementally correct build
system, but it would be a cascading rebuild without termination,
vastly inferior to the aforementioned system which can terminate the
rebuild early.
My assertion early in this thread was that such things (the "make
loopholes") can be avoided (don't use many include search paths)
and/or detected manually when they happen (rebuild from scratch when
files disappear from version control). I don't think I saw you
explaining why that isn't good enough, or did I miss that?
Well, a couple things.
First, some build steps, like javah, javac, the aforementioned Rose
compilation, unzipping, and others, produce output which is not
predictable without doing the actual build step. With file creation
and deletion, you need to check for:
1- Stale files - this might require some cleaning, and rerunning of
build steps downstream.
2- New files
2a- New files which hide old files on some search path. Relatively
unlikely for C++ depending on naming conventions (a lot more likely in
my company's product due to bad naming conventions and lots of include
path entries), but much more likely for Java.
2b- New files which require new build steps, or new nodes in the
dependency graph. For my Rose to C++ code generation, I do not know
what C++ files will come out of it until I actually do the code
generation. It will produce lots of .cpp files (which will not be
modified by hand). Each of these .cpp files needs to be compiled to
a .o file. I would like for this to be done in parallel, but to do
that I need to define new make rules, aka add nodes to the dependency
graph, which one really cannot do in GNU Make.
I know a little about GNU Make and how it can have makefiles as
targets, and if it detects an out of date makefile, it will like,
rebuild that makefile (and all prereqs), and restart GNU Make from the
start with the new makefile. Has anyone ever used this? I admit that I
haven't played around with this fully, but my initial impression is
that it's basically unworkable for my problems. Restarting the whole
shebang after every such Rose code generation would result in a lot of
makefile parsing, easily adding minutes (or likely much more) to a
build time. Though, I admit I could be wrong here.
Finally, why punt? We're requiring that the developer be fully aware,
but I think that a lot of these problems, such as "do a full clean
build whenever a file is deleted" is easy to forget or accidentally
miss. I think this is a little different than "don't dereference a
null pointer" or similar arguments you can make. When we're writing
code, we're aware of the pointer and that it could be null. When we're
doing a build, we're busy thinking about code, not about whether the
entire codebase breaks some "build style" rule, or whether some file
has been deleted. It's not practical to check your email for
"incremental build breaking messages". It's inefficient, and error
prone. Moreover, it's fixable. It's quite doable to handle all of
this, and more, and to do it faster than GNU Make. There is no reason
to punt. The investment of time now to make a build system which can
handle it all will save lots of developer time later - for those
developers which:
- work on "the build from hell" (like me)
- or those who forgot to check for a file deletion when they did a
sync
- or those who are working on a mixed code base with Java, C++, etc.
(like me).
Put another way, yes I recognize that the perfectly correct, academic
way is not the way to do things. For example, see my post here:
http://groups.google.com/group/comp.lang.c++.moderated/msg/dacba7e87ded4dd7
However, it seems clear to me that this is a clear win for investing.
The investment needs to be done once, by one guy, and everyone in the
entire C++, Java, and more, programming world can use it to save time.
Any time savings \at all\ is easily worth it when we can amortize the
cost to one guy but claim savings from every developer everywhere.
Now, it's hard to make such an argument to management, mostly because
it's wrong. For management, you correctly need to show that it helps
the company, which is a bit harder to show, but I still think that
this is the case. (My management and peers disagree though.)
[About being able to specify other actions than compiling and linking]
But no build tool actually does this. At best, they just provide a
framework for the new compilation step. GNU Make just provides a
framework. The build tool which I'm writing just provides a framework.
Yes, but it seemed to me you considered *not* providing that. That's
why I pointed out that it's important to many of us.
It seems to me that you have a pretty narrow focus and don't want to
listen to objections a lot. Actually, I think that's fine. That's what
*I* do when I have an idea and want to summon the energy to do
something about it.
I'm not quite following. One second. I think I need to clarify. Make
does not do C++ compilation out of the box, nor any other random
possible build kind. You need to write some logic in makefile to
handle this new kind of build. My new tool will be effectively the
same: it won't handle any arbitrary build kind out of the box - it
won't be magic, but it will be simple and quick to extend it to handle
X new build kind, just like make. The difference is that I'm strictly
enforcing the separation of build kind logic from the average
developer who just instantiates an already defined macro. If need be,
the average developer can add a new macro, but it will not be in an
interpreted language ala make so it will be much faster, the macro
definition cannot be in arbitrary build script file ala make which
will allow much easier auditing of incremental correctness, and it
will make it harder for a unknowledgeable developer to break the build
because the build system makes it exceptionally hard to do so. I think
an analogy which applies: "private, protected, public, const" are
technically unnecessary for perfect developers, but we recognize their
utility in protecting us from ourselves. (No, this is not a proof by
analogy. I'm just trying to explain my case.)