(snip, I wrote)
Yes, as someone else indicated, you should organize into groups.
That adds one more level of abstraction, but is it enough?
The grouping can be hierarchical, so that is as many additional levels
of abstraction as you want. I've recently been assigned to work on a new
project, and the program I'm currently responsible for in that project
have functions grouped into libraries, and the libraries can be divided
into several different groups, depending upon who's responsible for each
group of libraries: the C standard, the C++ standard, the POSIX
standard, three different third party vendors, our project, and another
project that we have borrowed a lot of code from. There's an average of
a couple of dozen libraries in each group, and an average of a dozen or
so identifiers with external linkage (most of them, function names) in
each library. That's three different levels to the hierarchy, which is
almost enough to make everything comprehensible (or at least, it might
be, if our code and the other project's code were better organized and
documented ;-( ). I imagine that larger projects need more levels; but
this one currently marks the upper limits of my experience.
If you have X functions with mean Y lines of code each, for a
total of X*Y, should Y not increase at all as X increases?
I don't think so. Y should be determined by the average developer's
ability to understand the code, I don't see any reason why it should
change with the size of the project.
For an example, say 1M loc, and you have a choice between
(mean) 50k functions of 20 lines each and 20k functions of 50
lines each, which will be more readable?
Unless this is C++ code, 20 lines seems a bit short for a function, but
any code that does something which can be implemented in 20 lines is
certainly going to be more readable than one that requires 50 lines. The
key point is that the average programmer won't have to (and won't have
time to) read all of the code - whether it has 50K of functions or only
20K. Any given programmer will read only the functions he or she is
actually working on, and short descriptions of the calling and called
functions. I assume you're not describing a one person project!
Now, going from 2D to 3D doesn't make the function 50% harder
to read, as there will be some similarity between those lines.
That depends upon what the third dimension means. If you have homogenous
data, and a third dimension just means larger chunks of data, then it
won't matter much. However, usually an additional dimension implies
corresponding additional feature that need to be implemented, would
could easily make the code 50% harder to understand. In my own code, two
dimensions that occur almost everywhere are called "sample" and
"detector", and together they lay out a two-dimensional image. A
possible third dimension (though the code is not actually organized that
way ) would be by scan; each scan of the rotating mirror creates a new
image, and that's just more of the same thing, which would be an example
of what you're talking about. However, the third dimension that actually
occurs in many contexts in my code is usually the band number; each
frequency band needs different processing, something which goes way
beyond being simply an extension of the two-dimensional image processing
part of the code.
For a completely unrelated subject, consider how the mean size
of cities changes as population grows? It might be better to have
more cities of the same, smaller, size but that isn't usually what
happens.
Larger cities require larger administrations to understand them well
enough to administer them properly. Correspondingly, large software
projects need to have more people working on them, but in my experience,
the best way to subdivide responsibilities for a program is to assign
entire functions or even entire libraries to one programmer, not by
trying to have one programmer understand one part of a function, and
have a another programmer understand a different part of it. Therefore,
the maximum size of a function is determined by a single programmer's
ability to understand it, not by the size of the program that it's a
part of.