lines of code

P

Pavan

Hi, I want to know if there is any software for measuring lines of
code of my c++ application.
I found out a tool, sloccount, but it gives only physical lines of
code.

I found out one more tool cccc , but iam getting many parse errors
with it.


If you know any other tool( to be used in linux) please let me know.
 
G

Gianni Mariani

Pavan said:
Hi, I want to know if there is any software for measuring lines of
code of my c++ application.
I found out a tool, sloccount, but it gives only physical lines of
code.

I found out one more tool cccc , but iam getting many parse errors
with it.


If you know any other tool( to be used in linux) please let me know.

Exactly how do you measure LOC - does that include comments ? defines ?
expressions over multiple lines ?
 
P

Pavan

Exactly how do you measure LOC - does that include comments ? defines ?
expressions over multiple lines ?

Yes, It should count comments seperately and should also give logical
lines of code
 
F

Fred Kleinschmidt

Pavan said:
Yes, It should count comments seperately and should also give logical
lines of code

int nItems = 0; // Number of items in my list
Is the above a line of code? is it a comment line?

Do you mean statements instead of lines?
a=0; b=1; c=2;
Is the above one line? three lines?

How do you interpret a line with no characters other than whitespace?
What about macros? How about multi-line macros that expand to
multiple statements?

y=f(x) , z=f(w);
How many lines in the above?
 
I

Ian Collins

Pavan said:
Hi, I want to know if there is any software for measuring lines of
code of my c++ application.

The obvious question is why do you want to?

If you are using LOC as a metric, don't, its nonsensical.
 
J

James Kanze

The obvious question is why do you want to?
If you are using LOC as a metric, don't, its nonsensical.

What kind of nonsense is that? Used correctly, it's a very
useful metric. The more lines of code, the larger the
application, and the more effort needed to develope (and
maintain it).

Obviously, like every metric, it can be abused, but that doesn't
mean that it's useless. (And what do you propose in its place.)
 
I

Ian Collins

James said:
What kind of nonsense is that? Used correctly, it's a very
useful metric. The more lines of code, the larger the
application, and the more effort needed to develope (and
maintain it).

Obviously, like every metric, it can be abused, but that doesn't
mean that it's useless. (And what do you propose in its place.)
I've seen it abused (as a performance metric) far more than used.

Even for the use you quote, it can be terribly misleading, code
complexity has more impact on support cost than lines of code. A huge
monolithic function my have fewer lines than a well factored equivalent,
but it would be way more expensive to maintain.

I don't propose anything, there isn't a simple, accurate way of
measuring the complexity of a C++ application (or the productivity of a
programmer).
 
J

Jerry Coffin

[ lines of code ... ]
Obviously, like every metric, it can be abused, but that doesn't
mean that it's useless. (And what do you propose in its place.)

I'd suggest function points. Even when you attempt to use them as well
as possible, lines of code tend to be difficult to apply in many
situations -- just for an obvious example, the number of lines of code
to implement specific functionality often varies quite widely depending
on the implementation language. Function points help factor that out of
the equation. In fairness, that can be misleading as well -- under some
circumstances, language choice really has a substantial effect on the
effort required. In far more cases, however, such differences in length
reflect little more than syntactic verbosity.
 
G

Greg Herlihy

int nItems = 0; // Number of items in my list
Is the above a line of code? is it a comment line?

I would count it as one line of code and as a one line comment.
Do you mean statements instead of lines?
a=0; b=1; c=2;
Is the above one line? three lines?

Three lines. I would expect that any line-counting program would first
converted the source code into a "canonical" representation before
counting its "lines" of code.
How do you interpret a line with no characters other than whitespace?
What about macros? How about multi-line macros that expand to
multiple statements?

A line with only white space would not be counted as a line of code.
The source files will have already been preprocessed - so the line
counter program examines the source code that is actually fed to the C+
+ compiler.
y=f(x) , z=f(w);
How many lines in the above?

Two. After all, the following program is nearly identical to the one
abovve:

y=f(x);
z=f(w);

But clearly my version does not have twice the amount of code as the
original program - so I would expect the lines of code counted in each
program would be the same. More formally, I would count every
"expression", "declarator" and "statement" (as described in the C++
language grammar found in Appendix A of the C++ language Standard) to
be equivalent to one "line" of code.


Greg
 
G

Gianni Mariani

Greg said:
I would count it as one line of code and as a one line comment.


Three lines.

(a = 0), (b = 1), (c = 2);

What about that ?

a = f( x=1, b=3 );

.... and that.

Construct::Construct()
: a(1),
b(2),
c(a+b)
{
}

.... oooh that too.

enum { a = 1, b = 2 };

and that !

int func( int a = 1, int b = 2, int * c = new int[3] );


.... and this

if ((flags & FNM_PERIOD) && *n == '.' &&
(n == string || ((flags & FNM_FILE_NAME) && n[-1] == '/')))
return FNM_NOMATCH;

.... or this
FormatMessage(
FORMAT_MESSAGE_ALLOCATE_BUFFER | FORMAT_MESSAGE_FROM_SYSTEM,
NULL,
dw,
MAKELANGID(LANG_NEUTRAL, SUBLANG_DEFAULT),
(LPTSTR) &lpMsgBuf,
0, NULL
);

+ a bazillion other constructs.
 
A

Alf P. Steinbach

* Gianni Mariani:
[examples of LOC ungood]

One of the worst problems with LOC in modern development is that the
ideal refactoring of old messy code yields much lower LOC than the
original (I've reduced hundreds of lines of spaghetti to just two or
three well-crafted lines); the better a job, the higher the quality, the
more /negative/ the LOC becomes, and so dim-witted productivity measures
based on LOC will assign the most productive person a negative
productivity, and the worst person, the one producing zillions of
low-abstraction spaghetti lines e.g. by copying, pasting and modifying,
is inferred to have the "best" productivity.

As I recall function points had some similar problems, but it's long
since I studied such things.

I think, in the end any really useful measure of work that's
intelligence based, requires intelligence: it can't be done mechanically
following simple rules (with the current state of art of AI).
 
G

Gianni Mariani

Alf said:
* Gianni Mariani:
[examples of LOC ungood]

One of the worst problems with LOC in modern development is that the
ideal refactoring of old messy code yields much lower LOC than the
original (I've reduced hundreds of lines of spaghetti to just two or
three well-crafted lines); the better a job, the higher the quality, the
more /negative/ the LOC becomes, and so dim-witted productivity measures
based on LOC will assign the most productive person a negative
productivity, and the worst person, the one producing zillions of
low-abstraction spaghetti lines e.g. by copying, pasting and modifying,
is inferred to have the "best" productivity.

As I recall function points had some similar problems, but it's long
since I studied such things.

I think, in the end any really useful measure of work that's
intelligence based, requires intelligence: it can't be done mechanically
following simple rules (with the current state of art of AI).

The farce gets worse. I worked with one guy who had a previous job with
a large computer company. They apparently had a policy that the number
of bugs per LOC was constant and so if you did not have a count of bugs
fixed which came close to the assumed number you would be told your code
was still buggy and go find and fix some more.

Yah. Needless to say there were *many* bugs that were filed simply to
make the bug count correspond.
 
B

Bo Persson

Alf P. Steinbach wrote:
:: * Gianni Mariani:
::: [examples of LOC ungood]
::
:: One of the worst problems with LOC in modern development is that
:: the ideal refactoring of old messy code yields much lower LOC than
:: the original (I've reduced hundreds of lines of spaghetti to just
:: two or three well-crafted lines); the better a job, the higher the
:: quality, the more /negative/ the LOC becomes, and so dim-witted
:: productivity measures based on LOC will assign the most productive
:: person a negative productivity, and the worst person, the one
:: producing zillions of low-abstraction spaghetti lines e.g. by
:: copying, pasting and modifying, is inferred to have the "best"
:: productivity.

Of course you could have improved your productivity, by keeping the
old spaghetti code as well. Or make an extra copy of it!

Does the rules say that only code that is executed counts? :)


I now guys who wrote a "comment generator" to fullfill their manager's
policy of enough comments in the source code.


Bo Persson
 
G

Greg Herlihy

(a = 0), (b = 1), (c = 2);

What about that ?

Three lines. My rule would be to break down each statement into its
grammatical components and if a (comma,separated) expression or
expression-list is found - then count each of its components
individually as one line of code (and to repeat this process
recursively) - otherwise if the statement can be broken down without
finding either an expression or expression-list - then the statement
counts as one line of code.(or something like that, USENET doesn't pay
well enough for me to work all of these details out :))

For example, the statement above expression matches this grammatical
production in the C++ grammar:

expression-statement => expression ';'

since we have to break down the expression before we can count the
lines, we do so:

expression => expression ',' primary-expression

and so forth. Essentially we wind up with with three of these

'(' assignment-expression ')'

which - when broken all the way down contain neither an expression nor
expression list - so the final tally is three lines of code - one for
each non-expression component of the expression production..
a = f( x=1, b=3 );

... and that.

Three lines. Again we start with an expression-statement:

expression-statement => assignment-expression ';'

becomes

assignment-expression => identifier '=' postfix-expression

becomes

postfix-expression => identifier '(' expression-list ')'

Since we found an expression-list we can no longer count this
statement as one line of code. Instead have to break down the
expression-list and count its components:

expression-list => assignment-expression ',' assignment-expression

So these two assignment-expressions here and the expression-statement
we started with - gives us three lines of code altogether.
Construct::Construct()
: a(1),
b(2),
c(a+b)
{

}

Three (two literals and one additive-expression)
... oooh that too.

enum { a = 1, b = 2 };

and that !

No lines of code assessed for an enum definition..
int func( int a = 1, int b = 2, int * c = new int[3] );

No lines of code awarded to a function declaration. There is after all
no reason to penalize a program for merely declaring a routine. On the
other hand, implementing a function and calling a function do
contribute to a program's line count. The idea here is to encourage
calls to external, library routines (whose lines are not included in
the program's line count) and otherwise to discourage redundant
implementations and to promote code reuse.

Note also that the use of default arguments is encouraged here, since
default arguments often reduce the number of arguments that need to be
provided at the call site (and with each argument counting as one line
of code - the savings could be significant).
... and this

if ((flags & FNM_PERIOD) && *n == '.' &&
(n == string || ((flags & FNM_FILE_NAME) && n[-1] == '/')))
return FNM_NOMATCH;

Three lines. One line of code for the if-statement, one line for the
logical-and expression in its condition, and one line of code for the
return statement. Note that the complexity of the expression does not
necessarily means that we have undercounted the "lines" of code here.

Presumably the variables appearing in the condition-expression had to
be declared somewhere - and each of those declarations would count as
one line of code (by my measure). The strategy here is to assess a
cost for declaring a variable - but not to penalize the program for
using the variable. Therefore removing unused variables is an easy and
effective way to reduce the number of lines of code assessed to a
program.

After all, the entire motivation for counting lines of code in a
program is to build awareness that code added to a program also adds
to its cost, code once-written is not "free" - its mere presence in a
program's sources represents a cost. Therefore any line of code in a
program that does not have to be there - is just money being wasted.
... or this
FormatMessage(
FORMAT_MESSAGE_ALLOCATE_BUFFER | FORMAT_MESSAGE_FROM_SYSTEM,
NULL,
dw,
MAKELANGID(LANG_NEUTRAL, SUBLANG_DEFAULT),
(LPTSTR) &lpMsgBuf,
0, NULL
);

I count 10: one for the postfix-expression FormatMessage, one for each
of its arguments argument - except for the postfix-expression
MAKELANGID (which is probably a macro - but I'll pretend is a
function) is assessed two lines additional lines (one for each of its
arguments).
+ a bazillion other constructs.

Yes. And in real life I would not expect the C++ programmer to count
the lines of code in a program by hand, as I did above (it's a little
tedious actually). Instead I would imagine that a C++ compiler would
be better suited to do the line counting - leaving the programmer free
to focus on constructive ways to bring the number of lines counted -
down.

Greg
 
J

James Kanze

I've seen it abused (as a performance metric) far more than used.

I've seen it abused, but not that often. Generally, either the
shop is into metrics, has studied the issues, and uses it
correctly, or they aren't into metrics, and don't use it, any
more than they use any other metric.

I think, however, that if you find that it is the only metric
being considered, you probably have a problem. It should be one
of a set of metrics; it's value depends on the fact that "all
other things are equal", and unless mechanisms are in place to
ensure this (and such mechanisms depend on other metrics, as
well as code review, and a number of other factors), then it is
worse than useless: we all know that quality code will usually
require less lines than hapharzardly written code.
Even for the use you quote, it can be terribly misleading, code
complexity has more impact on support cost than lines of code.

Within a given application domain, at least, complexity tends to
be constant. I suspect that this is true accross a large number
of application domains as well, but I'm pretty sure that OS
kernel code will require more effort for the same number of
lines than will most application code.
A huge monolithic function my have fewer lines than a well
factored equivalent, but it would be way more expensive to
maintain.

Obviously, if your organization allows huge monolithic
functions, then it's not mature enough to make good use of lines
of code as a metric. You do have to ensure a common coding
style, and a consistent level of quality, for it to be
meaningful.
I don't propose anything, there isn't a simple, accurate way
of measuring the complexity of a C++ application (or the
productivity of a programmer).

I don't know that it's necessarily simple, but productivity can
be measured and numerically evaluated. Otherwise, how do you
know whether you're improving your process or not?
 
J

James Kanze

[ lines of code ... ]
Obviously, like every metric, it can be abused, but that doesn't
mean that it's useless. (And what do you propose in its place.)
I'd suggest function points.

I'd say that it depends on what you're trying to measure. Lines
of code per function point is a very good measure of programmer
productivity, for example. (In this case, of course, less is
generally better, although you also have to insist on
readability requirements being met.)
Even when you attempt to use them as well
as possible, lines of code tend to be difficult to apply in many
situations -- just for an obvious example, the number of lines of code
to implement specific functionality often varies quite widely depending
on the implementation language.

Agreed. They're not an absolute measure, in any sense of the
word.
 
I

Ian Collins

James said:
I don't know that it's necessarily simple, but productivity can
be measured and numerically evaluated. Otherwise, how do you
know whether you're improving your process or not?
I tend to track productivity the XP way, by tracking the number of story
points completed in an iteration. This gives a good indication of a
team's performance, so it probably indirectly tracks code complexity as
well.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,289
Messages
2,571,448
Members
48,126
Latest member
ToneyChun2

Latest Threads

Top