Counting program lines except comments

J

Joona I Palaste

Alex Vinokur said:
Is there any tool to count C-program lines except comments?

Try that comment stripping program posted here, and use wc on the
result.
 
M

Mike Wahler

Ed Morton said:
The FAQ doesn't adequately answer this specific question

Your objection has apparently already been anticipated:
From the bottom of 18.1:

<quote>
(This list of tools is by no means complete; if you know of tools not
mentioned, you're welcome to contact this list's maintainer.)

Other lists of tools, and discussion about them, can be found in the Usenet
newsgroups comp.compilers and comp.software-eng .
(as the FAQ says, using
"wc" or "grep -c ';'" are both crude approximations). Try "ncsl" available for
free at http://www.lucentssg.com/displayProduct.cfm?prodid=33

Perhaps you'd consider submitting that link to the FAQ maintainer.

-Mike
 
E

Ed Morton

"Ed Morton" <[email protected]> wrote in message


Perhaps you'd consider submitting that link to the FAQ maintainer.

I'd like to see some other people try it first. For the past 15 years or so,
it's always worked fine for me on various UNIX platforms but I've never tried
compiling or running it on any other platform, nor have I tried it on C99 code.
I did find that if I added a //-style comment to my code and then run the tool
using "-lc", then it didn't recognize that commentary style - I had to run the
tool using "-lc++" for it to strip out //-style comments. Not sure if there's
other noteworthy differences.

So, if anyone out there feels like trying it out, I'd be interested in hearing
the result and then I can write a note with approriate caveats (if any) to the
FAQ maiuntainer.

NOTE: I didn't write "ncsl" and I don't maintain it, I just use it.

Ed.

-Mike
 
E

Ed Morton

Alex said:
Preprocessored program includes header files.

It also includes white-space and an added indication of the source file
name. If you're on *NIX, though, and you don't want to download the
"ncsl" tool I mentioned in a previous posting, then this seems like it
should work:

sed '/#include/s/#//' foo.c |
gcc -E - |
sed -e 's/[ ]//g' -e '/^$/d' -e '1d' |
wc -l

The first sed turns "#include" into just "include" so it's ignored by
the preprocessor. Then gcc -E strips out comments and adds # 1 "" as the
first line. The first "-e" of the second sed has a single space and a
tab within it's RE specifier ([...]) so it deletes all spaces and tabs
so that there are no lines with just white-space. The second "-e"
deletes all empty lines. The third -e gets rid of the first line added
by gcc. Finally wc -l counts the number of lines.

Ed.
 
R

Ravi Uday

Alex Vinokur said:
Is there any tool to count C-program lines except comments?

Thanks,

=====================================
Alex Vinokur
mailto:[email protected]
http://mathforum.org/library/view/10978.html
news://news.gmane.org/gmane.comp.lang.c++.perfometer
=====================================

Alex:

Check this :

/* File uncmntc.c - demo of a text filter
Strips C comments and counts number of non-commented codes.
Tested to strip itself
by Ravi Uday. 2003-08-15
Public Domain. Attribution appreciated
report bugs to <mailto:[email protected]>
*/

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

/* Maximum chars in any line. If this is crossed then the remaining
bytes are ignored */
#define BYTES 512

/* line is valid if it contains any of the following chars. Otherwise
treated as commented */
#define VALID_CHARS "\\abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ;,'/#{}()*+-0123456789"

int comment_handler ( FILE *fptr, char *c, char *str, int *len, int
*line_count)
{
int ind = 0;
char ch;
if ( ( ch = fgetc ( fptr )) =='*')/* checks for beginning of comment
*/
{
while ( !ind )
{
if (fgetc ( fptr ) == '*' )
{
if ( fgetc ( fptr ) == '/')/* checks for end of comment */
{
ind = 1;
*c = fgetc ( fptr);
}
}
}
}
else if ( ch == '/')/* checks cpp comment */
{
while ( (!feof (fptr) ) && ( ch != '\n'))/* end of line check for \n
*/
ch = fgetc ( fptr );

*c = ch;
}
else if ( *c == '"' )/* checks for comment in a string. */
{
str[(*len)++] = *c;/* Store the '"' char. */
str[(*len)++] = ch;/* Store the next char. */
while ( !ind )
{
str[*len] = fgetc ( fptr );
if ( str[*len] == '"')
ind = 1;
else if (str[*len] == '\n')
(*line_count)++;

(*len)++;
}
*len = *len-1;
*c = str[*len];
return 1;
}
else/* Special case: No comments found */
{
*c = ch;/* Storing next character */
return 0;/* The char is single '/' */
}
return 1;
}

int main (int argc, char *argv[])
{
int i = 0, j = 0, flag = 0;
char ch = 0;
char buffer[BYTES];/* variable holds a max of BYTES (defined) chars in
any line */
FILE *fp = NULL, *fout = NULL;

if ( argc < 3 ) return EXIT_FAILURE;

fp = fopen ( argv[1], "rb");/* open the source file */

if ( fp == NULL ) return EXIT_FAILURE;

fflush ( stdout);
fout = fopen ( argv[2], "wb");/* open the output file */

if ( fout == NULL ) return EXIT_FAILURE;

while ( (!feof (fp )) && (!ferror (fp)) )
{
memset ( buffer, 0x00, BYTES);
while ( (j != BYTES-1) && (!flag) )/* Check for max BYTES-1 chars */
{
ch = fgetc ( fp );

if ( ( ch == '/') || ( ch == '"'))
if (comment_handler ( fp, &ch, buffer, &j, &i ) == 0)
buffer[j++] = '/';

if ( (ch == '\n') || (feof (fp)))
flag = 1;

/* dont add '\n' or '\r' to the running buffer cause its appended in
the fprintf */
if ( ( ch !='\n' ) && ( ch != '\r'))
buffer[j++] = ch;
}
if (j == BYTES-1)/* line has more than BYTES chars, so ignore them !
*/
{
j++;
while ( (!feof (fp )) && (!ferror (fp) ))
{
ch = fgetc(fp);
if ( ch == '\n')
break;
}
}
if (strpbrk (buffer, VALID_CHARS))
{
fprintf ( fout, "%s\n", buffer);
i++;
}
j = flag = ch = 0;/* reset the loop variables. */
}
printf ("\n*** Number of non commented lines is : %d ***\n\n", i);

fclose ( fp );
fclose ( fout );

return EXIT_SUCCESS;
}

Good luck
- Ravi
 
J

John Bode

Alex Vinokur said:
Is there any tool to count C-program lines except comments?

Lines or statements? A single statement may take multiple source
lines. Then again, multiple statements may occupy a single line.
 
E

Ed Morton

Alex:

Check this :

<snip>

I tried it and it miscounted the number of lines in it's source program. It
reported 112 lines, whereas there's only 103. Sometimes it deletes the comments
from it's output, sometimes it leaves them in (e.g. the line " else if ( *c ==
'"' )/* checks for comment in a string. */"). Sometimes it deletes blank lines,
sometimes it leaves them in (in this case it left in 9, which I assume is why
its count is off by 9). To try to get some more information, I tried deleting
the "comment_handler" function and running the tool again on that modified
source file, but this time it core dumped.

Playing with it DID uncover a problem with using "gcc -E" to strip out comments
- if there are macros defined within the source file, then of course they get
expanded so the NCSL (Non-Commentary Source Line) count doesn't actually match
the NCSL written.

Ed.
 
R

Ravi Uday

<snip>

I tried it and it miscounted the number of lines in it's source program. It
reported 112 lines, whereas there's only 103. Sometimes it deletes the comments
from it's output, sometimes it leaves them in (e.g. the line " else if ( *c ==
'"' )/* checks for comment in a string. */"). Sometimes it deletes blank lines,
sometimes it leaves them in (in this case it left in 9, which I assume is why
its count is off by 9). To try to get some more information, I tried deleting
the "comment_handler" function and running the tool again on that modified
source file, but this time it core dumped.
Hey the comment_handler and main are inter-connected. Cause the input
pointer gets filled on returning from comment_handler. Dont comment it
!!
Playing with it DID uncover a problem with using "gcc -E" to strip out comments
- if there are macros defined within the source file, then of course they get
expanded so the NCSL (Non-Commentary Source Line) count doesn't actually match
the NCSL written.

Ed.

Ed,
Those were some interesting observations you made. Can you send me the
input file that you used, so that i can check on that.
Eg: if the file had ...

#define he "hello"/*is a
macro*/"int i;
Then number of lines in this snippet is 1 and not 2 !!

Any suggestions welcome.
- Ravi
 
M

Mark McIntyre

Preprocessored program includes header files.

He didn't say he wanted to exclude headers.
Anyway, some snippage in an editor would solve that easily enough.
 
E

Ed Morton

Ravi Uday wrote:

Hey the comment_handler and main are inter-connected. Cause the input
pointer gets filled on returning from comment_handler. Dont comment it
!!

I don't mean that I commented it out then tried to recompile and run it
- I compiled your code, then ran the tool on it's own source, then
editted the source to try to get to a minimal set that could reproduce
the problem (but didn't recompile!) and reran the previously compiled
(i.e. original) version on the newly editted source.

Those were some interesting observations you made. Can you send me the
input file that you used, so that i can check on that.

I just did a copy-and-paste of your posted code.

By the way, I wouldn't have posted this if I wasn't reply anyway, but in
case anyone wants it (e.g. Ravi - you could use it to check your code's
output), here's a shell script that'll count NCSL, getting round the
file inclusion and macro expansion problems:

_file="$1"

_sed=/whatever/bin/sed # path to a sed that supports
# the "[[:space:]]" RE (e.g. GNU)

_hash="__HASH__$$__"
$_sed -e "s/#/${_hash}/g" ${_file} |
gcc -E - |
$_sed -e "s/${_hash}/#/g" \
-e '/^[[:space:]][[:space:]]*$/d' \
-e '/^$/d' \
-e '1d'

Regards,

Ed.
 
E

Ed Morton

Ed Morton wrote:

output), here's a shell script that'll count NCSL, getting round the
file inclusion and macro expansion problems:

I meant to say it'll produce the NCSL. To count it, just pipe the output
to "wc -l".

Ed.
 
R

Ravi Uday

I meant to say it'll produce the NCSL. To count it, just pipe the output
to "wc -l".
Ed,
See what i did was to copy your shell script file contents and put
in a file and took that as an input to my parser code.

It gave me correct results. ( NCSL = 10 , blank lines
ignored. )

I have posted my corrected code: just have a look at
it. It prints to the 'stdout', instead of a file. You
can also print it to a file.


/* File uncmntc.c - demo of a text filter
Strips C comments and counts number of non-commented lines.
Tested to strip itself
by Ravi Uday. 2003-08-15
Public Domain. Attribution appreciated
report bugs to <mailto:[email protected]>
*/

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

/* Maximum chars in any line. If this is crossed then
the remaining bytes are ignored */
#define BYTES 512

/* line is valid if it contains any of the following
chars. Otherwise treated as commented */
#define VALID_CHARS
"\\abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ;,'/#{}()*+-0123456789"

int comment_handler ( FILE *fptr, char *c, char *str,
int *len, int *line_count)
{
int ind = 0;
char ch;

if ( ( ch = fgetc ( fptr )) =='*')/* checks for
beginning of comment */
{
while ( !ind )
{
if (fgetc ( fptr ) == '*' )
{
if ( fgetc ( fptr ) == '/')/* checks for end of
comment */
{
ind = 1;
*c = fgetc ( fptr);
}
}
}
}
else if ( ch == '/')/* checks cpp comment */
{
while ( (!feof (fptr) ) && ( ch != '\n'))/* end of
line check for \n */
ch = fgetc ( fptr );

*c = ch;
}
else if ( *c == '"' )/* checks for comment in a
string. */
{
str[(*len)++] = *c;/* Store the '"' char. */
str[(*len)++] = ch;/* Store the next char. */

if ( ch == 0x27 )/* Char is a single quote "'"*/
{
*len = *len-1;
*c = str[*len];
return 1;/* It is not a string just a char. */
}

while ( !ind )
{
str[*len] = fgetc ( fptr );
if ( str[*len] == '"')
ind = 1;
else if (str[*len] == '\n')
(*line_count)++;

(*len)++;
}
*len = *len-1;
*c = str[*len];
return 1;
}

else/* Special case: No comments found */
{
*c = ch;/* Storing next character */
return 0;/* The char is single '/' */
}

return 1;
}

int main (int argc, char *argv[])
{
int i = 0, j = 0, flag = 0;
char ch = 0;
char buffer[BYTES];/* variable holds a max of BYTES
(defined) chars in any line */
FILE *fp = NULL, *fout = NULL;

if ( argc < 3 ) return EXIT_FAILURE;

fp = fopen ( argv[1], "rb");/* open the source file */

if ( fp == NULL ) return EXIT_FAILURE;

fflush ( stdout);
fout = fopen ( argv[2], "wb");/* open the output file
*/

if ( fout == NULL ) return EXIT_FAILURE;

while ( (!feof (fp )) && (!ferror (fp)) )
{
memset ( buffer, 0x00, BYTES);
while ( (j != BYTES-1) && (!flag) )/* Check for max
BYTES-1 chars */
{
ch = fgetc ( fp );

if ( ( ch == '/') || ( ch == '"'))
if (comment_handler ( fp, &ch, buffer, &j, &i ) ==
0)
buffer[j++] = '/';

if ( (ch == '\n') || (feof (fp)))
flag = 1;

/* dont add '\n' or '\r' to the running buffer cause
its appended in the fprintf */
if ( ( ch !='\n' ) && ( ch != '\r'))
buffer[j++] = ch;
}
if (j == BYTES-1)/* line has more than BYTES chars,
so ignore them ! */
{
j++;
while ( (!feof (fp )) && (!ferror (fp) ))
{
ch = fgetc(fp);
if ( ch == '\n')
break;
}
}

if (strpbrk (buffer, VALID_CHARS))
{
fprintf ( stdout, "%s\n", buffer);
i++;
}

j = flag = ch = 0;/* reset the loop variables. */
}
printf ("\n*** Number of non commented lines is : %d
***\n\n", i);

fclose ( fp );
//fclose ( fout );

return EXIT_SUCCESS;
}


Thanks,
Ravi Uday
 
E

Ed Morton

Ravi said:
Ed,
See what i did was to copy your shell script file contents and put
in a file and took that as an input to my parser code.

It gave me correct results. ( NCSL = 10 , blank lines
ignored. )

I take it you mean "108" rather than "10" ;-). Looks good, but there are
a couple of issues:

1) It still requires a second file name argument, even though it doesn't
write to it,
2) It should print the analysis result to stderr instead of stdout so
the user can redirect stdout to a file to capture the de-commented code
without appending the NCSL count,
3) It deletes comments instead of replacing them with a single blank (as
"gcc -E" does).

Item "3" is the important one. If I have this code:

#include "stdio.h"
#define X/* whatever */7
int main(void)
{
printf("%d\n",X);
return 0;
}

in a file, then when I run my shell script on it I end up with:

#include "stdio.h"
#define X 7
int main(void)
{
printf("%d\n",X);
return 0;
}

which compiles and runs exactly like the original, but when I run your
tool on it I get:

#include "stdio.h"
#define X7
int main(void)
{
printf("%d\n",X);
return 0;
}

which won't compile since it now defines "X7" on the second line instead
of "X".

Ed.
 
A

Alex Vinokur

Alex Vinokur said:
Is there any tool to count C-program lines except comments?
[snip]

Counter of C/C++ source lines and bytes can be downloaded at :
http://alexvn.freeservers.com/s1/download.html ("Counter of C/C++ source lines and bytes").


NAME
nlcs - count C/C++ source lines and bytes

SYNOPSIS
nlcs [OPTIONS]... [FILE]...

DESCRIPTION
Count code-lines, empty-lines, comment-lines,
code-fields, empty-fields, comment-fields of C/C++-sources
which have been successfully compiled.


Summary and detailed reports are generated.

Here is Summary Report Sample.

--- Summary Report ---
==================================
| : Lines : Bytes |
|--------------------------------|
| Code Only : 47 : 873 |
| Code+Comment : 32 : - |
| Comment Only : 20 : 984 |
| Empty : 87 : 441 |
|................................|
| * Total : 186 : 2298 |
==================================


=====================================
Alex Vinokur
mailto:[email protected]
http://mathforum.org/library/view/10978.html
=====================================
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

Forum statistics

Threads
473,967
Messages
2,570,148
Members
46,694
Latest member
LetaCadwal

Latest Threads

Top