word count

J

jwvai316

I'm having problem in this case. I need to count how many time each
word appears in one sentence. and I need it to be case insensitive

for example when the input is

The quick brown fox jumps over the lazy dog.

and the output is

the 2
quick 1
brown 1
fox 1
jumps 1
over 1
lazy 1
dog 1

anyone can help me to solve this problem?
an example may help me a lot.

thanks.
 
E

Emmanuel Delahaye

jwvai316 wrote on 15/08/05 :
I'm having problem in this case. I need to count how many time each
word appears in one sentence. and I need it to be case insensitive

What exactly is your question about the C-language ? We don't do
homeworks here.

Do your best and post your code. If you know nothing about C, start
from the beginning with a good C-book. The 'Kernighan and Ritchie' is a
reference.

http://cm.bell-labs.com/cm/cs/cbook/


--
Emmanuel
The C-FAQ: http://www.eskimo.com/~scs/C-faq/faq.html
The C-library: http://www.dinkumware.com/refxc.html

I once asked an expert COBOL programmer, how to
declare local variables in COBOL, the reply was:
"what is a local variable?"
 
K

Kenny McCormack

I'm having problem in this case. I need to count how many time each
word appears in one sentence. and I need it to be case insensitive

for example when the input is

The quick brown fox jumps over the lazy dog.

and the output is

the 2
quick 1
brown 1
fox 1
jumps 1
over 1
lazy 1
dog 1

anyone can help me to solve this problem?
an example may help me a lot.

thanks.

#!gawk
{ for (i=1; i<=NF; i++) x[$i]++ }
END { for (i in x) print i,x }

(followups set)
 
O

osmium

jwvai316 said:
I'm having problem in this case. I need to count how many time each
word appears in one sentence. and I need it to be case insensitive

for example when the input is

The quick brown fox jumps over the lazy dog.

and the output is

the 2
quick 1
brown 1
fox 1
jumps 1
over 1
lazy 1
dog 1

anyone can help me to solve this problem?
an example may help me a lot.

The question cries out for a tree. Trees are discussed in , AFAIK, all
books on data structures.
 
D

David Resnick

osmium said:
The question cries out for a tree. Trees are discussed in , AFAIK, all
books on data structures.

I'd have said it calls out for a hashtable mapping strings to counts.
But hey, There's More Than One Way To Do It, to borrow from another
language.

That said, since the homework question said "one sentence", any
algorithm, even a dynamic array of structs each having a string
and count would be just fine...

To the OP, look at "tolower", "ispunct", and perhaps "strtok".
Figure out some way to store your lower cased punctation
stripped words and an associated count, and to search through
it when adding a word to see if it is a duplicate. Post what
you come up with, and you will no doubt get some help if you
have made an effort.

-David
 
A

akarl

David said:
I'd have said it calls out for a hashtable mapping strings to counts.
But hey, There's More Than One Way To Do It, to borrow from another
language.

OK, this is already off topic, but...how do you traverse the hashtable
to display the result? (I guess the OP would actually want a sorted output.)

August
 
R

Randy Howard

akarl wrote
OK, this is already off topic, but...how do you traverse the hashtable
to display the result? (I guess the OP would actually want a sorted output.)

His example didn't show the output in sorted form, so why would
you guess that?
 
C

CBFalconer

akarl said:
OK, this is already off topic, but...how do you traverse the
hashtable to display the result? (I guess the OP would actually
want a sorted output.)

You download my portable hashlib and compile and run the demo
wdfreq program.

<http://cbfalconer.home.att.net/download/hashlib.zip>

[1] c:\c\hashlib>wdfreq
Usage: wdfreq < inputfile > outputfile
collects all words in inputfile and outputs a
sorted (by frequency) list of words and the
frequency of their occurences, ignores case.

Signal EOF to terminate (^D or ^Z usually)
Now is the time for all good men to come to the aid of the party.
The quick brown fox jumped over the lazy hound dogs.
^Z
26 words, 21 entries, 59 probes, 18 misses
5 the
2 to
1 aid
1 all
1 brown
1 come
1 dogs
1 for
1 fox
1 good
1 hound
1 is
1 jumped
1 lazy
1 men
1 now
1 of
1 over
1 party
1 quick
1 time
 
A

akarl

CBFalconer said:
akarl said:
OK, this is already off topic, but...how do you traverse the
hashtable to display the result? (I guess the OP would actually
want a sorted output.)

You download my portable hashlib and compile and run the demo
wdfreq program.

<http://cbfalconer.home.att.net/download/hashlib.zip>

[1] c:\c\hashlib>wdfreq
Usage: wdfreq < inputfile > outputfile
collects all words in inputfile and outputs a
sorted (by frequency) list of words and the
frequency of their occurences, ignores case.

Signal EOF to terminate (^D or ^Z usually)
Now is the time for all good men to come to the aid of the party.
The quick brown fox jumped over the lazy hound dogs.
^Z
26 words, 21 entries, 59 probes, 18 misses
5 the
2 to
1 aid
1 all
1 brown
1 come
1 dogs
1 for
1 fox
1 good
1 hound
1 is
1 jumped
1 lazy
1 men
1 now
1 of
1 over
1 party
1 quick
1 time

OK, so you sort the items in a separate phase. With a binary search tree
(BST) you get the output sorted by word for free with an in-order
traversal. If the items doesn't need to be sorted or are to be sorted
by frequency (and you must implement the sorting) the BST approach is
slower however (though simpler).

August
 
C

CBFalconer

akarl said:
CBFalconer said:
akarl wrote:
OK, this is already off topic, but...how do you traverse the
hashtable to display the result? (I guess the OP would actually
want a sorted output.)

You download my portable hashlib and compile and run the demo
wdfreq program.

<http://cbfalconer.home.att.net/download/hashlib.zip>

[1] c:\c\hashlib>wdfreq
Usage: wdfreq < inputfile > outputfile
collects all words in inputfile and outputs a
sorted (by frequency) list of words and the
frequency of their occurences, ignores case.
.... snip usage example ...

OK, so you sort the items in a separate phase. With a binary search
tree (BST) you get the output sorted by word for free with an
in-order traversal. If the items doesn't need to be sorted or are
to be sorted by frequency (and you must implement the sorting) the
BST approach is slower however (though simpler).

Try your simple binary tree with sorted input. O(n*n).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,968
Messages
2,570,150
Members
46,697
Latest member
AugustNabo

Latest Threads

Top