Making a counter for each word's occurrences in a string

B

Ben Ben

By looking at this method below, I couldn't understand a few things.
Let me list the method first!

1) def count_frequency(word_list)
2) counts = Hash.new(0)
3) for word in word_list
4) counts[word] += 1
5) end
6) counts
7) end

p count_frequency(["sparky", "the", "cat", "sat", "on", "the", "mat"])

----
that code above produces {"sparky" => 1, "the" => 2, "cat"=>1, "on"=>1,
"mat"=>1}

What I don't understand is that why line 6 is there. OK, like 4, it
seems counts[word] combines to pin point which word and add 1 to the
counter of that word, right? Because I'm confuse how ruby works as it's
so compact and nice, and under the hood I have no idea. So line 2 is to
create an empty hash with 0 item and deposits as counts (object)? Using
for loop to go through each word in word_list, adding 1 to counts[word],
but I thought counts[word] produces a position of a word and not a
counter. OK, you can say I'm completely confused about the whole code
above.

Help?

Thanks in advance.
 
B

Ben Ben

Ben said:
By looking at this method below, I couldn't understand a few things.
Let me list the method first!

1) def count_frequency(word_list)
2) counts = Hash.new(0)
3) for word in word_list
4) counts[word] += 1
5) end
6) counts
7) end

p count_frequency(["sparky", "the", "cat", "sat", "on", "the", "mat"])

----
that code above produces {"sparky" => 1, "the" => 2, "cat"=>1, "on"=>1,
"mat"=>1}

What I don't understand is that why line 6 is there. OK, like 4, it
seems counts[word] combines to pin point which word and add 1 to the
counter of that word, right? Because I'm confuse how ruby works as it's
so compact and nice, and under the hood I have no idea. So line 2 is to
create an empty hash with 0 item and deposits as counts (object)? Using
for loop to go through each word in word_list, adding 1 to counts[word],
but I thought counts[word] produces a position of a word and not a
counter. OK, you can say I'm completely confused about the whole code
above.

Help?

Thanks in advance.

Also I forgot to add that how on earth ruby in the end knows to produce
sparky =>1, the =>2, cat =>1, etc... even though the code above doesn't
seem to be relate the counter with the word anywhere, and yet using p
command it knows to relate each word with its occurrence's counts.
 
S

Stefano Crocco

|By looking at this method below, I couldn't understand a few things.
|Let me list the method first!
|
|1) def count_frequency(word_list)
|2) counts = Hash.new(0)
|3) for word in word_list
|4) counts[word] += 1
|5) end
|6) counts
|7) end
|
|p count_frequency(["sparky", "the", "cat", "sat", "on", "the", "mat"])
|
|----
|that code above produces {"sparky" => 1, "the" => 2, "cat"=>1, "on"=>1,
|"mat"=>1}
|
|What I don't understand is that why line 6 is there. OK, like 4, it
|seems counts[word] combines to pin point which word and add 1 to the
|counter of that word, right? Because I'm confuse how ruby works as it's
|so compact and nice, and under the hood I have no idea. So line 2 is to
|create an empty hash with 0 item and deposits as counts (object)? Using
|for loop to go through each word in word_list, adding 1 to counts[word],
|but I thought counts[word] produces a position of a word and not a
|counter. OK, you can say I'm completely confused about the whole code
|above.
|
|Help?
|
|Thanks in advance.

Line two creates an empty hash which uses 0 as default value. This means that
if you ask the value of a key which doesn't exist, it'll return 0. If this
default value hadn't been specified (that is, if that line had been

counts = Hash.new

or, equivalently,

counts = {}
), the defalut value would have been nil, instead.

Inside the for loop (that is, for each word in the list) the value associated
to the word inside the hash is increased by one (here you see the usefulness
of setting the default value of the hash to 0: if we hadn't done this, every
time we should have checked whether the entry was a number or nil). This has
nothing to do with the position of the word. Each value in the hash is simply
the number of times the corresponding word has been seen in the word list.

For example, supppose the word list is:

['a', 'b', 'a', 'c', 'b', 'a']

Here's how the hash (which is initially empty) becomes in the various
iterations:

First iteration (word: 'a'): {'a' => 1}
Second iteration (word: 'b'): {'a' => 1, 'b' => 1}
Third iteration (word: 'a'): {'a' => 2, 'b' => 1}
Fourth iteration (word: 'c'): {'a' => 2, 'b' => 1, 'c' => 1}
Fifth iteration (word: 'b'): {'a' => 2, 'b' => 2, 'c' => 1}
Sixth iteration (word: 'a'): {'a' => 3, 'b' => 2, 'c' => 1}

The last line of the method, count, is there because in ruby a method returns
the value returned by the last expression (unless the "return" keyword is
used). Without that last line, the last expression would be the "for" loop,
which returns the object we're iterating on (in this case, the word list).
We're not interested in returning the word list, however: we need the word
count, which is stored in the variable count. Putting a line containing only
the name of the variable at the end of the method, we make sure that the value
contained in the variable is returned. If it's clearer to you, you can think
the last line as if it were:

return count

In this case, the return keyword has no effect, since we're already at the end
of the method. However, it may make clearer the meaning of the expression.
(The return keyword more or less means: don't go on executing the method, stop
immediately and return to the calling method the value given to return, or nil
if return is called without arguments).

I hope this helps

Stefano
 
P

Phillip Gawlowski

Also I forgot to add that how on earth ruby in the end knows to produce
sparky =>1, the =>2, cat =>1, etc... even though the code above doesn't
seem to be relate the counter with the word anywhere, and yet using p
command it knows to relate each word with its occurrence's counts.

I modified your script a little:

def count_frequency(word_list)
counts = Hash.new(0)
for word in word_list
counts[word] += 1
end
puts "counts' class: #{counts.class}"
puts "inspect counts: #{counts.inspect}"
puts "counts' Hash keys: #{counts.keys.join("; ")}"
counts
end

p count_frequency(["sparky", "the", "cat", "sat", "on", "the", "mat"])

Output:
c:\Scripts>ruby word_freq.rb
counts' class: Hash
inspect counts: {"mat"=>1, "cat"=>1, "sat"=>1, "the"=>2, "on"=>1,
"sparky"=>1}
counts' Hash keys: mat; cat; sat; the; on; sparky
{"mat"=>1, "cat"=>1, "sat"=>1, "the"=>2, "on"=>1, "sparky"=>1}


The mystery is solved in line 4:
counts[word] += 1
which tells Ruby to use "word" as the name for the key. If the key
doesn't exist, it is created, with the count of "1". Further ocurrences
increment the count (obviously enough).
 
B

Ben Ben

In this case, the return keyword has no effect, since we're already at
the end
of the method. However, it may make clearer the meaning of the
expression.
(The return keyword more or less means: don't go on executing the
method, stop
immediately and return to the calling method the value given to return,
or nil
if return is called without arguments).

I hope this helps

Stefano

Thanks Stefano, I got the idea! Especially when you explain the line 6,
it's clearer to me now. So whatever for loop did before line 6, all of
those are stored in memory, then when you called line 6, it will return
only the value part that got added to the hash a moment ago at one go
right? I hope that is what you mean.
 
P

Phillip Gawlowski

Thanks Stefano, I got the idea! Especially when you explain the line 6,
it's clearer to me now. So whatever for loop did before line 6, all of
those are stored in memory, then when you called line 6, it will return
only the value part that got added to the hash a moment ago at one go
right? I hope that is what you mean.

Close. "counts", or, by extension, "return counts", hands the result of
the method back to the caller. It includes all the results the method
has produced (in your example, the word count).

To illustrate with a metaphor:
A method is like a production line at a factory.

It takes raw materials (input), and produces a good (a result). This
production line can be simple, like making nails out of metal bits (a
word count), to a whole car (a webapp, or a desktop application).
 
B

Ben Ben

Phillip said:
Close. "counts", or, by extension, "return counts", hands the result of
the method back to the caller. It includes all the results the method
has produced (in your example, the word count).

To illustrate with a metaphor:
A method is like a production line at a factory.

It takes raw materials (input), and produces a good (a result). This
production line can be simple, like making nails out of metal bits (a
word count), to a whole car (a webapp, or a desktop application).

Thanks Phillip, I got it now. Awesome! Ya, it is going to take me a
while before I can truly know Ruby and programming in general.
 
P

Phillip Gawlowski

Thanks Phillip, I got it now. Awesome! Ya, it is going to take me a
while before I can truly know Ruby and programming in general.

You are welcome. :)
 
B

Benoit Daloze

[Note: parts of this message were removed to make it a legal post.]

Hi,

the "inject" wya to do what you want is:

["sparky", "the", "cat", "sat", "on", "the", "mat"].inject(Hash.new(0)) {
|c,w| c[w] +=1; c }
=> {"sparky"=>1, "the"=>2, "cat"=>1, "sat"=>1, "on"=>1, "mat"=>1}

That's kind of very short to write, isn't it ?

Ruby is awesome :D
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,982
Messages
2,570,185
Members
46,738
Latest member
JinaMacvit

Latest Threads

Top