RubyDictionary - First Try

M

Max Schmidt

Hello folks,

Last week I started learning Ruby by the book The Ruby Programming
Language if it's relevant. Today I decided to practise a little bit with
my new knowledge by trying to develop my first "serious" application in
Ruby. The idea was to create a small virtual dictionary based on CLI
with a few self-explaining functions like inserting/removing/search for
a data set or save/load it in a *.txt file.

Since this is my first attempt to a script language I wanted to know
what you think of my program. Are there any passages which could be
written more efficient in performance? In general, is the coding
style/commenting OK? What can I improve regarding OOP? Furthermore,
there are some commented in-code questions on line 26,93 and 99 - can
you answer them maybe?

Thanks for any help,

A Ruby novice

Attachments:
http://www.ruby-forum.com/attachment/4634/rdict.rb
 
J

Jesús Gabriel y Galán

Hello folks,

Last week I started learning Ruby by the book The Ruby Programming
Language if it's relevant. Today I decided to practise a little bit with
my new knowledge by trying to develop my first "serious" application in
Ruby.

Welcome to Ruby and very good job. You achieved very nice things.
The idea was to create a small virtual dictionary based on CLI
with a few self-explaining functions like inserting/removing/search for
a data set or save/load it in a *.txt file.

One question: when I think dictionary, I think this:
http://en.wikipedia.org/wiki/Associative_array, in which what is
unique is the keys, not the pairs. Your implementation allows to
insert two pairs with the same key:

irb(main):159:0* d = Dictionary.new
=> #<Dictionary:0xb7be8db0 @entries=[]>
irb(main):161:0> d.insert_words "aa", "bb"
=> [#<DictionaryEntry:0xb7bdd8fc @word2="bb", @word1="aa">]
irb(main):162:0> d.insert_words "aa", "cc"
=> [#<DictionaryEntry:0xb7bdd8fc @word2="bb", @word1="aa">,
#<DictionaryEntry:0xb7bd6494 @word2="cc", @word1="aa">]

and this looks a bit strange. But, anyway, I'll continue with the
assumption that this is what you really want.
If this is not the case, and you want unique keys, you will have to
change many things.
Since this is my first attempt to a script language I wanted to know
what you think of my program. Are there any passages which could be
written more efficient in performance? In general, is the coding
style/commenting OK? What can I improve regarding OOP? Furthermore,
there are some commented in-code questions on line 26,93 and 99 - can
you answer them maybe?

Some comments, mostly idiomatic:

1.-
def initialize(entries=nil)
if entries==nil
@entries = Array.new
else
@entries = entries
end
end

this could be done like this:

def initialize(entries=nil)
@entries = entries || [] # or Array.new if you prefer
end

2.- def insert_words: I would call the method:

def []=(word1, word2) which allows you to say d["word"] = "second word"
Nice syntactic sugar, although this goes more in line of what I said
about the dictionary being a key-value set.

3.-
def self.is_word? word
if (word.match(/[a-zA-Z]{2,15}/)[0].length == word.length) then true
else false
end
# (word.match(/[a-zA-Z]{2,15}/)[0].length == word.length)
# easier way?
end

Why don't you allow one letter words? There's a problem with this
implementation though: if the string doesn't match you get a

irb(main):160:0> d.insert_words "a", "b"
NoMethodError: undefined method `[]' for nil:NilClass
from (irb):94:in `is_word?'

because match will return nil, and you are calling [0] on that.

def self.is_word? word
word.match(/\A[:alpha:][:alpha:]+\z/) #you can remove the first
[:alpha:] to allow 1-letter words
end

4.- def insert_dict_entry(entry)
insert_words(entry.word1, entry.word2) # any better solution?
end

Another way, not necesarily better:

def insert_dict_entry(entry)
@entries << entry.dup
end

5.- def insert_array(words1, words2)
# arrays have to have same length
raise ArgumentError, "Invalid array arguments (not same length)"
unless words1.length==words2.length

# iterate on words1 and try to insert every pair
words1.each_index {|x| insert_words(words1[x], words2[x])}
end

I would do:

def insert_array(words1, words2)
words1.zip(words2).each {|first,second| insert_words(first,second)
if first && second}
end

this way, even if the arrays have different size, you store as much as
you can. If you still want the check you can add it.

6.- def remove(r_word1, r_word2)

def remove(word1, word2)
@entries.delete_if {|entry| entry == [word1,word2]} # if you keep
the enumerable equality check, see below
end

7.- def remove_at(index)

This one doesn't make sense for the public interface of a dictionary.
If you implement remove as above, you don't need it anymore.

8.- def search s_word

def search word
@entries.select {|entry| entry.word1.include?(word) ||
entry.word2.include?(word)}
end

9.- def has_word_pair?(word1, word2)

Why do you need the index as a return value?

def has_word_pair?(word1,word2)
@entries.find {|entry| entry == [word1,word2]}
end

10.- to_s

def to_s
s = "Current state of Dictionary (#{self.object_id})\n"
@entries.each_with_index {|entry, i| s << sprintf("%3d %15s |
%15s\n", i, entry.word1 , entry.word2)
s
end

DictionaryEntry

When you create a dictionary entry, you might want to dup the words to
avoid aliasing:

def initialize (word1, word2)
@word1 = word1.dup unless word1.frozen?
@word2 = word2.dup unless word2.frozen?
end

11.-
def ==(other)
if (other.is_a?(Enumerable) && @word1 == other[0] && @word2 == other[1])
true
elsif (other.is_a?(DictionaryEntry) && @word1 == other.word1 &&
@word2 == other.word2) # not tested
puts "dict entry"
true
else
false
end
end

Enumerable doesn't provide the method [], so if you are allowing a
equality test against an Enumerable you should only use Enumerable
methods. For example:

def ==(other)
case other
when Enumerable
@word1 == other.to_a[0] && @word2 == other.to_a[1]
when DictionaryEntry
@word1 == other.word1 && @word2 == other.word2
else
false
end
end

I'm not sure you want to keep the equality comparison with an
Enumerable, because I think you added it more as a convenience than as
a real requirement. If you remove it, some of my suggestions above
should be reviewed.

I don't have time now to comment on the CLI interface, but I think
that could be refactored a little bit too.


Jesus.
 
M

Max Schmidt

Hello and thank you for this fast widespread answer!
Your implementation allows to insert two pairs with the same key
This totally was my intention, because one word can be translated in
many different ways. Associative Arrays are called "Hashs" in Ruby,
aren't they?

1 -
def initialize(entries=nil)
@entries = entries || [] # or Array.new if you prefer
end
Nice idea, I never would consider a boolean operator as a expression,
which returns anything else than "true" or "false"

2 -
def []=(word1, word2)
Don't you think that this definition is rather confusing, if Dictionary
is NOT an associative array? I have to admit that I am still not
familiarized with these method-like operators...

3 -
Why don't you allow one letter words?
Because strings with one letter aren't any words, they are characters.
My aim was to develop a dictionary where you can organize "real-life"
words
if the string doesn't match you get a
irb(main):160:0> d.insert_words "a", "b"
NoMethodError: undefined method `[]' for nil:NilClass
from (irb):94:in `is_word?'

Ok, fixed it like this:
match = word.match(/[a-zA-Z]{2,15}/);
if (match!=nil && match[0].length == word.length) then true
def self.is_word? word
word.match(/\A[:alpha:][:alpha:]+\z/) #you can remove the first
[:alpha:] to allow 1-letter words
end

I wanted to limit the word's length to 15 as you can see in my regular
expression. Are the "\A" (start position) and "\z" (end position)
necessary?

4 -
Another way, not necesarily better:
def insert_dict_entry(entry)
@entries << entry.dup
end

I should have written the question more accurate. The problem was to
check the entry for compability by invoking insert_words() before
appending to @entries

I considered making a special function for that, like "is_insertable?",
but I was still unsure about that

5 -
this way, even if the arrays have different size, you store as much as
you can. If you still want the check you can add it.

Hmm, I want the length-check remain.
def insert_array(words1, words2)
words1.zip(words2).each {|first,second| insert_words(first,second)
if first && second}
end

Why do you prefer to a implementation which produces a temporaly array?

6 -

7 -
This one doesn't make sense for the public interface of a dictionary.
If you implement remove as above, you don't need it anymore.

This was meant to be a time-saver. You can output the Dictionary by
typing "o/output", find the index of the entry you want to delete and
then apply it to the "r/remove" - command

8 -

Very cool!

9 -
Why do you need the index as a return value?
for "remove" - I first searched for the word pair and if found I removed
the pair by calling remove_at(found_index)
@entries.find {|entry| entry == [word1,word2]}
I slowly realize that I will have to clearly go through the Array
methods this afternoon.

10 -
@entries.each_with_index {|entry, i| s << sprintf("%3d %15s |
%15s\n", i, entry.word1 , entry.word2)

This means it is actually never recommended to use
@entries.length.times {|i| ... }
to iterate through an array?

DictionaryEntry

11 -
you might want to dup the words to
avoid aliasing
Do you mean to avoid exceptions when the words are frozen? If yes, I
would throw an Exception if either of the two words is frozen.

12 -
Enumerable doesn't provide the method []
Ok, I was convinced that Enumerables behave like Arrays
I'm not sure you want to keep the equality comparison with an
Enumerable
... then I should replace the Enumberable with i.e. an Array like this?
case other
when Array
@word1 == other[0] && @word2 == other[1]
when DictionaryEntry
@word1 == other.word1 && @word2 == other.word2
else
false
end

Again, thanks for your effort
 
J

Jesús Gabriel y Galán

Hello and thank you for this fast widespread answer!

This totally was my intention, because one word can be translated in
many different ways. Associative Arrays are called "Hashs" in Ruby,
aren't they?

Yes, but the term Dictionary is a common technical term with that
meaning, that's why I was asking...
1 -
def initialize(entries=3Dnil)
=A0 =A0@entries =3D entries || [] =A0 # or Array.new if you prefer
end
Nice idea, I never would consider a boolean operator as a expression,
which returns anything else than "true" or "false"

2 -
def []=3D(word1, word2)
Don't you think that this definition is rather confusing, if Dictionary
is NOT an associative array? I have to admit that I am still not
familiarized with these method-like operators...

Yes, you are right. That syntactic sugar makes more sense when it's an
associative array, not in your case.
3 -
Because strings with one letter aren't any words, they are characters.
My aim was to develop a dictionary where you can organize "real-life"
words

"a" is a word :)
if the string doesn't match you get a
irb(main):160:0> d.insert_words "a", "b"
NoMethodError: undefined method `[]' for nil:NilClass
=A0from (irb):94:in `is_word?'

Ok, fixed it like this:
match =3D word.match(/[a-zA-Z]{2,15}/);
if (match!=3Dnil && match[0].length =3D=3D word.length) then true
def self.is_word? word
=A0word.match(/\A[:alpha:][:alpha:]+\z/) #you can remove the first
[:alpha:] to allow 1-letter words
end

I wanted to limit the word's length to 15 as you can see in my regular
expression. Are the "\A" (start position) and "\z" (end position)
necessary?

It's a way of saying that the full string should match your
restrictions. It avoids having to check that the match is equal to the
length of the original string:

def self.is_word? word
word.match(/\A[:alpha]{2,15}\z/)
end

should do what you want.
5 -
this way, even if the arrays have different size, you store as much as
you can. If you still want the check you can add it.

Hmm, I want the length-check remain.
def insert_array(words1, words2)
=A0 words1.zip(words2).each {|first,second| insert_words(first,second)
if first && second}
end

Why do you prefer to a implementation which produces a temporaly array?

6 -

7 -
This one doesn't make sense for the public interface of a dictionary.
If you implement remove as above, you don't need it anymore.

This was meant to be a time-saver. You can output the Dictionary by
typing "o/output", find the index of the entry you want to delete and
then apply it to the "r/remove" - command

8 -

Very cool!

9 -
Why do you need the index as a return value?
for "remove" - I first searched for the word pair and if found I removed
the pair by calling remove_at(found_index)
@entries.find {|entry| entry =3D=3D [word1,word2]}
I slowly realize that I will have to clearly go through the Array
methods this afternoon.

10 -
@entries.each_with_index {|entry, i| s << sprintf("%3d %15s |
%15s\n", i, entry.word1 , entry.word2)

This means it is actually never recommended to use
@entries.length.times {|i| ... }
to iterate through an array?

It's usually not recommended. If you really, really need the index
each_with_index handles that.
DictionaryEntry

11 -
avoid aliasing
Do you mean to avoid exceptions when the words are frozen? If yes, I
would throw an Exception if either of the two words is frozen.

No, what I mean is that the client can modify the String object after
inserting it in the dictionary:

d =3D Dictionary.new
w1 =3D "hello"
w2 =3D "world"
d.insert_words(w1,w2)
w1 << "something crazy which maybe should be in the dictionary at this poin=
t"

The string referenced by w1 is modified after being inserted. With
your solution, you store the same object, and so the dictionary is
modified by this external action.

The frozen check is just an optimization, because if the string is
frozen it's not going to be modified by the client, and so you are
safe just using it.
12 -
Enumerable doesn't provide the method []
Ok, I was convinced that Enumerables behave like Arrays
I'm not sure you want to keep the equality comparison with an
Enumerable
... then I should replace the Enumberable with i.e. an Array like this?

I don't know, depends on what your use case is. If it's just for
convenience, I think you should drop it, because it's giving those
objects a behaviour that they shouldn't have.
case other
when Array
=A0@word1 =3D=3D other[0] && @word2 =3D=3D other[1]
when DictionaryEntry
=A0@word1 =3D=3D other.word1 && @word2 =3D=3D other.word2
else
=A0false
end

Another idea: if want to see the dictionary as a set of things, a
common thing is to include Enumerable and provide an each method, so
that the clients of the dictionary can use the cool methods Enumerable
provides. In your case:

class Dictionary
include Enumerable

def each (&blk)
@entries.each &blk
end
end

that way people could do:

d =3D Dictionary.new
#populate it somehow
d.find {|entry| entry.word1 =3D=3D "hello"}

and all the rest of the cool stuff Enumerable provides.

Jesus.
 
M

Max Schmidt

3 -
"a" is a word :)
This is a good argument, I was obviously too focused on my native
language :)
if (match!=nil && match[0].length == word.length) then true
def self.is_word? word
�word.match(/\A[:alpha:][:alpha:]+\z/) #you can remove the first
[:alpha:] to allow 1-letter words
end

I wanted to limit the word's length to 15 as you can see in my regular
expression. Are the "\A" (start position) and "\z" (end position)
necessary?

It's a way of saying that the full string should match your
restrictions. It avoids having to check that the match is equal to the
length of the original string:

def self.is_word? word
word.match(/\A[:alpha]{2,15}\z/)
end

should do what you want.
Oh, now I understand. Checking the length of the original string was a
work-a-round, because I did not know about this regexp feature :)

No, what I mean is that the client can modify the String object after
inserting it in the dictionary:
You surely mean the DictionaryEntry
@word1 = word1.dup unless word1.frozen?
@word2 = word2.dup unless word2.frozen?

The frozen check is just an optimization, because if the string is
frozen it's not going to be modified by the client, and so you are
safe just using it.
If you assume that word1 is frozen, then @word1 would be assigned to
nil, wouldn't it? Or is the unless refering to the - and exclusively -
to the ".dup"?
If it's just for
convenience, I think you should drop it, because it's giving those
objects a behaviour that they shouldn't have.

Do you mean that the boolean equality operator (==) normally exclusively
compares with the class it is defined by?
Another idea: if want to see the dictionary as a set of things, a
common thing is to include Enumerable and provide an each method, so
that the clients of the dictionary can use the cool methods Enumerable
provides.
Nice idea, I implemented this.

Last question:
I posted this code on an other ruby forum and was criticized about my
coding style: Indents are always done with two whitespaces rather than
with tabs like I have. Have you heard anything about such a "unwritten
rule"?

Thanks
 
J

Jesús Gabriel y Galán

This is a good argument, I was obviously too focused on my native
language :)

Sorry about that ! You know your requirements :)
You surely mean the DictionaryEntry

If you assume that word1 is frozen, then @word1 would be assigned to
nil, wouldn't it? Or is the unless refering to the - and exclusively -
to the ".dup"?

No, I screwed up and you are right. I'd do it something like:

@word1 = word1.frozen? ? word1 : word1.dup

Do you mean that the boolean equality operator (==) normally exclusively
compares with the class it is defined by?

I think generally yes. I'm not sure if it's a hard rule, though, and
in some cases it might make sense. If you do that, I would certainly
include Enumerable in your class, so that then both are Enumerable, so
you might argue that the comparison really makes sense :)

Last question:
I posted this code on an other ruby forum and was criticized about my
coding style: Indents are always done with two whitespaces rather than
with tabs like I have. Have you heard anything about such a "unwritten
rule"?

Yeah, the typical convention is to use two spaces instead of tabs. But
criticized? Come on, it's a convention and you said you were new to
Ruby...
The reason I dislike tabs most is that I cannot copy and paste code
with tabs into irb, cause it shows me the current directory listing,
like a completion in the shell.

So, in my opinion, if you want to share code with the Ruby community,
I think you should try to adapt to the general conventions, it makes
it easier for everybody.

Jesus.
 
P

Peter Hickman

[Note: parts of this message were removed to make it a legal post.]

Last question:
I posted this code on an other ruby forum and was criticized about my
coding style: Indents are always done with two whitespaces rather than
with tabs like I have. Have you heard anything about such a "unwritten
rule"?
This is a religious issue, (flame)wars have started over less.
 
J

Jesús Gabriel y Galán

This is a religious issue, (flame)wars have started over less.

Ah, I thought it was pretty well-established...
Well, then, Max, what I said before may not hold true.
I still rather not have tabs because what I said about irb, but I'm
certainly not religious about it...

Jesus.
 
J

Josh Cheek

2010/3/31 Jes=FAs Gabriel y Gal=E1n said:
1.-
def initialize(entries=3Dnil)
if entries=3D=3Dnil
@entries =3D Array.new
else
@entries =3D entries
end
end

this could be done like this:

def initialize(entries=3Dnil)
@entries =3D entries || [] # or Array.new if you prefer
end

How about
def initialize( entries =3D Array.new )
@entries =3D entries
end

BTW, I think it is better to use setters and getters than to directly acces=
s
the ivars, because if you restrict access to the method, then you only have
to go to one place if you change implementation.
 
J

Jesús Gabriel y Galán

2010/3/31 Jes=FAs Gabriel y Gal=E1n said:
1.-
=A0 =A0 =A0 =A0def initialize(entries=3Dnil)
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0if entries=3D=3Dnil
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0@entries =3D Array.new
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0else
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0@entries =3D entries
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0end
=A0 =A0 =A0 =A0end

this could be done like this:

=A0 =A0 =A0 =A0def initialize(entries=3Dnil)
=A0 =A0 =A0 =A0 =A0 @entries =3D entries || [] =A0 # or Array.new if you= prefer
=A0 =A0 =A0 =A0end

How about
def initialize( entries =3D Array.new )
=A0@entries =3D entries
end

Yep, that's cleaner.
BTW, I think it is better to use setters and getters than to directly acc= ess
the ivars, because if you restrict access to the method, then you only ha= ve
to go to one place if you change implementation.

Sounds good to me, enter the Self Encapsulate Field refactoring (yes,
I'm now reading Fowler's book :)

attr_accessor :entries

def initialize(entries =3D Array.new)
self.entries =3D entries
end

Although maybe you want the accessor to be private at first, unless
your public interface calls for that.

Jesus.
 
F

Florian Gilcher

=20
This is a religious issue, (flame)wars have started over less.

I was the one who made that comment, because it was the first thing that =
jumped right at me, especially when viewing the code in out code =
highlighter (that uses 4 spaces per tab).

This may be a religious issue, but thats the case with every coding =
standard. I never criticize people for their use of do ... end, {} and =
such, but "2 spaces, no tabs" is such widely used in the ruby world that =
you can indeed call it a common standard and good practice.

It is also not an "unwritten rule", but referred to in many guides, for =
example:

http://www.caliban.org/ruby/rubyguide.shtml#indentation
http://github.com/chneukirchen/styleguide/blob/master/RUBY-STYLE
http://www.pathf.com/blogs/2008/10/elements-of-ruby-style/

... as well as many open repositories to be found on github.

As Max explicitly asked for Cosmetics/Style, and this is one of the =
biggest flaws of his program in this regard, so I had to mention it.

Regards,
Florian

--
Florian Gilcher

smtp: (e-mail address removed)
jabber: (e-mail address removed)
gpg: 533148E2
 
M

Max Schmidt

Hello,
As Max explicitly asked for Cosmetics/Style, and this is one of the
biggest flaws of his program in this regard, so I had to mention it.

I promise to chance this bad style in my next ruby program.
Although maybe you want the accessor to be private at first, unless
your public interface calls for that.

No, @entries should be private (it already is), you are right.

Good night
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,159
Messages
2,570,879
Members
47,413
Latest member
ReeceDorri

Latest Threads

Top