Ensuring uniqueness of an object at creation time

A

Abinoam Jr.

[Note: parts of this message were removed to make it a legal post.]

Hi all,

I would like to ensure that some attributes of an object are unique between
all the instances of that class.
So, I would like to prevent the creation of instances of a class that holds
attributes "equal" to the attributes of an already created instance.

(solely) As an example, the problem of creating an instance of a Person
class that has the same name of an existent instance.

My simple person class would be:

http://pastie.org/1610255

I found my way through overriding the Person.new method.
(Thanks "The Ruby Programming Language" book).
So that the new instance is not even allocated if there's already one with
the same name.

http://pastie.org/1610260

QUESTIONS:

1) Could this be considered a poor design choice?
2) What other ways to accomplish this?

Any opinions, tips, critics are welcome.

Some more comments:
Doing this way made me able to treat the object creation in a generic way.
The logic behind the uniqueness of the instances is hold by the class
itself, not by the (running) code.
This is desirable for me in this specific set because I'm parsing an xml
file with tags in a recursive manner.
I have a Hash that maps tags to classes.
The uniqueness attribute requirements of each class is different.
For example: Person unique by name. Telephone unique by number.
The parser only:
1) Take the tag
2) Look for the class that is mapped by the tag
3) Create the class using the attributes and text as argument to new
4) The class is the "when" and "where" the uniqueness validation is in
action

Thanks in advance,
Abinoam Jr.

PS:
My real problem is a little different (not Person or Telephone).
I used the Person class just as an example.
If someone curious about the code, tell me that I post the whole code.
 
B

Brian Candler

Abinoam Jr. wrote in post #984123:
I found my way through overriding the Person.new method.
(Thanks "The Ruby Programming Language" book).
So that the new instance is not even allocated if there's already one
with
the same name.

There's no need to do that. You could just raise an exception from
within the initialize method.
The logic behind the uniqueness of the instances is hold by the class
itself, not by the (running) code.
This is desirable for me in this specific set because I'm parsing an xml
file with tags in a recursive manner.

I think it's a poor design choice to enforce uniqueness within the
class, because it limits the usefulness of your Person class - you could
not have two XML parsers parsing two separate documents, for example, or
send and receive Person instances using DRb.

I think it would be better to have a 'person collection' object which
enforces the uniqueness. You create a new person, and get an error if
you try to add it into the collection where one already exists.

This is the same sort of model as you get with SQL uniqueness
constraints within a table, of course.

Regards,

Brian.
 
A

Abinoam Jr.

Hi Brian,

Thank you very much for replying.

Abinoam Jr. wrote in post #984123:

There's no need to do that. You could just raise an exception from
within the initialize method.

Person#initialize is called by Person.new
The Person.allocate is called _before_ Person#initialize.

So, if I raise an exception at initialize point the object was already
allocated.
I think it's a poor design choice to enforce uniqueness within the
class, because it limits the usefulness of your Person class - you could
not have two XML parsers parsing two separate documents, for example, or
send and receive Person instances using DRb.

I think it would be better to have a 'person collection' object which
enforces the uniqueness. You create a new person, and get an error if
you try to add it into the collection where one already exists.

This is the same sort of model as you get with SQL uniqueness
constraints within a table, of course.

Regards,

Brian.

I think I was not clear enough. (I tried to simplify it, and ended
OVERsimplifying it).

Look at this xml snippet.

<messages>
<message>PREPARE_A</message>
<message>COMPLETE_A</message>

<message>PREPARE_B</message>
<message>COMPLETE_B</message>

<message>PREPARE_C</message>
<message>COMPLETE_C</message>
</messages>

After declaring all those messages, I just want to use them in my
rule/action table.

<rule id="active_prepare_b">
<pre>
<current_state>active</current_state>
</pre>
<post>
<send_message>PREPARE_B</send_message>
<next_state>completing</next_state>
</post>
</rule>

Look at the <send_message>PREPARE_B</send_message>
This "PREPARE_B" message is the same of the previously declared one.
I'm just "using" it.
In this specific case of <message> its "uniqueness" is based on message text.
(There's other classes that has its uniqueness based on something different)
"If it smells like dog, it should be a dog" (Or "THAT specific" dog).
If it's a message and has the same text, it should be the SAME message
(not a new one).

So, I don't want to raise an exception, I just want to return the
existing object without even allocating a new one.

This kind of behavior makes me able to design my parser in a
"generic"/"agnostic" manner.
I just have to have a 'table' mapping xml tags to classes.
The parser just get the tag, see what class should be instantiated,
and calls the <class>.new and iterate to the next tag.
It's up to the class all the logic to ensure its uniqueness.

But, I'm feeling I'm forgetting something.

What do you think?

a = :prepare_b
b = :prepare_b

With Symbol, if it has the same value, it's the SAME Symbol, not different ones.

a = "prepare_b"
b = "prepare_b"

With String, even if they have the same value, they are different objects.

I would like to resemble/extend this kind of behaviour to more generic objects.

Thank you again,
Abinoam Jr.
 
G

Gary Wright

a =3D :prepare_b
b =3D :prepare_b
=20
With Symbol, if it has the same value, it's the SAME Symbol, not =
different ones.

Don't get hung up on using the default new/initialize framework. You =
can define your own constructors that return existing instances instead =
of allocating a new instance if you want to model 'value' semantics.

message =3D Message.find_or_create('message_a', other, args)

Then define find_or_create such that it manages a cached collection of =
messages that it can dip into if it finds a match or it can create a new =
message if necessary.

In general I think it is better to define your own constructors if the =
'normal' semantics of new/initialize aren't what you want instead of =
redefining new. You can always make new private if you want to 'force' =
the use of your own constructors.

If the only attribute of your messages is their unique name, then =
symbols might be exactly what you need but be aware that symbols are =
generally not garbage collected so that if you create them based on =
external data you might be opening yourself to a memory exhaustion =
attack (i.e. the attacker can cause your memory footprint to grow =
without bounds). Whether this is a concern or not just depends on where =
the external data is coming from.

Gary Wright=
 
A

Abinoam Jr.

Hi Gary,

It's pretty elegant using a name such as "find_or_create" for the
factory method. It really describes its behavior much better than
"new".
And setting "new" as private, good point too.

In my specific piece of software I just have to make all the class a
"find_or_create" method, even if it doesn't have any uniqueness
constraints. This is because my parser is "agnostic", so I don't want
it to decide where to use "new" or "find_or_create". It has just to
use "find_or_create" in every object it tries to create.

Thank you for your comments,
Abinoam Jr.

nt ones.

Don't get hung up on using the default new/initialize framework. =A0You c=
an define your own constructors that return existing instances instead of a=
llocating a new instance if you want to model 'value' semantics.
message =3D Message.find_or_create('message_a', other, args)

Then define find_or_create such that it manages a cached collection of me=
ssages that it can dip into if it finds a match or it can create a new mess=
age if necessary.
In general I think it is better to define your own constructors if the 'n=
ormal' semantics of new/initialize aren't what you want instead of redefini=
ng new. =A0You can always make new private if you want to 'force' the use o=
f your own constructors.
If the only attribute of your messages is their unique name, then symbols=
might be exactly what you need but be aware that symbols are generally not=
garbage collected so that if you create them based on external data you mi=
ght be opening yourself to a memory exhaustion attack (i.e. the attacker ca=
n cause your memory footprint to grow without bounds). Whether this is a co=
ncern or not just depends on where the external data is coming from.
 
B

Brian Candler

Abinoam Jr. wrote in post #984287:
Person#initialize is called by Person.new
The Person.allocate is called _before_ Person#initialize.

So, if I raise an exception at initialize point the object was already
allocated.

Yes, but it will be garbage-collected later.
So, I don't want to raise an exception, I just want to return the
existing object without even allocating a new one.

Then you could make a class method:

class MyClass
@all_objects = {}

# non-threadsafe version
def self.create(args)
return @all_objects[args] if @all_objects[args]
@all_objects[args] = new(args)
end
end
a = :prepare_b
b = :prepare_b

With Symbol, if it has the same value, it's the SAME Symbol, not
different ones.

a = "prepare_b"
b = "prepare_b"

With String, even if they have the same value, they are different
objects.

That's correct. And normal objects are like String; Symbol is very much
a special case, baked into the language, to give efficient method
dispatch.

Why is it important in your application for your objects to have the
singleton behaviour like Symbol? What bad things would happen if there
were two objects representing the same message?

Regards,

Brian.
 
B

Brian Candler

Abinoam Jr. wrote in post #984295:
In my specific piece of software I just have to make all the class a
"find_or_create" method, even if it doesn't have any uniqueness
constraints. This is because my parser is "agnostic", so I don't want
it to decide where to use "new" or "find_or_create". It has just to
use "find_or_create" in every object it tries to create.

One option:

class Object
def self.find_or_create(*args,&blk)
new(*args,&blk)
end
end

Then you override it in those classes where you need to.
 
A

Abinoam Jr.

One option:
class Object
=A0def self.find_or_create(*args,&blk)
=A0 =A0new(*args,&blk)
=A0end
end

Then you override it in those classes where you need to.

Good! Thank you.

Abinoam Jr.
 
A

Abinoam Jr.

So, if I raise an exception at initialize point the object was already
Yes, but it will be garbage-collected later.

You're right. In my specific piece of software I think this will not a prob=
lem.
But, if it's a huge one, the computational cost of allocating and
deallocating can be important.
=A0# non-threadsafe version

Thank you for advising me to always think in "thread-safe" way.
Why is it important in your application for your objects to have the
singleton behaviour like Symbol? What bad things would happen if there
were two objects representing the same message?

Look, there's some problems I have circumvented by defining a (class)#=3D=
=3D method.
The "=3D=3D" method relied on the comparison of the instance variables (
@message, for example ).
So, even if there are 2 instances representing the same message they
would be considered equal to each other throughout the software.
But... again, I thought it could a little ugly code.

Abinoam Jr.
 
A

Avdi Grimm

You're right. In my specific piece of software I think this will not a problem.
But, if it's a huge one, the computational cost of allocating and
deallocating can be important.

And you've already proven that allocation is a significant factor in
your program's runtime efficiency using rigorous profiling... right?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,968
Messages
2,570,150
Members
46,696
Latest member
BarbraOLog

Latest Threads

Top