Managing metadata about attribute types

S

Simon Kitching

Hi,

I'm porting the Apache Jakarta Commons Digester (written in Java) to
Ruby at the moment. This module processes xml in a rules-based manner.
It is particularly useful for handling complex xml configuration files.

However some of the very nice features of this module depend upon being
able to introspect a class to find what attributes it has, and what
their datatypes are.

Finding attributes on a Ruby class is simple (just look for "attr="
methods). Unfortunately, determining what object types it is valid to
assign to that attribute is not so simple...

I was wondering if there were any other Ruby projects which have faced
this problem and come up with solutions? I would rather steal a solution
than invent one :)

Example of problem:

Input xml is:
<stock>
<stock-item name="spanner" cost="12.50"/>
<stock-item name="screwdriver" cost="3.80"/>
</stock>

// java
class StockItem {
public void setName(String name) {....}
public void setCost(float cost) {....}
}

# Ruby
class StockItem
attr_accessor :name
attr_accessor :cost
end

In the java version, when the "cost" attribute is encountered in the xml
input, it is seen that the target class has a setCost(float) method, so
the string "12.50" is converted to a float before invoking the setCost
method.

I want to achieve the same effect in the Ruby version. I do *not* want
to effectively invoke this in ruby:
stock_item.cost=('12.50') # string passed


Anyone have any references to "pre-existing art"???

Thanks,

Simon
 
R

Ryan Pavlik


I was wondering if there were any other Ruby projects which have faced
this problem and come up with solutions? I would rather steal a solution
than invent one :)

Yep. I'll take this opportunity to shamelessly plug some modules. I
do this in Mephle. First, I use the StrongTyping module and write new
attr_ functions, so I can do this:

attr_accessor_typed String, :foo, :bar

Now #foo= and #bar= complain if they don't get a String. Even better,
I can use StrongTyping's type querying on foo= and bar= to get what
they take, if I need to.

Next, I use the MetaTags module for actually tagging what attributes
exist:

class_info <<-DOC
!Class: Foo

!attr foo: Foo
!attr bar: Bar: This is an optional description of bar.
DOC
class Foo
:
end

Now I can ask for information about the Foo class and look through the
attributes that way.

It works... I generate UIs from this information... and I'm working on
some tools to eliminate redundancy and required typing.

hth,
 
A

Ara.T.Howard

Date: Wed, 5 Nov 2003 09:38:16 +0900
From: Simon Kitching <[email protected]>
Newsgroups: comp.lang.ruby
Subject: Managing metadata about attribute types

Hi,

I'm porting the Apache Jakarta Commons Digester (written in Java) to
Ruby at the moment. This module processes xml in a rules-based manner.
It is particularly useful for handling complex xml configuration files.

However some of the very nice features of this module depend upon being
able to introspect a class to find what attributes it has, and what
their datatypes are.

Finding attributes on a Ruby class is simple (just look for "attr="
methods). Unfortunately, determining what object types it is valid to
assign to that attribute is not so simple...

I was wondering if there were any other Ruby projects which have faced
this problem and come up with solutions? I would rather steal a solution
than invent one :)

Example of problem:

Input xml is:
<stock>
<stock-item name="spanner" cost="12.50"/>
<stock-item name="screwdriver" cost="3.80"/>
</stock>

// java
class StockItem {
public void setName(String name) {....}
public void setCost(float cost) {....}
}

# Ruby
class StockItem
attr_accessor :name
attr_accessor :cost
end

In the java version, when the "cost" attribute is encountered in the xml
input, it is seen that the target class has a setCost(float) method, so
the string "12.50" is converted to a float before invoking the setCost
method.

I want to achieve the same effect in the Ruby version. I do *not* want
to effectively invoke this in ruby:
stock_item.cost=('12.50') # string passed


Anyone have any references to "pre-existing art"???


i have a similar problem for parsing header informaion from satelite data, eg

....
samples per scanline: 7322
organization: band interleaved by scanline
....

i want the first to be a Fixnum and the second to be a String. i know it's
simple, but i've taken this approach:

def samples_per_scanline= arg
@samples_per_scanline =
case arg
when String
raise ArgumentError.new(arg) unless arg =~ FLOAT_PAT
arg.to_f
when Numeric
arg.to_f
else
raise TypeError.new(arg.class.to_s)
end
end

this can be useful when you want to ensure that a @attr is of a certain type,
but want to allow _different_ types in the call to set it...

i'm not sure if modules like StrongTyping allow this. perhaps they do.


i suppose one could automate this somehow to using a class_eval.

~/eg/ruby > cat float_attr.rb
def Object.float_attr sym
template = <<-'template'
def %s= arg
float_pat = %%r/^\s*[+-]?\d+(?:\.\d*)?\s*$/o
case arg
when String
raise ArgumentError.new(arg.to_s) unless arg =~ float_pat
@%s = arg.to_f
when Numeric
@%s = arg.to_f
else
raise TypeError.new(arg.class.to_s)
end
printf "%s = %%s\n", arg
end
def %s; @%s; end
template
code = template % [sym,sym,sym,sym,sym,sym]
class_eval code
end

class C
float_attr :foo
end

c = C.new
c.foo = 42
p c.foo


~/eg/ruby > ruby float_attr.rb
foo = 42
42.0


anyhow - i see the desire for strong typing of attributes, but my personal
opinion is that strong typing of method signatures is a bit anti ruby: better
to munge inside the method than outside it lest we all tumble into c-- hell.

__IMHO__

-a
--

ATTN: please update your address books with address below!

===============================================================================
| EMAIL :: Ara [dot] T [dot] Howard [at] noaa [dot] gov
| PHONE :: 303.497.6469
| ADDRESS :: E/GC2 325 Broadway, Boulder, CO 80305-3328
| STP :: http://www.ngdc.noaa.gov/stp/
| NGDC :: http://www.ngdc.noaa.gov/
| NESDIS :: http://www.nesdis.noaa.gov/
| NOAA :: http://www.noaa.gov/
| US DOC :: http://www.commerce.gov/
|
| The difference between art and science is that science is what we
| understand well enough to explain to a computer.
| Art is everything else.
| -- Donald Knuth, "Discover"
|
| /bin/sh -c 'for l in ruby perl;do $l -e "print \"\x3a\x2d\x29\x0a\"";done'
===============================================================================
 
R

Ryan Pavlik

Hi Ryan,

Thanks for your reply.

MetaTags (http://raa.ruby-lang.org/list.rhtml?name=metatags) may be what
I was looking for. I'll try to figure out exactly what it does over the
next few days.

Hope you find it useful. You may still want to couple it with
strongtyping, as this provides a really convenient way to do what you
want (check desired types), but you can do it with metatags alone.

I've thought about doing this, in fact, for "documenting" builtin
classes and their methods before. You shouldn't have a problem
modifying the existing method_info or class_info tagsets to handle
this.
Mephle looks very interesting .. might have to look into it later on.

It's another can of worms. I should have an app or two that uses it
coming out soon, though.

ttyl,
 
R

Robert Klemme

Simon Kitching said:
Hi,

I'm porting the Apache Jakarta Commons Digester (written in Java) to
Ruby at the moment. This module processes xml in a rules-based manner.
It is particularly useful for handling complex xml configuration files.

However some of the very nice features of this module depend upon being
able to introspect a class to find what attributes it has, and what
their datatypes are.

Finding attributes on a Ruby class is simple (just look for "attr="
methods).

That's not the best way. Better do "obj.instance_variables".
Unfortunately, determining what object types it is valid to
assign to that attribute is not so simple...

It's impossible.
I was wondering if there were any other Ruby projects which have faced
this problem and come up with solutions? I would rather steal a solution
than invent one :)

Then why not postprocess the output of YAML on dumping and convert the XML
to YAML on reading?

Regards

robert
 
S

Simon Kitching

What a vigorous discussion I seem to have triggered :)

It's really nice to see so many people interested in the topic - thanks
to all who replied.

Rather than reply individually to the several emails which raise
interesting points, I'll try to gather all the different bits here.

I really am interested in the points raised, and am definitely still in
the learning phase with Ruby. So all of the statements below should
really be prefaced with "I think", "It seems to me", "Do you think that
... is right?". However that would double the size of this email. Please
assume all below is tentative, and that comments/corrections are
welcome.

And I hope those people who "really wanted to stay out..." don't stay
out and chip in. I'm interested at the very least!

====
Re "xml-config" module, raised by Austin: there are significant
differences between the "xml-config" and "xmldigester" approaches.

Which is "better" will depend on circumstances and developer taste. The
most significant differences are:
(a)
Xml-config first builds a complete representation of the input xml in
memory, then starts extracting data. For large xml files this is not a
good idea. Xmldigester is "event-based", so the input xml does not have
to be completely loaded into memory.
(b)
I *believe* that the xmldigester rules-based approach will take less
client code, and will bind the "parsing" program less tightly to the API
of the objects being built than the xml-config approach.
(c)
If building inter-object relationships that are more complex than simple
parent/child references, the xml-config approach may prove easier. There
are some tricks that can be played with xmldigester (eg a common
"registry" object used to resolve relations), but having the entire xml
document available (DOM-style) can allow things that an event-style
approach cannot.
(d)
The xmldigester event-based approach is likely to be faster.

Of course the best test of all the above opinions is actually to create
the code, then compare the two. Once I have xmldigester knocked into
reasonable shape, I might port the xmldigester examples to xml-config
(and vice-versa) to see if any of the above is true!

Regardless of the results, I think that both approaches have their
place.

====

Regarding whether the target class should be responsible for accepting a
string and doing the conversion...

I think it is definitely *not* the receiving classes' responsibility to
do the conversion.

Here's my original class, with the initial implicit assumptions spelled
out more clearly as comments.

# Ruby
class StockItem
# contract with user: any value assigned to name must
# act like a string
attr_accessor :name

# contract with user: any value assigned to cost must act
# like a float object.
attr_accessor :cost
end

Isn't this a valid API for a class to provide? As far as the author of
StockItem is concerned, cost is a float.

I don't see why the author of StockItem should even consider the
possibility that a string could be assigned to cost; that would violate
the contract of the class, so any program that does so can damn well
take the consequences :). The StrictTyping module can enforce this, but
perhaps does so over-eagerly, as it doesn't allow "duck-typing" ie
objects which aren't Float objects but do behave like them.

Now I happen to want to configure this object based on some string-typed
information. But that's my problem, not the problem of the author of the
StockItem class. And if I wanted to use ASN.1 format input to initialise
an instance of StockItem, then it is still my responsibility to convert
the ASN.1 representation to an appropriate type before assigning to
cost, not the StockItem class' responsibility to understand ASN.1 format
input.

Ok, with Ruby's "open" classes, I can alias the original cost= method
and insert some wrapper code. But I will have to restore the original
method after parsing is complete, otherwise during the real "running"
part of the program, the StockItem's cost= method won't behave like
other classes expect it to.

Not to mention that writing those "conversion" methods by hand is ugly.
You're right, they shouldn't. But if your warehouse management
classes don't do what they can to ensure their data integrity, then
there's a problem with the classes -- not with the XML library. I'm
not trying to be difficult here; just pointing out that I think
you're trying to fix the problem from the wrong end.

The StockItem's contract clearly states that it only accepts Float types
for the cost attribute. It doesn't actually need to enforce its data
integrity - it is the calling code's responsibility to use StockItem
correctly.
attr_accessor proc { |x| x.to_i }, :item_id

That's some very cool code. I can feel my brain expanding just by
looking at it! However I don't feel it does what I want, because this
code actually changes the API of the target class, breaking all other
code that accesses that same attribute thereafter.

The data conversion clearly has to be done somewhere, but I would like
it to be done separately from the target class so as not to muck around
with its API.

Here's the "conversion" code extracted out into a helper class:

def StockItemHelper
def StockItemHelper.cost=(stock_item, str)
stock_item.cost = str.to_f
end
end

In fact, why not use the Java convention and call it StockItemBeanInfo?

Applying a modified version of your attr_accessor code, this could be
written more succinctly as the following, generating effectively the
same code as shown above:

def StockItemBeanInfo
attr_from_str :name, String
attr_from_str :cost, Float
end

However I can also use something like Ryan's MetaTag syntax to write
this. I'm not sure which syntax is more convenient.

desc = <<-END
!class StockItem
!name String
!cost Float
END

# parse the string and dynamically create a wrapper class
beanInfo = createBeanInfoClass(desc)

# because cost was declared as a Float in the MetaTag string,
# the beanInfo class knows to convert the second (string) param
# to a float.
beanInfo.set_cost(stock_item, '3.50')


As you can see, I'm not interested in "type strictness" at all.
What I need is simply "what type of object should I generate in order to
be able to validly assign to cost without violating the API contract of
the StockItem class"...

Changing the StockItem class contract is one solution, but that screws
up all other code that really depended on the original contract being
valid.


Oh, and what if the target attribute is a "Date" class, and I want to
globally control the way string->date mapping works? If it is
distributed across every class that has a Date attribute that is much
trickier to handle than if I somehow know that classes X, Y and Z have
date attributes and the xmldigester code does the string->date
conversions before the assignment.

====
From Ryan:

Using #to_* methods are the ruby equivalent of type casting. The
point in this case is not to _convert_ types, it's to provide the
right type in the first place. Instead of giving the attribute a
string and expecting it to be parsed, we want to create the desired
type and hand that off.

It has nothing to do with the #attr= function. Strict type checking
at that point is merely a convenience. It's all about getting the
input into a useful format without writing n^2 functions (for n
classes). This is the primary reason I wrote StrongTyping in fact;
the strict checking has merely helped with debugging a whole lot.

Yep, that's exactly how I see it.

However I don't want the "type enforcement at runtime" feature of
StrongTyping, and I want to avoid changing the target class' behaviour
in any significant way. Is it possible to get the "type info" part of
StrongTyping without the "type enforcement"?

====

The thread about namespaces still has me pondering a little.
I'm not sure it's relevant to my issue, though, is it?

I need to *instantiate* an object in order to assign it to an attribute
on a target object. So I do need to know the name of a concrete class to
instantiate. There's no "duck typing" there, is there?

====

Thanks Ryan, Chad, Austin, Richard, James, David, Christoph (phew!)


As said in the intro, all comments/corrections welcome!


Regards,

Simon
 
R

Ryan Pavlik

I believe that's by design; as I understand it, the StrongTyping
module performs parameter gatekeeping based exclusively on the
class/module ancestry of an object (the namespaces to which it
belongs, as Rich and Chad were discussing), not on what the object
actually can do. This means, as you say, that objects which might fit
the bill may not get through, if their class/module ancestry is wrong,
and also that objects which do not fit the bill can get through -- for
example:
<snip>

This is the fundamental philosophical disagreement, or
miscommunication, or what have you. If an object fits the bill, and
its class/ancestry is wrong, then there is a error in design.
It should not be the case that this happens, or you have found an
error in your code.

I realize not all of Ruby is documented in this manner; that's a
simple matter to change. A few smaller modules would solve this; for
instance, Set, HashedSet, IndexedSet, etc. Array would be an
IndexedSet; modules such as CGI would include HashedSet. Then you
could ask for the simple behavioral pattern you desire, and know that
you have it. You would further be assured that this #[] means what
you want it to.

This isn't really any different than duck typing, except you're just
making sure that it really does quack, it doesn't just have a bill.
 
R

Ryan Pavlik

At this point you're waging a battle directly against the design of
Ruby. Ruby allows you to extend objects at runtime; to decide that
this is sloppy or wrong or a second-rate programming technique is
entirely arbitrary.

Not at all. The fact you _can_ do something doesn't mean you must do
it all over the place. The fact you _can_ extend objects has no
bearing on what the fact your class hierarchy documents the behavioral
pattern well, either.
When you check whether or not an object's response to #is_a? includes
an element you've specified, that's the one and only thing you're
checking. It's not as if Ruby somehow pulls up its socks and
straightens its tie and says "Better stop this dynamic stuff!" when it
sees a call to #is_a? It doesn't; it remains dynamic, and the
programming techniques required to ascertain the interface of an
object do not change. (Besides, its socks and tie were just fine to
start with :)

You seem to have the preconception that #is_a? and "this dynamic
stuff" are mutually exclusive in some way. This is not the case. The
fact you're asking #is_a? just means you're asking if at some point it
had a parent that was related to this class.

I see this as both a fundamental problem of blindly using #to_*
methods and a factor that has limited the perception of typing.
When you call #to_f, or #to_s, you get a Float or a String... not a
subclass. (In fact, this is often used to work around having
singleton strings, or not-quite-strings.) Don't confuse this with
actually _having_ a String subclass, which _is_ a String, but (likely)
with some additional behavior.

The primary goal of inheritence is dynamic extension. The ability for
you to add something later to the code and have it fall into the
existing structure.

Consider:

* When writing code, you will never specifically want a type you
do not know about.

* When using the code later, you can provide a specific subclass
unknown to the original code.

* Thus, when type checking, you are not limited in any manner,
because you are asking for what you want (generally), and
getting it (specifically).

Dynamicism doesn't fall out of the picture here at all. (In fact, you
could in theory construct ST expect() statements dynamically, although
it would be odd.)
I realize not all of Ruby is documented in this manner; that's a
simple matter to change. A few smaller modules would solve this; for
instance, Set, HashedSet, IndexedSet, etc. Array would be an
IndexedSet; modules such as CGI would include HashedSet. Then you
could ask for the simple behavioral pattern you desire, and know that
you have it. You would further be assured that this #[] means what
you want it to.

I think I must be not getting something here; it sounds like you're
suggesting that every possible behavior of every future object in
every Ruby program be anticipated by modules in the core distribution,
with multi-barreled names to suit the purpose. I'm thinking that
can't really be what you mean.

You've got it backwards. The core code should not have every possible
pattern; this is silly. It should provide the ones it uses and needs.
Future classes, if they provide an interface that is useful to this
code, they should document it by including the proper pattern.

For example (let def_abstract define a method that raises
SubclassResponsibility if unimplemented):

module Hashable
def_abstract :[], :[]=, :has_key?, :keys, :has_value?, :values
end

def show_hash(h)
expect h, Hashable

:
end

:

class SomeOddClass
include Hashable

def [](k);
:
end

:
end

Now the code knows that SomeOddClass is, in fact, Hashable. If you
fail to implement one of the functions necessary, it will complain.
This is only one example; you could make things a bit finer-grained.
Perhaps have a HashAccessed that defines #[] and #[]=, and a subclass
that defines the rest. I am not sure that level of granularity is
useful, but it is possible.
Also, remember that you're never 100% assured that #[] or any other
method means what you want it to. It's ducks all the way down :)
(http://www.the-funneled-web.com/hawking.htm) Every method call
operates under the same conditions as every other. You can light a
candle, dance a jig, call #is_a? twenty times... but in the end,
obj#[] is whatever obj#[] is. You can't change Ruby for a few
nanoseconds at a time through sheer willpower.

This is only correct when blindly assuming that if #[] exists, it must
act like you want. Having the class/module tie gives you a semantic
that further assures you it _does_ act like you want.

This can, of course, be broken. Many things can be broken in ruby
though, but it is not desirable to do so.
Hence the quest to harness the dynamism, rather than wishfully think
that it comes and goes.

There is no lack of dynamicism here. Nor does type checking make the
ability to extend classes on the fly go away or become less useful.
I rely heavily on the strongtyping module in Mephle; I also do some
modification to many base classes. Just because something is a
singleton, does not mean it's also not of the original class.

<snip more type-checking-is-static-typing misconceptions>
 
J

John W. Long

Ryan,
You've got it backwards. The core code should not have every possible
pattern; this is silly. It should provide the ones it uses and needs.
Future classes, if they provide an interface that is useful to this
code, they should document it by including the proper pattern.

For example (let def_abstract define a method that raises
SubclassResponsibility if unimplemented):

module Hashable
def_abstract :[], :[]=, :has_key?, :keys, :has_value?, :values
end

def show_hash(h)
expect h, Hashable

:
end

:

class SomeOddClass
include Hashable

def [](k);
:
end

:
end

An intriguing argument. By far the best that I have heard for Strong Typing.

I do wonder: what is really gained by Strong Typing? Are better error
messages the only advantage? An experienced ruby programmer will see a
method-not-defined error message and know that it means the wrong object was
probably passed in to his method. This kind of error message certainly
throws beginning Ruby programmers for a loop, but is it really a weakness of
the language?

If Strong Typing is an advantage how does it help experienced Ruby
programmers? Do you find that it saves a significant amount of time? Does it
help you catch errors you normally would not have known existed? Does it
encourage the right programming habits? If it can be demonstrated that it
does these things, perhaps it should be incorporated into the language.

I like the clean syntax, but I can't help feeling like I'm looking at code
someone with a strong background in statically typed languages would write.
How can you prevent someone from using Strong Typing in the wrong way? Is
there a way to accomplish this through a different implementation/syntax?

All Respect.
___________________
John Long
www.wiseheartdesign.com


----- Original Message -----
From: "Ryan Pavlik" <[email protected]>
To: "ruby-talk ML" <[email protected]>
Sent: Thursday, November 06, 2003 12:50 PM
Subject: Re: Managing metadata about attribute types

At this point you're waging a battle directly against the design of
Ruby. Ruby allows you to extend objects at runtime; to decide that
this is sloppy or wrong or a second-rate programming technique is
entirely arbitrary.

Not at all. The fact you _can_ do something doesn't mean you must do
it all over the place. The fact you _can_ extend objects has no
bearing on what the fact your class hierarchy documents the behavioral
pattern well, either.
When you check whether or not an object's response to #is_a? includes
an element you've specified, that's the one and only thing you're
checking. It's not as if Ruby somehow pulls up its socks and
straightens its tie and says "Better stop this dynamic stuff!" when it
sees a call to #is_a? It doesn't; it remains dynamic, and the
programming techniques required to ascertain the interface of an
object do not change. (Besides, its socks and tie were just fine to
start with :)

You seem to have the preconception that #is_a? and "this dynamic
stuff" are mutually exclusive in some way. This is not the case. The
fact you're asking #is_a? just means you're asking if at some point it
had a parent that was related to this class.

I see this as both a fundamental problem of blindly using #to_*
methods and a factor that has limited the perception of typing.
When you call #to_f, or #to_s, you get a Float or a String... not a
subclass. (In fact, this is often used to work around having
singleton strings, or not-quite-strings.) Don't confuse this with
actually _having_ a String subclass, which _is_ a String, but (likely)
with some additional behavior.

The primary goal of inheritence is dynamic extension. The ability for
you to add something later to the code and have it fall into the
existing structure.

Consider:

* When writing code, you will never specifically want a type you
do not know about.

* When using the code later, you can provide a specific subclass
unknown to the original code.

* Thus, when type checking, you are not limited in any manner,
because you are asking for what you want (generally), and
getting it (specifically).

Dynamicism doesn't fall out of the picture here at all. (In fact, you
could in theory construct ST expect() statements dynamically, although
it would be odd.)
I realize not all of Ruby is documented in this manner; that's a
simple matter to change. A few smaller modules would solve this; for
instance, Set, HashedSet, IndexedSet, etc. Array would be an
IndexedSet; modules such as CGI would include HashedSet. Then you
could ask for the simple behavioral pattern you desire, and know that
you have it. You would further be assured that this #[] means what
you want it to.

I think I must be not getting something here; it sounds like you're
suggesting that every possible behavior of every future object in
every Ruby program be anticipated by modules in the core distribution,
with multi-barreled names to suit the purpose. I'm thinking that
can't really be what you mean.

You've got it backwards. The core code should not have every possible
pattern; this is silly. It should provide the ones it uses and needs.
Future classes, if they provide an interface that is useful to this
code, they should document it by including the proper pattern.

For example (let def_abstract define a method that raises
SubclassResponsibility if unimplemented):

module Hashable
def_abstract :[], :[]=, :has_key?, :keys, :has_value?, :values
end

def show_hash(h)
expect h, Hashable

:
end

:

class SomeOddClass
include Hashable

def [](k);
:
end

:
end

Now the code knows that SomeOddClass is, in fact, Hashable. If you
fail to implement one of the functions necessary, it will complain.
This is only one example; you could make things a bit finer-grained.
Perhaps have a HashAccessed that defines #[] and #[]=, and a subclass
that defines the rest. I am not sure that level of granularity is
useful, but it is possible.
Also, remember that you're never 100% assured that #[] or any other
method means what you want it to. It's ducks all the way down :)
(http://www.the-funneled-web.com/hawking.htm) Every method call
operates under the same conditions as every other. You can light a
candle, dance a jig, call #is_a? twenty times... but in the end,
obj#[] is whatever obj#[] is. You can't change Ruby for a few
nanoseconds at a time through sheer willpower.

This is only correct when blindly assuming that if #[] exists, it must
act like you want. Having the class/module tie gives you a semantic
that further assures you it _does_ act like you want.

This can, of course, be broken. Many things can be broken in ruby
though, but it is not desirable to do so.
Hence the quest to harness the dynamism, rather than wishfully think
that it comes and goes.

There is no lack of dynamicism here. Nor does type checking make the
ability to extend classes on the fly go away or become less useful.
I rely heavily on the strongtyping module in Mephle; I also do some
modification to many base classes. Just because something is a
singleton, does not mean it's also not of the original class.

<snip more type-checking-is-static-typing misconceptions>
 
R

Ryan Pavlik

An intriguing argument. By far the best that I have heard for Strong
Typing.

I do wonder: what is really gained by Strong Typing? Are better error
messages the only advantage?

Well, the main advantage I have gained, aside from debugging bonuses,
is the ability to ask what types something wants. This is what
started the discussion in the first place, I believe.
An experienced ruby programmer will see a method-not-defined error
message and know that it means the wrong object was probably passed
in to his method. This kind of error message certainly throws
beginning Ruby programmers for a loop, but is it really a weakness
of the language?

Actually simply seeing the error is often not much use. For instance,
in Mephle I have defined a whole set of classes that allows one to
treat HTML as a widget set. For instance:

table = Table.new
table << Row["a", "b", "c"]

:

puts table.to_s(renderer)

Often, widgets are passed around and used deeply within the code.
Consider the following mistake:

def add_extra_row(r)
:
table << w
:
end

add_extra_row Object.new

When the widget is actually "rendered" via #to_s, you might see
something like this:

NoMethodError: undefined method `rowsize' for #<Object:0x4019f23c>
/path/to/Table.rb:500
:
:

This shows that a problem occurred in the code---we don't have
something that's a row---but it doesn't show you where the _actual_
error occured---when someone passed you an Object instead of a Row.
The ST module allows you to catch the culprit immediately.
If Strong Typing is an advantage how does it help experienced Ruby
programmers? Do you find that it saves a significant amount of time?
Does it help you catch errors you normally would not have known
existed? Does it encourage the right programming habits? If it can
be demonstrated that it does these things, perhaps it should be
incorporated into the language.

The error above is a decent example. For me it's not really about the
time saving primarily, it's about the ability to query parameter
types, but the time saving has helped.

However I am not asking or expecting this be integrated into the
language. It's obvious many people dislike it. I think it would be
nice if things were typed like in the snipped example though. Perhaps
this is something I could add to the module.

I love the dynamicism of Ruby. It has allowed me to make this module
that does what I want---something that would be considered a builtin
feature in many other languages---without having to modify the core
language. I would rather see ruby's core remain as simple as
possible.
I like the clean syntax, but I can't help feeling like I'm looking
at code someone with a strong background in statically typed
languages would write.

I have a background in statically-typed languages and
dynamically-typed languages. Both of these are orthogonal to strict
type checking. For instance, Common Lisp (which no-one can argue is
static anything) allows one to specify parameter types in much the
same way as this module. These are used by some compilers for
optimization purposes. They are also optional, and they do not
diminish the dynamicism of the language.
How can you prevent someone from using Strong Typing in the wrong
way?

I don't believe this is the right question. You can't prevent someone
from using something in the wrong way. I tried to demonstrate this
with the ruby-goto module, which abuses exceptions and blocks to
implement labels and goto in ruby. The trick is to make it so there
is no incentive to do things the wrong way, rather than prevent them.
Thus I expect (hope!) no one will ever seriously use ruby-goto,
because there is no incentive to do so.

Similarly with ST. You can build a totally unstructured class
hierarchy which repeatedly breaks the API at every subclass and makes
the module useless. However there is no incentive to do so.
Is there a way to accomplish this through a different
implementation/syntax?

It depends on what you mean by "this"... it could probably be
implemented differently and maybe with different syntax. I could
probably have documented types with something like MetaTags alone.

Actually, I now recall another reason for not doing so. Mephle had a
few requirements I needed to resolve. Type documentation was one.
Mephle provides remote object handling, and is built on the concept
that users will directly manipulate objects. Both remote code and
users have access to objects. "Type safety" takes on a whole new
meaning---remote access means potential for malicious abuse, so even
with limited permissions, it may be possible to pass a
maliciously-constructed object over the network. With the
StrongTyping module that is no longer possible.

(For the record, ST does not verify object content, but that's not
what this is about. Nor can you pass an object to the server of a
class the server does not know about. The problem is when the server
knows about two objects:

class Privileged
def foo
# privileged operation
end
end

class Unprivileged
def foo
# operation anyone can call
end
end

class SomeoneElse
def blah(x)
:
x.foo
:
end
end

Now if the remote end passed the server a Privileged object, bad
things would happen. Yes, you may be able code around this
specifically in every case. But it's quicker and more reliable to let
the system handle it for you.)
 
J

John W. Long

Ryan,
I don't believe this is the right question.

These two questions where meant to go together, but they are a little
confusing. What I meant was that a good api encourages people to use it in
the right way (it won't prevent them from doing something stupid).

My real question is more: Is there a better way to implement Strong Typing
so that people will not tend to use it in the wrong way?

I'm thinking of something along the lines of:

def my_method(a, b, c)
expect a to implement this behavior #this seems more ruby like
expect b to implement
...
end

This is essentially what you have written. However it is also easy to
implement it in a way more like statically typed languages:

def my_method(a, b, c)
expect a is a String #this is more like basic, java, or c
expect b is an Integer
...
end

Most of the reaction seems to be against the second option.

I could almost see this as a language feature:

def my_method(a implements StringBehavior, b implements IntegerBehavior, c)
...
end

Or even:

def my_method(a has DuckBehavior, b ...)
...
end

You could also implement this as an extension to Object:

def my_method(a, b, c)
if a.has?(DuckBehavior)
a.waddle
else #waddle is preferred over trot
a.trot if a.has?(HorseBehavior)
end
end

Strange thoughts.
___________________
John Long
www.wiseheartdesign.com
 
R

Ryan Pavlik

On Sat, 8 Nov 2003 11:29:11 +0900

My real question is more: Is there a better way to implement Strong Typing
so that people will not tend to use it in the wrong way?

I'm thinking of something along the lines of:

def my_method(a, b, c)
expect a to implement this behavior #this seems more ruby like
expect b to implement
...
end

This is essentially what you have written. However it is also easy to
implement it in a way more like statically typed languages:

def my_method(a, b, c)
expect a is a String #this is more like basic, java, or c
expect b is an Integer
...
end

Most of the reaction seems to be against the second option.
<snip>

The problem is people seem to think that "implements String behavior"
and "is a String" are two different things. I'm not sure why.

If it is a String, then it implements a String behavior by
definition. If it doesn't implement a String behavior, then it's not
a String. What's wrong with calling something what it is?

"Duck typing" acts in just this way: if it implements the behavior of
an object, then we consider it that object. Except with "duck
typing", it's just blind to the little detail of whether it's actually
the behavior you expect or not.

Confusion also seems to be around the fact that type checking is
static typing. Or that strong typing is static typing. It's been
repeatedly shown that this is not the case, but people persist under
this assumption. Just recently I've seen the strongtyping module
referred to as the "statictyping" module, which is completely wrong.
The fact that people don't understand this difference limits them.

Getting back to the point at hand: if something implements the
behavior of a thing, we call it that thing. We already have a
mechanism---classes and modules---for doing so. Multiple inheritence,
which Ruby implements in a certain form, allows us to call an object
any number of accurate descriptions.

There's just no need to reimplement a wheel that we already have.
 
J

James Britt

Ryan Pavlik wrote:
This shows that a problem occurred in the code---we don't have
something that's a row---but it doesn't show you where the _actual_
error occured---when someone passed you an Object instead of a Row.

Isn't the error that the object did not know how to respond to 'rowsize'?
The ST module allows you to catch the culprit immediately.

If I understand the ST module, it doesn't help that an object simply
responds to all the required messages, but rather it must only have the
right pedigree. Even though having the right pedigree is no guarantee
that the object was not (perhaps pathologically) munged at some point,
with critical methods removed/altered.

Or does the ST module query an Object with a list of responds_to? messages?

James
 
R

Ryan Pavlik

Ryan Pavlik wrote:


Isn't the error that the object did not know how to respond to 'rowsize'?

The error is that somewhere I inserted an Object when I should have
inserted a Row. Checking this at the point of insertion is a simple
and elegant way to immediately flag this sort of problem.

The fact it doesn't respond to rowsize is only a symptom. Which leads
us to...
If I understand the ST module, it doesn't help that an object simply
responds to all the required messages, but rather it must only have
the right pedigree.
Right.

Even though having the right pedigree is no guarantee that the
object was not (perhaps pathologically) munged at some point, with
critical methods removed/altered.

As in the previous post, this is possible, but there is no incentive
to do so. If you want to break your own code, there are quicker and
easier ways.
Or does the ST module query an Object with a list of responds_to?
messages?

It doesn't, even though I considered something like that, simply
because I realized it was the wrong thing to do. "Does it respond to
this?" isn't the right question to ask (because it doesn't give you
enough inforamtion). "Is this the thing I want?" tells you
everything.

Of course, if there was a standard conversion mechanism, the ST module
could work in conjunction with it to try and provide you with the type
you wanted where possible...

ConversionTable.add(Float, String) { |f| f.to_s }

:

def foo(s)
expect s, String
:
end

foo 4.2 # => foo "4.2"

Hmm, maybe I should release that conversion mechanism after all, and
add this...
 
I

Ian Hobson

Ryan Pavlik said:
Actually simply seeing the error is often not much use. For instance,
in Mephle I have defined a whole set of classes that allows one to
treat HTML as a widget set. For instance:

table = Table.new
table << Row["a", "b", "c"]

:

puts table.to_s(renderer)

Often, widgets are passed around and used deeply within the code.
Consider the following mistake:

def add_extra_row(r)
:
table << w
:
end

add_extra_row Object.new

When the widget is actually "rendered" via #to_s, you might see
something like this:

NoMethodError: undefined method `rowsize' for #<Object:0x4019f23c>
/path/to/Table.rb:500
:
:

This shows that a problem occurred in the code---we don't have
something that's a row---but it doesn't show you where the _actual_
error occured---when someone passed you an Object instead of a Row.

Without wishing to pick holes in your design (I don't know why you chose
the structures you did), you have the table class trusting people
passing in things that form table's INTERNAL state.

This is a flaw. If table is going to be used by others, a serious flaw.

Only a table should know it is made up of rows of cells. A table could
be columns of cells. Why not?

I agree that if you are going to trust a programmer to pass you a row
you have to check it is in fact a row.

Why not provide an interface to table that avoids this problem by never
exposing rows or columns to the outside world? A table is a 2D array of
cells. Table methods would include methods to set up spanning, set cell
content and properties, set row properties, and column properties (such
as width). Other methods would add or remove rows or columns, declare
the use of headers of footers.

Every method would maintain the integrity of the table as a unit, and
could validate the data before it is applied to the table.

This makes it "impossible" for a careless programmer to set up an error
condition that will not be visible until rendering time.

No expensive, run-time type checking needed. Simply ensure that every
class maintains its own internal state at all times.

Regards

Ian
 
R

Ryan Pavlik

On Sun, 9 Nov 2003 10:07:50 +0900
You must not be a big fan of class methods :)

I'm not sure what your point is here. The only conceivable point I
can see is that class methods are singleton objects of type Class, and
that if I ask for a Class, I may get one, but it won't be the exact
API of Class if I use class methods.

You're probably just trying to be funny---which is fine---but the joke
does overlook a few points I've made, including the one you pasted
beneath.

Adding methods does not break type checking. An extended object is
still the object you wanted. (Overriding and replacing an existing
class method might break something, but this is my point in the above
quote---there's not a lot of incentive to break things.)

This is coupled with a previous point: you will never ask
specifically for something you don't know about. For instance, your
code might ask for a Foo, but if it doesn't know about a Bar, a
subclass of Foo, it will not ask for it. Singletons fall under this;
they are still their original class, but with things added.

(Of course you _can_ remove or incompatibly redefine things. This is
usually bad. For instance, when CGI used to return singleton Strings
with #[] redefined in an incompatible manner without any information
on the fact. This caused a lot of problems.)

Anyway, you will obviously never accept my position, and I've had too
much success with strongtyping to give it up, so add any closing
points you wish to make and let's wrap this one up so we can get back
to doing something productive... like writing ruby code. ;-)
 
N

Nathan Weston

Ryan Pavlik said:
The problem is people seem to think that "implements String behavior"
and "is a String" are two different things. I'm not sure why.

I think in the ruby world, when people say "implements String
behavior", they really mean "implements some relevant subset of String
behavior".
This is where duck typing (which really isn't a type system at all,
IMO) becomes more flexible than type systems based on the class of an
object.

Whether this flexibility is a good thing is up for debate. Any type
system will catch some errors, and throw away some otherwise valid
programs. It's a question of freedom vs. safety, and the ruby
community mostly seems to come down on the side of freedom (at least
where programming languages are concerned).
 
T

Thien Vuong

Nathan said:
I think in the ruby world, when people say "implements String
behavior", they really mean "implements some relevant subset of String
behavior".

I'm not sure that attributing people in the ruby world really means
something when they say something else is quite representative of
"people in the ruby world" :).
This is where duck typing (which really isn't a type system at all,
IMO) becomes more flexible than type systems based on the class of an
object.

If I wanted a relevant subset of String, would that be better (i.e.
safer/faster) represented by a class/module representing that subset or
just one or a combination of respond_to?. I would argue most of the
time, that it is the former.
Whether this flexibility is a good thing is up for debate. Any type
system will catch some errors, and throw away some otherwise valid
programs. It's a question of freedom vs. safety, and the ruby
community mostly seems to come down on the side of freedom (at least
where programming languages are concerned).

I don't think it is a flexibility issue only. As ruby is a dynamic
language, class/modules could be as easily created/extended/included
to remap the class/module hiearchy to the desired set. It looks more
like the issue is convenience of being able to say:

- I'm very sure that the this object behaviour/or my need could be
encapsulated fully in this method - so there is no good reason to create
extra code to do needless things.

This is valid and works much of the time, but does not scale into the
argument that class/module checkin is an inferior/crutchy way to design.
Once the needed object behaviour becomes more complex, class/module
interface is needed to describe that behaviour (of course, respond_to
combination/manipulation could do it too, but we would be trying to
reinvent the wheel against a proven/established system already).

I was drawn to ruby because of the OO capability + extra dynamicity
which added to the OO capablity - and it would be a shame to just
embrace the dynamics only and drop the OO.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

No members online now.

Forum statistics

Threads
474,139
Messages
2,570,807
Members
47,356
Latest member
Tommyhotly

Latest Threads

Top