Software Reliability and method i/o

T

Trans

I've just finished reading the beginning of an interesting article on
software reliability at:

http://users.adelphia.net/~lilavois/Cosas/Reliability.htm#Abstract

There is one small part that cought my attention b/c something similar
came up recently in regards to pass-by reference.

"... all connectors should be unidirectional, i.e., they should be
either male (sender) or female (receiver). This will eliminate mix-ups
and ensure robust connectivity."

It was mentioned the previous thread that purely pass-by-value is
arguably the most "proper". But as we all know it is inefficient
--having to copy an object going in only to turn around and reassign it
whence it came:

a = amethod(a.dup)

So I'm wondering might it be possible for the language itself to be
intelligent enough to work as if it were pass-by-value, but catch
situations like the above and work with them on a pass-by reference
bases for efficency, all the while using a pass-by-reference-to-value
under the hood like Ruby currently uses?

In other words, no matter how it works underneath, for the
end-programmer the left side should always be male-output and the right
female-input, and never the twain shall meet.

T.
 
C

Curt Sampson

It was mentioned the previous thread that purely pass-by-value is
arguably the most "proper". But as we all know it is inefficient
--having to copy an object going in only to turn around and reassign it
whence it came:

a = amethod(a.dup)

Actually, you need pass by reference for arguments where you want the
function to have side-effects affecting that argument.

I think a much better way of getting the kind of "reliability" you're
talking about is to always pass by reference, but wherever you can, use
value objects. Methods that would "modify" a non-value object instead
return a new value object with the modified state. For example,

p1 = Point.new(17, 35);
p2 = p1;
p1 = p1.move(0, 5);

Now p2 still refers to the point (17,35), but p1 has been changed to
refer to a different point (17, 40) (rather than the point itself
changing). This helps greatly to avoid aliasing.

Avoiding side-effects as much as possible in this way is OO in sort of a
functional programming spirit.

One of the things I think Java did right was to make Strings immutable.
One of the things I think Java did wrong was to make Dates mutable.

cjs
 
A

Austin Ziegler

One of the things I think Java did right was to make Strings
immutable. One of the things I think Java did wrong was to make
Dates mutable.

I'd have to completely disagree with you here. I think that Java's
choice to make Strings immutable has made the language much more
unpleasant to deal with, and results in a number of inefficiencies
when one -- inevitably -- must modify the contents of a string
object. Always using construction and concatenation isn't
acceptable, and the overhead of using StringBuffer all the time is
undesirable (especially since you have to construct a StringBuffer
every time you need to deal with this since everyone returns String
objects).

Matz got it right when he made Strings mutable. James and Guido
got it wrong.

-austin
 
T

Trans

Curt said:
Actually, you need pass by reference for arguments where you want the
function to have side-effects affecting that argument.

That's my point. I'm not sure we do NEED them. Side-effects are exactly
the couse of concern which leads to more bugs and thus unreliable
software.
I think a much better way of getting the kind of "reliability" you're
talking about is to always pass by reference, but wherever you can, use
value objects. Methods that would "modify" a non-value object instead
return a new value object with the modified state. For example,

p1 = Point.new(17, 35);
p2 = p1;
p1 = p1.move(0, 5);

Now p2 still refers to the point (17,35), but p1 has been changed to
refer to a different point (17, 40) (rather than the point itself
changing). This helps greatly to avoid aliasing.

Hm... but doesn;t that still leave it all up to the programmer? And it
is terribly inefficient.
Avoiding side-effects as much as possible in this way is OO in sort of a
functional programming spirit.

Right. And I ownder if it isn;t worth enforcing --and finding
intelligent ways to to id it efficiently.
One of the things I think Java did right was to make Strings immutable.
One of the things I think Java did wrong was to make Dates mutable.

Maybe so for the dates.

But I'm hinting at transcending this in a way with regards to argument
passing. Arguments would become as if immutable (and side effect must
enforce a duplication) unless it was seen that the passed object was
returning on output from the same place it came from on input. Does
that make sense?

T.
 
C

Curt Sampson

I'd have to completely disagree with you here. I think that Java's
choice to make Strings immutable has made the language much more
unpleasant to deal with....

"Much more" unpleasant? I'd like to see some specifics on this.
...and results in a number of inefficiencies when one -- inevitably --
must modify the contents of a string object. ... and the overhead of
using StringBuffer all the time is undesirable (especially since you
have to construct a StringBuffer every time you need to deal with this
since everyone returns String objects).

Do you have some examples of these efficiencies, with stats to back them
up? I've profiled a fair amount of Java, and I've just not seen these
"inefficiences" after JDK 1.1.

Regardless, if you look at techniques such as ropes (sorry, I don't
know of an on-line version of the paper) you'll see that making
strings immutable is no barrier to making the implementation of string
manipulation extremely efficient. So efficiency is no argument either
way.
Matz got it right when he made Strings mutable. James and Guido
got it wrong.

Aliasing problems take up programmer time, not computer time, and thus
are going to be, in the general case, far more costly than having to
explicitly specify a mutable string when you want it.

cjs
 
C

Curt Sampson

That's my point. I'm not sure we do NEED them. Side-effects are exactly
the couse of concern which leads to more bugs and thus unreliable
software.

Indeed. I'm coming more and more over toward the functional side.
Hm... but doesn;t that still leave it all up to the programmer?

Sure, but it's not hard if that's how the programming culture works. And
the only option beyond that is to make ruby purely functional, so that
it's impossible to have side-effects. Not even Scheme went that far.
And it is terribly inefficient.

Again, I don't buy this. Show me some proof.
But I'm hinting at transcending this in a way with regards to argument
passing. Arguments would become as if immutable (and side effect must
enforce a duplication) unless it was seen that the passed object was
returning on output from the same place it came from on input. Does
that make sense?

I understand that, but I don't like it because it's no longer simple.
For example, when I read "a.foo(b, c)", which argument allows
side-effects? Or is it neither? You have to go track down the definition
of foo to see what it returns, since there's no indication in that bit
of code.

cjs
 
T

Trans

Curt said:
On Thu, 27 Jan 2005, Trans wrote:


Indeed. I'm coming more and more over toward the functional side.

Me too.
Sure, but it's not hard if that's how the programming culture works. And
the only option beyond that is to make ruby purely functional, so that
it's impossible to have side-effects. Not even Scheme went that far.

But maybe it _is_ a good idea to go that far.
Again, I don't buy this. Show me some proof.

Well, I shouldn't say terribly I guess. But it will be slower by the
simple fact that when passing by-value, a copy of the object must be
made. That operation, though very quick, will add up. I could do a
benchmark...maybe when I have more time.
I understand that, but I don't like it because it's no longer simple.
For example, when I read "a.foo(b, c)", which argument allows
side-effects? Or is it neither? You have to go track down the definition
of foo to see what it returns, since there's no indication in that bit
of code.

Try this on for size. Calling 'a.foo(b,c)' flags the parameters b and c
as by-value, but nothing actually happens yet. Only if during the
course of executing foo should b (or c) be _affected_ does a copy get
made prior to the actual affect, and the copy is used from then on.
Sort of a lazy pass by-value. (Maybe this is how seme systems already
work?)

Now take a different example 'b = a.foo(b,c)'. Here b will be flaged as
_potentially_ by-reference. Now lets say foo is defined like:

| def foo(x,y)
| x << y
| x
| end

In this case b is simply returning to wence it came. If it were
possbile for the evaluator to see that fact, then the lazy by-value
copying could be omitted. Of course the trick is seeing that (is it
possible?).

It might help to look at a much simpler example:

x += y

Presently, of course, this creates a new object referenced by x. Yet
does it need to? Well, only if there are other references to x's
original object. If there are not then it may just as well be modified
in place, it won't effect anything.

Seems to me if this could be done then it would be by-value with much
of the efficency of by-reference; and also out of sight, out of mind to
the end-programmer. At least, thats the idea. I may have totally
overlooked something, and I don't now how feasibile it is either, or if
the loss in efficency from actually doing it counteracts the gain. But
it's an interesting notion at least.

T.
 
A

Austin Ziegler

"Much more" unpleasant? I'd like to see some specifics on this.

Well, you'll notice that Java has very little that's actually built
for manipulating text, because the expense of String construction.
Frankly, the only thing that I like about the C++ std::string is
that it *is* mutable -- otherwise it's just as bad as the Java
String for usability.

-austin
 
C

Curt Sampson

But maybe it _is_ a good idea to go that far.

Well, that may well be, but it seems to me that as soon as a language does
go that far, people start putting stuff on top of it so that they can do
things in a more "imperative-looking" way. But this whole topic gets very interesting and cool, check this out for an example:

http://www.cs.pdx.edu/~antoy/Courses/TPFLP/lectures/MONADS/Noel/research/monads.html

(Particularly note the bit at the end where it points out that,
"getContents and readFile read an entire stream lazily (stdin or a file
respectively), returning a lazily constructed list. writeFile does the
opposite, writing a (possibly lazy) list as a file." You've seen this
kind of thing with generators in Ruby.)
Well, I shouldn't say terribly I guess. But it will be slower by the
simple fact that when passing by-value, a copy of the object must be
made.

Oh! I agree that that is inefficent. That's why I proposed passing
immutable objects by reference.
Try this on for size. Calling 'a.foo(b,c)' flags the parameters b and c
as by-value, but nothing actually happens yet. Only if during the
course of executing foo should b (or c) be _affected_ does a copy get
made prior to the actual affect, and the copy is used from then on.

Ah, the standard copy-on-write. That gets interesting. Say I have a file
containing "abc" and a reference to an I/O object for it with the file
pointer for next character to read/write pointing to "b". I then,

def read_one_character(file_handle)
return file_handle.get_next_character
end
c1 = read_one_character(file_handle)
c2 = read_one_character(file_handle)

c1 contains "b". What does c2 contain? "b". Why? Because
read_one_character invokes a method that modifies the file handle object
(moves the location of the file pointer), and so it's copied first,
leaving the caller with the original one that was not modified.
x += y

Presently, of course, this creates a new object referenced by x.

Indeed. That was pretty much what I was proposing: do this for a lot
more classes than we currently do.
Yet does it need to? Well, only if there are other references to x's
original object. If there are not then it may just as well be modified
in place, it won't effect anything.

That's really more a matter of GC tuning. figuring out if there are
other references to that object is essentially doing a good chunk
of a garbage collection. Doing this on every assignment does not
instinctively strike me as a performance improvement. By separating
the marking and freeing phases of the GC you could make unreferenced
available available for reuse, with potential modification, and achieve
pretty much the same effect (though in the case above you'd be finding
another now-unreferenced object of the same class, modifying it, and
setting x to reference it, now freeing up the object that x used
to reference for the same thing later on). But given the type of
tactics GCs are using these days to deal with, e.g., short lived versus
long-lived objects, this might just as easily backfire on you and kill
the GC's performance.

For this sort of thing, you need to look at the entire system, not just
object creation, because the system may be tuned such that it's actually
cheaper to create and delete a lot of short lived objects than it is to
keep fewer long-lived objects around.

cjs
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,166
Messages
2,570,902
Members
47,443
Latest member
RobertHaddy

Latest Threads

Top