Confirm my Performance Test Against Java?

B

Ben Christensen

Edging away from the data driven performance analysis, I'd like to
explore some of my more subjective and opinionated thoughts on the
matter since several of your answers have touched on them. I'd very much
appreciate your rebuttals to my following comments (if you feel like
spending the time reading and responding).

I am obviously biased by my long use of the Java platform, and very
likely by my many years of focusing on processing large amounts of data,
writing search engines and other such applications very sensitive to
performance - and thus I have profiled and optimized virtually every
aspect of the stack and code - and to great gains for the user
experience. On the other hand, I've also connected to mainframes where
no amount of code optimization on my end could make any difference in
how things performed as I depended on the result from an external
resource, and I hid that from the user as much as possible with
asynchronous user interface magic as opposed to worrying about code
optimization, language choice or even hardware.



## Fast Enough ##

I often hear that Ruby is "fast enough" or that the performance
difference is not important since IO is generally the real source of
performance issues.

I understand that in certain cases - though when I see potential
performance degradation as multiples (2x, 3x, etc) as opposed to small
percentages, it makes me question the decision to use Ruby much more
than if we were talking percentages of 10-20% - such as 115ms vs 100ms
on a page response.

For example, a java environment my team has built provides SOAP/REST
webservices for product catalog search - and responsiveness is a very
significant measurement criteria of the success of the product and
system. Therefore, our average server side response time is something we
watch very closely.

It's difficult for me to accept the use of a language or platform which
means I'm taking a significant hit in performance - similar to the first
5-8 years of Java.

Even Java is still improving ... Java 5 to Java 6 was a very noticeable
improvement in performance at the JVM level (I've had 30-80% performance
increases from 5 to 6). Same hardware, same code - noticeable
improvement in performance and thus responsiveness on webservices,
webapps and shorter data processing times.

In the late 90s it was worth it to me to use Java and take the
performance hit - as the benefits were so strong over C for so many
things.

However, I'm still struggling to see the strong reasons to adopt Ruby
when I'm penalized performance-wise and the answer becomes "it's good
enough" or "network IO is usually the slowest aspect, so it's negligible
what Ruby adds" or "CPU is cheap".

Yes, "CPU is cheap", but that applies to Java as well. 6 months ago we
upgraded our hardware (systems were only 18 months old when we upgraded)
and shaved off another 30% just by taking advantage of new CPU
architectures and bus speeds that had changed in the previous year.

Perhaps in a straight-forward CRUD app where all that's being done is
retrieving/storing data in a database where IO truly is the bottleneck
that no amount of optimization can improve - then it doesn't matter and
"good enough" rings true where IO takes 100ms and the Ruby/Java portion
is only 10-20ms on top of the IO.



## C Extensions to Ruby ##

This feels akin to saying in the late 90s that to make Java perform
well, use JNI to use C. To me it defeats the whole purpose of the "Ruby
is simple and pleasant" paradigm. If I have to optimize it with C, then
it's no longer simple or a joy to use.



## Maintainability, Speed of Development and 'Enjoyment' ##

I hear "speed of development", "maintainability", and "enjoyment of
coding" as the reasons to move to Ruby - and to accept the negatives in
performance, tools, libraries etc.

I'm still not sold on these reasons - as I truly enjoy working in the
Java ecosystem of tools, projects, libraries etc - despite what may or
may not be "crufty" or verbose in some aspects of the language itself.

Nor am I convinced yet that managing a codebase over 5-10 years and 40+
developers is any easier with Ruby - Java's static typing and now its
generics (the polar oppostite approach of 'duck typing' in Ruby) are
actually very nice for readability, navigation of code, refactoring and
other such things on such large codebases when so much of it is from
other coders, teams, or just plain old and forgotten about.

Putting aside these more subjective decision points that I do not yet
have the experience to weigh in on with Ruby - if the performance impact
affects the end user, then that is in my opinion an objective point of
concern. Google and Amazon have both publicized how slow downs of
100-300ms on a user interface affects user experience and how much those
users utilize their sites. I certainly recognize that fact while
operating a hosted search engine. Speed of search directly impacts how
much someone will choose to use it. Slower performance equals increased
friction to use.

Also, no amount of "throwing hardware at it" will make Ruby faster than
throwing the same hardware at Java - which is ultimately I think the
biggest issue I have with the concept. If I throw the fastest hardware I
can at something, I want my user to get the biggest bang for the buck -
not just make up for me using a slower language.

As for "brevity" equalling "maintainability" and "less bugs" - I tend to
disagree when the pursuit results in code such as this example given:

puts "The number of tokens is: %d." % File.open(ARGV[0]){|f|
f.inject(0){|a,l| a+l.split.length } } ,
"It took #{(Time.now - start) * 1000} ms"

I find it intellectually stimulating and admirable at the power of what
is accomplished in such a short statement.

Understanding it however takes time and thought - and a certain level of
skill.

Perhaps your experiences are different - but most development teams have
a lot of more junior and intermediate developers than senior - and the
more verbose, easy to read, simple to understand code is far more
preferable - even for myself when I must review, debug and profile the
code of dozens of others - as opposed to something that looks like an
academic puzzle to unravel.




## Concluding Thoughts ##

I recognize that many of my concerns are similar to the C versus Java
argument of 10 years past.

Moving to Java from C had a variety of very significant benefits
however, amongst them being: garbage collection and vastly simplified
memory management, no need to worry about pointers, "write once, run
[mostly] everywhere" (as long as you were talking server-side and not
desktop where Java is miserable) and APIs designed better to address the
networked world of the then new (to the common public anyways) internet.

In moving from Java to Ruby I don't see the benefits as strongly
motivating - and therefore am much less willing to accept performance
penalties.

In short, after waiting through 10+ years of maturation to get the Java
performance now enjoyed, it feels somewhat odd to step back
significantly in performance, tool maturity and available libraries.

What types of applications and codebases do you feel truly are served
best by Ruby - and therefore not in need of the highest performance
given to the end-user?
 
P

pharrington

pharrington, in your response you stated:
"as the code that happens is neither the most "elegant" *nor* fastest
Ruby can do."

Can you please provide me a re-write of the Ruby code I used that is
elegant and fast so I can learn from you?

Hi, I was gone for the day, but numerous people in the thread already
did both, so :)
@Mike

Thank you for providing the Gist link to a file.
(http://gist.github.com/170476)

However, the changes don't improve the performance when I take into
account what was removed and I had in there on purpose. Take note of
item #2 below.

1) Object structure

The modified code removed all of the class/object structure, which I
purposefully had in there to simulate this being an object within a
larger project.

Sticking methods in a class really doesn't simulate an object in a
larger project at all; its just methods in a class. The general
concept of ***larger project*** isn't really something you can factor
out; its just how the code ends up needing to be structured for the
task at hand.
That being said, converting the lines of code we're discussing for
performance into a script means nothing to this discussion - but I
purposefully am writing the code in an OO style with classes as opposed
to scripts.

Again, coders code to solve the task at hand. When you say "Yes, this
test is representative of some of the types of applications
and necessary data processing I have current applications doing and am
needing in some future ones" we look at what the code *does*, not
guess at a vague idea of a large project which defining a class is
apparently supposed to imply. The code you posted counts words in a
file. Nothing more; nothing less.

I was also purposefully making the Java and Ruby versions as similar to
each other so as to allow a performance comparison to be done with as
little difference as possible in approaching the code.

If you want to write Java code, then why use something that isn't
Java? This is like taking a C program, trying to emulate as closely as
possible, line-for-line the C code in Erlang (using mutable data and
everything) and then dismissing Erlang because it's worse at being C
than C. Different languages express solutions to different problems in
different ways; that's the whole point.

I guess you just wanted to know whether or not the Ruby interpreter is
generally slower than compiled Java bytecode? Of course it is (I
assumed this was common knowledge (to the point of being a cliche
even) but :\). If anyone told you otherwise, I'm sorry you were
blatantly lied to. BUT Ruby lets us *produce* faster and more
accurately, giving us plenty of spare time to optimize the code (even
porting specific parts to C if needed) after we've easily made it
correct.
2) Counting versus Using the Tokens

In the modified code, it is now just counting the tokens:

    num += l.split.length

Obviously that is faster than what I had in the original code. Again
however, I'm doing this on purpose.

Counting the number of tokens in an of itself is not all that I was
doing in the original code or in the Java version. To simulate more
closely what actually occurs in a functional system I am:

- assigning the array of tokens to a variable
- iterating the tokens to do something with each of them

In this case I'm just assigning each token to another variable and then
performing the count.

In a real world use I'd perform some function on the text, put it
somewhere, whatever.

In the real world, the "do something with each of them" is the real
juicy part that we want to compare. What is the something? Does the
real world program just end up counting tokens? Then we realize this,
count tokens, and be on our merry way. Is the real world program
taking each word in a text file, comparing relationships against a
lexical database, then based off whatever relationships in context and
calculations constructing a sort of hash to classify a given text
document? String token = tk.nextToken(); numTokens++ does not begin to
describe or "simulate" this, so what is the point of the benchmark?
This change accounts for the difference in time from "7965.289 ms" to
"4821.399 ms" when I run the original code and the modified code.

So yes, the modified code is "faster", but it's not doing the same thing
as the original and therefore not a valid comparison.

The input is the same. The output is the same. The person running your
code does not care if its object-oriented, procedural, a script, is
functional, etc; he only cares if he gets the expected output in a
reasonable amount of time when he gives his input. Thus the coder only
cares if she can code fast enough to give the client the features he
wants, and if she can does this in a way thats easy to keep up with
his increasing feature demands while keeping the code stable and
fast.





But I dunno, maybe I'm still completely missing the point?
 
P

pharrington

@Matthew K. Williams

-- 1. How *often* are you going to be processing these files?  If they
are
-- batch style jobs, then does absolute speed matter over
maintainability?

The particular application I'm looking at in the future has a virtually
continuous feed of incoming data from multiple concurrent sources.

Thus I'm looking at what language the processing code would be in. My
default go to is Java - but I want to consider Ruby and not blindly just
use what I'm accustomed to before establishing what will likely be in
existence for the next 3-5 years.

OK, speed and excellent concurrency handling (non-JRuby Ruby strikes
out across the board on the second aspect unfortunately) really is
your priority then. May I then ask what made you consider Ruby in the
first place? Something along the lines of Scala (runs on the JVM,
*relatively* mature, *fantastic* concurrency support, works well with
functional and imperative/object oriented styles) or Erlang (SMP
almost for free, but *almost* strictly immutable and functional coding
(which is much more of a learning curve than a negative)) would make a
ton more sense here.
 
J

Josh Cheek

[Note: parts of this message were removed to make it a legal post.]
puts "The number of tokens is: %d." % File.open(ARGV[0]){|f|
f.inject(0){|a,l| a+l.split.length } } ,
"It took #{(Time.now - start) * 1000} ms"
I find it intellectually stimulating and admirable at the power of what
is accomplished in such a short statement.

Understanding it however takes time and thought - and a certain level of
skill.

Perhaps your experiences are different - but most development teams have
a lot of more junior and intermediate developers than senior - and the
more verbose, easy to read, simple to understand code is far more
preferable - even for myself when I must review, debug and profile the
code of dozens of others - as opposed to something that looks like an
academic puzzle to unravel.

Hello, Ben,
I apologize if my solution turns you off to Ruby (because of the joy, and
excitement you'll miss ;) Ruby is supportive of many paradigms, and so I
adapt my code to my preference. I also write my Java like this, as much as
the language allows. Certainly you can write code which is much more
straightforward than mine. Due to the way I read my code later, I like to
get as much accomplished in as little room as possible (I'm perfectly happy
to let it run off the end of the screen), and then supply a comment telling
me what it does, and if it is esoteric, explaining how it does this. This
allows me very quickly quantify chunks of code, and narrow my attention to
only the relevant portions. If those portions are complex, my comment helps
me quickly figure it out. If your junior developers read code differently,
then perhaps a less terse style would be more fitting. Ruby also adapts
itself very well to legibility, if you choose to write it that way (in
Rails, people often remark that the code documents itself).

Choosing the right tool for the job is a relevant cliche here, and it sounds
like your job, being so performance oriented, requires a tool well suited to
meet these performance needs. If that need is so pressing that you've had to
replace Java with C, then Ruby is probably not the tool you need. You can do
things in Ruby that will make your head spin (these simple examples do not
even hint at the power Ruby grants you), but that doesn't make it the right
tool for this job. If Java is better suited to your needs for this project,
then that would certainly be the responsible choice.

However, I'd encourage you to keep evaluating the language, even if it is
not the best choice for this particular project, because it can take a
little time to figure out 'the Ruby way'. And as Pharrington pointed out
"Different languages express solutions to different problems in
different ways". I think that once you play with it to the point of comfort,
then you should see how Ruby addresses various patterns, (consider
http://www.amazon.com/Design-Patterns-Ruby-Addison-Wesley-Professional/dp/0321490452),
you may begin to feel that same love some of us here do. And then, I suspect
that instead of seeing how well Ruby can pretend to be Java, you'll find
yourself wondering if Java can't be a bit more like Ruby. Perhaps at that
time, a solution like JRuby would look very desirable. Also, great strides
are being made in regards to speed, which would significantly alleviate the
most relevant objection.

Anyway, regardless of your choice, thank you for taking the time to educate
yourself about Ruby.
 
D

David Masover

First, I should say that I'm going to present arguments for Ruby here, whether
or not I think it's the best choice. Without knowing what you need, I really
can't say.

Yes, "CPU is cheap", but that applies to Java as well.

But if you are in a position to be able to throw more hardware at the problem,
it does really become a question of CPU time vs programmer time. That is, if
Ruby really does cost 3x the CPU of Java, you can calculate in real dollars
how much it will cost to use.

Speaking for myself, I certainly feel Ruby makes me more than three times as
productive as Java, and programmer time is much more expensive than CPU time.

Obviously, there are cases where spending programmer time makes sense. For
example, performance-critical desktop apps (CAD, games, etc) cannot use the
"throw more CPU at it" argument, because you're now forcing your clients to
upgrade their hardware to use your product. A large enough organization might
want to optimize as much as they can -- if rewriting it in C saves a thousand
machines and takes a developer a year, it's probably worth it, unless you can
get a thousand machines cheaper than a developer.

On the other hand, there's a case to be made that you should "write one to
throw away" -- if you can do it in Ruby in a few days, and rewrite it in Java
in a few weeks, the rewrite can take lessons learned from that ruby sketch.

It would also give you time to evaluate the "fast enough" argument. If it
turns out that you have plenty of extra capacity, and the Ruby version runs
fast enough, it may not be worth rewriting. If it turns out that Ruby is too
slow (even after trying Ruby 1.9 and JRuby), you've only lost a few days.
Perhaps in a straight-forward CRUD app where all that's being done is
retrieving/storing data in a database where IO truly is the bottleneck
that no amount of optimization can improve - then it doesn't matter and
"good enough" rings true where IO takes 100ms and the Ruby/Java portion
is only 10-20ms on top of the IO.

That depends...

Response time is only part of the story. What you really want to benchmark is
requests per second, and that's not always as simple as multiplying response
time.
## C Extensions to Ruby ##

This feels akin to saying in the late 90s that to make Java perform
well, use JNI to use C. To me it defeats the whole purpose of the "Ruby
is simple and pleasant" paradigm. If I have to optimize it with C, then
it's no longer simple or a joy to use.

It's a bit like C -- it's going to be fast enough most of the time, but
there's always the possibility you'll find some small part that can be
rewritten in assembly to squeeze some extra performance out of it.

Most of what I said could be summarized as: The speed of a nonworking program
is irrelevant, and premature optimization is the root of all evil. (I don't
remember who I'm quoting, but it was someone important.)

My preference would be, if I can write 97% of the program in Ruby, and 3% in
C, is that really going to be less pleasant than writing 100% of the program
in Java?
Nor am I convinced yet that managing a codebase over 5-10 years and 40+
developers is any easier with Ruby - Java's static typing and now its
generics (the polar oppostite approach of 'duck typing' in Ruby) are
actually very nice for readability, navigation of code, refactoring and
other such things on such large codebases when so much of it is from
other coders, teams, or just plain old and forgotten about.

I suspect most of that is due to the tools, more than the type system itself.

The most compelling argument I've heard is: Type checks are just a special
case of unit tests. You need unit tests anyway, and good unit tests will
already more than cover what type checks were meant to save you from.

I suppose I'm curious -- when was the last time the type system saved you?
That is, when was the last time you tried to pass an object of the wrong type
to a method, and gotten a type error of some sort, and realized you needed to
do something other than a simple typecast on that object?
Also, no amount of "throwing hardware at it" will make Ruby faster than
throwing the same hardware at Java - which is ultimately I think the
biggest issue I have with the concept.

Indeed -- but again, you're paying for it with increased developer time.

And throwing the same hardware at Java that you would need for Ruby just gives
you a bunch of unused capacity -- you'd be buying less hardware. So it is
pretty much a straight trade.
If I throw the fastest hardware I
can at something, I want my user to get the biggest bang for the buck -

Which is pretty much going to give you benchmark candy. If your site is
slowing down, that's a bug. Once the speed of the site is acceptable, and
you're set up to handle spikes appropriately, more performance doesn't really
buy you anything other than "because you can".
As for "brevity" equalling "maintainability" and "less bugs" - I tend to
disagree when the pursuit results in code such as this example given:

puts "The number of tokens is: %d." % File.open(ARGV[0]){|f|
f.inject(0){|a,l| a+l.split.length } } ,
"It took #{(Time.now - start) * 1000} ms"

Probably true for that -- though, to be fair, I wouldn't have put it that way.

However, there was a study done, at one point, which showed that the ratio of
bugs to lines of code was constant across languages. So while I wouldn't say
it makes sense to make things unreadably brief, Ruby is usually both more
readable and shorter. 100 lines of code is generally easier to read and debug
than a thousand.
Perhaps your experiences are different - but most development teams have
a lot of more junior and intermediate developers than senior - and the
more verbose, easy to read, simple to understand code is far more
preferable - even for myself when I must review, debug and profile the
code of dozens of others - as opposed to something that looks like an
academic puzzle to unravel.

I think that particular code sample was misleading -- certainly, you can play
Perl Golf in any language. But you have coding conventions in Java, and you
would in Ruby.
What types of applications and codebases do you feel truly are served
best by Ruby - and therefore not in need of the highest performance
given to the end-user?

I feel the kind that is served best is any sort of web app, or small scripts
for system administration -- particularly anything that needs to be flexible
and constantly maintained and improved, for which the developer controls the
hardware.

Were there better deployment tools (and Shoes seems to be an effort in that
direction), I'd also say any sort of desktop app that doesn't need the highest
performance possible. Frankly, that's most of them -- an instant messaging
client, for instance, doesn't need to be blazingly fast, just fast enough.

But, it's not always about whether the end-user needs the highest performance.
It's about whether it's possible to throw CPUs at the problem, or whether the
CPU is the bottleneck at all.
 
B

Brian Candler

Ben said:
## Fast Enough ##

I often hear that Ruby is "fast enough" or that the performance
difference is not important since IO is generally the real source of
performance issues.

I understand that in certain cases - though when I see potential
performance degradation as multiples (2x, 3x, etc) as opposed to small
percentages, it makes me question the decision to use Ruby much more
than if we were talking percentages of 10-20% - such as 115ms vs 100ms
on a page response.

For example, a java environment my team has built provides SOAP/REST
webservices for product catalog search - and responsiveness is a very
significant measurement criteria of the success of the product and
system. Therefore, our average server side response time is something we
watch very closely.

And in that case, your response time may be dominated by the time to
process the SOAP request and format the SOAP response, so evaluating the
performance of those libraries is important.

I still don't buy the "fastest to execute must be best" reasoning. In
reality, there will be a threshold of acceptability - e.g. 90% of
requests must be returned within 150ms - in which case you're free to
choose a platform which meets that goal, and/or apply money to the
hardware platform as required.

However, if your competitor using Ruby gets a product to market in one
third of the time, then it doesn't matter if your Java solution performs
50% better - you won't have any customers.
## Maintainability, Speed of Development and 'Enjoyment' ##

I hear "speed of development", "maintainability", and "enjoyment of
coding" as the reasons to move to Ruby - and to accept the negatives in
performance, tools, libraries etc.

I'm still not sold on these reasons - as I truly enjoy working in the
Java ecosystem of tools, projects, libraries etc

Then you have no need to ask anything more here. You're sold on Java,
you're productive with Java, you enjoy Java, so use Java.

But computing is not static. There are probably still people who use
Algol and Fortran daily, and they are Turing-complete of course, but
newer languages make programming easier. You found the same when you
moved from C to Java, so you can see the benefits of keeping abreast of
new developments. It's always good to stretch yourself out of your
comfort zone to experience how people are using different languages
effectively. Perhaps you should try something more radically different
for a C/Java programmer, like Erlang.
Nor am I convinced yet that managing a codebase over 5-10 years and 40+
developers is any easier with Ruby - Java's static typing and now its
generics (the polar oppostite approach of 'duck typing' in Ruby) are
actually very nice for readability, navigation of code, refactoring and
other such things on such large codebases

Again, if you find this a benefit, then go with Java. Most people here
find the opposite, but then, you're on a ruby users' mailing list so
what do you expect :)

That is: most of us hugely value Ruby's speed of development (like
writing the same thing in 1/10th of the number of lines of code), and if
there's a reduced run-time penalty, that's a minor issue.

You won't really get a taste for what we mean until you start writing
Ruby (real Ruby, not Java ported line-by-line to Ruby). Perhaps you
could start with using jruby to wire up your Java objects. You'll get
the raw performance of the underlying Java objects, but using jruby as
the integration glue.
As for "brevity" equalling "maintainability" and "less bugs" - I tend to
disagree when the pursuit results in code such as this example given:

puts "The number of tokens is: %d." % File.open(ARGV[0]){|f|
f.inject(0){|a,l| a+l.split.length } } ,
"It took #{(Time.now - start) * 1000} ms"

I find it intellectually stimulating and admirable at the power of what
is accomplished in such a short statement.

Understanding it however takes time and thought - and a certain level of
skill.

I agree with you, this is an unnecessary use of inject, and people who
push this sort of code at newcomers are not doing any favours. I would
write this simply as:

tokens = 0
File.foreach(ARGV[0]) do |line|
tokens += line.split.length
end
puts "The number of tokens is: #{tokens}"

Regards,

Brian.
 
D

David Masover

My previous version would probably be better like this:

start = Time.now
puts "Starting to read file ..."
puts "The number of tokens is: %d." % File.open(ARGV[0]){|f|
f.inject(0){|a,l| a+l.split.length } } ,
"It took #{(Time.now - start) * 1000} ms"

Just for fun, here's a verbose, somewhat less magical version:

start = Time.now
filename = ARGV.first

puts 'Starting to read file...'

count = 0
File.open filename do |file|
file.each_line do |line|
count += line.split.length
end
end

puts "The number of tokens is: #{count}."
duration = Time.now - start
puts "It took #{duration*1000} ms"


That is intended to be somewhat self-documenting, so a bit more verbose than I
might normally do. It does more or less the same thing, in more or less the
same way. It also seems to be following roughly the pattern you did in Java,
and I find it _much_ more readable.

Of course, it's a short (benchmark) example, so it's difficult to show Ruby
really shining here, unless you want to play golf. But even the readable
version is also far less verbose than the equivalent Java.
 
R

Rick DeNatale

I'm evaluating Ruby for use in a variety of systems that are planned by
default to be Java.

I've started down a path of doing various performance tests to see what
kind of impact will occur by using Ruby and in my first test the numbers
are very poor - so poor that I have to question if I'm doing something
wrong.

Is this test case in any way representative of the tasks you will
actually be performing?

Test file 1:
java FileReadParse
Starting to read file...
The number of tokens is: 1954
It took 16 ms
ruby -v file_read_parse.rb
ruby 1.8.6 (2007-09-24 patchlevel 111) [i386-linux]
Starting to read file ...
The number of tokens is: 1954
It took 4.951 ms

Test file 2:
java FileReadParse
Starting to read file...
The number of tokens is: 479623
It took 337 ms
ruby file_read_parse.rb
Starting to read file ...
The number of tokens is: 479623
It took 2526.455 ms
ruby file_read_parse-2.rb
Starting to read file ...
It took 588.065 ms
The number of tokens is: 479623

One of the things this 'benchmark' skips over is the time it takes to
initialize the two environments.

There's no measurement of the time between hitting enter on the
command line and the point where the

start = Time.now

line (or it's equivalent in the Java program) gets executed.

It might be interesting to execute those benchmarks using something
like the linux time command to measure the differences in "womb to
tomb" execution times.

I suspect, but I may be totally wrong, that Java takes a while to
'warm up' a more complex runtime environment, and that Ruby gets going
faster. The JVM has evolved in an environment where it has tended to
be used for long-running processes.

Depending on the task this may be important.

This can also affect deployment architecture. JRuby tends to
encourage multi-threading in a single process to amortize the start-up
time. Most of us who are using MRI for, say rails, are pretty happy
with having multiple server processes which can be brought up as
needed (and terminated when not) under something like Passenger
(a.k.a. mod ruby) particularly using the "Enterprise" version of Ruby
which allows for sharing vm code between the processes.


--
Rick DeNatale

Blog: http://talklikeaduck.denhaven2.com/
Twitter: http://twitter.com/RickDeNatale
WWR: http://www.workingwithrails.com/person/9021-rick-denatale
LinkedIn: http://www.linkedin.com/in/rickdenatale
 
B

Ben Christensen

Thank you very much for the excellent answers, and your well reasoned
responses to what could have easily been dismissed as someone "not
getting it" or attempting to start a flame war.

I have quoted various aspects of the responses and added my response or
further comments.



-- May I then ask what made you consider Ruby in the first place?

The reason I'm considering it is because I don't want to blindly choose
Java just because it's the default.

As for why Ruby and not Erlang, Scala, Groovy etc -- the honest answer
is because Ruby is getting so much attention these days, to the point of
religious fervor amongst many I speak to that I need to take an
objective look at it and what it does well.

I greatly dislike religious wars amongst technology - for example Linux,
Mac and Windows - and thus want to understand the objective
benefits/drawbacks as opposed to personal taste.



-- "seeing how well Ruby can pretend to be Java"

My intention is not to see how Ruby can pretend to be Java.

I'm using Java as the point of comparison for a few reasons:

- it's what I have the most expertise in
- it's generally the "default" choice in the types of projects and
development teams I lead
- the language has a very wide range of understanding in the development
world and is therefore a good point of reference to discuss from
- I need a valid point of reference for objective performance
comparisons

That being said, I am trying to figure out what the "Ruby way" is -
which so far is far from clear to me.

I appreciate the reference to the Design Patterns in Ruby book. That is
very much the type of recommendation that will probably help me out, so
thank you.

http://www.amazon.com/Design-Patterns-Ruby-Addison-Wesley-Professional/dp/0321490452



-- You won't really get a taste for what we mean until you start writing
-- Ruby (real Ruby, not Java ported line-by-line to Ruby).

What example opensource projects can you refer me to which espouse the
"real Ruby" style of doing things?

I'd prefer non-Rails projects, as I understand the completely different
approach of webapp dev with Rails.

I'm looking specifically at Ruby.

I keep getting told that I must understand the "Ruby way" - so I'd
appreciate instruction on how to accomplish the "Ruby way" considering I
am apparently boxed in as a "Java/C style programmer" ... despite
disliking C :)




-- But if you are in a position to be able to throw more hardware at the
problem,
-- it does really become a question of CPU time vs programmer time. That
-- is, if Ruby really does cost 3x the CPU of Java, you can calculate in
real
-- dollars how much it will cost to use.

If scalability was the only issue, this would be a valid response.

For example, if both Java and Ruby both performed single threaded
transactions at 150ms each, and both scaled to 10 concurrent threads
equally well, but Java continues to scale to 30 concurrent threads and
Ruby does not, then that's a scenario where I can add 3 machines to
scale Ruby horizontally and truly argue that the cost of the hardware is
more than made up for by lower developer costs.

But, "per request" performance does not get improved by this type of
solution.

Adding faster hardware does not make Ruby catch up to Java - since Java
also improves with faster hardware.

This is why the "add hardware" answer doesn't win me over on the
performance issue, because performance and scalability are two
completely different problems. I haven't even begun to test scalability
with Ruby yet.



-- Response time is only part of the story. What you really want to
benchmark is
-- requests per second, and that's not always as simple as multiplying
response time.

That's correct ... but supports my point. Requests per second is the
throughput, or scalability - not performance.

That is something I can throw hardware at - performance is not.


-- Which is pretty much going to give you benchmark candy. If your site
is
-- slowing down, that's a bug. Once the speed of the site is acceptable,
-- and you're set up to handle spikes appropriately, more performance
doesn't
-- really buy you anything other than "because you can".

I disagree. If I can cut 30% of the transaction time off of a search
engine request - that is valuable.

It provides a better use experience to the user and (according to Google
and Amazon) increases their usage of the system.

Performance of response (not talking about scalability here but actual
performance) is more than just "bragging rights" or "benchmark candy".
The speed at which an application responds to an end users request
impacts the overall usability of an application.

It is for this same reason that things such as network compression,
network optimization (CDNs, Akamai route acceleration etc) and client
side caching also all play a role.

In the presentation layer however, I tend to think the performance
degradation of using Ruby is far less of an issue than backend services,
since IO does play such a huge role - which is more or less what
Thoughtworks has come to conclude from their use of it based on their
reported experiences.



-- premature optimization is the root of all evil

I 100% agree. Martin Fowler comes to mind or someone similar.



-- My preference would be, if I can write 97% of the program in Ruby,
and
-- 3% in C, is that really going to be less pleasant than writing 100%
of the
-- program in Java?

An interesting observation and one I must consider.



-- when was the last time the type system saved you?

This is a valid and interesting question.

I would suggest that it's not that it is "saving" anything - cause there
is nothing to save once the application is running, because the code
can't be compiled if things aren't type-safe.

It's the toolset as you stated that you suspect.

The readability of code to know exactly what types a given argument,
variable or array contain.

The IDE telling me as I type when errors are occuring, what objects
relate to what, navigating through code, etc.

I've attempted RubyMine, Aptana and Netbeans. They are attempting this
dynamic interpretation but are far from accomplishing it.

For example, code completion in these tools to suggest the available API
methods is almost useless, as they offer virtually every method
available under the sun, as they are not interpreting what actual type
the variable is. Therefore they'll show me 15 different versions of a
method with the same name, all for different object types from the Ruby
API.

Similarly, looking at an array or collection in Ruby does not tell me
what it is, especially if things are being passed around through
methods, across class boundaries etc. Instead of the method signature
telling me "Collection<ZebraAnimal>" I just see a variable.

Thus, I must now depend on a team of developers properly documenting
everything, using very descriptive naming conventions (and properly
refactoring all of that when changes occur), and wrapping everything in
unit tests.

Now, all of those are "ideal" cases - ones I believe in and stress
continually. I have hundreds and hundreds of unit tests and automated
build servers etc - but in the "real world", getting teams to comment
every method, properly name (and refactor) variable names and cover
everything in unit tests just doesn't happen - unless it's a small team
of very competent people who all believe in the same paradigm and treat
their code as art. I wish that's how all dev teams were, but it's not a
reality.

Perhaps if it's a personal project where I know the code and can ensure
all is covered it's a different story.



-- 100 lines of code is generally easier to read and
-- debug than a thousand.

I'll give you that - but I have yet to see anything that proves to me
that a competent developer using both Ruby and Java (or C# for that
matter) would have 10x as much written code than they would in Ruby.

The "cruft" so often referred to are things that I don't even consider
or think of. Boilerplate code ... clutter and sometimes annoying ...
fades into the background and tools remove the pain of it. And with the
advent of annotations, many of these arguments disappear when Java code
is written correctly with modern patterns and frameworks.



-- I think that particular code sample was misleading -- certainly, you
can
-- play Perl Golf in any language. But you have coding conventions in
Java, and
-- you would in Ruby.

You surely can, and I'm trying to understand what the coding conventions
are in Ruby. The book link offered is something I'm going to go look at.
Amazon referred another book called "The Ruby Way" which may also
provide me good insights. Any experience with that one?




-- ... best served by Ruby ...
-- small scripts for system administration

I completely agree here.

-- any sort of web app ... that needs to be flexible and constantly
maintained and improved,
-- for which the developer controls the hardware.

I'm leaning more and more towards this. In fact, I'm trying to figure
out how to rip Java out of my webapps completely and leave that to the
backend webservices and let the presentation layer be as free from
"code" as possible. Java developers typically aren't exactly the best at
client facing solutions (don't attack me on this if you disagree ...
this is ofcourse not a definitive rule, it's just that I find it more
challenging to hire good web developers who are 'Java' skilled as
opposed to PHP, Ruby, Javascript, CSS, etc).

For example, if I can accomplish a dynamic front-end purely driven by
client side Javascript using AJAX techniques with a REST style
webservices backend, I will try to pursue that.

The middle ground seems to be pursuing Ruby or something else that is
still server-side, but better suited to the always changing pace of
webapp dev and more creative, script driven coding style better suited
to web developers and designers.

-- desktop app that doesn't need the highest performance possible

What you say makes sense here, but I am so far removed from desktop apps
that I'm useless in weighing in on this.



-- It's about whether it's possible to throw CPUs at the problem, or
-- whether the CPU is the bottleneck at all.

Understood and I agree.
 
B

brabuhr

-- My preference would be, if I can write 97% of the program in Ruby,
and
-- 3% in C, is that really going to be less pleasant than writing 100%
of the
-- program in Java?

An interesting observation and one I must consider.

Or, perhaps in your case: 9x% in Ruby, y% in Java.


Example 1:

require 'java'
java_import 'FileReadParse'

FileReadParse.new.do_stuff


Example 2:

require 'java'

java_import 'java.util.StringTokenizer'

File.open("/tmp/file_test.txt") do |file|
file.each_line do |line|
tokens = StringTokenizer.new(line)
tokens.each do |token|
#do_stuff_with(token)
end
end
end

(Though in the token counting case, example 2 is slower than pure
ruby: "tokens = StringTokenizer.new(line)" takes more time than
"tokens = line.split".)


Example 3:

require 'java'

java_import 'TokenProcessor'

token_processor = TokenProcessor.new

File.open("/tmp/file_test.txt") do |file|
file.each_line do |line|
line.split.each do |token|
token_processor.process(token)
end
end
end


There could also be room in your toolbox for Ruby to help in the
testing of your Java code:
http://jtestr.codehaus.org/
http://wiki.github.com/aslakhellesoy/cucumber/jruby-and-java
 
L

lith

I suspect, but I may be totally wrong, that Java takes a while to
'warm up' a more complex runtime environment, and that Ruby gets going
faster. =A0The JVM has evolved in an environment where it has tended to
be used for long-running processes.

Put in a few require statements, load some gems in the ruby source and
then repeat the comparison. The last time I checked, e.g., groovy's
startup time was comparable to ruby's.
 
M

Mike Sassak

[Note: parts of this message were removed to make it a legal post.]

-- You won't really get a taste for what we mean until you start writing
-- Ruby (real Ruby, not Java ported line-by-line to Ruby).

What example opensource projects can you refer me to which espouse the
"real Ruby" style of doing things?

I'd prefer non-Rails projects, as I understand the completely different
approach of webapp dev with Rails.

I'm looking specifically at Ruby.

I keep getting told that I must understand the "Ruby way" - so I'd
appreciate instruction on how to accomplish the "Ruby way" considering I
am apparently boxed in as a "Java/C style programmer" ... despite
disliking C :)

Hi Ben,

Three books come to mind when discussing "real Ruby" style:

"The Ruby Way", by Hal Fulton: http://isbn.nu/9780672328848 (The oldest of
the three, and I don't *think* it covers 1.9)
"The Well-Grounded Rubyist", by David A. Black: http://isbn.nu/9781933988658

"Ruby Best Practices", by Gregory Brown: http://isbn.nu/9780596523008

For projects, I'd recommend Rake or FasterCSV. There are no doubt many, many
more, but those two are written by very well respected Rubyists, and are
also in widespread use.

HTH,
Mike
 
M

Martin DeMello

-- May I then ask what made you consider Ruby in the first place?

The reason I'm considering it is because I don't want to blindly choose
Java just because it's the default.

As for why Ruby and not Erlang, Scala, Groovy etc -- the honest answer
is because Ruby is getting so much attention these days, to the point of
religious fervor amongst many I speak to that I need to take an
objective look at it and what it does well.

If speed and static typing are must-haves, and if you have already got
a lot invested in the jvm, scala is very well worth a look. It also
has some features that let you write concise, maintainable code in
much the same way that you could with ruby.
http://www.cordinc.com/blog/2009/04/combinatorial-iterators-in-jav.html
is an interesting look at the same task implemented in all three
languages.

martin
 
B

Ben Christensen

Thank you Martin and Mike for the book references, I will go pursue
further education on the subject from those.

This thread has been very instructional to me and I appreciate your
willingness to discuss this subject.

Have a nice weekend everyone.

Ben
 
P

Peter Booth

[Note: parts of this message were removed to make it a legal post.]

Ben,

Thanks for provoking a productive discussion.
I think we all agree that Ruby is slow, very slow, and my impression
is that you underestimate the slowdown.

But ...

The "is Ruby fast enough?" discussion suffers from the same flaws as
many that precede it ( Fortran vs Assembler, C vs Fortran, C++ vs C, C+
+ vs Fortran, Java vs C++ ...)

The discussion rests on some faulty assumptions:

That language runtime performance will dictate system performance. In
fact, rarely is that true and in performance engineering the truth is
much more farcical than anyone might think...

The first time I was paid to write Fortran code I'd been warned that
Fortran would be unacceptably slow compared to Assembler. It didn't
matter. I simply was not capable of writing sophisticated time series
analysis code in Assembler. In fact a large part of that project was
built and deployed with GW-Basic on a 6MHz 8086 CPU with 512K of RAM.
As a newbie programmer I didn't realize that an interpreted language
could not perform, and the app successfully predicted windshifts in
real time in about 5% of the time that been budgeted. Perhaps with
more work experience I would have known better ;-)

Since then I've done a bunch of performance critical coding and, over
the past few years, a bunch of tuning work.

Ruby's 3x performance penalty is enormous.

But it's dwarfed by the performance degradation caused by typical
coding and typical physical architectures.

Two real, typical datapoints, from a list of hundreds ...

In Dec 2008 I tuned a production Rails app that had 100,000 users,
improving the client side build time, for the test page,
from 2.2 sec to 181 ms, (a factor of 12). In an appendix to that
project I identified more than a dozen unimplemented tunings that
could further lower that build time to about 7 msec.

In 2003 I worked on a similar Java project, and spent much longer
tuning a similar dynamic page (running on much slower hardware). The
team implemented more than 500 performance fixes over six months,
improving page build times from approx 2.5 sec to 14 ms (a factor of
180x).

Neither app was built by weak programmers - in fact they were two
extremely smart development teams.

So when you describe a web service that responds (server side) in 140
ms, and ask why you should consider a toolset that might triple that
response time, I ask

"has someone else deployed a similar web service that responds in 2
seconds?"
"has someone else deployed a similar web service that responds in 2 ms?"

"what would it take for it to respond in 20 ms?
"what would it take for it to respond in 5 ms?"
"what would it take for it to respond in 1 ms?"

I hate slow code and slow websites and I resent the time I waste
waiting for both.
But our industry norm is for system response times to be 100x or
more slower than they need to be.

You might think "BS", or "OK, but he's talking about the doofus
programmers, he's not talking about us."

I'm talking about you, me, all of us.

I don't know anything about the web service that you describe but I
will happily wager $50 that we can take any Java web service that is
currently running in production, and replace it with a Ruby equivalent
that is twice as fast.

Note that I'm not saying "I can". I'm saying that you, me or any smart
programmer here can do this.

Here's the thing - I've worked on at least a dozen platform rewrite
projects, going back more than 20 years (
"we need a C version of this hand optimized assembler file IO layer"
"Hey build a Java version of this C++/X app",
"we need a web version of this desktop app",
"we need a script version of this compiled app").

Typically there's an accompanying message that management understand
that it might be twice as slow.

On every single occasion, the surprising outcome is that the new
version, built with a higher level "slower" toolset, outperformed the
"stable, optimized, tuned" version, typically by a factor of 3 or
more. So I'd be a fool to continue being surprised by this. I'm not
saying that I'm a better programmer than anyone. I am saying that the
amount of wasted resources in most deployed systems is much, much
higher than people realize, for a whole set of reasons.


Thanks for initiating such an interesting conversation, and for
persisting with it.

Peter Booth.
 
B

Ben Christensen

Peter,

Taking your experiences one step further, wouldn't it stand to reason
that if a system is being "rebuilt" with all of the lessons learned, but
with the mature "faster performance" language, that it could achieve
higher performance than being rebuilt in a new, less mature, "slower
performance" language?

Your well-founded arguments suggest that many (if not the majority of)
performance issues are in poor design and implementation - not the
language itself. I agree that this is often the case - I find and fix
many of them in the systems I profile. A recent example was an issue
causing 2 orders of magnitude in performance degradation because of
absolutely horrible design - nothing to do with any type of language,
platform or infrastructure.

But if a system is being built by a team capable of achieving the
performance gains you claim with a "slower" toolset, if given a "faster"
toolset, would that same team not accomplish an even better performing
end result?

Of course, I'm not suggesting a difference as extreme as Assember and C
- which is such a different paradigm that this comparison is very
difficult to do.

Current languages though are so often variations on a theme - rather
than revolutionary changes in approach. For example, working with Ruby
doesn't leave me feeling like I've just experienced some nirvana -- it
feels like just a different approach to things that may or may not
benefit certain tasks -- but principally is not so different from Java
(or Groovy, Scala, C#, Python etc) as to make me feel something earth
shattering has occurred.

Thus, if a team equally skilled in both Ruby (and its "way" of doing
things) and Java could approach a project and avoid the design pitfalls
that cause most of the performance issues you have stated - then
wouldn't the team accomplish higher performance with Java?

Ben
 
D

David Masover

That being said, I am trying to figure out what the "Ruby way" is -
which so far is far from clear to me. [...]
What example opensource projects can you refer me to which espouse the
"real Ruby" style of doing things?

I can't think of any particularly good examples, mainly because...
unless it's a small team
of very competent people who all believe in the same paradigm and treat
their code as art.

I was part of just such a team. We built a set of semi-formal rules, and an
always-outdated document about coding style. Mostly, though, our coding style
evolved together because we were always in each other's code and over each
other's shoulder.

So, unfortunately, I've developed a very visceral and intuitive sense of what
"real Ruby" should be, what's idiomatic, but I find it difficult to express.

I can point to a few things you've probably heard:

- Duck typing. The type and class hierarchy is completely irrelevant. All you
care about is whether the object in question responds to a particular method.
(This means you should more often use #responds_to? rather than #kind_of? if
you're testing your arguments at all.)

- Encapsulation. Not as in enforcing what's private, because you can't
(there's always #send and #instance_variable_get), but as in, push the logic
back into the appropriate object, rather than into something operating on that
logic.

- DSLs. Or, less buzzword-y, define what you'd like to be able to do, and then
figure out how to do it. Go by what's most expressive, and most sounds like
English -- treat code as communication. "Code like a girl."

- Don't Repeat Yourself.

I can give you some extreme examples: Rake (or even Capistrano), Hpricot (or
better, Nokogiri), Sinatra, Markaby, and Rspec (or test-spec, etc).

I'm not suggesting you read the source of all of them. Rather, see how they
might be used. Sinatra is a particularly powerful example, especially combined
with Markaby -- though I prefer Haml for real projects. Rails is a fine
framework, but it's beautiful to see a framework dissolve into nothing more
than:

get '/' do
'Hello, world!'
end
For example, if both Java and Ruby both performed single threaded
transactions at 150ms each, and both scaled to 10 concurrent threads
equally well, but Java continues to scale to 30 concurrent threads and
Ruby does not, then that's a scenario where I can add 3 machines to
scale Ruby horizontally and truly argue that the cost of the hardware is
more than made up for by lower developer costs.

But, "per request" performance does not get improved by this type of
solution.

A good point. Still worth investigating whether Ruby can be "fast enough" for
this. Just for fun, here's a quick presentation:

http://www.slideshare.net/wycats/merb-camp-keynote-presentation

This is also relevant, as there are plans to merge Merb and Rails at some
point, while retaining the advantages of Merb -- particularly performance.
Adding faster hardware does not make Ruby catch up to Java - since Java
also improves with faster hardware.

Yes, you've said this before -- but it doesn't have to. Take your example
above -- if you can get Ruby under 150 ms, that's good enough. Adding faster
hardware gets Ruby under 150 ms. If it gets Java down to 30 ms, what's the
point?
It provides a better use experience to the user and (according to Google
and Amazon) increases their usage of the system.

I'm curious what the threshold was for this to make a difference.

Certainly, at a certain point, it doesn't. The difference between 16 ms and 0.6
ms would actually be invisible to the human eye. But while 100 ms vs 50 ms may
make a difference, I'm skeptical. Users are annoyed at having to wait a tenth
of a second for a response?
The speed at which an application responds to an end users request
impacts the overall usability of an application.

It is for this same reason that things such as network compression,
network optimization (CDNs, Akamai route acceleration etc) and client
side caching also all play a role.

These all make sense -- Akamai in particular -- in the context of having a 100
ms response instead of, say, 500 ms or a full second, or in the context of
scalability.
-- when was the last time the type system saved you?

It's the toolset as you stated that you suspect.

The readability of code to know exactly what types a given argument,
variable or array contain.

To me, this falls back into Duck Typing. What type does this argument contain?
Why is this a meaningful question? If I want it to contain a string, for
instance, all I really need to know is whether it responds to #to_s.

More likely, it's a more complex object, but it's still the behavior that I
care about, not the type of it. And this intuitively makes sense -- in the
real world, also. When making a hiring decision, do you care about the "type"
of the person -- their degree, their sex, their skin color? Or do you care
what they can do, and how they'll interact with the rest of the team?

Yes, the degree may be an indication of that, but it's not really what you
care about. And certainly, the other things I mentioned shouldn't enter into
the equation at all.
For example, code completion in these tools to suggest the available API
methods is almost useless, as they offer virtually every method
available under the sun, as they are not interpreting what actual type
the variable is.

Because it probably doesn't have one yet.

While it's a bit different, try running an IRB shell with your favorite
framework loaded and some sort of tab completion. It won't be perfect, but
it'll probably work.

In the mean time, I'm going to say that it isn't an issue for me, simply
because if the framework I'm using is so complex that I need code completion
for daily work, I'm probably using the wrong framework. I can think of some
times it would've been convenient, but not nearly worth having to use one of
these other languages.
Therefore they'll show me 15 different versions of a
method with the same name, all for different object types from the Ruby
API.

Any one of them would probably have been a starting place.
Thus, I must now depend on a team of developers properly documenting
everything, using very descriptive naming conventions (and properly
refactoring all of that when changes occur), and wrapping everything in
unit tests.

These are things you should rely on anyway.

No, not Hungarian notation, but calling the variable something more
descriptive than 'a' and 'b'.
Now, all of those are "ideal" cases - ones I believe in and stress
continually. I have hundreds and hundreds of unit tests and automated
build servers etc - but in the "real world", getting teams to comment
every method, properly name (and refactor) variable names and cover
everything in unit tests just doesn't happen

I don't comment every method. I should comment more than I do, but for
example:

def writable_by? user
# ...
end

Tell me you don't at least have a guess what that does.

-- 100 lines of code is generally easier to read and
-- debug than a thousand.

I'll give you that - but I have yet to see anything that proves to me
that a competent developer using both Ruby and Java (or C# for that
matter) would have 10x as much written code than they would in Ruby.

It's probably an exaggeration, but not much, though I admittedly have limited
experience in Java. But as an example, how much time do you spend writing
interfaces? Maybe it was the nature of the assignment, but I would guess
easily 20-30% of my time doing Java in school was doing things like writing
interface definitions.

That whole file becomes irrelevant in Ruby.

And I would say the same for Ruby or Python, and to a lesser extent, Perl and
Lisp -- it does end up being _significantly_ less code. I'm learning Lisp now,
and this book:

http://gigamonkeys.com/book

opens with just such an anecdote:

"The original team, writing in FORTRAN, had burned through half the money and
almost all the time allotted to the project with nothing to show for their
efforts... A year later, and using only what was left of the original budget,
his team delivered a working application with features that the original team
had given up any hope of delivering. My dad credits his team's success to
their decision to use Lisp.

"Now, that's just one anecdote. And maybe my dad is wrong about why they
succeeded. Or maybe Lisp was better only in comparison to other languages of
the day..."

I could say the same -- certainly Java is going to be better than FORTRAN. But
you'll still occasionally find the story of the team which beat everyone to
market, or swooped in and rewrote a failing project, or won.
The "cruft" so often referred to are things that I don't even consider
or think of. Boilerplate code ... clutter and sometimes annoying ...
fades into the background and tools remove the pain of it.

I don't think tools would remove the pain of looking at it, at least -- and
yes, it is annoying. Even if the language is going to be statically typed,
consider the runtime exception. If the Java compiler knows enough to know that
I forgot to declare what type of exceptions a method might throw, why do I
have to specify them at all? If it's for the sake of other developers, why
can't the tool tell them?

After all, there are going to be plenty of methods which really wouldn't care
about exceptions -- just let them pass through, let some other layer handle
them.

I also find it telling that with Ruby, I can get by with just a good text
editor -- TextMate for OS X was excellent, though I now use Kate on Linux --
whereas with Java, I would pretty much need a tool just to remove the pain of
the language.
Amazon referred another book called "The Ruby Way" which may also
provide me good insights. Any experience with that one?

None. I did read a book called "The Rails Way" which was excellent, and seems
to be from the same series, but by a different author.
In fact, I'm trying to figure
out how to rip Java out of my webapps completely and leave that to the
backend webservices and let the presentation layer be as free from
"code" as possible.

Look at Haml and Sass. You'll either love it or hate it.
For example, if I can accomplish a dynamic front-end purely driven by
client side Javascript using AJAX techniques with a REST style
webservices backend, I will try to pursue that.

I like jQuery for this. Rails and Merb seem to be moving back towards
integrating this kind of thing -- "link_to_remote" is an old-school example,
and I suspect we'll see more of this sort of thing in the future.

I've also been a big fan of replacing the X in AJAX with either JSON or HTML,
as the situation demands. While it's a bit sloppy, HTML makes sense in that I
can then have all the HTML-generation stuff in the server-side views, where
they belong, and the Javascript on the client is that much simpler. But if I
was writing a richer client, JSON would be ideal, at least until someone shows
me a decent Javascript Yaml library.
The middle ground seems to be pursuing Ruby or something else that is
still server-side, but better suited to the always changing pace of
webapp dev and more creative, script driven coding style better suited
to web developers and designers.

I think this would work well with the above. In particular, Rails has been
very REST-oriented for a very long time.
 
P

Peter Booth

[Note: parts of this message were removed to make it a legal post.]

Ben,

I think the problem is that we are technologists, so we see our work
through a technical lens. But developing systems is a human activity.


Peter,

Taking your experiences one step further, wouldn't it stand to reason
that if a system is being "rebuilt" with all of the lessons learned,
but
with the mature "faster performance" language, that it could achieve
higher performance than being rebuilt in a new, less mature, "slower
performance" language?

It wouldn't be true, but it might stand to reason. It would be
reasonable
for me to expect that Microsoft Word, on my Dual Core MacBook Pro
should be 100x as responsive as the first dedicated word processor
that I used 25 years ago

I think that, in general, systems are as slow as is physically possible
whilst still being "good enough." I suspect that the 2 or 3 orders
of magnitude
degradations are common today because of current hardware, and that 20
years
ago the same systems might be 1 or 2 orders of magnitude slower.

Your well-founded arguments suggest that many (if not the majority of)
performance issues are in poor design and implementation - not the
language itself. I agree that this is often the case - I find and fix
many of them in the systems I profile. A recent example was an issue
causing 2 orders of magnitude in performance degradation because of
absolutely horrible design - nothing to do with any type of language,
platform or infrastructure.

My work has shifted in recent years from development to more short-term,
fire-fighting, performance work. I've found that to making a dramatic
performance improvement in minimal time requires being willing (and
able)
to work at all layers in the technology stack, from web page
construction,
application code, server configuration, physical architecture(what
runs where),
DB schema and query tuning, OS kernel tuning, TCP stack tuning, RDBMS,
physical network, hardware, hosting, virtualization, etc.

Unless I'm feeling jaded, I wouldn't say poor design and
implementation - rather
incomplete or nonexistent design and physical architecture, and flawed
implementation.
A positive spin on this is to say that 95% of startups fail, 70% of IT
allegedly fail,
thus it's a waste of effort to unnecessarily invest this time until
there is evidence that
the system will live.

But if a system is being built by a team capable of achieving the
performance gains you claim with a "slower" toolset, if given a
"faster"
toolset, would that same team not accomplish an even better performing
end result?

Not at all. The same team that builds a web service that responds in
two seconds
is capable of building the same web service with a response time of
200 ms.
But what incentive do they have, if they don't have a 200ms SLA?

The "bloat factor" is not a measure of developer strength. It seems to
be more a function
of the expected performance, and how visible performance is.

Here's an example:

I worked on a project that had more than 100 developers building
"portlets",
tiny portal widgets that built a single piece of content. My team was
responsible for
performance. After a few months of furiously tuning everything we
could we made
a change that had a profound impact:

We reconfigured the integration instance of our system so that any
user would view
the build time of each portlet, as a small label on the frame of the
portlet. At first
build times ranged from upto a few seconds. Within a few days those
developers
whose code was especially slow were proactively asking experienced
architects to help
them fix performance issues. Within two months there were no portlets
building
in more than 100 ms. The force of social disapproval had a much bigger
impact than
buying ten licenses of a Java profiler.
... working with Ruby ..... is not so different from Java ... make
me feel
something earth shattering has occurred.

I was a bad/intermediate C++ programmer when I learnt Java. My
managers, who were
all much stronger at C++ would say similar things. Java did turn out
to be ground-breaking
despite looking visually so much like C++

The significance of a technology change depends upon how its used, not
just syntax or the obvious feature-set.
I think it took more than a decade to clearly see how Java had changed
things for developers.

In 2015 I will be happy to discuss whether or not Ruby makes the earth
move for me.

Thus, if a team equally skilled in both Ruby (and its "way" of doing
things) and Java could approach a project and avoid the design
pitfalls
that cause most of the performance issues you have stated - then
wouldn't the team accomplish higher performance with Java?

No. Like the guy in the bar said, "If my auntie Velma had a pair,
she'd be my uncle ..."

I interviewed with two competing, secretive, equity option dealers in
2004. Each had a team of 3 or 4 developers
who had written their own automated trading system. These dealers
competed head to head and used
identical technology (Java, Linux on Intel), and both were
successful. The first place had identified that
equity options was an important business to them, knew that the US
markets would be most profitable,
and had provided the group with almost a blank check to get the job
done. They were proud of their
300 server cluster and the profits they made.

The other team was less optimistic and had a much smaller bankroll.
They didn't realize that the US was
the place to be so they tried to deal in all global markets. They
built a similar system to the first group
but they had a system workload that was about 10x that of the first
company. They were just as proud of
their home-grown system, which made them lots of money, even though it
was hosted on a mere 4 servers.

Two teams, similar background, built near-identical systems, and one
had a capacity that was
1000x the capacity of the other.

This is not unusual.


Peter.
 
P

Peter Booth

[Note: parts of this message were removed to make it a legal post.]


I'm curious what the threshold was for this to make a difference.

Certainly, at a certain point, it doesn't. The difference between 16
ms and 0.6
ms would actually be invisible to the human eye. But while 100 ms vs
50 ms may
make a difference, I'm skeptical. Users are annoyed at having to
wait a tenth
of a second for a response?



I've spent some time trying to understand this. My impressions ..

Human Factors Research says:
A response is perceived as instantaneous if it occurs within 0.1 or
0.2 sec
A response is perceived as immediate, and won't interrupt
concentration, if it occurs within about 0.5 to 1.0 sec
If we speed up a website, then a change thats less than 7% to 18%
wont be discerned by a user.

The State of the art
Google measured that a 400ms slowdown measurably reduced the number of
searches per user by 0.76%
Shopzilla invested in performance tuning their website and saw revenue
increases of 7 to 12%
Bing saw measurable changes in user behavior with slowdowns of 200ms
but not with slowdowsn of 50ms

State of the Art Data Point: A google search for "state of art
performance testing" from a slow broadband (ADSL 1.5MB/s) IE7 client
in Dulles VA has an average response time of 861 ms (entire page)

The State of the practice
The web is slow, unnecessarily slow, sometimes painfully slow.
Most website owners (and most developers)
dont know that their websites are slow,
and they dont know how to fix them

Above Average Data Point: If you search either meetup.com or LinkedIn
for "performance testing"
from same client as above both have an average response time of 2.0
sec (entire page).

Sources
Designing and Engineering Time by Steven Seow
papers from Xerox Parc
The O'Reilly Velocity Conference proceedings for 2007 to 2009
My clients
Remote test tools like Keynote, Neustar, Gomez etc
 
C

Charles Oliver Nutter

Worth noting again that for any long-running code or benchmarks,
you'll want to pass --server to use the optimizing JVM in JRuby. Much
faster.

1.8.6 is pretty slow, compared to other impls. Ruby 1.9 and JRuby will
perform better, as shown by a few folks. JRuby on a Java 6 JVM with
--fast and --server should perform very well.

And, of course JRuby adds other possibilities:

$ java FileReadParse
Starting to read file...
The number of tokens is: 234937
It took 2098 ms

$ java FileReadParse
Starting to read file...
The number of tokens is: 234937
It took 788 ms

$ ruby -v file_read_parse.rb
ruby 1.8.2 (2004-12-25) [powerpc-darwin8.0]
Starting to read file ...
The number of tokens is: 234937
It took 2666.646 ms

$ jruby -v file_read_parse.rb
jruby 1.3.1 (ruby 1.8.6p287) (2009-06-15 2fd6c3d) (Java HotSpot(TM)
Client VM 1.5.0_16) [ppc-java]
Starting to read file ...
The number of tokens is: 234937
It took 3120.0 ms

$ jruby --fast --server -v file_read_parse.rb
jruby 1.3.1 (ruby 1.8.6p287) (2009-06-15 2fd6c3d) (Java HotSpot(TM)
Client VM 1.5.0_16) [ppc-java]
Starting to read file ...
The number of tokens is: 234937
It took 2809.0 ms

$ jruby -v file_read_parse-2.rb
jruby 1.3.1 (ruby 1.8.6p287) (2009-06-15 2fd6c3d) (Java HotSpot(TM)
Client VM 1.5.0_16) [ppc-java]
Starting to read file...
The number of tokens is: 234937
It took 593 ms

$ java FileReadParse
Starting to read file...
The number of tokens is: 234937
It took 588 ms

$ jruby -v file_read_parse-2.rb
jruby 1.3.1 (ruby 1.8.6p287) (2009-06-15 2fd6c3d) (Java HotSpot(TM)
Client VM 1.5.0_16) [ppc-java]
Starting to read file...
The number of tokens is: 234937
It took 595 ms

$ cat file_read_parse-2.rb
require 'java'
java_import 'FileReadParse'

FileReadParse.new.do_stuff

:)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,169
Messages
2,570,919
Members
47,459
Latest member
Vida00R129

Latest Threads

Top