asynchronous network access with Rack?

N

Nick Brown

I've read that "threading is considered harmful" for Ruby web apps.
Well, I'm writing a Sinatra app which will build a page based on the
responses of several servers (Net::HTTP.get). I want to do these .gets
in parallel, as doing them synchronously would obviously mean the users
would wait for a long time.

Would it be "considered harmful" to do:

resp_a, resp_b, resp_c = nil
thread_a = Thread.new { resp_a = Net::HTTP.get site_a }
thread_b = Thread.new { resp_b = Net::HTTP.get site_b }
thread_c = Thread.new { resp_c = Net::HTTP.get site_c }
thread_a.join
thread_b.join
thread_c.join

Is there any possible harm that could come from this? Can threading
interfere with Rack in some way? I haven't done much previous
development of threaded apps, so I would appreciate any tips.
 
T

Tom Reilly

I've used threading with getting several web pages for a long time and
I've never had any problem so long as you catch errors if a specifc web
page can't be obtained.
Tom Reilly
 
E

Ezra Zygmuntowicz

I've used threading with getting several web pages for a long time and =
I've never had any problem so long as you catch errors if a specifc web =
page can't be obtained.
Tom Reilly
=20


IMHO you would be better served using the 'thin' rack comat web =
server and using its async mode along with EM::HTTP::Request. This way =
you could use event driven style to hve zero threads and basically pause =
the clients request connection while you make async calls to all the =
other web services, once they all return then you fire the async =
callback for thin to resume the clients connection and return the =
results.

Doing it this way will require a bit more mental twisting to get =
all the async stuff correct but it will be far more scalable and will =
serve you much better in the end.

Cheers-

Ezra Zygmuntowicz
(e-mail address removed)
 
R

Richard Conroy

[Note: parts of this message were removed to make it a legal post.]

I've read that "threading is considered harmful" for Ruby web apps.
Well, I'm writing a Sinatra app which will build a page based on the
responses of several servers (Net::HTTP.get). I want to do these .gets
in parallel, as doing them synchronously would obviously mean the users
would wait for a long time.

There are some historical reasons behind threading == harmful (defaults for
Rails,
GIL & native gems, and a general lack of robustness in older Ruby thread
implementations).


Would it be "considered harmful" to do:

resp_a, resp_b, resp_c = nil
thread_a = Thread.new { resp_a = Net::HTTP.get site_a }
thread_b = Thread.new { resp_b = Net::HTTP.get site_b }
thread_c = Thread.new { resp_c = Net::HTTP.get site_c }
thread_a.join
thread_b.join
thread_c.join

Is there any possible harm that could come from this? Can threading
interfere with Rack in some way? I haven't done much previous
development of threaded apps, so I would appreciate any tips.

I believe Sinatra/Rack is thread safe, so you should be fine on that count.

Whats more important is that this model isn't exactly a good architecture.
You are spawning a lot of threads per request and you have no real external
oversight into how they are working. You can't send back your response
until you have received all your outbound responses and you are particularly
vulnerable to timeouts - in particular the client browser can timeout
your request, while you are still waiting on responses to outbound
connections.

You see a lot of solutions that use process level concurrency (BackgroundRb,
DelayedJob etc) but most web solutions that aggregate content from multiple
sites (i.e. mashups) do it all in the browser, with some cross site
scripting
& javascript.

Technically I dont see too many issues with the multi-threaded approach you
propose for smaller requests, but you will want to set an aggressive
timeout
on the outbound requests.
 
N

Nick Brown

Tom Reilly:
I've never had any problem...

Awesome! Good to hear :)

Ezra Zygmuntowicz:
you would be better served using the 'thin' rack comat web
server and using its async mode along with EM::HTTP::Request.

Thanks. I've been using Apache+Passenger because that's what I know, but
I will investigate Thin if it is indeed more scalable. Are you referring
to RAM usage when you say it's more scalable?

Richard Conroy:
javascript ... you will want to set an aggressive timeout

This must happen server-side. But you're right about the timeouts. And
some searching has revealed Timeout::timeout() to me! It would appear
that:

resp_a = nil
thread_a = Thread.new{ Timeout::timeout(4){ resp_a = Net::HTTP.get
site_a }}
...
thread_a.join

will do what I need, so long as I catch exceptions, too. And again, I'm
still open to other suggestions if anyone else has any!
 
B

Brian Candler

Or this slightly shorter version:

thread_a = Thread.new { Net::HTTP.get site_a }
thread_b = Thread.new { Net::HTTP.get site_b }
thread_c = Thread.new { Net::HTTP.get site_c }
val1 = thread_a.value
val2 = thread_b.value
val3 = thread_c.value
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,982
Messages
2,570,185
Members
46,736
Latest member
AdolphBig6

Latest Threads

Top