[LRUG] Multi-threading, Ruby & Rails

Nicolas Overloop nicolas at couchcontrol.com
Mon Sep 17 13:57:28 PDT 2012


Hi Paul,

Assuming that your application is CPU-bound than multi-threading on JRuby
should help. JRuby uses native threads which are scheduled by the kernel
and utilize multiple cores.
The MRI unfortunately can't use more than one core due to the
global interpreter lock (1.9) or the use of green threads (1.8).

If you want to stick with the MRI then you need multiple processes to take
advantage of multiple cores.
The problem seems compatible with using (distributed) workers who burn down
a centralized queue.

other thoughts:
- don't know much about forking
- if your application is IO-bound, consider using eventmachine
- upstart is great for keeping workers up
- too little information to tell if map/reduce would work

Cheers,
Nicolas



On Mon, Sep 17, 2012 at 6:57 PM, Paul Robinson <paul at 32moves.com> wrote:

> Hi all,
>
> Now the recruiter rant post is on HN, let's move that discussion over
> there and talk about some proper Ruby stuff, eh? Please?
>
> Right, multi-threading, Ruby and Rails.
>
> This is causing me some pain, and I suspect it's because my mid-/low-level
> coding voodoo left my soul sometime around 2004. The beauty of a high-level
> language such as Ruby mixed with the fact I have not had to spend a moment
> thinking about memory management in 6 years has left my deeper coding brain
> soft, flabby and over-obsessed with meta-programming. A little like the
> fattened goose before Christmas (who are *so* into meta-programming, btw).
>
> On our current project we have a linear process that takes some time to
> process. It can easily be parallelised, because it's a discrete set of
> 20-30 steps that need to be done in order for each of the 'x' number of
> instances we're dealing with. Right now it can take hours, and for various
> reasons we need it to take seconds.
>
> My first stab at this was to look at benchmarking profiles and to look for
> single methods that were taking up a lot of wallclock time. There aren't
> any. We're not locking on I/O, we're not sitting in a single method for 30%
> of the time or anything, it's just a long drawn-out set of processes.
> Interestingly, the only headliner (at 8% of wall clock) is Kernel#Integer
> and we can't eliminate that.
>
> So we're moving straight to parallelisation.
>
> My first thought was to either:
>
> a) Split things up into separate fork'ed processes, but I don't like the
> bootstrap/tidy-up overhead that fork provides
>
> b) Throw it out to cloud-like infrastructures like Hadoop/MapReduce, but
> the problems needs direct SQL access and that can get messy
>
> c) Multi-thread it, and at least on a single server be able to get 8x-16x
> performance increase over multiple cores and maybe re-visit b) but with
> something a bit more pure Ruby-esque like delayed job, resque, etc.
>
> The problem is, multi-threading in Ruby - particularly in Rails with
> ActiveRecord model actions - kinda sucks. I can get it working, but it's
> painful. It doesn't look or feel graceful, and frankly I'm not sure if the
> internal methods for doing it are all that careful.
>
> Anybody here with experience in this little niche want to open up the
> discussion, provide some pointers and context, before I start poking around
> the internals of MRI? I've discovered that JRuby has a potentially better
> internal implementation of Thread, but I've not had a chance to play with
> it in anger yet - is it worth it?
>
> Thanks in advance,
>
> Paul
>
> _______________________________________________
> Chat mailing list
> Chat at lists.lrug.org
> http://lists.lrug.org/listinfo.cgi/chat-lrug.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lrug.org/pipermail/chat-lrug.org/attachments/20120917/1b7fcd60/attachment.html>


More information about the Chat mailing list