[LRUG] Multi-threading, Ruby & Rails

Tue Sep 18 00:51:56 PDT 2012

If you're comfortable with threading and it feels like a good fit,
switch to JRuby without a second thought. Otherwise, Hadoop.

Remember that Ruby is moving into the threaded web app space after
many many years with Rails 4, Rubinius, JRuby  and Phusion Passenger
all supporting threads. You will no longer be fighting an uphill
battle.

Best,
Sidu.
http://c42.in
http://sidu.in

On 18 September 2012 04:00, Tim Cowlishaw <tim at timcowlishaw.co.uk> wrote:
> On 17 September 2012 22:38, Roland Swingler <roland.swingler at gmail.com> wrote:
>
>>> b) Throw it out to cloud-like infrastructures like Hadoop/MapReduce, but the problems needs direct SQL access and that can get messy
>>
>> I've not tried it and I don't know whether you need SQL or "SQL-like"
>> but there are things like hive http://hive.apache.org/ built on top of
>> hadoop that may be of some use?
>>
>
> If I recall correctly, Hive provides a SQL-like querying layer on top
> of information that's stored in a Hadoop (HDFS cluster), rather than
> providing integration with a SQL db. However, there's a DB input
> format [1] for hadoop that allows you to use the rows returned by a DB
> query as the input to a mapreduce job which might be helpful in this
> case. It depends a little on the complexity of the query - in my
> fairly limited experience, doing complex joins can get rather messy
> (although there are patterns for writing MR jobs that alleviate this -
> Nathan Marz's 'Big Data' book [2] which is in Manning EAP at the
> moment is in its infancy, but it looks like it's going to become a
> good reference for this sort of stuff when it's published, as is their
> Hadoop book [3])
>
> Of course, using Hadoop would mean embracing some Java-ish
> infrastructure to a greater or lesser extent (you could use MRI ruby
> to run your jobs with hadoop streaming, but hadoop itself is still a
> Java tool. Alternatively you can use JRuby to access the Java apis
> directly, and if you're going down this road then:
>
>> JRuby threads are Java threads, so you you get their benefits - i.e.
>> proper use of all cores, no global interpreter lock.
>
> ...which might give you the performance increase you need without the
> extra overhead of setting up and maintaining a hadoop cluster. If you
> go don this route but are keen to use some sort of higher-level
> concurrency primitive than threads, locks, mutexes etc then you might
> want to take a look at akka [4], an erlang-ish   library for
> actor-based concurrency on the JVM.  It's written and maintained by
> Typesafe, the scala guys, but is usable from any other JVM language
> too (and it looks like people have had some success using it with
> JRuby [5]), so it might prove fruitful if you decide that JRuby's the
> way you want to go.
>
> Hope this helps!
>
> Tim
>
> REFERENCES
> ------------------
>
> [1] http://hadoop.apache.org/docs/mapreduce/current/api/org/apache/hadoop/mapreduce/lib/db/DBInputFormat.html
> [2] http://www.manning.com/marz/
> [3] http://www.manning.com/lam/
> [4] http://akka.io/
> [5] http://metaphysicaldeveloper.wordpress.com/2010/12/16/high-level-concurrency-with-jruby-and-akka-actors/
> _______________________________________________
> Chat mailing list
> Chat at lists.lrug.org
> http://lists.lrug.org/listinfo.cgi/chat-lrug.org