[LRUG] Queuing systems

gareth rushgrove gareth.rushgrove at gmail.com
Wed Sep 7 14:40:44 PDT 2011


On 7 September 2011 10:55, Neil Middleton <neil.middleton at gmail.com> wrote:

> We don't really want to get into messing about with
> hosting too much.

To be honest, given what you've described I'd probably get someone
involved in the project who is happy to go 'messing about with
hosting'.

What you're describing seems to want a highly available queue, and a
scalable number of persistent workers. And that's going to require
some server juggling however you go about it. Because everything
involved is going to fail at some point or other.

I'd also see if any of your constrains can be reduced in severity.
Given the millions of records you're collecting I'm assuming that the
aggregate data is the end goal (which might be incorrect, but just an
example), so if you lose a few records is that going to throw the
significance of the outputs? Relaxing things from the ideal will make
you're life much easier.

G

>
> On Wednesday, 7 September 2011 at 10:48, Chris Rode wrote:
>
> Processing every job is accommodated within the AMQP specification (be
> careful about double processing though). In order, removes the ability to
> parallel process.
> In such systems it is much easier to create the job on the queue than
> process it. If you don't parallel process when consuming you may create a
> bottle necked backlog which will get worse over time.
> Use cases off the golden path will not guarantee In order processing, even
> in the AMQP compliant techs.
> On 07/09/2011, at 11:31, Neil Middleton <neil.middleton at gmail.com> wrote:
>
> Everything processed in order.  Generally jobs are similar to analytics
> data, and are processed to provide a similar sort of stats set.
> Every single job needs to be processed.
> Neil
>
> On Wednesday, 7 September 2011 at 10:29, Graham Ashton wrote:
>
> On 7 Sep 2011, at 10:18, Neil Middleton wrote:
>
> Hundreds, possibly thousands of new jobs per second.
>
> I was actually wondering in terms of bytes per second, but that's still a
> useful answer.
>
> I know I'm still avoiding the question here, but I'm now wondering how
> you're thinking of going about processing them. That would probably have an
> impact on how I'd approach queueing.
>
> i.e. If your job processing component went off line for six hours for some
> reason, would it (when it came back up) be more important to process the
> earliest queued data, or the most recent data?
>
> If your job processing stuff got a long way behind would it make sense to
> drop really old jobs and just start on recent stuff? I'm wondering if this
> data needs persisting to disk, or whether you can get away with stashing it
> in RAM (possibly in a persistence backed key value store like Redis).
> _______________________________________________
> Chat mailing list
> Chat at lists.lrug.org
> http://lists.lrug.org/listinfo.cgi/chat-lrug.org
>
> _______________________________________________
> Chat mailing list
> Chat at lists.lrug.org
> http://lists.lrug.org/listinfo.cgi/chat-lrug.org
>
> _______________________________________________
> Chat mailing list
> Chat at lists.lrug.org
> http://lists.lrug.org/listinfo.cgi/chat-lrug.org
>
>
> _______________________________________________
> Chat mailing list
> Chat at lists.lrug.org
> http://lists.lrug.org/listinfo.cgi/chat-lrug.org
>
>



-- 
Gareth Rushgrove
Web Geek

morethanseven.net
garethrushgrove.com



More information about the Chat mailing list