[LRUG] Queuing systems
Jim Myhrberg
contact at jimeh.me
Wed Sep 7 03:09:04 PDT 2011
AMQP doesn't prohibit you from processing jobs in parallel. A broker will more or less round-robin messages between all consumers of a queue. By default a message is removed once it's been pushed to a consumer, so double processing isn't an issue, unless you're acknowledging each message after you've processed it.
Hence, double processing would only happen if your worker starts processing a job, updates database records or other things, and then something goes wrong and it never acknowledges the message. Then the broker will mark the message as unacknowledged and re-queue it so it gets processed again by a consumer, but since some of the database writes are done, things get weird.
Without using acknowledge though, it does mean that your worker can crash and die, and effectively loose the messages which have been pushed to it, since the broker has removed them from the queue as soon as it's been pushed.
-jim
On Wednesday, 7 September 2011 at 10:48, Chris Rode wrote:
> Processing every job is accommodated within the AMQP specification (be careful about double processing though). In order, removes the ability to parallel process.
>
> In such systems it is much easier to create the job on the queue than process it. If you don't parallel process when consuming you may create a bottle necked backlog which will get worse over time.
>
> Use cases off the golden path will not guarantee In order processing, even in the AMQP compliant techs.
>
> On 07/09/2011, at 11:31, Neil Middleton <neil.middleton at gmail.com (mailto:neil.middleton at gmail.com)> wrote:
>
> > Everything processed in order. Generally jobs are similar to analytics data, and are processed to provide a similar sort of stats set.
> >
> > Every single job needs to be processed.
> >
> > Neil
> >
> > On Wednesday, 7 September 2011 at 10:29, Graham Ashton wrote:
> >
> > > On 7 Sep 2011, at 10:18, Neil Middleton wrote:
> > >
> > > > Hundreds, possibly thousands of new jobs per second.
> > >
> > > I was actually wondering in terms of bytes per second, but that's still a useful answer.
> > >
> > > I know I'm still avoiding the question here, but I'm now wondering how you're thinking of going about processing them. That would probably have an impact on how I'd approach queueing.
> > >
> > > i.e. If your job processing component went off line for six hours for some reason, would it (when it came back up) be more important to process the earliest queued data, or the most recent data?
> > >
> > > If your job processing stuff got a long way behind would it make sense to drop really old jobs and just start on recent stuff? I'm wondering if this data needs persisting to disk, or whether you can get away with stashing it in RAM (possibly in a persistence backed key value store like Redis).
> > > _______________________________________________
> > > Chat mailing list
> > > Chat at lists.lrug.org (mailto:Chat at lists.lrug.org)
> > > http://lists.lrug.org/listinfo.cgi/chat-lrug.org
> >
> > _______________________________________________
> > Chat mailing list
> > Chat at lists.lrug.org (mailto:Chat at lists.lrug.org)
> > http://lists.lrug.org/listinfo.cgi/chat-lrug.org
> _______________________________________________
> Chat mailing list
> Chat at lists.lrug.org (mailto:Chat at lists.lrug.org)
> http://lists.lrug.org/listinfo.cgi/chat-lrug.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lrug.org/pipermail/chat-lrug.org/attachments/20110907/91d384ab/attachment-0003.html>
More information about the Chat
mailing list