[LRUG] Queue-related war stories

James McCarthy james at lety.co
Mon Mar 16 03:01:11 PDT 2015


I've used and swear by RabbitMQ, avoiding all the DB as a Q 
locking/transaction/commit issues.

Handy thing with RabbitMQ is that it includes an acknowledge 
configuration which can be auto or manual.

With manual, you need to call the acknowledge method before the message 
is removed from the queue.

James.

On 16/03/15 09:12, Najaf Ali wrote:
> Hi all,
>
> I'm trying to identify some general good practices (based on real-life 
> problems) when it comes to working with async job queues (think DJ, 
> Resque and Sidekiq).
>
> So far I've been doing this by collecting stories of how they've 
> failed catastrophically (e.g. sending thousands of spurious SMS's to 
> your customers) and seeing if I can identify any common themes based 
> on those.
>
> Here are some examples of what I mean (anonymised to protect the 
> innocent):
>
> * Having a (e.g. hourly) cron job that checks if a job has been done 
> and then enqueues the job if it hasn't. It knows this because the 
> successfully completed job would leave some sort of evidence of 
> completion in e.g. the database. If your workers go down for a day, 
> this means the same job would be enqueued over and over again 
> superfluously.
>
> * Sending multiple emails (hundreds) in a single job lead to a problem 
> where if just one of those emails (say the 24th) fails to be 
> delivered, the entire job fails and emails 1-23 get sent again when 
> your worker retries it again and again and again.
>
> * With the workers/app running the same codebase but on different 
> virtual servers, deploying only to the application server (and not the 
> server running the workers) resulted in the app servers queueing jobs 
> that the workers didn't know how to process.
>
> It would be great to hear what sort of issues/incidents you've come 
> across while using async job queues like the above. I don't think I 
> have enough examples to make any generalisations about the "right way" 
> to use them yet, so more interested in just things that went wrong and 
> how you fixed them at the moment.
>
> Feel free to reply off-list if you'd rather not share with everyone, I 
> intend to put the findings together in a blog post with a few guesses 
> as to how to avoid these sorts of problems.
>
> All the best,
>
> -Ali
>
>
> _______________________________________________
> Chat mailing list
> Chat at lists.lrug.org
> Archives: http://lists.lrug.org/pipermail/chat-lrug.org
> Manage your subscription: http://lists.lrug.org/options.cgi/chat-lrug.org
> List info: http://lists.lrug.org/listinfo.cgi/chat-lrug.org

-- 
James McCarthy

Software Consultant

LetyCo

Mob:  07577006897

Email:  james at lety.co

lety.co

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lrug.org/pipermail/chat-lrug.org/attachments/20150316/de3f8236/attachment-0003.html>


More information about the Chat mailing list