[LRUG] A general question about exception handling in services

Wed Apr 10 16:07:41 PDT 2013

Thanks Gabe.

Any other thread-local variable supporters or opponents out there?

I'm thinking it may make sense for us to build an abstraction that has an
implementation with the thread-local ids first as an initial step, then the
first time this bites us, to have a plan to consider using a logging
server.

We're using JRuby in places and my hunch is that as soon as we try and
leverage threads or actors or anything we will be in concurrency debugging
hell and perhaps dumping everything to a shared mongo or logging
service would be the simplest.

On Wednesday, 10 April 2013, Gabe da Silveira wrote:

> I wouldn't be totally averse to a global variable (or rather a
> thread-local variable) in this case.  Think of it like a PID except that is
> tied to the request.  As long as it is used exclusively for logging and you
> have a reliable way to guarantee it's set on every usage of the code (ie.
> web requests, consoles, resque jobs, etc) then you are obeying the single
> responsibility principle even if encapsulation is technically broken.
>
> On Wed, Apr 10, 2013 at 9:01 PM, Mark Burns <markthedeveloper at gmail.com<javascript:_e({}, 'cvml', 'markthedeveloper at gmail.com');>
> > wrote:
>
>> I get the impression there is a pattern for doing this and probably
>> someone on this list has some good input into it.
>>
>> We've been thinking about how to handle failures in internal services,
>> whilst integrating with third party services and trading off robustness and
>> ability to debug complex requests and yet still notice actual genuine
>> errors in our codebase. (e.g. avoiding things like 'try' and 'rescue nil'
>> or 'rescue Exception')
>>
>> Let's say we have three internal services A,B,C and some external API
>> providers X,Y, Z.
>>
>> Some object may be responsible for communicating with Z, but this object
>> doesn't have access to the original incoming request.
>> Also it's absolutely critical that if this request to Z fails, the rest
>> of the request can complete and the our external API user is hidden from
>> the failure and some manual or separate automated process resolves the
>> issue.
>>
>> To emphasise the criticality of such a system it would be where a user
>> has paid for a service and one part of the fulfilment of the customer's
>> purchase is achieved by an API call to an external provider Z. If this
>> doesn't occur then we'd have angry customers and so we make sure the
>> request is fulfilled by any means possible (manual if necessary), but still
>> assure the customer we have fulfilled their order.
>>
>> We've been toying with the idea of generating unique identifiers for our
>> incoming requests and sending these in to all other internal services, then
>> we'd be able to log these ids in all our log statements. We'd also ideally
>> use these ids in communications to airbrake.
>>
>> We could pretty easily create middleware that can generate the ids and
>> send/receive them in headers to our other services, but the issue comes
>> with having access to this info in our models.
>>
>> sinatra route/rails controller code
>>  --   some long
>>  --   stack frame
>>  --  model code communicating with Z
>>
>> One solution that would get us to our controller/route code where we can
>> access the request info would be throwing or raising
>> exceptions, but this then prevents us continuing the request in the
>> normal way and completing the required tasks after a call to Z fails. Also
>> it's horrendous goto flow control.
>>
>> Other undesirable hacks would be sticking something on the thread itself,
>> or a global variable.
>>
>> The other thing is to actually ensure we can pass down request info all
>> the way through a stack, but this completely breaks single responsibility
>> and is going to result in complex spaghetti.
>>
>> There has to be some kind of intelligent solution to this that is
>> elegant, readable and maintainable and isn't any of the things I've
>> mentioned.
>>
>> Oh and another idea is to only swallow all exceptions in our protective
>> blocks around API calls in production mode, and to not do this in dev or
>> test. I.e. surface programming errors, but give as cast iron a guarantee as
>> we can that no failure of Z can possibly result in non-completion of the
>> rest of the request in production.
>>
>> Will appreciate hearing your thoughts,
>>
>> Mark
>>
>> _______________________________________________
>> Chat mailing list
>> Chat at lists.lrug.org <javascript:_e({}, 'cvml', 'Chat at lists.lrug.org');>
>> http://lists.lrug.org/listinfo.cgi/chat-lrug.org
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lrug.org/pipermail/chat-lrug.org/attachments/20130411/8435ba2e/attachment-0003.html>