[LRUG] A general question about exception handling in services

Wed Apr 10 16:00:24 PDT 2013

Thanks Chris.

The exception catching and re-raising code is an approach we have used
elsewhere effectively. This is a case where it wouldn't work because of the
need to do something else if it fails.

If risky_thing.success?
  plan_a
else
  plan_b
end

Whilst passing enough info up to the controller or route level to execute
plan_b would be possible it doesn't feel like a clean way to encapsulate
things.

The queue probably sounds like it would solve this issue the most cleanly
and is something I'd forgotten about as (for time constraints reasons) we
opted for the plan a/b solution. Now it feels like the best way to not
complicate our code further and to even make it more robust would be to
revisit queuing.

Introducing yet another moving part would increase our debugging complexity
a step further though and highlight even more the need for improved
cross-service logging, etc.

On Wednesday, 10 April 2013, Chris Parsons wrote:

> How about posting the call to the external API providers to a queue?
>
> You can then return from the request with a good level of surety, and
> track the progress of the external API call, handling any failures and
> retrying as appropriate.
>
> If a queue isn't an option, I tend to use different kinds of Exceptions:
>
> * only catch very specific exceptions from the API calls
> * re-raise my own Exception objects
> * specifically catch my own Exception objects in the top level request
> code and handle as appropriate.
>
> HTH,
> Chris
>
> --
> Chris Parsons
> chris.p at rsons.org <javascript:_e({}, 'cvml', 'chris.p at rsons.org');>
> http://twitter.com/chrismdp
> http://chrismdp.com
>
> BDD Kickstart London, May 22-24, http://bddkickstart.com/dates#london
>
>
> On 10 Apr 2013, at 21:01, Mark Burns <markthedeveloper at gmail.com<javascript:_e({}, 'cvml', 'markthedeveloper at gmail.com');>>
> wrote:
>
> I get the impression there is a pattern for doing this and probably
> someone on this list has some good input into it.
>
> We've been thinking about how to handle failures in internal services,
> whilst integrating with third party services and trading off robustness and
> ability to debug complex requests and yet still notice actual genuine
> errors in our codebase. (e.g. avoiding things like 'try' and 'rescue nil'
> or 'rescue Exception')
>
> Let's say we have three internal services A,B,C and some external API
> providers X,Y, Z.
>
> Some object may be responsible for communicating with Z, but this object
> doesn't have access to the original incoming request.
> Also it's absolutely critical that if this request to Z fails, the rest of
> the request can complete and the our external API user is hidden from the
> failure and some manual or separate automated process resolves the issue.
>
> To emphasise the criticality of such a system it would be where a user has
> paid for a service and one part of the fulfilment of the customer's
> purchase is achieved by an API call to an external provider Z. If this
> doesn't occur then we'd have angry customers and so we make sure the
> request is fulfilled by any means possible (manual if necessary), but still
> assure the customer we have fulfilled their order.
>
> We've been toying with the idea of generating unique identifiers for our
> incoming requests and sending these in to all other internal services, then
> we'd be able to log these ids in all our log statements. We'd also ideally
> use these ids in communications to airbrake.
>
> We could pretty easily create middleware that can generate the ids and
> send/receive them in headers to our other services, but the issue comes
> with having access to this info in our models.
>
> sinatra route/rails controller code
>  --   some long
>  --   stack frame
>  --  model code communicating with Z
>
> One solution that would get us to our controller/route code where we can
> access the request info would be throwing or raising
> exceptions, but this then prevents us continuing the request in the normal
> way and completing the required tasks after a call to Z fails. Also it's
> horrendous goto flow control.
>
> Other undesirable hacks would be sticking something on the thread itself,
> or a global variable.
>
> The other thing is to actually ensure we can pass down request info all
> the way through a stack, but this completely breaks single responsibility
> and is going to result in complex spaghetti.
>
> There has to be some kind of intelligent solution to this that is elegant,
> readable and maintainable and isn't any of the things I've mentioned.
>
> Oh and another idea is to only swallow all exceptions in our protective
> blocks around API calls in production mode, and to not do this in dev or
> test. I.e. surface programming errors, but give as cast iron a guarantee as
> we can that no failure of Z can possibly result in non-completion of the
> rest of the request in production.
>
> Will appreciate hearing your thoughts,
>
> Mark
> _______________________________________________
> Chat mailing list
> Chat at lists.lrug.org <javascript:_e({}, 'cvml', 'Chat at lists.lrug.org');>
> http://lists.lrug.org/listinfo.cgi/chat-lrug.org
>
>
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lrug.org/pipermail/chat-lrug.org/attachments/20130411/db43f1b2/attachment.html>