[LRUG] e-petitions site

Aleksandar Simic asimic at gmail.com
Sun Aug 7 09:10:23 PDT 2011


On Sun, Aug 7, 2011 at 5:02 PM, Chris Parsons <chris.p at rsons.org> wrote:
> Hi Paul,
> On 5 Aug 2011, at 09:33, Paul Robinson wrote:
>
> It would be interesting to hear what caused the e-petitions site to go
> under. What cacheing was in use? Which tier gave in first? Was the load
> profile unexpected in that it was biased to some corner of the app that
> nobody thought people would be much interested in?
>
> 1000/minute is less than 17 requests/second, and to my mind doesn't seem too
> absurd: almost 60ms per request on a single thread. A moderate server in
> terms of CPU with decent RAM, local MySQL instance and 4-6 Passenger threads
> should be able to handle that no problem at all, I think. So it would be
> interesting to hear which part of it failed first.
>
> Sorry for a bit of delay in responding to this thread. I'm planning to blog
> this soon, but here are the facts:
> First up, the site didn't 'crash' as was reported: we had a few stuck rails
> processes and the load balancer is only a basic round robin, so you
> sometimes had to refresh a few times to get your request through. The site
> stayed up throughout if you were persistent.
> The Rails processes got stuck waiting on SMTP connections for sending
> emails. As soon as the email sending load decreased, the site popped right
> back up.
> The main lesson is not to trust your hosting environment's SMTP server to be
> responsive, and ensure that you send emails asynchronously right out of the
> gate rather than making the whole request dependent on it. We didn't quite
> get time to do this as part of main dev, but next time I'd proactively break
> this out of the critical path.
> Re stress testing: We used JMeter to test the site load extensively, and
> pre-launch we were getting 50 req/second without much trouble. However, the
> one difference being that we were using dummy email addresses in our JMeter
> scripts, which the hosting environments SMTP service was discarding very
> quickly, and it could therefore take a much higher volume than in real life.
> Should've thought of that possibility, too :)
> Hope that clears a few things up: happy to answer any other questions people
> have.

Hello Chris,

thanks for the explanation.

You mentioned that you've used "nginx + unicorn ruby stack".

What did you use for Unicorn worker monitoring and why?

> Thanks
> Chris

Thank you
Aleksandar



More information about the Chat mailing list