[LRUG] What's the best way to go from 1 server to 2?

Olly Headey olly at freeagent.com
Tue Apr 16 04:27:25 PDT 2013


Hi Andy

So much to write about this. Where to start?

Having scaled from one server to dozens, where we're running in two DCs
replicating MySQL across multiple sites, I have a lot of thoughts on, and
experience with, this. What I would say in the first instance is: keep it
simple. I wouldn't advise trying to do ALL THE THINGS and aiming for fully
redundant, automated failover straight away. And a lot of this is
incredible cost and time sensitive, so don't go there. At least not right
away.

Is this a DR solution for catastrophic failure of your primary site (DC
down, unrecoverable)? Or do you just want to improve redundancy in your
stack so you can continue servicing requests if a single point of failure
goes down?

I'm presuming the second, so I won't go into the first (but that's an
interesting thread in itself). In which case, there are three main
considerations:

1. App server redundancy
2. Job server redundancy
3. Database redundancy

The first two are relatively simple assuming your app is stateless. You can
load balance between multiple app servers using something like nginx,
hardware (pricey) or even
DNS<http://dyn.com/dns/dynect-managed-dns/traffic-management-load-balancing-round-robin-cdn-manager/>.
Similarly, if you're using Delayed Job for queuing then you can just spin
up multiple job servers. This involves breaking our your app into a stack
of these distinct components (what I did at FreeAgent back in the day,
which is still effectively the same now - just more moving parts) and
deploying to all of them as part of your Cap process. This is all
reasonably straightforward. We actually used hardware load balancing
originally (someone else's problem at the time - throw money at it), but in
time we moved to nginx to do it in software. Anyway, you get the idea.

The database is a whole different problem. I would suggest not considering
auto-failover of this layer at all, at least not right now. Keep the DB as
a single point of failure (in terms of app uptime) but ensure you have data
redundancy. The best option here is maintaining a synchronous replica for
(almost) zero data loss, a simpler option would be full and incremental
backups that are transferred offsite but here you may lose data if the
server fails catastrophically and is unrecoverable. Depending on your risk
appetite, this may be something you could live with.

Once you have a replica (slave), you can build a process for switching the
slave to master and vice versa. Keep this manual, at least for now. Worst
case if your master goes down, you'll have a tried and tested failover
process which may take worst case 30 mins to get the service up and running
with no data loss. I think that's pretty robust for your needs right
now. There are questions about where the replica lives (same DC and
network, your life is easier again depending on risk appetite).

I hope this rambling helps at some level.


Cheers,
Olly

--
*Olly Headey :: Co-Founder and CTO*
FreeAgent
www.freeagent.com

Follow @freeagent <http://twitter.com/freeagent> on Twitter






On Tue, Apr 16, 2013 at 11:50 AM, Andrew Stewart
<boss at airbladesoftware.com>wrote:

> Good afternoon El Rug,
>
> What's the best way to increase from one server to two?
>
> Currently I have everything for my webapp – code, database, background
> jobs, etc – on one server.  Performance is fine but it's a single point of
> failure (see this morning's email thread).  Off the top of my head I'm
> thinking:
>
> - Use a different host in a different city from my current server.
> - Install same operating system as current server and set up identically
> via Chef/whatever.
> - Deploy all code changes to both servers with Capistrano but have second
> server serving Rails maintenance page (just in case anybody finds it).
> - Ideally set up live (mysql) replication...somehow.
> - If/when first server croaks, manually fail over to second server via
> changing DNS.
>
> I'm sure it's more complicated than that, particularly the switching from
> one server to the other (and back).  Does anybody have any tips?
>
> Thanks again,
>
> Andy Stewart
> _______________________________________________
> Chat mailing list
> Chat at lists.lrug.org
> http://lists.lrug.org/listinfo.cgi/chat-lrug.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lrug.org/pipermail/chat-lrug.org/attachments/20130416/82c1756f/attachment.html>


More information about the Chat mailing list