[lug] Setting up failover in Linux?

Sean Reifschneider jafo at tummy.com
Sun May 6 03:25:39 MDT 2012

On 04/30/2012 04:12 PM, Rob Nagler wrote:
> http://aws.amazon.com/architecture/
> Nothing here says anything about how to build fault tolerant apps.

Are you sure?  Because one of the links on that page is to their white-paper
titled "Building Fault-Tolerant Applications on AWS Whitepaper"...

> the system which can sustain a "yum update postgresql", have that stop
> in the middle, and survive.

Right, that is not what Amazon or Linux-HA are aiming to provide.

> disk, the transaction is not coordinated.  What I'm looking for is for
> Xapian, ZFS (if you insist), Postfix, and Posgres to all participate
> in the transaction.  The each have their own mechanisms for

Sure, and that sounds sweet, but that is not something that AWS or Linux-HA
or DRBD address.

If this is what you want in Linux fail-over, there's a lot of development
that needs to happen between there and here...

> This is voodoo at best.  If the network is partitioned, you can't send
> a message to the server to shutdown.  The two nodes may be visible to

STONITH isn't voodoo, it's out of band communication to ensure that a
failed node no longer is serving requests.  If you have devices at
locations that you can't do this sort of out-of-band STONITH, you probably
would use one or more external quorum servers to achieve that, but it
really depends on the exact situation.

But, I just don't know of anything in the open source arena that is going
to achieve the goals you have laid out.


More information about the LUG mailing list