[chef-dev] Re: Re: Idiom for adding a node to a Cluster


Chronological Thread 
  • From: Bryan Taylor < >
  • To: Blake Irvin < >
  • Cc: Joseph Holsten < >, Chef Dev < >
  • Subject: [chef-dev] Re: Re: Idiom for adding a node to a Cluster
  • Date: Wed, 16 Oct 2013 16:22:55 -0500

That's very reasonable -- clearly an improvement. The postgres recipe that sets up the replication will need to depend on the backup piece, since we have to bootstrap a slave from the backup data, but I can definitely see that piece being reusable quite beyond just databases.

BUT -- it still just relocates my fundamental problem, because what location would a backup cookbook use for its default file drop location?

On 10/16/2013 03:58 PM, Blake Irvin wrote:
" type="cite">
I've been through this sort of thing quite a few times now, and tend to embrace the Unix model for cookbooks more and more.  That is, have a cookbooks that do one small thing and do it well and are only loosely coupled (if at all) to anything else in Chef.

So for what you are trying, Bryan, I think I'd end up with something like this (assuming that your 'cluster' is a database cluster or some other type of datastore):

- A cookbook that that manages the data layer itself (for example, Postgres)
   - This cookbook includes a recipe for making local backups/dumps of data in some univerally-understood format (for example, .tgz)
   - Local backup directory is an overridable attribute

- A cookbook that manages backups
   - Likely creates a cronjob
   - Knows about a few types of remote data storage (NFS/FTP/Joyent Manta/Amazon S3/etc)
   - Remote backup location and protocol are overridable attributes

By separating concerns, I've found that my infrastructure is much less brittle and individual components can be improved without breaking anything else (Unix model/service-orientedness).

(As an aside, I love using Manta for backups because I get to use all the traditional Unix tools *inside* my object store, (for things like md5sum/gzcat/log analysis, etc), and the cli interfaces are super lightweight and easy to employ inside of cookbooks (npm install manta))


Blake


On Wed, Oct 16, 2013 at 1:42 PM, Bryan Taylor < " target="_blank"> > wrote:
I think I've arrived to the point of your 2nd paragraph. It really just comes down to how does an opscode community cookbook set a reasonable default for the backup location. It's easy enough to have the master set up a cron to do a local backup and then copy those files us to this location.

The problem is: where? I see three options:
 1) rsync backup to the chef server. It exists. Otherwise, yuck
 2) provision a node explicitly for this purpose. Also, yuck
 3) Use one of the db nodes for this purpose. Also yuck

Which one sucks least and would be accepted in a pull request? Or is there another way? My assumption is anybody doing this for real, would immediately override the backup location with a "real" location that doesn't suck.


On 10/16/2013 03:16 PM, Joseph Holsten wrote:
If you're using something with autoclustering, adding node addresses to config files and rolling restarts is safe to do with chef. Use role/tag search to find nodes and populate host lists, notify service restart when the config file is changed, and bob's your uncle. We do this for elasticsearch and hazelcast.

If you're setting up slaves/replicas, you can probably set up a run-once resource to bootstrap the server from a backup, authenticate itself with the master, and turn on replication. We did this for free-ipa (ldap)

If you need something that needs stonith-style singletons, doesn't handle split-brain on its own, &c, you need automation designed for that. Pacemaker & corosync are old school, things built on zookeeper, doozer or etcd are what the cool kids are doing. Everything I've heard of actually being in production does this on another band than chef, typically with a command-and-control tool like capistrano, fabric, mcollective, rundeck, &c. We use this approach for most things, notably mysql.

If you're looking for a magic bullet, etcd-chef < https://github.com/coderanger/etcd-chef> has that hard-consistency in its data store and supports triggers on config changes, so (if you're daring) that might meet your needs perfectly. I'm hoping to spike some work on it as soon as I migrate my entire company into Rackspace Chicago. But I doubt I'll be doing a production master failover via etc-chef in the immediate future.

In a broader sense I think our industry's terms for clusters are lacking, and our tools suffer for it.
--
~j
info janitor @ simply measured

On 2013-10-15, at 22:20, Bryan Taylor 
 
 " target="_blank"><
 > wrote:

I'm wondering what the chef idioms are for a certain problem that comes up a lot when expanding a cluster. Let's say I have some kind of persistence store and I want to enable replication, or add a new node with replication to an already running cluster. The replication will communicate on some custom protocol, but in order to work, I have to move stateful data, like db logs or whatever, from the master to a new node. The master is "the master right now", so it needs to be dynamically discovered, and accessed via rsync or scp, say,  to pull the files down. I'm thinking for this I should just provision every cluster node with a fixed static public/private key.






Archive powered by MHonArc 2.6.16.

§