[chef] Re: Chef Replication


Chronological Thread 
  • From: Sam Pointer < >
  • To:
  • Subject: [chef] Re: Chef Replication
  • Date: Wed, 11 Sep 2013 09:46:09 +0100

Hey Nikhil,

The way in which you would go about this depends on what you're trying to do, why, and at what volume. A middle-man process between the two APIs is certainly the simplest option, as it avoids you having to think about the index state or getting involved with the chef-server 'machinery'. That said, here are some alternatives if you're running at volume.

If you're trying to have a warm local backup you may be better off using the replication facility in the data store. Assuming your're running the open source version of chef-server that would either be couchdb or postgres for 10.x and 11.x respectively. When we were attempting to scale 10.8.x chef-server there was a replication bug in the version of couch that shipped at that time. Unfortunately I don't have all of the version numbers to hand.

This way you're not reinventing tried and tested replication logic yourself.

Assuming you're on an 11.x version of the server and you want to replicate to a remote data center for warm backup purposes I would go down the route of ensuring postgres is storing everything at an lvm-managed location and look to periodic snaps that you can then stream over a socket to your warm standby box and restore, followed presumably be a re-index. You can of course attempt "proper" postgres replication, but I would be wary unless you've a good handle on latency between your two locations and/or you need the warm backup to be as current as it can be. That said, you'll still run into the problem of needing to re-index.

You also mention a write-master, read-slave setup in which the client would perform all of its queries against the slave and provide updates to the master.

Firstly, I'm not sure how you're going to achieve this without modifying the client. The only way I can possibly think of doing such a thing without client modifications is to front-up the API with a routing load balancer that sends GET and other read verbs to your slave whilst sending changes to the master. Again, you need to think about how you're going to keep the indexes up-to-date.

When your clients are querying the API in the manner of "get me all of my backend servers" you are most likely doing an index search rather than a .cdb_load (in 10.x money). If you really are having performance problems enough to warrant this kind of set-up then perhaps think about investigating which component of chef-server is causing you problems. You can do this by examining the expander queues, interrogating solr and postgres, and looking at the consumption and performance of the API tier.

We recently performed a benchnmark where we concurrently launched and built 500 EC2 instances against a single Chef 11.x open source server install on a single m1.xlarge with postgres storing everything on the root ephemeral volume (i.e. the worst possible disk layout). The built client nodes included 50 load balancers, 150 jetty application servers and 300 mysql servers organised into 100 self-discovering shards in a m-s-s configuration. The server easily had the head-room to double that; or in other words, we could have concurrently built over 1000 machines with inter-tier discovery before we'd even come close to the limits of the box or the default heap sizes, etc. If you're considering this because you're on 10.x, move to 11.x.

The above experience is worth noting in that we had previously been running a "full-scaled" 10.x backend where we had split everything out onto different hosts and then doubled-up "most" of the components to build a HA setup. Having moved to 11.x and seen the improvements in the installation process brought by the "everything from libc up" packaging for disaster and backup purposes you are better off, in my experience, ensuring you have a good backup of your data store and a runbook for installing, restoring and re-indexing the back-end again than attempting anything more convoluted unless you really have to.

If you'd like to talk about this issue further, please feel free to contact me off-list. (I am not affiliated with OpsCode).

Hope this helps,

Sam Pointer - www.opsunit.com



On 10 September 2013 15:20, Nikhil Shah < " target="_blank"> > wrote:
I did a bit of reading up and wasn't able to find the configuration I was hoping to find. I was looking for a way to have my chef server replicated to production data center. With that being said, I was hoping to have something like a master/slave where the master copies the data to the slave and the slave is read only. This would allow the production 'nodes' to pull data from the slave only. While this also decreases the latency. 

Any advice?

--
Nikhil Shah / System Administrator

nshah@theorchard.com


The Orchard® / www.theorchard.com

t (+1) 212.308.5648 / f (+1) 212.201.9203
23 E. 4th St., 3rd Fl / New York, NY 10003

The Daily Rind™ / www.dailyrindblog.com

Facebook / @orchtweets


Privileged And Confidential Communication.

This electronic transmission, and any documents attached hereto, (a) are protected by the Electronic Communications Privacy Act (18 USC §§ 2510-2521), (b) may contain confidential and/or legally privileged information, and (c) are for the sole use of the intended recipient named above. If you have received this electronic message in error, please notify the sender and delete the electronic message. Any disclosure, copying, distribution, or use of the contents of the information received in error is strictly prohibited




Archive powered by MHonArc 2.6.16.

§