chef - [chef] Re: RE: Re: Re: Automated check-ins or not...

First login ?
Lost password ?

Subscribers: 1946
Owners
Bryan McLellan
Joshua Timberman
Nathen Harvey
Seth Chisamore
Serdar Sutay

Subscribe
Unsubscribe
Info
Archive

Post

RSS
Shared documents

General discussion about Chef

[chef] Re: RE: Re: Re: Automated check-ins or not...

From: Daniel DeLeo < >
To:
Subject: [chef] Re: RE: Re: Re: Automated check-ins or not...
Date: Mon, 13 Jan 2014 13:49:21 -0800

On Monday, January 13, 2014 at 1:16 PM, Phillip Roberts wrote:

The problem isnât my coworker, the problem is a lack of understanding the tool.

Chef is my baby, and I am perfectly fine with automated check-inâs, however, just like any business, there are politics at play. There are fears due to a lack of understanding as well.

I am purposely asking for others use cases because I am interested in them to help me form my arguments as to why chef nodes should be checking in (running chef-client) automatically.

I am not asking for anyone to tell me whether we should be using chef, or how we should be using chef, I am interested in how it is being used in other environments. I have seen plenty of other environments where I have implemented chef, however, in all cases, I have implemented chef and the policies that surround chef. In all cases, this question has never come up, or this argument.

I appreciate the responses thus far.

Thanks,

Phillip Roberts | Sr. Linux Systems Administrator

Thereâs a joke that goes around twitter every so often that goes like: âto err is human, to propagate your error to 1000 machines automatically is devops.â I think this joke actually does a good job at getting to the heart of your coworkersâ concerns: what prevents a potentially destructive mistake from getting applied to your whole infrastructure?

As youâve implied one option is to have chef-client be run manually on each machine to apply updates as desired. The pros of this approach:

* you can use why-run mode to get some indication of whatâs going to change when you run chef-client for real

* Workflow is very simple, you donât need to invest in a lot of testing or extra infrastructure, just upload cookbooks to the server and run them

* If youâre using chef for app deployments, you donât need to have any additional logic or tooling for orchestration, just run chef on the boxes in the right order.

The downsides:

* You have to manually check whether chef has run recently on all your machines. If you miss one, you could be missing an important security/bug/performance patch. This can get you into problems such as missing a patch on the passive node in a failover pair. When the cluster fails over, the service doesnât work correctly on the now-active node. Iâm sure you can imagine plenty of similar cases.

* Related to the above, you can get a different delta from starting state to desired state than you expected if a machine is a few cookbook iterations behind. This can cause chef-client to fail or to apply a change incorrectly based on the assumptions in your cookbook code.

* Your team has to be fairly disciplined about communicating when changes are made to the chef-server. Say Alice uploads her change, runs it on a trial node and it works correctly. Now she starts a parallel SSH session to run chef-client on the remaining nodes. In the meantime, Bob uploads a change to some âbaseâ cookbook and itâs incompatible with Aliceâs change. Aliceâs chef-client runs fail or cause an outage on those systems.

* Humans spend a lot of time running chef-client.

My view is that running chef-client in some periodic fashion is a good forcing function that will require you to implement good workflow practices, whether that be cookbook testing with automated uploads to the chef-server from Ci, partitioning your infrastructure so that you have sub-clusters running the âfutureâ cookbook version before the majority of similar machines, testing cookbooks locally in vagrant/test kitchen/whatever, etc. If you have this stuff in place, running chef-client manually (or via orchestration tool) vs. on interval wonât make a big difference.

Daniel DeLeo

[chef] Re: Re: Automated check-ins or not..., (continued)