Yeah, but he's talking about a more fundamental problem with his management/co-workers not being okay with the fundamental idea of an automated job running which might change system config.
You're off on a completely different planet where you've accepted the basic premise of "DevOps" (for lack of a better term) and its a question not of "should we do it?" but "how aggressive?" and thats influenced by how well along the road to continuous integration / continuous deployment you are, which would be like trying to explain quantum mechanics to a cave man.
On 1/13/14 5:28 PM, Greg Zapp wrote:
My cookbooks hook into our orchestration server via REST calls to pull down information about which sites should be configured, etc. During POC build out I had Chef run every minute, but most of my machines are Windows servers and Chef is very CPU hungry there. We have modified our orchestration server to set the updated time for the "pool" when any resource contained in the "pool" is modified. I wrapped Chef in a .Net app/service that will first check if the pool has been changed since the last successful Chef run. This is how we chose to mitigate Chef's CPU hunger and allow for faster converge times.
-Greg
On Tue, Jan 14, 2014 at 1:56 PM, Lamont Granquist < " target="_blank"> > wrote:
Yeah, places that I've been where managers have been afraid of config management (CFengine at the time) running on a schedule has resulted in an accretion of changes over time, and then once enough changes got queued up that we had to run it on a server and the change window was scheduled and it was approved by our CRB board and appropriate offerings were burned to the gods of ITIL, the changes would often wind up causing outages because so many changes hit the server and it was hard to determine the impact ahead of time. But the outages were all contained to change windows and were approved, so I guess that makes it okay.
A tactic that I've used in the past has been to run CM only once per day and run it with a 12-hour random splay and time it for 8pm-8am. Changes can be committed during the business day and they don't immediately take effect, then they can get tested or pushed out manually. And if anything goes wrong, it'll start hitting servers at 8pm and you have a longer window before it hits your entire infrastructure and more time for you to get monitoring alerts and stop the changes rolling out. If you just run Chef every 30 minutes with a 5 minute random splay, then its likely that by the time your monitoring alerts you and you start taking action that the change has hit your entire infrastructure. By only doing the "scheduled" runs once per day you still keep the deltas between runs small, you allow yourself some time to stop your CM tool before it all rolls out, and you also reduce the load on your chef server infrastructure (or on our HEC infrastructure).
The other thing is that if you only run Chef once a week or once a month on-demand, then you're not getting the "self-repairing" and SOX/PCI-DSS "prevent control" features of configuration management. If you're running it nightly then any junior SA or malicious attacker that logs into the server and manually changes the state of critical files will have those changes immediately rolled back. That produces prevent controls that auditors really like. That also trains your junior SAs to not make with the typey-typey on the keyboard and to use the CM program -- otherwise they tend to fall back to old behaviors of making changes on the console and then its not their fault they did that, its going to be Chef's fault that it rolled those changes back when its eventually run and reverts those changes and the service crashes.
On 1/13/14 1:32 PM, David Petzel wrote:
We had quite a few discussions about this as well and at the end of the day we opted for the ability to do both on-demand as well as scheduled. There were concerns that without a scheduled check-in the amount of drift in systems could become large over time on servers that don't routinely get deployments done. With that drift comes a slew of unknown issues. By enforcing a schedule run we could be sure that hand modified configurations didn't stick around very long.
We've setup a report to notify us if a node has not checked-in in the last day. This helps us catch cases where the schedule run might be failing and other notification mechanisms might not be catching it (it some nasty compile error super early in the run)
From there we extended an existing in house tool that lets anyone with access request a chef run without needing access to the servers.
On Mon, Jan 13, 2014 at 4:16 PM, Phillip Roberts < " target="_blank"> > wrote:
The problem isn’t my coworker, the problem is a lack of understanding the tool.
Chef is my baby, and I am perfectly fine with automated check-in’s, however, just like any business, there are politics at play. There are fears due to a lack of understanding as well.
I am purposely asking for others use cases because I am interested in them to help me form my arguments as to why chef nodes should be checking in (running chef-client) automatically.
I am not asking for anyone to tell me whether we should be using chef, or how we should be using chef, I am interested in how it is being used in other environments. I have seen plenty of other environments where I have implemented chef, however, in all cases, I have implemented chef and the policies that surround chef. In all cases, this question has never come up, or this argument.
I appreciate the responses thus far.
Thanks,
Phillip Roberts | Sr. Linux Systems Administrator
San Mateo | Ann Arbor | New York | London
O 734.922.7014 | C 614.423.9871 | www.MyBuys.com
From: Christopher Armstrong [mailto: " target="_blank"> ]
Sent: Monday, January 13, 2014 4:09 PM
To: " target="_blank">
Subject: [chef] Re: Re: Automated check-ins or not...
Chef as a tool is used for orchestration, converging nodes to a desired state. If your coworker doesn't want nodes checking in automatically, then perhaps Chef isn't the ideal tool for you. What does your use case look like?
On Mon, Jan 13, 2014 at 1:05 PM, Ranjib Dey < " target="_blank"> > wrote:
by check in do you mean chef runs or chef registrations. I am aware of 3 different ways
1) on demand: use rundeck, or mco or capistrano like tools to invoke chef run. pros: on demand :-), which helps if you deploy your application via chef. also you can eliminate the need of a validation certificate. cons: requires additional tooling, special security considerations etc.
2) as service : specify a splay time, and use the standard init scripts to run chef client as service. pros: no additional configuration required, no dependency on any other tools. cons: memory leak, stale processes used to be a pain.
3) as a scheduled job : use cron or rufus like system to run chef on periodic interval. pros: simple, less prone to memory leaks., cons: infra has to be designed as evantually consistent, on demand application deployment can not be done., additional considerations needed on deciding cron times on individual servers, else u'll storm the chef server.
i have used pretty much all three of these. and i think all of them has merits. choose any one depending upon what you do, how you are doing it and how comfortable you are with chef and those tools. most of the issues with running chef as service are now sorted (or workarounds are known).
best
ranjib
On Mon, Jan 13, 2014 at 12:52 PM, Phillip Roberts < " target="_blank"> > wrote:
I am interested in hearing what others are doing in terms of allowing nodes to automatically check in with chef or not. It has recently come up as a concern with a party in our company, he would prefer to not see nodes check in automatically with chef (I currently have a cron job that runs chef-client every X number of minutes).
I am just interested in hearing how others manage this, I am not certain that I think that manually running chef-client is a good solution.
I am being slightly vague on purpose, because I am looking for full case examples from others using chef and how they are using it.
Thanks,
Phillip Roberts | Sr. Linux Systems Administrator
San Mateo | Ann Arbor | New York | London
O 734.922.7014 | C 614.423.9871 | www.MyBuys.com
Archive powered by MHonArc 2.6.16.