[chef] Re: Re: Re: Re: Re: Re: Re: Re: Re: One-Shot runlists with inheritance


Chronological Thread 
  • From: Dan Nemec < >
  • To:
  • Subject: [chef] Re: Re: Re: Re: Re: Re: Re: Re: Re: One-Shot runlists with inheritance
  • Date: Wed, 2 Feb 2011 09:08:24 -0500
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=nemecfamily.com; s=snocky; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; b=aGnQ3s4+jr76sKGWTxmCoNgxqJ7s1NdZPZV0EpM5SF2C9ujwSdyZXIeQssUu3oUsfa mA9HsK6QTQzKQWC0YiqVhULUoQamn6ytxm6TUj2bZyl4D8OEj1yAamXIbGCo8UBS21YA AjysXNxdD6H5t7njMomRUqrlaBLDWn49dZbP4=

Sean,

You bring up a good point. As a new Chef user in a 10 year old environment it's hard to change the way you think and go from a "discreet unit of work" mentality to a "state" mentality. Our whole operations staff (myself included) has always been in the world of "I file a change request for a discreet piece of work then perform that one piece of work". Where a piece of work is something like updating the configurations of one application. We've always thought of it that way and have a hard time with the possibility of "bringing the system to a new state" might have unforeseen consequences. The risk in the latter case is that if you want to make one change, you must make sure that no one else has changed any other part of the configuration or you might accidentally make a change you didn't intend. In the first case I know only one thing can change, in the latter I have to do some extra work to make sure only one thing is going to change.

Instead of calling it a one-shot runlist, it is more a discreet-work runlist. I know Chef wasn't designed for it. I'm wondering if it should support it better or if I should change the way I think about configuration management.

Am I being too paranoid? Who is using Chef with a team of 10+ admins with long runlists run for every change? Do you get to the point where your code and your processes are good enough that you can rely on knowing that only intended changes are applied only at the time you intend them? How would an ITIL-type shop use Chef with their change control. Are there any case studies published?

Thanks,

Dan

On Tue, Feb 1, 2011 at 5:25 PM, Sean OMeara < "> > wrote:
To clarify, chef-client in this scenario would be run in the context
of a normal posix user.

The deployment team would log into the machine with their individual
accounts, then invoke chef-client. chef-client's runlist would do its
thing, writing to directories that the user had group write access to.

If a recipe attempted to modify /etc/hosts or /etc/shadow, it would
fail, throwing a permissions error.

-s

On Tue, Feb 1, 2011 at 5:14 PM, Sean OMeara < "> > wrote:
> Chef (and puppet and cfengine and bcfg2 and.....) are really meant to
> converge an entire system against a (composed) state description, not
> doing "one off" operations.
>
> Keeping that in mind, if you're absolutely sure that the "one-off" run
> list won't step on the toes of the "normal" runlist, you could try:
>
> Adding an additional Client (SSL user) to your platform account, and
> calling chef-client -c /path/to/an/alternate/configfile.rb, that
> specifies a special separate node name.
>
> For example: foonode-oneoff.yourdomain.here.
>
> Add your runlist to that node object, then call chef-client against
> the special node.
>
> YMMV
>
> -s
>
>
> On Tue, Feb 1, 2011 at 5:03 PM, Dan Nemec < "> > wrote:
>> Matt,
>> That's a great idea. I looked it over and I think it does solve the problem
>> I'm thinking of with a one-shot runlist.
>> I'm relatively new to Chef so I don't know everything that is possible and
>> have a few questions.
>> 1) Your attribute that contains a list of recipes, can it contain roles with
>> runlists as well?
>> I'm still left with the problem that I require a "base" runlist as well. The
>> way I see your one-shot runlist is that if I want to use it with
>> "chef-client -j" then I still need to include all of the roles and recipes I
>> consider "base" in the list of the -j option. Your solution just provides me
>> the mechanism to attach some recipes (maybe roles) to an existing runlist
>> where it will be removed after the run. That is exactly what I need for the
>> second half of my problem.
>> [I'll interject here that one of my design goals is that I have
>> environment-specific configuration in as few places as possible. Commands,
>> especially, cannot be environment dependent. I just want to run "doit" not
>> "doit.prod".
>> 2)  Given I have a databag with some configuration that is a runlist that
>> can contain roles or recipes (here in this databag is my node-specific
>> information, and nowhere else). Is it possible to have a recipe that will
>> read from the databag, then construct a runlist on the fly and run that
>> runlist?
>> That way I can say something like:
>> chef-client -j infrastructure.json
>> where infrastructure.json looks like
>> { "run_list": [ "recipe[node-manager::base-runlist]",
>> "recipe[one-shot::infrastructure]" ] }
>> Then when that runs, it concatenates the runlist from the node databag
>> attribute and the runlist from the infrastructure attributes.
>> Let me know if you think I'm getting to far out in my quest for base
>> runlists and one-shot runlists. I can't run a monolithic runlist that does
>> everything.
>> Dan
>> On Tue, Feb 1, 2011 at 2:43 PM, Matt Ray < "> > wrote:
>>>
>>> Thinking about this problem, I've written a "one-shot" cookbook that
>>> may be used to solve simple cases of this problem.
>>>
>>> https://github.com/mattray/cookbooks/tree/master/one-shot
>>>
>>> This cookbook provides a framework for making single-use, one-shot
>>> recipes. By including the "one-shot" recipe in the node's run_list, on
>>> the next chef-client run the contents of the "one-shot::one-shot"
>>> recipe will be called. This is parametrized as an attribute, so you
>>> can change these out by setting the ["one_shot"]["recipe"] to include
>>> different recipes (and uploading dependencies if necessary). The file
>>> roles/one-shot.rb is included so you can simply change the role
>>> instead of changing the source directly.
>>>
>>> Thanks,
>>> Matt Ray
>>> Technical Evangelist | Opscode, Inc
>>> E: "> T: (512) 731-2218
>>> Twitter, Github: mattray
>>>
>>>
>>>
>>> On Fri, Jan 28, 2011 at 2:18 PM, Dan Nemec < "> > wrote:
>>> > Sorry for the really long post.
>>> > Here is our use case:
>>> > I agree that one-off runlists are a component of overall orchestration.
>>> > Right now we use Control Tier for orchestration. It can handle the
>>> > workflow
>>> > [take server out of load, wait for connections to drain, deploy code to
>>> > server, run smoke test, put server back in load]. We want to use Chef
>>> > for
>>> > the "Deploy Code" step. Actually, we plan to use it to deploy
>>> > configuration
>>> > and all configuration dependencies where Control Tier deploys just the
>>> > code.
>>> > (We don't have any Chef implemented, so these are currently only plans.
>>> > We
>>> > do have Control Tier running and have been using it for over a year
>>> > orchestrating deployments).
>>> > In our case the thought process is that Control Tier would dispatch a
>>> > "chef-client -j <runlist>" or some such thing to the node that is being
>>> > acted upon. We want that runlist to have only what is important to that
>>> > activity. For a code deployment the runlist would deploy application
>>> > code.
>>> > For system updates the runlist would update system things. Any runlist
>>> > that
>>> > runs on the node is going to need some shared set of attributes on the
>>> > node.
>>> > We need a whole lifecycle of keeping the node attributes up to date  so
>>> > that
>>> > all the new configuration for the upcoming deployment is loaded prior to
>>> > the
>>> > deployment.
>>> > Answering your second question here, Before we knew all the details
>>> > about
>>> > Chef, we had the concept of an "attribute runlist" and an "action
>>> > runlist"
>>> > where the attribute runlist would be one runlist used to manage all node
>>> > attributes and would not have any recipes that would actually perform
>>> > work
>>> > on the node. Then, we would maintain a collection of activity runlists
>>> > that
>>> > perform sets of system actions relying on the existing attributes on the
>>> > node.
>>> > Now, we plan on one more variation. I'll prepend it with
>>> > a disclaimer that
>>> > we are an "old-school" shop learning new tricks. We have a 10+ year old
>>> > code
>>> > base and 10 years of process built around the caution that comes from
>>> > countless painful deployments. We don't have the luxury of wiping the
>>> > slate
>>> > clean so we have to make incremental improvements and build on each
>>> > success.
>>> > That being said, we plan to "pre-deploy" most of our changes. So, the
>>> > day
>>> > before the scheduled deployment we plan to lay down all the code and
>>> > configuration that is needed for the deployment in a location near the
>>> > running code. Then, the deployment becomes more of [Stop, flip links,
>>> > update
>>> > database, Start]. In this case we would have a runlist that would
>>> > pre-deploy
>>> > configurations and a separate one that would activate the
>>> > configurations.
>>> > Let me know if I am unaware of a feature here: Expanding on the notion
>>> > of an
>>> > "attribute runlist", node attributes should be persistent feature of a
>>> > node.
>>> > If I set an attribute that says my administrators email address is
>>> > "> , then I shouldn't have to have a role in every runlist
>>> > to
>>> > assure that my admin email is always set. "chef-client -j" is
>>> > destructive in
>>> > that it only maintains attributes in the runlist that it ran. This doe
>>> > create the problem that if you have persistent attributes you need a
>>> > method
>>> > of removing them. It is a challenging problem when specifying your
>>> > attributes to make a process to be able to remove ones you no longer
>>> > need.
>>> > Chef will provide the way to delete, the user must figure out what to
>>> > delete.
>>> > Now, to answer your first questions: I do not think that maintaining one
>>> > node object per activity set would be practical in the long run.
>>> > Dan
>>> >
>>> > On Fri, Jan 28, 2011 at 2:31 PM, Charles Duffy < "> >
>>> > wrote:
>>> >>
>>> >> This speaks more to orchestration than to one-off run lists, but let me
>>> >> comment --
>>> >> My most interesting workflow I've been interesting in modeling is along
>>> >> the lines of the following:
>>> >> "If average load across all application servers is less than 1.0, no
>>> >> more
>>> >> than 1/5 of all app servers are out of the pool, and this node is
>>> >> flagged as
>>> >> having at least one recipe in the pending-downtime list:
>>> >>  - remove this node from the load balancer's pool
>>> >>  - wait for all requests to drain
>>> >>  - run all recipes in the pending-downtime list, removing each from
>>> >> said
>>> >> list after successful completion
>>> >>  - when pending-downtime list is empty, put this server back into the
>>> >> pool"
>>> >> ...where several different recipes have the ability to add their own
>>> >> entries to the pending-downtime list (which could be anything from a
>>> >> firewall reconfiguration to an application restart to a full-system
>>> >> reboot)
>>> >> Of course, the "no more than 1/5 of all app servers are out of the
>>> >> pool"
>>> >> requirement calls for some care to avoid race conditions.
>>> >> If y'all are working on an orchestration solution, I would be very
>>> >> interested to hear how it addresses this kind of use case.
>>> >> On Fri, Jan 28, 2011 at 1:12 PM, Chris Walters < "> > wrote:
>>> >>>
>>> >>> Dan,
>>> >>> Absolutely. One-off run lists are one of the most requested features.
>>> >>> They also fit into some of the preliminary discussions we've had about
>>> >>> orchestration models. We plan to get a design together for one-off run
>>> >>> lists
>>> >>> in the next few weeks to share with the community for feedback.
>>> >>> If you're willing to comment on your use case more, here are a few
>>> >>> questions that I have.
>>> >>> For your use case, does the multi-node solution with a shared base run
>>> >>> list work, or do you actually need to have only one node object for
>>> >>> the
>>> >>> purpose of searching?
>>> >>> Should run lists be first-class objects instead of just properties on
>>> >>> nodes and roles? Should they be able to contain not only roles and
>>> >>> recipes
>>> >>> but run list-containing entities (nodes and other dis-embodied run
>>> >>> lists),
>>> >>> as well?
>>> >>> If anyone else has opinions on any aspect of one-off run lists, please
>>> >>> respond, as well.
>>> >>> Thank you for your input.
>>> >>> -chris
>>> >>>
>>> >>> On Fri, Jan 28, 2011 at 7:31 AM, Dan Nemec < "> >
>>> >>> wrote:
>>> >>>>
>>> >>>> So, the obligatory next questions is:
>>> >>>> "Is this anywhere on the roadmap?"
>>> >>>> Thanks for the suggestion about multiple nodes. We'll play with that
>>> >>>> and
>>> >>>> see if it may be a workable, but not ideal solution.
>>> >>>> Dan
>>> >>>>
>>> >>>> On Wed, Jan 26, 2011 at 5:46 PM, Chris Walters < "> >
>>> >>>> wrote:
>>> >>>>>
>>> >>>>> Hi Dan,
>>> >>>>> There isn't currently a way that I can think of to run one run list
>>> >>>>> after another except to package up the main run list into a role and
>>> >>>>> prepend
>>> >>>>> that role to the one-off run list's items.
>>> >>>>> As for one-off run lists, there isn't currently a built-in solution.
>>> >>>>> Since a single server can be managed by many chef nodes, one way to
>>> >>>>> do it is
>>> >>>>> to have different JSON files like you do, but run them as different
>>> >>>>> nodes.
>>> >>>>> Something like:
>>> >>>>>
>>> >>>>> infrastructure maintenance runs:
>>> >>>>> "chef-client -j infra-maint.json -n node-XYZ-infra-maint"
>>> >>>>> deployment team runs:
>>> >>>>> "chef-client -j deployment.json -n node-XYZ-deployment"
>>> >>>>>
>>> >>>>> etc.
>>> >>>>>
>>> >>>>> Does that help?
>>> >>>>> -chris
>>> >>>>> On Wed, Jan 26, 2011 at 1:55 PM, < "> > wrote:
>>> >>>>>>
>>> >>>>>> We have run into an interesting problem. We want to segregate
>>> >>>>>> runlists
>>> >>>>>> by
>>> >>>>>> activity (e.g infrastructure maintenance, deployment, one-off,
>>> >>>>>> etc…).
>>> >>>>>> But we
>>> >>>>>> want all the runlists to share some common role information about a
>>> >>>>>> node. We
>>> >>>>>> have a node that has some roles (datacenter, servergroup, tier)
>>> >>>>>> that
>>> >>>>>> are
>>> >>>>>> important identifiers and drive selection of certain attributes. We
>>> >>>>>> want
>>> >>>>>> different groups to be able to do maintenance on their parts at
>>> >>>>>> different times
>>> >>>>>> without impacting others. So if a sysadmin wants to update
>>> >>>>>> /etc/hosts
>>> >>>>>> he
>>> >>>>>> shouldn’t have to worry if the application team has put in a new
>>> >>>>>> attribute
>>> >>>>>> for a deployment later. The sysadmin can run a runlist that only
>>> >>>>>> affects the
>>> >>>>>> parts of the system he is responsible for without worrying that an
>>> >>>>>> application
>>> >>>>>> deployment recipe will run. Conversely in a software deployment the
>>> >>>>>> deployment
>>> >>>>>> team should be able to update the applications without updating the
>>> >>>>>> operating
>>> >>>>>> system (given the os changes are not part of the software
>>> >>>>>> deployment).
>>> >>>>>>
>>> >>>>>> I thought “chef-client -j” would do this, but it didn’t. This is
>>> >>>>>> what
>>> >>>>>> I
>>> >>>>>> did: I created a node and bootstrapped it with a runlist of its
>>> >>>>>> identity roles.
>>> >>>>>> I then made a json file with a  runlist for a set of activity and
>>> >>>>>> ran
>>> >>>>>> the
>>> >>>>>> runlist via “chef-client -j <jsonfile>”. The problem is that the
>>> >>>>>> runlist
>>> >>>>>> for the node that existed before chef-client gets wiped out and
>>> >>>>>> only
>>> >>>>>> the
>>> >>>>>> runlist in the json file gets run thus wiping out its “identity”
>>> >>>>>> and
>>> >>>>>> breaking the one-off runlist because certain attributes no longer
>>> >>>>>> exist.
>>> >>>>>>
>>> >>>>>> I’d like to be able to append a runlist on the fly to an existing
>>> >>>>>> runlist on
>>> >>>>>> the node where the new runlist exists on the node only for the
>>> >>>>>> duration of the
>>> >>>>>> chef-client run. The node has a “base” runlist that should always
>>> >>>>>> be
>>> >>>>>> run,
>>> >>>>>> but I want to run some other recipes and roles one at a time while
>>> >>>>>> keeping the
>>> >>>>>> “base” runlist. I do not want to have to copy the base runlist into
>>> >>>>>> the
>>> >>>>>> json file of the one-shot runlist that I am running as I’m trying
>>> >>>>>> to
>>> >>>>>> keep the
>>> >>>>>> “activity” runlists environment independent.
>>> >>>>>>
>>> >>>>>> Is there a way to run a one-off runlist on a node that is
>>> >>>>>> effectively
>>> >>>>>> appended
>>> >>>>>> to the runlist that is already on the node and is removed after the
>>> >>>>>> run?
>>> >>>>>>
>>> >>>>>> Thanks,
>>> >>>>>>
>>> >>>>>> Dan
>>> >>>>>
>>> >>>>
>>> >>>
>>> >>
>>> >
>>> >
>>
>>
>




Archive powered by MHonArc 2.6.16.

§