[chef] Re: Why does chef have a two-pass execution flow?


Chronological Thread 
  • From: Ian MacLeod < >
  • To: " " < >
  • Subject: [chef] Re: Why does chef have a two-pass execution flow?
  • Date: Mon, 28 Jan 2013 23:28:22 -0500
  • Accept-language: en-US
  • Acceptlanguage: en-US

Heh, fantastic example resource.  I'm still a bit fuzzy, though!

Even though we're collecting a bunch of resources and their desired states through the compilation phase - In practice it seems that for the most part, that mirrors the ordering and conditionals put forth by a given recipe.

By "in practice", I'm referring to my fractured understanding and that gleaned off of tutorials/docs; not necessarily The Right Way :P 

How often do recipes converge out of their declarative ordering (due to triggers, or other recipes)?  Am I missing a point here, but that seems to be the main benefit of the two-phase execution?  Most examples I see are are pretty sequential in nature.   Say, for example, the openssh cookbook:

  • Install the openssh package
  • Configure it as a service
  • Lay down the configuration file (declared twice?)
  • (Re)start the server if the config file was touched

Wouldn't that cookbook (and many/most others) be just as successful if they converged immediately?

Apologies if I'm coming off as dogmatic/combative - I'm mostly just trying to understand the rationales here; I've got a few hundred hosts and a myriad of roles to manage with chef, and I want to make sure I do it properly.


LWRPs that declare inline resource execute in the compilation phase

Sorry, let me clarify.  Say I have a recipe:

evil_cat "mr-bigglesworth" do
action :stroke
variant "persian"
end

And the following evil_cat provider:

use_inline_resources
 
action :stroke do
converge_by("ensure that we have an evil #{new_resource.variant} cat on hand") do
unless have_henchman? "cat-wrangler"
shell_out! "hire-henchman cat-wrangler"
end
 
henchman "cat-wrangler" do
action :dispatch
destination "animal-shelter"
not_if { have_cat_variant? new_resource.variant }
end
end
 
converge_by("stroke cat in a most evil manner") do
Chef::Log.info "#{new_resource.name} is most pleased."
end
end

I'll get the following ordering:

Starting Chef Client, version 11.0.0.beta.0
Compiling Cookbooks...
Converging 1 resources
Recipe: world-domination::default
  * evil-cat[mr-bigglesworth] action stroke
    - ensure that we have an evil persian cat on hand
    - stroke cat in a most evil mannerRecipe: <Dynamically Defined Resource>
  * henchman[cat-wrangler] action dispatch
    - Dispatching cat-wrangler to animal-shelter

The inlined henchman resource ends up executing after the action block returns.  This is good in that it's consistent with how recipes behave.  However, it's confusing because most examples of LWRPs out there perform work directly in the action block (frequently calls to shell_out!)  You're forced into an all-or-nothing of either all resources, or cheating a bit and doing the action :nothing, run_action(…) trick.

-Ian

On Jan 28, 2013, at 7:01 PM, Sean OMeara < "> > wrote:

Hi Ian.

This is one of the biggest conceptual stumbling blocks with new Chef users.

Chef is very much not "just imperative programming" (despite what
you'll read at certain places on the internet). Chef is about building
a set of convergent resource statements called a Resource Collection.
By doing this, it allows the you to create subscribe/notify relations
between resources, so you can restart a service when you fix a
configuration file, for example. Having a compilation phase allows you
to make decisions about what to put on the collection.

It allows for things like this:

1.upto(99) { |i|
 beer_bottle "#{i}" do
   action [:take_down, :pass_around]
 end
}

That would place 99 uniquely named convergent beer_bottles on the
resource collection.

More usefully, it lets you take advantage of ruby libraries (sql,
chef, anything) to drive data about what to declare.

mah_nodes = search(:node, "role:webserver")
template "lb_config" do
 variables( :members => mah_nodes)
end

Chef was designed with a larger infrastructure in mind, not just
single nodes. Spinning up a new node can automatically be integrated
without having to manually track topology information, which is very
hard, if not impossible to do on IaaS providers.

As to your concerns:

- "It is very easy to execute tasks out of order by accidentally
defining work outside of a resource"

With great power comes great responsibility. There's also nothing
stopping you from dd'ing /dev/urandom into your boot sector. After
some practice with Chef and an understanding of the resource
collection, spotting instances of "doing work in the compile phase"
stand out very sorely at a glance.

Repeat: "Chef is Not Just Ruby"

- "LWRPs that declare inline resource execute in the compilation phase"

Actually, no, LWRPs (or "custom types", as I've starting calling them)
are evaluated in the compilation phase, and added to the resource
collection just like any core resource type. Currently, custom_type
with nested_resource1 and nested_resource2 will appear as three
resources on the collection. In Chef11, there will be the option to
have it appear as a single resource, through the magic of
run_contexts.

- "It is not very obvious to newcomers"

I could not agree more. We make a point to spend a bit of time
explaining this and making it as clear as possible in our training
sessions, but the docs could definitely be more clear about it. Making
this more widely understood is a bit of a personal mission of mine.
Unfortunately, the "Chef is just pure Ruby" myth remains widespread.

Hope that helps,

-s

On Mon, Jan 28, 2013 at 9:16 PM, Ian MacLeod < "> > wrote:
Sorry if this is (most likely) a repost; I wasn't able to find any dupes in
the list archive, though :(

Chef executes recipes with a two pass model: First it compiles each
resource/provider by walking through the run list.  Then it converges each
of the compiled resource/providers to get the host to the desired state.

While I (finally) understand how it works; I don't understand the why.  It
seems like there are quite a few pitfalls w/ this model:

* It is very easy to execute tasks out of order by accidentally defining
work outside of a resource (forgetting to use execute/ruby_block providers,
dropping to lower level Ruby, etc)

* LWRPs that declare inline resources have a confusingly different execution
order (their actions occur in the convergence phase, but any resources they
declare get run after the current provider)

* It is not very obvious to newcomers; they seem to expect that recipes are
executed immediately.

What are the benefits of this two pass architecture?  Why was Chef built
this way?  (I'm not finding obvious docs on it)

Thanks,
-Ian





Archive powered by MHonArc 2.6.16.

§