Thank you very much Christine and Daniel for your helpful discussion. I think this line is really what answers my question:
So, for the policyfile focused stuff, I opted to make it work more like `knife bootstrap` and `knife cloud create`
That means you specifically were looking at the case where a provisioning recipe handles just one machine. I don't think I will bother with the "chef provision" tool in that case, since I don't want to have to enforce that "one machine per recipe" rule in our code, and the allure of chef-provisioning for us is being able to use it to bring up whole "Stacks" of related machines (what Christine called 'topologies' I think).
I did think about clustering scenarios when I was writing `chef provision`, but it turns out that it can be complicated depending on what the exact use case is. Do you want a “throwaway” cluster to integration-test your cookbooks as a whole? How do you keep different developers’ throwaway clusters from conflicting with each other?
So far my solution to this (though we haven't worked with it for long enough to say if it's a good one long term) is to require ENV['DEPLOYMENTID'] exist and use it in the name of every chef_provisioning resource. That way, each deployment of a stack has it's own machine names and can be managed as a distinct entity of related machines.
The workflow I've settled on is similar to how Daniel describes it:
1. Set up the policyfile
2. Install/Update/Push the policyfile
3. Write chef-provisioning code to set up machines to use a named policyfile
4. Run the provisioning recipes with something like `chef-client -z -o my-provisioning::mystack`
This thread was about whether I should be using 'chef provision' in step #4 because running chef-zero or chef-solo on my provisioning node feels weird.
For step 3, I've done a more cookbook-heavy version of Daniel's suggestion:
From there, you need to set the policy_group and policy_name on the nodes via the client.rb (not sure how well this is documented, but you can pass configuration as a string with { chef_config: “your client.rb content” } as convergence_options)
Here's what I do:
default['my-provisioning']['node_attrs'] = {
:chef_client => {
:config => {
:use_policyfile => true,
:policy_document_native_api => true,
:policy_group => 'mypolicygroup',
:policy_name => 'mypolicy'
} } }
machine "mymachine-#{ENV['DEPLOYMENTID']}" do
run_list ['recipe[chef-client]', 'recipe[chef-client::config]']
attributes node['my-provisioning']['node_attrs'].merge(
:mycookbooks => {
:config => {
:stuff => 'values'
} } )
end
This uses the chef-client cookbook to set up the policyfile mode for the node. Actual control of the run_list is in the policyfile.
There are some downsides to this:
* You need to run a second converge on the node in order to get it running the right run_list; so that can either happen automatically via chef-client service config, or you can add the second converge right in your chef-provisioning recipe.
* IIRC, subsequent executions of the the provisioning recipe don't actually run the real run_list, they just ensure that it's still set up for the right policy
* You need to serve a version of the chef-client cookbook without the benefit of policyfile locking
* Your run_list is separate from your machine definition, which is a bit awkward, leaving all your machines with the same uninformative run_list specification in your provisioning recipes
I think I will experiment with using 'chef_config: “your client.rb content”' instead of the chef-client cookbook, it might cut down a bit on complexity.
Thanks again, this is exactly what I was looking for from this thread. I would be interested in hearing any additional thoughts about my workflow and especially any insight to how you see the co-ordination between these tools evolving in the next year or so.