[chef] Re: Re: Re: Re: Re: Problem accessing databag from node


Chronological Thread 
  • From: Jay Feldblum < >
  • To:
  • Subject: [chef] Re: Re: Re: Re: Re: Problem accessing databag from node
  • Date: Fri, 9 Dec 2011 19:18:28 -0500

Chef is a distributed system. Using search would be a good approach, so long as your code takes that into account. If you go that route, you don't rely on timing; rather, you code for the fact that timing may be "off." I don't think that's sloppy - I think it's part of what it means for nodes to "converge" over time, over multiple runs, to a final "finished" state.

For example, to bring up a cluster, here's what you can do. Add a normal-level attribute on a first node indicating it is the seed node (don't add that attribute to any other node). Then during a convergence, each node will only create the cluster if it is the seed node, and will only add itself to the cluster if some other node is the seed node (found by search) and that node is reachable and healthy. The seed node might not be the primary node coordinating the cluster, but any node should be able to find the primary node by asking the seed node what the primary is. If the original seed node fails at a later point and needs to be removed from the cluster, just delete that node object from the chef-server and make a different node the new seed node (by setting that specific normal-level attribute) before bringing up any new nodes into the cluster. Note that normal-level attributes may be set via knife or via the webui, and that they persist across chef-client runs. You can use the node tags for the purpose of the seed attribute, since node tags are normal-level attributes and are easily edited via the CLI (`knife tag create NODE TAG`) without requiring an editor. There is certainly room for extension and variation - for example, to eliminate the SPOF where there can only be one seed node.

- Jay Feldblum

On Fri, Dec 9, 2011 at 6:39 PM, Michael Glenney < "> > wrote:
The use case here is clustering.   The first node that comes up would search the databage for an item named the deployment id.  If it doesn't find an item with that name it would create the item and within that item state that it is the master.  The next node to come up with that deployment id would find that item and extract the master node info.

A coworker suggested I use chef's search capabilities instead.  Search chef for a node with that deployment id.  If not found set an attribute claiming itself as the master.  Subsequent nodes would find that with search and would extract what they need from the search results.  This is the method I'm going to use now but I initially rejected his idea (and am not happy that were having to go this route) because I have to rely on solr properly indexing my node before the next node comes up.  I don't like having to rely on timing when it comes to automation.  It's sloppy in my eyes.  With the databag solution, I "know" the data is there because my call to write it is successful.

A 3rd solution I came up with is to use Amazon SQS but I'd be happier with a pure chef solution.

MG


On Fri, Dec 9, 2011 at 10:55 AM, Jay Feldblum < " target="_blank"> > wrote:
What's a use-case for a node rewriting a data bag item during a converge, that can't be solved by some other method?

Note that it would have to scale to 100,000 nodes talking to the same server and converging every 5 minutes, just as it would have to scale to 2 nodes talking to the same server and converging only when you SSH in to run the chef-client.

- Jay Feldblum


On Fri, Dec 9, 2011 at 12:48 PM, Michael Glenney < " target="_blank"> > wrote:
I figured out the issue.  First of all, the 403 was because I was trying to write to a databag within a recipe and didn't read the disclaimer in the wiki that said I'd have to give the node api client admin privileges to be able to do that.  Of course we don't want to do that so I'll have to come up with another way to solve the problem.

BTW, I wouldn't mind seeing chef have the capability to give a nodes permission to write to a particular databag without giving the node admin rights in case anyone is accepting Christmas wishes.

For the shef localhost:4000 error, that was just because, when running shef from a client node, I forgot I need to launch shef with 'shef -c /etc/chef/client.rb'

MG

On Wed, Dec 7, 2011 at 10:09 PM, Peter Donald < " target="_blank"> > wrote:
Hi,

We had this exception when the chef-solr service died on the chef
server. Figure out what killed itc and restart was our approach.

* This was a result of some process updating the owner/permissions on
/var/log/chef and /var/run/chef so that solr failed during startup.

On Thu, Dec 8, 2011 at 4:02 PM, Michael Glenney < " target="_blank"> > wrote:
> I'm having problems with a 403 error when trying a new cookbook and I think
> I've tracked it down to databag access.
>
> Chef Server 10.0
> chef-client 10.4
>
> The first several lines of the stacktrace:
>
> Generated at 2011-12-08 04:12:07 +0000
> Net::HTTPServerException: 403 "Forbidden"
> /usr/lib/ruby/1.9.1/net/http.rb:2303:in `error!'
> /usr/lib/ruby/gems/1.9.1/gems/chef-0.10.4/lib/chef/rest.rb:237:in `block in
> api_request'
> /usr/lib/ruby/gems/1.9.1/gems/chef-0.10.4/lib/chef/rest.rb:288:in
> `retriable_rest_request'
> /usr/lib/ruby/gems/1.9.1/gems/chef-0.10.4/lib/chef/rest.rb:218:in
> `api_request'
> /usr/lib/ruby/gems/1.9.1/gems/chef-0.10.4/lib/chef/rest.rb:130:in `put_rest'
> /usr/lib/ruby/gems/1.9.1/gems/chef-0.10.4/lib/chef/data_bag_item.rb:227:in
> `save'
> /var/chef/cache/cookbooks/ejabberd/recipes/default.rb:45:in `from_file'
> /usr/lib/ruby/gems/1.9.1/gems/chef-0.10.4/lib/chef/mixin/from_file.rb:30:in
> `instance_eval'
> /usr/lib/ruby/gems/1.9.1/gems/chef-0.10.4/lib/chef/mixin/from_file.rb:30:in
> `from_file'
> /usr/lib/ruby/gems/1.9.1/gems/chef-0.10.4/lib/chef/cookbook_version.rb:578:in
> `load_recipe'
>
>
> The relevant part of that recipe is:
>
> # Find cluster master, or create item and become master
> if search(:ejabberd, "id:#{deployid}").count == 0
>   masternode = "#{node[:ipaddress]}"
>   h = {}
>   h[deployid] = {"id" => deployid, "master" => masternode, "members" =>
> [masternode]}
>
>   # Create new data bag item for cluster
>   databag_item = Chef::DataBagItem.new
>   databag_item.data_bag("ejabberd")
>   databag_item.raw_data = h[deployid]
>   databag_item.save
> else
>
> This is for a first chef run on a new node.  If I run shef, switch to recipe
> context, and run 'search(:ejabberd, "id:deployment_000010182")' I get back:
>
> chef:recipe > search(:ejabberd, "id:deployment_000010182")
> [Thu, 08 Dec 2011 04:51:05 +0000] ERROR: Connection refused connecting to
> localhost:4000 for /search/ejabberd, retry 1/5
> [Thu, 08 Dec 2011 04:51:10 +0000] ERROR: Connection refused connecting to
> localhost:4000 for /search/ejabberd, retry 2/5
> [Thu, 08 Dec 2011 04:51:15 +0000] ERROR: Connection refused connecting to
> localhost:4000 for /search/ejabberd, retry 3/5
> [Thu, 08 Dec 2011 04:51:20 +0000] ERROR: Connection refused connecting to
> localhost:4000 for /search/ejabberd, retry 4/5
> [Thu, 08 Dec 2011 04:51:25 +0000] ERROR: Connection refused connecting to
> localhost:4000 for /search/ejabberd, retry 5/5
> Errno::ECONNREFUSED: Connection refused - Connection refused connecting to
> localhost:4000 for /search/ejabberd, giving up
>
> but my /etc/chef/client.rb has the proper url:port for my chef server.  If I
> run the same command from my local box I get back:
>
> chef:recipe > search(:ejabberd, "id:deployment_000010182")
>  => []
>
> which is what I expect.  Any ideas where I should be looking?
>
> Thanks,
>
> MG
>



--
Cheers,

Peter Donald







Archive powered by MHonArc 2.6.16.

§