[chef] Re: Re: Re: Re: Data Bag Search Delay


Chronological Thread 
  • From: Greg Zapp < >
  • To:
  • Subject: [chef] Re: Re: Re: Re: Data Bag Search Delay
  • Date: Fri, 4 Oct 2013 10:41:48 +1300

Thanks too Zac... I may be able to create more data bags than I originally planned to ease the pain of iterating.  Would be nice if Chef used something like couchbase or even upgraded to Solr4 with their soft commits though.

I'm also setting up a "shared" hosting environment like Steven and in load balanced pools to boot.  I find myself debating whether I should bother implementing certain stuff in Chef, or just do it through our agent(it can run jobs) more frequently as I get into the devilish details.

@Steven: I have a few ideas around this myself.
* Create a data bag for each node and place the domains it needs in there.  Then you can iterate over all items efficiently.  You can also setup the domain on multiple hosts for migrating services easily enough.
* Have nodes remove data items after working them.  Implement a unique ID and revision numbers.  This will allow updates and give the node the ability to detect which item is the latest.  Store the info into node attributes(or at least the revision number) and save before removing the data item from the bag.  This will keep the domain bag cheap to iterate and create a job queue of sorts.

I have other ideas but can't recall them all just yet; it's too early.  Looking forward to what others suggest.



On Fri, Oct 4, 2013 at 7:47 AM, Steven Barre < " target="_blank"> > wrote:
Thanks for the info Stephen and Seth

I'm still new to this and only expect to have 200 nodes or so max. I've only got 15 right now.


What you may want to ask instead, is what is it about your usage of databags that necessitates real-time search?

Maybe you have a better idea of how I should be doing things? I'm setting up web servers with shared hosting. I've got a data bag for all the domains, each domain document has an attribute to say which node it belongs on. Then the recipe does

search(:domains, "nodes:#{node['hostname']}") do |domain|

to find all the domains it needs and to configure them.

We then have a webui to allow people to create domains. And the plan is to just add the document to the databag then call chef-client on the node and then return a success message to the user. So having that webui take a minute is a little undesirable.

What would be a better way to handle this?


=================================================
Steven Barre, RHCE, ZCE, MCP

 
 " target="_blank">
 

Systems Administrator / Programmer
Real Estate Webmasters - 250-753-9893
==================================================
On 2013-10-02 18:09, Stephen Delano wrote:
Solr 1.4, the Solr included with the chef server, is asynchronous in "commiting" saved object to the index. The rate at which Solr commits is tunable. The defaults are set to commit every 60 seconds or 1000 documents as seen here https://github.com/opscode/omnibus-chef-server/blob/master/files/chef-server-cookbooks/chef-server/attributes/default.rb#L74-L75.

You can tune these to your heart's content by editing the /etc/chef-server/chef-server.rb file to override the default values, but you should be aware of the tradeoffs that you're making by doing so.

Every time Solr commits to the index, it blocks all incoming updates. As you shorten the duration between commits, the time that chef-expander has available to send updates to Solr decreases and you may, under heavy load, find yourself in a state that your update rate outruns the rate at which you can commit objects to the index. If you're going to be putting this server under heavy load, proceed with caution.

What you may want to ask instead, is what is it about your usage of databags that necessitates real-time search?

-Stephen

On Wed, Oct 2, 2013 at 5:07 PM, Noah Kantrowitz < " target="_blank"> > wrote:
Get more CPU for Solr. There have been some experiments with replacing Solr with ElasticSearch which can have better insertion performance, so you could also look at working on that patch.

--Noah

On Oct 2, 2013, at 4:44 PM, Steven Barre < " target="_blank"> > wrote:

> It takes 60 seconds from when I call "knife data bag from file somebag path/to/some.json" until "knife search somebag" will return the answer. Is there anything that can be done to make that faster?
>
> http://community.opscode.com/questions/436 also describes the issue.
>
> CentOS 6.4
> chef-server-11.0.8-1.el6.x86_64
>
> --
> =================================================
> Steven Barre, RHCE, ZCE, MCP
> " target="_blank">
>
> Systems Administrator / Programmer
> Real Estate Webmasters - 250-753-9893
> ==================================================
>




--
Stephen Delano
Software Development Engineer
Opscode, Inc.
1008 Western Avenue
Suite 601
Seattle, WA 98104





Archive powered by MHonArc 2.6.16.

§