In regards to that thread, I (woohoo!) finally got around to
submitting a pull request that splits the Windows ohai kernel plugin
into separate plugins so it will be easier to cut down on unneeded
ohai data for Windows.
On 6/17/12 3:56 AM, Jeremiah Snapp wrote:
"
type="cite">
I'm just adding my two cents to the great suggestions from MC
and Sascha.
As KC suggested you want to consider preventing your nodes from
converging at the same time to reduce the amount of concurrent
requests to the server.
When considering the large amount of windows ohai data you may
want to look at a chef thread from may 29 with the subject
"Knife search note returning a node". It mentions disabling a
Windows ohai plugin to reduce the amount of content.
Refer to http://wiki.opscode.com/display/chef/Disabling+Ohai+Plugins
On Jun 17, 2012 4:31 AM, "Madhurranjan
Mohaan" <
">
>
wrote:
Thanks yet again for the response!
@KC- Yes, we're running just one thread and the nodes are
converging every hour. I'll spike out the unicorn + nginx
setup on a new box with Centos 6.2 and get see how that
behaves and then probably move these out to that setup. Thanks
for the tip!
@Sascha - Its a mix of Windows 2003 32 bit server and WIn
2003 64 bit mostly. I ain't sure if the sheer amount of ohai
data is causing this. Any other parameters I should consider?
Ranjan
On Sun, Jun 17, 2012 at 3:26 AM, Sascha Bates <
" target="_blank">
>
wrote:
Could it be the number of Windows servers and the
astonishing amount of ohai data collected for Windows? My
understanding is that Windows ohai has an awful lot of
data. I haven't worked with it in a few months so my
memory is fading a bit and I was chef-solo anyway. 120
Windows nodes might produce a lot of data.
On 6/16/12 3:47 PM, KC Braunschweig wrote:
On Sat, Jun 16, 2012 at 12:41 PM, Madhurranjan
Mohaan
<
"
target="_blank">
>
wrote:
Do you think we should scale out ? If yes, what
services do you think we
should run on different servers? Also, on my end,
I am trying to see if all
Regarding the instability, I can tell you I had
issues on RHEL 5.7
because the versions of couchdb and erlang were old.
Newer packages
probably would have fixed it, but I upgraded to RHEL
6.1 which also
had newer versions and things were happier. Doesn't
sound exactly like
your instability, but worth considering.
Regarding the performance issues, I hope that Josh
was joking. 160
nodes is nothing. Are they converging every 30
minutes? Do you have a
reasonable splay? Are your recipes very search
heavy? It could be a
lot of things, but I'd start with considering the
concurrency on the
server API. Are you running a single Thin process
for the API server?
If so, consider running multiple processes with
proxy balancer or some
such in front of them. Alternatively switch the
server to run in
unicorn with nginx in front of it. I've been happy
with unicorn so
far.
I don't think you should be there yet, but 4gb is
probably not gonna
be enough forever. Eventually solr will want more
heap and you'll need
memory as you add api server workers and couch will
take whatever's
left. Which leads back to either adding memory or
Josh's point of
splitting components on different servers. That's
eventually though,
I'd hope you could get at least a couple hundred
nodes with your
current VM and 1000+ with 8gb without too much
trouble.
To give you an example, I have a preprod server with
about 1000 nodes:
RHEL 6.1 VM
8gb
4 virtual cores
unicorn - 8 api workers, 2 webui workers
solr - 2gb heap
chef 0.10.4
KC
On Sat, Jun 16, 2012 at 7:25 PM, Joshua
Timberman<
" target="_blank">
>
wrote:
Are you running all the chef server services on
one machine? What is the
hardware spec of it? 160 nodes is quite a few.
Sounds like you may need to
start scaling out the server and run services on
separate systems.
|