It doesn't look like you have logging set to DEBUG either. Try that and maybe check your log rotation to protect your disk space.
Bryan
From “knife node show --long" right now on both nodes the attributes look fine. Spits out 140KB file for one node and 98KB for the other. (Most of that is the list of cpus and abilities)
Will keep that in mind and run it immediately after it breaks next time.
Cheers,
Dave.
From: Bryan McLellan [mailto: " target="_blank"> ]
Sent: Friday, 20 January 2012 10:40 a.m.
To: " target="_blank">
Subject: [chef] Re: chef-client massive memory usage spikes
I wonder if a node attribute is growing out of control. Have you looked at the node attributes on the server after this run finishes?
Bryan
On Jan 19, 2012 3:40 PM, "David Leaver" < " target="_blank"> > wrote:
We are running ubuntu 10.04 LTS, x64.
chef-client is running with the default service installed with the chef .deb packages.
On two of our machines when chef-client runs and does a check it randomly (once a day to once a week) takes longer than normal (30s becomes 3 minutes+) and takes up a lot more memory than chef should.
5GB memory usage: http://pastebin.com/YKXXFRaR (Taken from a machine running right now which had the issue earlier this morning, I’ve left chef-client running at 5GB in case there is anything that can be pulled from it).
For each of the machines having this problem we have an identically configured machine, same chef roles etc (secondary for failover) which is not exhibiting any of the issues (These machines also have almost no load, whereas the ones having issues have plenty of load).
I really don’t have any idea where to start with investigating this, so any help will be much appreciated!
The runlist for each machine has apt (currently 1.1.1, will update that) which installs a few packages, however none of these have changed recently, so no packages should be being installed.
Each machine has been running with the exact same runlist for quite a while (6 months+), and only started exhibiting this problem in the last month.
The chef logs when these memory/time issues occur appear in the log as follows:
[Fri, 20 Jan 2012 07:48:06 +1300] INFO: Processing easy_install_package[jonpy] action install (mapnik::default line 43)
[Fri, 20 Jan 2012 07:49:57 +1300] INFO: Processing package[php5-gd] action install (mapscript::default line 10)
(jonpy is already installed, yet it took 2 minutes to check?)
Or more commonly like this:
[Mon, 09 Jan 2012 01:05:31 +1300] INFO: Processing package[snmpd] action install (snmpd::default line 10)
[Mon, 09 Jan 2012 01:08:12 +1300] INFO: Processing service[snmpd] action nothing (snmpd::default line 14)
….
[Mon, 09 Jan 2012 01:08:27 +1300] INFO: Processing package[apache2] action install (apache2::default line 10)
[Mon, 09 Jan 2012 01:10:05 +1300] INFO: Processing service[apache2] action nothing (apache2::default line 14)
(Both packages already installed and happily running, yet took a long time)
One machine is on chef 10.8, the other on 10.0.
Runlist from machines:
If there are any further details that may help investigate this I will happily supply them.
Cheers,
Dave.
Archive powered by MHonArc 2.6.16.