I found it. /var filled up with views getting too large. I can’t believe I missed that..
I found the info I needed in Harold’s blog… http://blog.agoragames.com/blog/2011/08/10/chef-explosion/
It is still slow, but functional now.. Status on 100 nodes still takes 3-5 minutes.
Randy
From: Van Fossan,Randy
Sent: Saturday, March 17, 2012 10:23 PM
To: '
'
Subject: RE: [chef] R Re: os_process_error in CouchDB
Well after upgrading to couchdb 1.0.2 things have steadily gone downhill. On Friday, other than performance, things seemed a little better. Now I cannot even get the database to startup any more unless I reboot.
I get many of the following (different seq number and different doc.id) in the /var/log/couchdb/couch.log See -> http://pastie.org/3618670
[Fri, 16 Mar 2012 15:22:28 GMT] [info] [<0.1354.0>] checkpointing view update at seq 89668 for chef _design/nodes
[Fri, 16 Mar 2012 15:22:28 GMT] [info] [<0.238.0>] OS Process #Port<0.3913> Log :: function raised exception (new TypeError("doc.attributes has no properties", "", 3)) with doc._id 0f696c00-1d54-4677-888a-a364b3b94bb2
[Fri, 16 Mar 2012 15:22:29 GMT] [info] [<0.1354.0>] checkpointing view update at seq 89671 for chef _design/nodes
[Fri, 16 Mar 2012 15:22:29 GMT] [info] [<0.238.0>] OS Process #Port<0.3913> Log :: function raised exception (new TypeError("doc.attributes has no properties", "", 3)) with doc._id e09374fd-2fe8-44da-9a98-6a2070df11f7
This chef server is really starting to drive me crazy. This will be a show stopper if I cannot get a stable chef server.
Randy
From: Van Fossan,Randy
Sent: Friday, March 16, 2012 11:04 AM
To:
">
Subject: RE: [chef] R Re: os_process_error in CouchDB
I have upgraded couchdb to 1.0.2 and it still seems to be very very slow. Did I mention very slow? :/ We have less than 100 nodes (in chef) and about 50 cookbooks and 20 or so roles. Very little is applied to the nodes at this time. If I click on Status in the webui, it take 3-5 minutes or longer to return a status.
The server is a VM with 2 vCPU’s and 4 GB RAM. It is not starved for virtual resources.. I do notice 60% paging when attempting to reindex with knife index rebuild. I also receive timeouts during the index rebuild with a message trying again. This eventually times out completely and fails.
ERROR: Timeout connecting to myserver.myco.org:4000 for //search/reindex, retry 1/5
….
ERROR: Timeout connecting to myserver.myco.org.org:4000 for //search/reindex, retry 3/5
ERROR: Failed to authenticate to http:// myserver.myco.org:4000 as myself with key /home/myself/.chef/myself.pem
Response: Failed to authenticate. Please synchronize the clock on your client
The chef server and knife client are on the same host.
It is unusable in its current state.
Any ideas?
Randy
From: Chris
">[mailto:
Sent: Thursday, March 15, 2012 4:19 PM
To:
">
Subject: [chef] Re: RE: Re: RE: RE: RE: Re: os_process_error in CouchDB
This is where i got mine
On Thu, Mar 15, 2012 at 1:03 PM, Van Fossan,Randy <
">
> wrote:
Yes,
However, I could not find a couchdb 1.0.x from a trusted source for
CentOS 5.x. We have built a new server (CentOS 6 and couchdb 1.0.2)
but do not have time to migrate right now. If anyone knows where I
can find couchdb 1.0.x for CentOS 5.x that is from a reputable source, I
am all for it.
A few weeks ago, I cloned the server (VM) and downloaded a couchdb rpm
and tested that I could in fact just upgrade couchdb and it worked. But
again, I need a reputable source.
Thanks
Randy
-----Original Message-----
From: Chris [mailto:
">
]
Sent: Thursday, March 15, 2012 2:49 PM
To:
">
Subject: [chef] Re: RE: RE: RE: Re: os_process_error in CouchDB
There's a rpm available for 1.0.1, I don't think it's in centos or epel.
You should be able to get it via rpmfind. I agree with ian, you should
ipgrade
Sent from a phone
On Mar 15, 2012, at 11:39 AM, <
">
> wrote:
> Randy,
>
> Is it possible to upgrade to 1.0.1 or greater? What OS you running?
CouchDB 1.0.1 is included in Ubuntu 11.10.
>
> ian D. Rossi
>
> -----------------------------
> From: Van Fossan,Randy [
">
]
> Sent: Thursday, March 15, 2012 2:10 PM
> To:
">
> Cc:
">
; Kiner, Kari; Zulauf, Graham
> Subject: [chef] RE: RE: Re: os_process_error in CouchDB
>
> Ian / Adam,
> On my current chef-server, my couch db is version 0.11.2
>
> Randy
>
> -----Original Message-----
> From:
">
[mailto:
">
]
> Sent: Thursday, March 15, 2012 2:06 PM
> To:
">
> Cc:
">
;
">
;
">
> Subject: [chef] RE: Re: os_process_error in CouchDB
>
> Hi Adam,
>
> couch.log is 125000 lines long, so I'll include the beginning
> (http://pastie.org/3602674) and the end (http://pastie.org/3602677).
> I'll post to CouchDB's mailing list too.
>
> I have to add to my below description that after I rebuilt the
> chef-server, I reloaded all of our cookbooks, roles and databags.
>
> I do believe that Randy has the same issue, although I'm not sure what
> version of CouchDB he is using.
>
> Ian D. Rossi
>
> ________________________________________
> From: Adam Jacob [
">
]
> Sent: Thursday, March 15, 2012 1:02 PM
> To:
">
> Subject: [chef] Re: os_process_error in CouchDB
>
> Can you show us the full stack trace from CouchDB?
>
> Have you showed it to the folks on the CouchDB list?
>
> Adam
>
> On Wed, Mar 14, 2012 at 6:51 AM, <
">
> wrote:
>> We are seeing a strange error in CouchDB that causes Chef to become
>> unusable and unrecoverable. The knife command ceases to respond, and
>> the chef webui ceases to respond. /var/log/couch.log shows an
>> os_process_error with exit status 0.
>>
>> This is the second time this has happened. The first time, it
>> happened
>
>> to our chef-server that was running properly for several weeks. On
>> Monday, at about 11 AM EST, the error occurred and our chef-server
> became urecoverable.
>> We tried to research and recover the issue for about a day.
>>
>> We then rebuilt the chef-server this morning. During the
>> setup/installation, we encountered this issue
>> (http://tickets.opscode.com/browse/CHEF-2346)
>> which we had encountered in the past. We then applied the fix, by
>> increasing maxFieldLength in the mainIndex section of the chef solr
> config file.
>>
>> Very shortly after that, while do a chef run on a lab node, running a
>> knife command and trying to access the web UI all at the same time,
>> the os_process_error occurred again and the chef-server became
> unusable.
>>
>> Our chef-server is running on a vSphere VM with 2 cores (2 cores in 1
>> socket), 2GB of RAM. It's running Ubuntu 10.04 LTS, Chef 0.10.8 and
>> CouchDB 0.10. The VM was generated from a pre-existing VM that
>> originally had only 1 core.
>>
>> Another detail about our environment that may be important is that we
>> use Centrify on our Linux server for Active Directory integration.
>> This is why we were affected by CHEF-2346. Since chef pulls in all
>> authorized users on a node as an automatic attribute, there can be
>> thousands of users in a list that gets gathered by chef.
>>
>> Is perhaps CouchDB dying because of the size of the node data that we
>> are asking chef to gather? Has anyone else encountered this error?
>> Much thanks for any help. Let me know if I can provide any more
> information.
>>
>> Ian D. Rossi
>>
>
>
>
> --
> Opscode, Inc.
> Adam Jacob, Chief Customer Officer
> T: (206) 619-7151 E:
">
>
>
--
Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.