[chef] RE: Re: RE: Re: os_process_error in CouchDB


Chronological Thread 
  • From: "Van Fossan,Randy" < >
  • To: < >
  • Subject: [chef] RE: Re: RE: Re: os_process_error in CouchDB
  • Date: Thu, 15 Mar 2012 13:42:25 -0400

Hi Chris,

 

Here are the relevant portions of the couch config and init.d scripts.   I am not sure what needs to be done..  Are you suggesting that we set the COUCHDB_RESPAWN_TIMEOUT=0 in etc/init.d/couchdb to a different value  or the os_process_timeout in /etc/couchdb/default.ini ?

 

-----

/etc/couchdb/default.ini

[couchdb]

os_process_timeout = 5000 ; 5 seconds. for view and external servers

-----------------

/etc/init.d/couchdb (and /etc/sysconfig/couchdb)

COUCHDB_RESPAWN_TIMEOUT=0

 

 

From: Chris [mailto:
Sent: Thursday, March 15, 2012 1:23 PM
To:
Subject: [chef] Re: RE: Re: os_process_error in CouchDB

 

Do you have CouchDB heartbeat turned on? i *believe* its off by default on Centos.

On Thu, Mar 15, 2012 at 10:13 AM, Van Fossan,Randy < "> > wrote:

I believe I am getting the same thing..  " os_process_error occurred"

See ->   http://pastie.org/3602358

Randy


-----Original Message-----
From: Adam Jacob [mailto: "> ]
Sent: Thursday, March 15, 2012 1:03 PM
To: ">
Subject: [chef] Re: os_process_error in CouchDB

Can you show us the full stack trace from CouchDB?

Have you showed it to the folks on the CouchDB list?

Adam

On Wed, Mar 14, 2012 at 6:51 AM,  < "> > wrote:
> We are seeing a strange error in CouchDB that causes Chef to become
> unusable and unrecoverable. The knife command ceases to respond, and
> the chef webui ceases to respond. /var/log/couch.log shows an
> os_process_error with exit status 0.
>
> This is the second time this has happened. The first time, it happened

> to our chef-server that was running properly for several weeks. On
> Monday, at about 11 AM EST, the error occurred and our chef-server
became urecoverable.
> We tried to research and recover the issue for about a day.
>
> We then rebuilt the chef-server this morning. During the
> setup/installation, we encountered this issue
> (http://tickets.opscode.com/browse/CHEF-2346)
> which we had encountered in the past. We then applied the fix, by
> increasing maxFieldLength in the mainIndex section of the chef solr
config file.
>
> Very shortly after that, while do a chef run on a lab node, running a
> knife command and trying to access the web UI all at the same time,
> the os_process_error occurred again and the chef-server became
unusable.
>
> Our chef-server is running on a vSphere VM with 2 cores (2 cores in 1
> socket), 2GB of RAM. It's running Ubuntu 10.04 LTS, Chef 0.10.8 and
> CouchDB 0.10. The VM was generated from a pre-existing VM that
> originally had only 1 core.
>
> Another detail about our environment that may be important is that we
> use Centrify on our Linux server for Active Directory integration.
> This is why we were affected by CHEF-2346. Since chef pulls in all
> authorized users on a node as an automatic attribute, there can be
> thousands of users in a list that gets gathered by chef.
>
> Is perhaps CouchDB dying because of the size of the node data that we
> are asking chef to gather? Has anyone else encountered this error?
> Much thanks for any help. Let me know if I can provide any more
information.
>
> Ian D. Rossi
>



--
Opscode, Inc.
Adam Jacob, Chief Customer Officer
T: (206) 619-7151 E: ">



 

--
Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.




Archive powered by MHonArc 2.6.16.

§