[chef] Debugging Chef errors


Chronological Thread 
  • From: Michael Fischer < >
  • To:
  • Subject: [chef] Debugging Chef errors
  • Date: Wed, 11 Jun 2014 10:35:59 -0700

Hi folks,

We recently upgraded to Chef Server 11.1.1 and are still occasionally seeing erchef errors like:

Dreaded "no connections" error:

2014-06-11 17:22:03.351 [error] {<<"method=POST; path=/search/node; status=500; ">>,{error,{error,function_clause,[{chef_wm_search,'-make_bulk_get_fun/5-lc$^2/1-0-',[{error,no_connections},#Fun<chef_wm_routes.3.34923093>,[{<<"name">>,[<<"name">>]},{<<"hostname">>,[<<"hostname">>]},{<<"fqdn">>,[<<"fqdn">>]},{<<"ipaddress">>,[<<"ipaddress">>]},{<<"rsa">>,[<<"keys">>,<<"ssh">>,<<"host_rsa_public">>]},{<<"dsa">>,[<<"keys">>,<<"ssh">>,<<"host_dsa_public">>]}],node],[{file,"src/chef_wm_search.erl"},{line,336}]},{chef_wm_search,fetch_result_rows,4,[{file,"src/chef_wm_search.erl"},{line,447}]},{chef_wm_search,make_search_results,5,[{file,"src/chef_wm_search.erl"},{line,414}]},{chef_wm_search,to_json,2,[{file,"src/chef_wm_search.erl"},{line,131}]},{chef_wm_search,process_post,2,[{file,"src/chef_wm_search.erl"},{line,271}]},{webmachine_resource,resource_call,3,[{file,"src/webmachine_resource.erl"},{line,186}]},{webmachine_resource,do,3,[{file,"src/webmachine_resource.erl"},{line,142}]},{webmachine_decision_core,resource_call,1,[{file,"src/webmachine_decision_core.erl"},{line,48}]}]}}}

"badrecord" error:

2014-06-11 17:21:00.154 [error] {<<"method=POST; path=/environments/production/cookbook_versions; status=500; ">>,{error,{error,{badrecord,chef_cookbook_version},[{chef_wm_depsolver,'-assemble_response/3-lc$^0/1-0-',2,[{file,"src/chef_wm_depsolver.erl"},{line,291}]},{chef_wm_depsolver,'-assemble_response/3-lc$^0/1-0-',2,[{file,"src/chef_wm_depsolver.erl"},{line,293}]},{chef_wm_depsolver,assemble_response,3,[{file,"src/chef_wm_depsolver.erl"},{line,291}]},{webmachine_resource,resource_call,3,[{file,"src/webmachine_resource.erl"},{line,186}]},{webmachine_resource,do,3,[{file,"src/webmachine_resource.erl"},{line,142}]},{webmachine_decision_core,resource_call,1,[{file,"src/webmachine_decision_core.erl"},{line,48}]},{webmachine_decision_core,decision,1,[{file,"src/webmachine_decision_core.erl"},{line,486}]},{webmachine_decision_core,handle_request,2,[{file,"src/webmachine_decision_core.erl"},{line,33}]}]}}}

And a new one:

2014-06-11 17:30:47.316 [error] Supervisor pooler_chef_depsolver_member_sup had child chef_depsolver_worker started with {chef_depsolver_worker,start_link,undefined} at <0.19090.54> exit with reason killed in context child_terminated

For the "no connections" error, we tried to raise the db client pool to 400 and Postgres max connections to 500 without much success.  (As a side note, I'd really like to understand why erchef uses a db connection pool, unless Postgres connections are really much more expensive to establish than MySQL's.  Last I can remember connection pools being useful was way back in my Oracle PL*SQL days.)

For /search queries are we tuning the right thing?  I expect them to be directed to solr, not Postgres - someone clue me in?  Is there a different tunable we should be looking at?

Just a general comment about erchef logging - the log information is almost inscrutable to the hapless administrator.  If there any work being done to make the messaging clearer?

Thanks,

--Michael




  • [chef] Debugging Chef errors, Michael Fischer, 06/11/2014

Archive powered by MHonArc 2.6.16.

§