Hi all - I could use some guidance tracking down a frustrating intermittent issue we've been having with open source Chef Server. This issue started when we were running version 11.0.8 on CentOS 6.4 and has continued after upgrading to 11.6.1. We interact with Chef Server frequently using chef-api gem v0.5.0.
{"error":["Invalid signature for user or client 'bethany'"]}
Corresponding logs on Chef server:
=> /var/log/chef-server/erchef/requests.log.2 <==
2014-10-16T21:19:42Z
">
method=GET; path=/cookbooks/pp-chef-server?num_versions=1; status=401; user=bethany; req_id=8tlCJk/Z9R+mPVS/ztvVzw==; msg=bad_sig; req_time=3; rdbms_time=0; rdbms_count=2;
==> /var/log/chef-server/erchef/crash.log <==
2014-10-16 21:19:42 =ERROR REPORT====
{<<"method=GET; path=/cookbooks/pp-chef-server; status=401; ">>,"Unauthorized"}
==> /var/log/chef-server/erchef/erchef.log <==
2014-10-16 21:19:42.950 [error] {<<"method=GET; path=/cookbooks/pp-chef-server; status=401; ">>,"Unauthorized"}
It happens for GET and PUT requests for nodes, cookbooks, searches, and for many different users in our organization. Re-trying the request always works. I've yet to see a 401/bad_sig from using knife, but we also rarely use knife. I'm currently running commands on a loop via knife to see if I can trigger a 401 but so far have had no errors.
System load on the server is always low, plenty of available memory, and no iowait or other disk-related performance issue markers for /var/opt/chef-server/ which is a DRBD disk on SSD. Requests come in via a Heartbeat-managed virtual IP but there is no additional layering of load-balancing or proxy-ing.
Any ideas what might be causing the client to only occasionally present an invalid signature? Should I be looking more closely at the chef-api gem source rather than the chef server itself?
Bethany
--
Bethany Erskine
Senior Technical Operations Engineer