Thanks a lot Steven & Stephen!
I'm not sure I would go as far to call it a recommendation, but I<jesse> I will try 1:1 ratio plus some headroom for erchef['db_pool_size'] and the number of concurrent chef clients. It should consume more CPU and memories. Is there a limit for number of the concurrent chef clients served by a single Chef Server 11 ? The depsolver workers number seems a blocker, unless giving more CPU cores. If adding more CPU cores, will increase erchef['depsolver_worker_count'] and nginx['worker_processes'] solve the 503 issue? Both these 2 attributes' devalue value are related to CPU cores number ? By returning a 503, the client should retry said request with an exponential backoff of up to 60 seconds between retries. Are these 503s stopping chef-client runs?<jesse> Yes, the chef-client exits after getting 503. How can I tell the chef-client to retry syncing cookbook or calling Chef Node or Search APIs? Is there a settings in /etc/chef/client.rb? If not, I may re-spawn chef-client when getting 503 error. Some other observations and questions below: <jesse> If using the default value with Chef Server 11.0.8, the 'knife cookbook upload -a' will hang forever, so I increased it. I'm not sure whether it reproduces in the latest Chef Server 11.1.1.
<jesse> I tried 'no_lazy_load false' in my 300 nodes cluster, it still threw 503 ERROR. I see there is a commit make no_lazy_load the default in 7 days ago. As it described, my chef-client might run for a long time (1~2 hours), so I think I'd better set no_lazy_load to true. Thanks Jesse Hu Stephen Delano wrote On Tue, 29 Jul 2014 11:57:02 -0700 : " type="cite"> |
Archive powered by MHonArc 2.6.16.