Hey Michael, This isn't my area of expertise so I'll poke one of the other Chef devs who knows more to weigh in, but I'm curious if you can provide more info on your setup. I'd like to know what your Chef 10 setup looked like, just for comparison to Chef 11. While your Chef 11 setup seems to be hurting, what was the baseline hardware/number of boxes you had with Chef 10? Also, for completely clarity - this is open source Chef right? I just want to make sure we're all on the same page. On to actually trying to solve your problem. Having everything on a single box might be causing some of the backup. While Chef 11 is much more performant than Chef 10 if you're throwing 5000 nodes at it with everything on a single box that might hurt some. Typically we haven't seen much need to tune postgres. You might need to look at upping the connection count on postgres, but as far as I know that is usually the only tuning that is done. I'm not aware of much rabbit tuning that typically happens either, but Solr, that sits on the other end of rabbit might need some tuning. Out of the box it has some fairly vanilla settings and so you might see improvements if you look there. What Jeff said is valid. Cutting down on node data sent frees up not only network but what Solr has to ingest. Could you do possibly do some more monitoring on the box and try to figure out where the bottleneck is? That would certainly make it easier to give recommendations. In the meantime I'll ask one of the other engineers to weigh in. I'll also follow up and see if we can't get a doc page on ways to tune Chef, as that seems like it could prove helpful. - Mark Mzyk Opscode Software Engineer " type="cite"> |
Archive powered by MHonArc 2.6.16.