[chef] Re: Re: Re: Re: aws autoscaling and chef cleanup


Chronological Thread 
  • From: Morgan Blackthorne < >
  • To: " " < >
  • Cc: Brian Hatfield < >
  • Subject: [chef] Re: Re: Re: Re: aws autoscaling and chef cleanup
  • Date: Mon, 20 May 2013 03:09:34 -0700

Sam, that's not exactly what it does. It generates a list of all the EC2 instance_id values, and then does a Knife search for all nodes with the attribute ec2_instance_id. (You can duplicate this on the command line with "knife search node "ec2_instance_id:*". Note that the actual attribute for -a parameters to other commands would be ec2:instance_id; this bit of syntax is somewhat confusing IMO, but I'm guessing there's a good reason for "flattening" the Ohai trees for search.) It then compares these two lists and prunes any Chef nodes that have the attribute but don't have a corresponding EC2 listing.

I haven't run into any problems with this personally, but it just occurs to me that if there's a non-fatal error with the EC2 calls (say a region is having issues with the API service, but EC2 is running fine), then it could potentially end up pruning a node that's still valid. However, assuming you're not running the chef-client::delete_validation (and I don't see much of an issue leaving that on a running system vs. configuring that in an AMI...), then the next chef-client run will re-register and everything should likely be fine. (Although I'm not sure if that would honor the -j flag passed to chef-client on startup if it's running via init... which could lead to a node with an empty runlist. If you're running it via cron, or manually, and always specifying your JSON file or configuring the runlist otherwise, it shouldn't be an issue either.)

YMMV; this is certainly an easy way to get up and started while you look into other solutions. If you happen to find that another one resolves issues you see with this, I'd certainly be interested in hearing about it.

--
~*~ StormeRider ~*~

"Every world needs its heroes [...] They inspire us to be better than we are. And they protect from the darkness that's just around the corner."

(from Smallville Season 6x1: "Zod")

On why I hate the phrase "that's so lame"... http://bit.ly/Ps3uSS


On Mon, May 20, 2013 at 2:55 AM, Sam Darwin < " target="_blank"> > wrote:
Thanks for all the great replies!!

Morgan: that is checking for terminated instances in AWS, but they may
completely vanish from AWS (not even shows as 'terminated' anymore)
and not necessarily get processed, unless the script is run often?
Cassiano: very cool about knife not having to be setup.
Thom:  SNS , SQS, looks like the way to go
Alex:  Nuvole has an implementation for that
Brian:  has an implementation for Nuvole
each step further it seems...   will look into these suggestions.


On Fri, May 17, 2013 at 8:52 PM, Brian Hatfield < "> > wrote:
> Based upon that Nuvole Computing article, I wrote this:
>
> https://github.com/bmhatfield/chef-deregistration-manager
>
> Might be useful.
>
>
> On Fri, May 17, 2013 at 1:49 PM, Alex Corley < "> > wrote:
>>
>>
>> http://www.nuvolecomputing.com/2012/07/02/chef-node-de-registration-for-autoscaling-groups/
>>
>> (Not my article)
>>
>> This is not a self-cleanup, requires a backend workflow/process, but it
>> can be expanded in any number of ways.
>>
>> - alex
>>
>>
>> On 05/17/2013 11:43 AM, Sam Darwin wrote:
>>>
>>>
>>> If using AWS auto-scaling + Chef, the final step of instance cleanup
>>> seems to
>>> be slightly unclear.
>>>
>>> One solution is to run a script in /etc/rc0.d which is called on
>>> shutdown.
>>> "knife node delete".    This requires knife to be configured and working
>>> on the
>>> instance, which is a (minor) pain.    This method will also fail for an
>>> abrupt
>>> machine crash.
>>>
>>> Another solution is to have a script which queries chef server for
>>> instances
>>> that haven't checked in for a while, and removes those.   That would
>>> require
>>> having chef-client running very often or as a daemon.
>>>
>>> I wonder what the security implications would be of adding functionality
>>> into
>>> chef-client:
>>>
>>> chef-client --remove-self-from-server
>>>
>>> Some people have posted about a script which checks for terminated
>>> instances
>>> and removes them.   this sounds like the best way.    Perhaps they mean
>>> to
>>> query AWS first, and then make changes to chef-server.    now to figure
>>> out
>>> how...
>>>
>>
>>
>> --
>> Alex Corley | Software as a Service Engineer
>> Zenoss, Inc. | Transforming IT Operations
>> "> | Skype: acorley_zenoss | Github: anthroprose
>
>




Archive powered by MHonArc 2.6.16.

§