chef - [chef] Delayed notifications and failure (was: Re: Monitoring chef runs)

First login ?
Lost password ?

Subscribers: 1946
Owners
Bryan McLellan
Joshua Timberman
Nathen Harvey
Seth Chisamore
Serdar Sutay

Subscribe
Unsubscribe
Info
Archive

Post

RSS
Shared documents

General discussion about Chef

[chef] Delayed notifications and failure (was: Re: Monitoring chef runs)

From: Daniel DeLeo < >
To:
Subject: [chef] Delayed notifications and failure (was: Re: Monitoring chef runs)
Date: Fri, 7 Sep 2012 18:28:40 -0700

On Friday, September 7, 2012 at 3:22 PM, KC Braunschweig wrote:

On Fri, Sep 7, 2012 at 3:11 PM, Daniel DeLeo < "> > wrote:
There's a patch where delayed notifications are always run in master. This
is a pretty significant behavior change so we're waiting until Chef 11 to
ship it.

Interesting. I suspect that would be good much of the time, and more
obvious, but sorta goes against the normal chef behavior that if
something bad happens we bail out immediately. Thanks,

KC

It was an easy patch, but not an easy decision.

The basic argument for is that delayed notifications are generally used to make configuration changes for resources where Chef cannot verify the state. For example, Chef cannot usually tell if a service resource is running the correct version of the application and config, so it has no way to enforce a policy (service "foo" should be running with the app version and config on disk) in an idempotent way. For simple, single-process services, workarounds are possible (e.g., using something like `ps -o etime` and a disk-based notification queue), but there is no general solution that works for all cases.

The argument against is that some resource may be partially or incorrectly configured, and running the delayed notification could therefore cause an outage.

After much discussion, we decided that the idea that chef made a promise (of sorts) to run an action on a resource and the benefit of ending up with a correctly configured system according to your policy by re-running chef outweighed the concerns about (possibly, temporarily) leaving a resource in an incorrect state. Furthermore, running an incorrect version of, or incorrectly configured service can also lead to severe problems, so most of the "win" ends up on the side of always running the notifications.

Daniel DeLeo

[chef] Re: Monitoring chef runs, (continued)
- [chef] Re: Monitoring chef runs, Tetsu Soh, 09/06/2012
  - [chef] Re: Re: Monitoring chef runs, Joshua Miller, 09/06/2012
    - [chef] Re: Re: Re: Monitoring chef runs, Tetsu Soh, 09/06/2012
      - [chef] RE: Re: Re: Re: Monitoring chef runs, Paul McCallick, 09/06/2012
        
        [chef] Re: Monitoring chef runs, Tetsu Soh, 09/06/2012
        
        [chef] RE: Re: Monitoring chef runs, Paul McCallick, 09/06/2012
        
        [chef] Re: Monitoring chef runs, Tetsu Soh, 09/06/2012
        
        [chef] Re: Re: Monitoring chef runs, KC Braunschweig, 09/07/2012
        
        [chef] Re: Re: Re: Monitoring chef runs, Daniel DeLeo, 09/07/2012
        [chef] Re: Re: Re: Re: Monitoring chef runs, KC Braunschweig, 09/07/2012
        [chef] Delayed notifications and failure (was: Re: Monitoring chef runs), Daniel DeLeo, 09/07/2012
        [chef] Re: Delayed notifications and failure (was: Re: Monitoring chef runs), KC Braunschweig, 09/09/2012
      - [chef] RE: Re: Re: Re: Monitoring chef runs, Christopher Brown, 09/06/2012
        
        [chef] Re: RE: Re: Re: Re: Monitoring chef runs, Andrea Campi, 09/07/2012
      - [chef] Re: Re: Re: Re: Monitoring chef runs, Lamont Granquist, 09/06/2012
        
        [chef] Re: Re: Re: Re: Re: Monitoring chef runs, Tim Smith, 09/06/2012
        
        [chef] Re: Re: Re: Re: Re: Monitoring chef runs, KC Braunschweig, 09/06/2012
        
        [chef] Re: Re: Re: Re: Re: Re: Monitoring chef runs, Lamont Granquist, 09/06/2012