chef - [chef] Re: Re: Re: Re: Re: Simultaneous node edits

Subscribers: 1946
Owners
Bryan McLellan
Joshua Timberman
Nathen Harvey
Seth Chisamore
Serdar Sutay

Subscribe
Unsubscribe
Info
Archive

Post

RSS
Shared documents

General discussion about Chef

[chef] Re: Re: Re: Re: Re: Simultaneous node edits

From: Rafał Trójniak < >
To: Lamont Granquist < >
Cc: , Maxime Brugidou < >
Subject: [chef] Re: Re: Re: Re: Re: Simultaneous node edits
Date: Fri, 10 Jul 2015 21:08:19 +0200

Hello,

Thanks For feedback, comments inline.

On Fri, 10 Jul 2015 11:22:22 -0700
Lamont Granquist
< >
wrote:

> https://github.com/chef/chef-server/issues/419
Thanks for creating this issue.
Should we stich our discussion there ?

> I'm not sure that the chef-client should implement this, however, at
> least not without the desired state split in place.  Otherwise any
> time you successfully updated a node's run_list while a run was
> happening you would fail the node.save at the end of the run. That's
> the tradeoff.  We can keep it so that edits which succeed are never
> overwritten but at the cost that in the situation where you have
> concurrent access, if your knife command succeeds and wins, then the
> chef-client run node.save must fail (and one of the clients will
> fail, randomly, depending on which one wins the race).
>
> In the client we could rescue the exception, re-fetch the node data,
> re-clean default/override/automatic and then diff against the prior
> version that we got at the start, then merge changes if we can and
> then node.save again, but....   ouch...

Are we considering node object as one big always-consistent information
structure ? I suppose so.

For example each node has :
- run_list
- List of cookbooks used in the last chef-run.
If we are getting that information (from chef search), I would expect
them to be consistent. Am I wrong ? How would it look like if the
conflicting edit would be a runlist change ?

In that context I would like to see either :
- No data saved to the chef server on the chef-run spotting conflict
- Only auto-generated data (but complete and original) saved there
  without conflict/overriding manually edited data.

Is the diff-and-merge really good idea ? The logic that leads to those
attributes is highly unknown. It doesn't sound to me like good idea to
produce such data without proof it will be logically consistent.

Looking forward to see your opinion on that.

Regards,

>
> On 07/10/2015 10:02 AM, Rafał Trójniak wrote:
> > Hello Maxime,
> >
> > On Wed, 8 Jul 2015 22:46:56 +0200
> > Maxime Brugidou
> > < >
> >  wrote:
> >
> >> I don't know about the server side but wouldn't it be better to
> >> have this feature directly usable from the Chef gem? Having
> >> something like node.safe_save in addition to node.save instead of
> >> only implementing this for the knife node edit command. This could
> >> help other gems use the feature.
> > If we want to use 'If-Match' HTTP header and 'Etag' value, those has
> > to be generated/validated on Server Side. It doesn't sound like good
> > idea to do it on client-side. It would generate many concurrency and
> > implementation-dependent problems (may clients in different
> > versions) The things that need to be done on client-side is :
> > Remembering received Etag, and sending IF-Match header whatever
> > needed.
> >
> >> And should we do a safe_save or a save at the end of a Chef run?
> >> Because that's where we get most of the concurrent edits: a Chef
> >> run happening while the node gets edited, and doing a safe_save
> >> from knife won't solve this. But do we want to make the Chef run
> >> fail for that? Or retry?
> > I would appreciate community/developers suggestions here.
> > Usually, situations like that are resolved in 'first-wins' way.
> > - If the knife save will finish first - chef-client run can fail
> > - If the chef-client will finish first - knife save will fail
> > How to respond to those failures are second problem.
> >
> > Knife save can be re-tried by the operator. When the operator knows
> > what has to be changed - it can be applied to new object again.
> >
> > Chef-client run can be re-run. I don't like any 'merging' idea
> > because the result of the chef-client run still bases on the old
> > data. I suppose we can threat it as just 'failed' chef-client run -
> > just like exception in process - we do not push result data to chef
> > server. Next chef-client run will base on the new object, and will
> > probably succeed.
> >
> > What are your opinions ? Would you be able to help me implement
> > Etag/If-Match logic on chef server side ?
> >
> > Regards,
> >
> >>   On Jul 8, 2015 10:18 PM, "Rafał Trójniak"
> >> < >
> >> wrote:
> >>
> >>> On Tue, 7 Jul 2015 14:56:17 -0700
> >>> Noah Kantrowitz
> >>> < >
> >>>  wrote:
> >>>
> >>>> On Jul 7, 2015, at 2:33 PM, Rafał Trójniak
> >>>> < >
> >>>> wrote:
> >>>>
> >>>>> On Mon, 06 Jul 2015 10:20:15 -0700
> >>>>> Lamont Granquist
> >>>>> < >
> >>>>>  wrote:
> >>>>>
> >>>>>>
> >>>>>> On 07/01/2015 11:40 PM, Maxime Brugidou wrote:
> >>>>>>> I would like to stop using Chef nodes as file but use the new
> >>>>>>> chefDK provision command with a special driver that would
> >>>>>>> "pick" a node from the firstboot pool (so basically my
> >>>>>>> "cloud" provider is the pool of firstboot nodes in Chef).
> >>>>>>> Without dealing with concurrent access to Chef provision,
> >>>>>>> this seem doable: to allocate a node I can "tag" a firstboot
> >>>>>>> node and delete it once the machine is ready.
> >>>>>>>
> >>>>>>> But how to do this with concurrent access? It seems almost
> >>>>>>> impossible. And the way things are going with Policy files
> >>>>>>> will tend towards a separate git repo and provision cookbook
> >>>>>>> per policy, all sharing the same pool of firstboot nodes (for
> >>>>>>> now I don't use Policy files).
> >>>>>>>
> >>>>>>> I wish I could have a way to "lock" a node or something like
> >>>>>>> that.
> >>>>>>>
> >>>>>>>
> >>>>>> The way to do this is to make sure only one agent on your
> >>>>>> network can move the node between states.  A simple design
> >>>>>> would be to have the node responsible for publishing that its
> >>>>>> done with firstboot by tagging itself and then the node.save
> >>>>>> at the end publishes the write.  Then write a simple web
> >>>>>> endpoint which is your API to 'allocate' a new firstboot'ed
> >>>>>> node.  By centralizing it you don't have to worry about race
> >>>>>> conditions between multiple clients all trying to get the same
> >>>>>> node at the same time.  You can then write command line tools
> >>>>>> that talk to the endpoint you wrote to get a node, rather than
> >>>>>> wanting a distributed lock that the CLI commands can grab on
> >>>>>> the node object itself.  If you've already got etcd or
> >>>>>> something similar that you're using internally you could
> >>>>>> probably use that instead.
> >>>>>>
> >>>>> Hello
> >>>>>
> >>>>> Had anyone analysed lock-free and optimistic approach by using
> >>>>> 'If-Match:' HTTP header on the write stage ?
> >>>>>
> >>>>> The scenario would look like :
> >>>>> - Every object (node, role, environment) would have some token
> >>>>>   (could be timestamp, or any other value changed on each edit)
> >>>>> - When the user invokes 'knife node edit' the version is sent to
> >>>>> client (possibly in HTTP Header)
> >>>>> - When the user edits the object, the value is stored somewhere
> >>>>> - When the user sends write API call to the server, it sends
> >>>>> 'If-Match' header with value received in first call
> >>>>>   - If the token matches the old one - the object is updated
> >>>>>   - If the token does not match the old one - the update is
> >>>>> rejected.
> >>>>>
> >>>>> That won't solve all the problems, but it will fix many of them
> >>>>> with (i suppose) less work and changes. Such behaviour would
> >>>>> also be non-braking change.
> >>>> This was discussed way back at the first community summit, but no
> >>>> one has written the code. I'm sure it would be accepted if
> >>>> someone sent in a patch though.
> >>>>
> >>>> --Noah
> >>> Hello Noah,
> >>>
> >>> Thanks for the information. Just spent some time on the analysis
> >>> on how to achieve that.
> >>> - On the knife side
> >>>    - Extracting 'Etag' information from the response on the
> >>> Chef::REST
> >>>    - Storing it during the edit (probably in
> >>> Chef::Knife::NodeEditor
> >>>    - Updating node from the NodeEditor class instead of Node class
> >>> - On the chef-server side
> >>>    - I have completely no Idea where to start analysis from :/ I
> >>> hadn't touched Erlang in few months.
> >>>    - We would generally require :
> >>>      - Generating Etag (dynamically or storing it somewhere) for
> >>> the object (like node, environment...)
> >>>      - Sending Etag in HTTP Response header when requested by
> >>> client
> >>>      - On Modification request, detecting 'If-Match' header
> >>>        - Comparing received value with current Etag,
> >>> accepting/rejecting update
> >>>
> >>> I hope that pattern will work for all objects (roles, nodes,
> >>> environments..)
> >>> Also this should work not generate braking change with older
> >>> chef-server/knife implementation.
> >>>
> >>> Is there any way someone could help me with chef-server side ?
> >>>
> >>> Regards,
> >>> --
> >>> Rafał Trójniak
> >>> WEB : http://trojniak.net/
> >>>
> >>>  :
> >>>
> >>> Jid :
> >>>
> >>> GPG  key-ID : 9A9A9E98
> >>> ABC8 83DF E717 6B76 CE49
> >>> BAFD 4F6F 854F 9A9A 9E98
> >>>
> >
> >
>

--
Rafał Trójniak
WEB : http://trojniak.net/

:

Jid :

GPG  key-ID : 9A9A9E98
ABC8 83DF E717 6B76 CE49
BAFD 4F6F 854F 9A9A 9E98

Attachment: pgpfAXO9wu9KN.pgp
Description: OpenPGP digital signature

[chef] Re: Re: Simultaneous node edits, (continued)
- [chef] Re: Simultaneous node edits, Noah Kantrowitz, 07/01/2015
  - [chef] Re: Re: Simultaneous node edits, Tensibai Zhaoying, 07/01/2015