[chef-dev] Re: Re: Re: Re: local file copy resource?


Chronological Thread 
  • From: Jesse Campbell < >
  • To: Daniel DeLeo < >
  • Cc: Chef Dev < >
  • Subject: [chef-dev] Re: Re: Re: Re: local file copy resource?
  • Date: Mon, 10 Dec 2012 21:57:56 -0500

I agree with you that there needs to be a better understanding of the end goal.

My end goal is this:
1. I want to be able to pick up a text or binary file from anywhere (local, cookbook, remote, text string) and drop it somewhere, and have it just work.
2. I want to be able to pick up a text file from anywhere (local, cookbook, remote, text string), feed it through template rendering, and drop it somewhere, and have it just work.
3. Not break current functionality (existing resources should continue to exist and should function the same way)

Right now I can do many of these but I can't:
- copy/move a local binary file without an execute resource shelling out to cp, or rubycode running FileUtils.cp
- fetch a remote file from anything except for non-authenticated HTTP/HTTPS without shelling out to wget/curl
- feed a remote file through template rendering, without having first a remote_file to drop it into tmp, then a template with local option to pick it back up

Additionally there are three different resources that all just make a file.

I do not see a connection with directory, so I would not put directory or remote directory in with file (though puppet does exactly this) I'm not going into it here, but why the f does remote_directory pull things from a cookbook, and remote_file pull things from the web? who named these things? okay, rant over.

If I combine remote_file, template, cookbook_file, and file all together, at least the way I think of them, we end up with the following attributes:
path/name: the target location
diff: i don't know what this does, but its in the base file
backup: number of backups to keep for new file
content: text string to use for file content
local: boolean to say whether the file is local instead of cookbook
remote: boolean to say whether the file is remote instead of cookbook
source: accepts path on filesystem when local, path inside cookbook/files, path inside cookbook/templates when a template, URL when remote; causes content to be ignored
cookbook: name of the cookbook when source is specified, local is false, remote is false, ignored otherwise
template: boolean to say whether to perform template rendering
variables: the template variables hash, ignored unless template is true

These are added to base file: local, remote and template booleans, source, cookbook, and variables.
Combinations that don't make sense:
1. specifying local or remote, then specifying cookbook **you can already do this in template**
2. specifying source, then specifying content **you can already do this in template, cookbook_file, and remote_file**
3. specifying variables, but not specifying template **this one is new, but we could throw an error**
4. specifying local and remote at the same time **this one is new and should throw an error**

Any other combination should work.

You are correct that the implementations are currently different, but why should they be? this is not a complex problem, yet it is handled in a very complex way.
remote file fetches should be using a ruby standard library to fetch a file, not the custom chef rest client.
template should be able to be applied to any generic thing, so have the provider stage every file first with deploy_tempfile, then if template=true apply template processing in place on the tempfile, then deploy.

-Jesse



On Mon, Dec 10, 2012 at 8:57 PM, Daniel DeLeo < " target="_blank"> > wrote:

Forwarded message:

From: Daniel DeLeo < " target="_blank"> >
To: Bryan McLellan < " target="_blank"> >
Date: Monday, December 10, 2012 12:10:43 PM
Subject: Re: [chef-dev] Re: Re: Re: local file copy resource?

On Monday, December 10, 2012 at 10:07 AM, Bryan McLellan wrote:
On Sun, Dec 9, 2012 at 1:44 PM, Jesse Campbell < " target="_blank"> > wrote:

You could create pull request for this sort of patch, even if you're
not done. That would allow people to comment in the PR against the
specific parts of the code.

It looks like #action_create no longer differentiates between creating
a file that doesn't exist and updating the content of a file that does
exist. This seems minor, but my gut tells me it is important for Chef
to know the difference so we can be completely honest with the user
about what we did or are going to do, for purposes like whyrun,
auditing, reporting, etc.

I don't think I like overloading the 'create' action to also be a
'copy' action, differing on if a source file exists or not, as we're
not really creating a file.

In #action_move, you remote the source file if it already matches the
destination file. Conceptually this makes sense, but I'd be pretty
surprised if I asked Chef to move a file and it told me that it
deleted it instead.

#copy reports "create a new cookbook_file " but
is used in both Chef::File#action_move and Chef::File#action_create
(when copying a file). Again, we need to be accurate in what we're
reporting.

Personally the 'to' attribute made more sense to me than 'source.'
When I think about what I'm copying or what I'm moving, I think what I
am describing is the source, not the destination. But I can totally
see how this is possibly personal preference. We see this in
remote_file, cookbook_file and template today. The resource manages
the destination and there is a source attribute. In link, where we
already deal with two files on disk, we use 'to' since we realized
this made more sense in CHEF-30 [1].

Bryan


Oops, I failed at reply-all sending this originally :/

 
I'm against merging all of these behaviors into a single resource/provider, until there's a better understanding of what we're trying to do, what the related impact will be and what we'll do to mitigate that.

In particular, one problem we've had is where a base kind of resource (service, package, etc.) has fields (like the options field for packages) that are only supported by/relevant to some of the providers it uses. Users then expect these fields to modify behavior in a certain way, when they are in fact silently ignored. The problem also goes the other way; for example, one can create a package resource and set the provider to GemPackage, but then you can't specify all of the relevant parameters that the gem provider expects/respects. The first case could be addressed with validation, but would be more complicated than necessary, because the validation logic needs to know which provider would be used and which options that provider implements.

In this particular case, we're adding a bunch of options that could potentially be combined in invalid ways and also result in very different behaviors. At the same time, we're only going half way: will template be rolled into file? remote file? What about remote_directory and directory?

Bryan made the point that which resource to use should be determined by the kind of thing you're managing (file in this case), but we can't completely divorce what you're managing and how you manage it in the resource declaration. A hypothetical do-everything file resource would need to support cookbook/template name, local template switch and path, content parameter, URL of source, and (now) path of source. These are all mutually exclusive, so we need to catch this and return a sane error to the user if, say, they accidentally leave one parameter in there when changing the resource to be backed by cookbook file to being a template. The actual back-end implementation of all of these things is also quite different, so ideally there would also be some sort of dynamic provider switching based on the attributes present so that the provider doesn't devolve into a mess of case statements based on the selected mode of managing the file.

Despite my reservations above, I'm not opposed to unifying similar resources if we're convinced it makes chef cleaner or easier to use, but I'm also not currently convinced we aren't just moving complexity around and creating a bunch of work in terms of user education and doc changes for what might be a bike shed issue. If reality really is the first case, then we need to make sure we come up with an end result that handles edge cases and invalid input which are now caught by NoMethodError and the like.

-- 
Daniel DeLeo






Archive powered by MHonArc 2.6.16.

§