[chef] File created from template has broken (binary) and incorrect data in it after Chef run


Chronological Thread 
  • From: Ian Marlier < >
  • To: chef < >
  • Subject: [chef] File created from template has broken (binary) and incorrect data in it after Chef run
  • Date: Tue, 5 Jun 2012 10:13:24 -0400

Hi, Chefs --

After a Chef run on one of my machines last night, a text file on the system, created from a template, had binary data in it.  On the next Chef run, everything was fine.  I'm wondering if this is something that anyone else has experienced.  (This is also filed as http://tickets.opscode.com/browse/CHEF-3179, but I wanted to cast a wide net since it's not an easily searchable issue.)

Basic details:
Chef 0.10.10, Ruby 1.9.2p180, Ohai 0.6.12


Many more details: read on!

I have a resource definition called "nrpe_service".  It's used to define services for NRPE (Nagios Remote Plugin framework).  In this case, it's defined like this:
nrpe_service "check_couch_two_way_replication" do
  command "check_couch_replication.py --database $ARG1$ --source --dest"
end

The definition does some sanity checking (making sure that needed attributes are defined, etc), then does this if the sanity check works out:
        template "#{node[:nrpe][:confdir]}/#{params[:name]}.cfg" do
            cookbook "nagios"
            source "nrpe_service.cfg.erb"
            owner "root"
            group "root"
            mode "0644"
            variables({:params => params})
        end


So far, so good.

The template itself is pretty simple (though, wow, what was I thinking when I put some of the modification logic in the template instead of the resource definiton?  That was silly...):
<% if @params[:command][0] == '/' %>
<% cmd = @params[:command] %>
<% else %>
<% cmd = " :nagios][:plugin_path]}/ :command]}" %>
<% end %>
command[<%= @params[:name] -%>]=<%= cmd %>


The resulting service definition looks like this:
[imarlier nrpe.d]$ cat check_couch_two_way_replication.cfg
command[check_couch_two_way_replication]=/usr/lib64/nagios/plugins/check_couch_replication.py --database $ARG1$ --source --dest
[imarlier nrpe.d]$


This has been in place, and has been working just fine, for months now.  Then last night, chef-client ran (a scheduled run) and the file turned into this:
[imarlier nrpe.d]$ cat check_couch_two_way_replication.cfg.chef-20120604182416
command[1?en
            way_replication]=/usr/lib64/nagios/plugins/["uid", 10003]
[imarlier nrpe.d]$


This is...wrong.

The next time that Chef ran, everything went back to how it should be.  There were no alterations to the Chef server at this time, and no local modifications on the client either.

It's interesting to me that the things that appear to be corrupted are parameter values that should be derived entirely locally -- they're elements of the resource definition, and not parameters that are being pulled down from the server or anything like that.

Has anyone seen similar behavior?  Given that this isn't something that seems to happen with any frequency -- and it's not something that I can reproduce at will -- anyone have ideas as to what I might do to debug this?  Are there any known data corruption bugs in Chef?  In Ruby?  I'm kind of at a loss, but seeing something like this is pretty terrifying.

Thanks,

- Ian


--
Ian Marlier | Senior Systems Engineer
Brightcove, Inc.
290 Congress Street, 4th Floor, Boston, MA 02110
" target="_blank">




Archive powered by MHonArc 2.6.16.

§