chef - [chef] The future of the database and application cookbooks

Subscribers: 1946
Owners
Bryan McLellan
Joshua Timberman
Nathen Harvey
Seth Chisamore
Serdar Sutay

Subscribe
Unsubscribe
Info
Archive

Post

RSS
Shared documents

General discussion about Chef

[chef] The future of the database and application cookbooks

From: Noah Kantrowitz < >
To: ,
Subject: [chef] The future of the database and application cookbooks
Date: Tue, 30 Aug 2011 19:39:55 -0700

As some people have noticed from my musings on IRC and elsewhere, I have
embarked on a (probably long overdue) overhaul of the application and
database cookbooks. This is largely orthogonal (for now) with the recent
LWRPs added to the database cookbook, though in the end they would be
removed. What follows is a set of syntax ideas for where I think this should
head. Not all of this is implemented, but you can see what is in my cookbooks
branch https://github.com/coderanger/cookbooks/tree/COOK-634.

Database clusters
=================

In the new LWRP hierarchy the top level resource is a database_cluster. This
defines an abstract model of a single-master-multi-slave cluster setup.
Within a cluster is one or more database_servers, each of which defines a
single type of database service (MySQL, Postgres, Redis, Cassandra, etc)
mapped on to the containing cluster. In general I would expect most clusters
to only contain a single database_server block, but this isn't required and
if you have a beefy primary box for, say, both Postgres and Redis and wanted
to have both use the same backup machine you could put those both in the same
cluster definition. Within a database_server you can define database and
users, though this isn't needed for all types of servers and the exact
implementation of what those two constructs do is left up the backend plugin
implementing the specified server type.

With all that laying down the model, lets look at some syntax:

database_cluster "prod"
master_role "prod_database_master"
slave_role "prod_database_slave"

database_server do
   type :mysql
   database "prod" do
     engine "innodb"
   end
   user "myuser" do
     password "xxx"
     grant do
       select "prod"
       insert "prod"
       update "prod"
     end
   end
end

database_server do
   type :redis
end
end

This will create a cluster running a MySQL server with one database named
prod and a Redis server (in the case of Redis there are no specific tuning
params we need to worry about so far). This example shows a very verbose form
of the desired syntax, below is an equivalent version using some more sugar
and allowing for sane defaults:

database_cluster "prod"
mysql do
   database do
     engine "innodb"
   end
   user "myuser" do
     password "xxx"
     grant "prod" do
       select
       insert
       update
     end
   end
end

redis
end

You can also pass Hashes to database and user instead of blocks if you want
to feed them from an external data source (such as a data bag):

database_cluster "prod"
mysql do
   database "prod", {:engine => "innodb"}
   user "myuser", {:password => "xxx", :grant => {:select => "prod, ...}}
end

redis
end

As mentioned before, the actual implementation can choose how to handle
databases and users. In the above examples, if you added any to the redis
section it would simply be a no-op (or maybe a runtime error?). Also as a
structural piece, database and user definitions will only ever be processed
on the database master (are there databases where this isn't the case and
slaves should create users and such too?). The individual backends for this
would live in the cookbook for that program, so the database cookbook would
only hold the core infrastructure and plumbing to run things. The idea is
that this resource block would be placed somewhere central and assigned to
all servers so they can use it for reference (see below in the app cookbook
section). Only nodes with the master or slave role would actually do
anything. As a special case if the system detects that the master role
actually doesn't exist it will run in a degraded mode assuming a single-node
cluster (so obviously the current node is the master).

The database cookbook would also grow an associated knife-database plugin to
handle replication setup/teardown as for some databases you have to perform
an out-of-band data sync before attaching the slave.

Seen in isolation, does this seem suitable flexible to cover most needs? I am
mostly familiar with SQL databases and a few smaller NoSQL products, so for
the greater NoSQL world is this still a viable model?

Application deployment
======================

The model for the application LWRPs is generally simpler in terms of data,
but is also more callback-driven. The top-level application resource just
contains the central data used for all applications, and then any framework
or platform specific bits are contained in sub-resources similar to the
database_server arrangement. Below is an example of deploying the Radiant CMS:

application "radiant" do
path "/srv/radiant"
owner "nobody"
group "nogroup"
repository "git://github.com/radiant/radiant.git"
revision "master"
packages ["libxml2-dev", "libxslt1-dev", "libsqlite3-dev"]
migrate true

rails do
   gems "bundler" => nil, "RedCloth" => "4.2.3"
   database :mysql => "radiant_production" do
     user "radiant"
     reconnect true
     encoding "utf8"
   end
   migration_command "bundle exec rake db:migrate"
end

unicorn do
   port 8000
end
end

The global data maps very directly to fields in the existing application data
bags, except that they are not Hashes indexed by environment name. In the
rails sub-resource you see that again things mostly map to the data bag. The
biggest exception is the database section which doesn't actually state any
information about the database. Instead things are looked up by reference to
the information in the database resource. It would still be possible to
specify all information manually if you aren't using the database cookbook.

Application backends define actions as callbacks more or less, to provide
code to execute at different points during the deploy process. The current
callbacks are :before_compile, :before_deploy, :before_migrate,
:before_symlink, :before_restart, and :after_restart. The last 4 just map to
the existing callbacks in the deploy resource system. before_compile takes
place as the first thing when the application resource executes (so when the
provider is compiled), while before_deploy takes place during the execute
phase in the provider, but after the main application folder structure is
created.

The plan is to convert all the existing application cookbook recipes into a
small wrapper just piping data bags into this LWRP structure, so they should
continue to work as they always have.

So, what do people think? Is this a step in the right direction for
application deployment? Do you like the idea but what a different syntax for
it? Are there big, glaring use cases I've missed?

--Noah Kantrowitz

tl;dr Application deployment with Chef is going to be awesome!

Attachment: smime.p7s
Description: S/MIME cryptographic signature

[chef] The future of the database and application cookbooks, Noah Kantrowitz, 08/30/2011
- Message not available
  - [chef] Re: [chef-dev] Re: The future of the database and application cookbooks, Jason J. W. Williams, 08/30/2011
    - [chef] Re: Re: [chef-dev] Re: The future of the database and application cookbooks, Noah Kantrowitz, 08/30/2011
      - [chef] Re: Re: Re: [chef-dev] Re: The future of the database and application cookbooks, Denis Barishev, 08/31/2011
      - [chef] Re: [chef-dev] Re: Re: Re: The future of the database and application cookbooks, Jason J. W. Williams, 08/31/2011
- [chef] Re: The future of the database and application cookbooks, Charles Duffy, 08/31/2011