- From: Adam Jacob <
>
- To:
- Subject: [chef] Re: race conditions during resource allocation
- Date: Tue, 13 Oct 2009 17:02:39 -0700
On Fri, Oct 2, 2009 at 3:08 PM, Nick Ohanian
<
>
wrote:
>
I have a question about possible race conditions during resource allocation.
>
Let me give a simple (but somewhat silly) example. Imagine I have two
>
types of nodes: web-server nodes and database nodes. As far as Chef is
>
concerned, new nodes can come alive at any random time (in reality, they
>
come alive when an auto-scaler determines that more resources are needed in
>
the deployment). I want to make sure that each web-server, once configured
>
and running, is connected to exactly one database. I can accomplish this by
>
storing an attribute on the web-server nodes called "using-database". When
>
a new web-server comes alive, it searches the chef-server db for the list of
>
web-server and database nodes, and it looks for a database node that is not
>
being used by any other web-server. Then it sets that database node in its
>
"using-database" attribute.
>
>
This works fine, but you get the usual race condition problem with shared
>
read-write resources. If two web-servers come alive at the same time, they
>
will search for an available database at the same time, and possibly select
>
the same one, and save it to their "using-database" attributes. The
>
one-web-server-per-database constraint is then violated.
>
>
Has anyone else come across this problem? Does this situation go against
>
the intent of a Chef system? Or is there support for it already, or some
>
other Chef best-practice?
>
Thanks in advance.
This is an interesting one, Nick. Nobody has brought it up before,
but the potential certainly exists. The fact that the write time for
the search indexes is not immediate makes the window larger as well.
As things stand right now, you can't really get this sort of behavior
reliably from Chef. CouchDB provides some internal mechanisms for
doing conflict resolution on documents, and Chef basically uses them
to get a 'last write wins' model.
What you really wind up needing is an external locking mechanism that
deals with the correct distribution of these sorts of resources, and
integrating with it via a library.
Another way to solve this problem would be to always bring up the
webservers and databases in pairs, and remove the need for the lock
altogether.
Regards,
Adam
--
Opscode, Inc.
Adam Jacob, CTO
T: (206) 508-7449 E:
Archive powered by MHonArc 2.6.16.