- From: Bryan Berry <
>
- To: Erik Hollensbe <
>
- Cc: Chef Dev <
>, "
" <
>
- Subject: [chef] Re: [chef-dev] Fwd: How do I know if my application has really been "provisioned"? a suggestion
- Date: Sun, 9 Dec 2012 20:14:28 +0100
Hey Erik,
thanks for your thoughtful comments!
>
service "foo" do
>
action :start
>
ensure do
>
if success
>
# check socket
>
else
>
# maybe kill process?
>
end
>
end
>
end
I think that your example above could work for a lot of use cases and
I could definitely could see myself using. However, it doesn't really
apply to the specific use case I have in mind. I need chef to loop for
a maximum specified timeout value, checking if a condition is true.
For example, starting a JBoss instance that uses the standalone-full
configuration will take around 20 seconds to start. A one-time check
after an indeterminate period will not be sufficient for my needs.
i will have to take some more time to read thru your full response.
Thanks for taking the time to make a thoughtful response!
On Sun, Dec 9, 2012 at 7:28 PM, Erik Hollensbe
<
>
wrote:
>
Sorry for breaking the thread -- when I first signed up I used a plus hack
>
address and your list software is stricter than I thought. :)
>
>
Anyhow my reply is included here.
>
>
Begin forwarded message:
>
>
From: Erik Hollensbe
>
<
>
>
Subject: Re: How do I know if my application has really been "provisioned"?
>
a suggestion
>
Date: December 9, 2012 10:21:21 AM PST
>
To: Bryan Berry
>
<
>
>
Cc:
>
,
>
Chef Dev
>
<
>
>
>
>
On Dec 9, 2012, at 4:22 AM, Bryan Berry
>
<
>
>
wrote:
>
>
Erik Hollensbe is doing some freaking awesome work on workflow
>
orchestration w/ chef-workflow and I think it illustrates the problem
>
here
>
>
require 'chef-workflow/helper'
>
class MyTest < MiniTest::Unit::VagrantTestCase
>
def before
>
@json_msg = '{ 'id': "dumb message json msg"}'
>
end
>
def setup
>
provision('elasticsearch')
>
provision('logstash')
>
wait_for('elasticsearch')
>
wait_for('logstash')
>
inject_logstash_message(@json_msg)
>
end
>
>
def test_message_indexed_elasticsearch
>
assert es_has_message?(@json_msg)
>
end
>
end
>
>
If I understand Erik's code correctly, the `wait_for('elasticsearch')`
>
only waits for the vagrant provisioner to return. The vagrant
>
provisioner in turn only waits for `service elasticsearch start` to
>
return a non-zero exit-code.
>
>
>
Not exactly. It doesn't matter for the purposes of this discussion, but I
>
feel compelled to explain anyway: chef-workflow's provisioner is
>
multithreaded and dependency-based out of the box. When you ask something to
>
be provisioned, it gets scheduled for a provision and a scheduler in the
>
background tries to provision it as soon as all dependencies are satisifed
>
for it, but it doesn't actually wait for anything to happen other than the
>
message to be sent to the scheduler. In this time it may provision other
>
machines that are needed to satisfy any requirements of the machine or
>
groups of machines.
>
>
The wait_for statement is simply a way to say, "I can't continue until this
>
machine actually exists" but is not coupled with a provision statement at
>
all -- the behavior you're seeing is partially a side effect of being unable
>
to multithread vagrant and virtualbox for provisioning (the knife side of
>
this is already multithreaded, and the gains are huge when you provision
>
more than one machine at a time for a specific role).
>
>
This is relevant because my current task is supporting ec2 as a first-class
>
provisioner which means that in your test this would actually be quite a bit
>
faster:
>
>
def setup
>
provision('elasticsearch')
>
provision('logstash', 1, %w[elasticsearch]) # logstash depends on ES here
>
wait_for('logstash')
>
inject_logstash_message(@json_msg)
>
end
>
>
Because the scheduler cares about the logstash dependency on ES now. If you
>
needed to provision other machines, you could throw these wait_for
>
statements in the unit tests themselves and literally have your tests be
>
provisioning tons of machines in the background but not actually halt the
>
testing process until the one you care about hasn't been provisioned, which
>
has signficant gain over time as they're all provisioning at once whether or
>
not your test suite has made it to the point where they matter yet.
>
>
Anyhow, this is important to point out because I think this dependency
>
system and parallelism code can be adapted to chef converges -- because
>
resource converge lists and this work extremely similar from a conceptual
>
standpoint, and I'm about to suggest an alternative that would solve this
>
problem in a way that lets that happen, should the actual patches be
>
written. Please raise your hand if you'd like chef to try and parallelize as
>
much as it can about your converge. :P
>
>
We need an optional way to determine whether an server has been
>
complete provisioned, or that all the resources have entered a "done"
>
state. The only way I know that elasticsearch has started
>
successfully is if I see in the log "Elasticsearch has started" w/ a
>
timestamp more recent than when I started the service.
>
>
The before block would run before the service is actually actioned.
>
Now Chef would need some additional machinery to collect all the done
>
:after blocks and the related @before_results. This could be done by
>
chef_handler but may be better as part of chef itself. Let's call it
>
the done_handler for now. This done_handler would mark the time before
>
it starts handling any done_after blocks, then loop through the
>
collected done_after blocks for the specified timeout. Once all blocks
>
are complete it would continue onto other handlers, such as the
>
minitest_handler.
>
>
>
I think I have a more general suggestion that takes its cue from typical
>
exception handling schemes in languages, but not exactly.
>
>
When you have an exception in ruby, the program aborts unless you catch it.
>
Here's an example:
>
>
def foo
>
something_that_might_raise
>
rescue
>
$stderr.puts "omg! we raised"
>
end
>
>
This is a common problem in writing routines like 'foo':
>
>
def foo
>
create_a_file
>
something_that_might_raise
>
delete_that_file
>
rescue
>
$stderr.puts "omg! we raised"
>
end
>
>
The problem being that if "something_that_might_raise" does indeed raise an
>
exception, no amount of error handling is going to get "delete_that_file"
>
called.
>
>
Luckily ruby (and other languages that use exceptions) provides us with
>
"ensure", which allows us to specify a bit of code executed that always runs
>
no matter what happens. The right way to write the last example:
>
>
def foo
>
create_a_file
>
something_that_might_raise
>
rescue
>
$stderr.puts "omg! we raised"
>
ensure
>
delete_that_file if file_exists
>
end
>
>
Saying that a chef converge is exceptions-as-flow-control isn't exactly a
>
leap of logic -- a resource application breaks and chef blows up -- that's
>
the end of the story. Your job is to write your cookbooks and recipes in a
>
way that's tolerant of these issues.
>
>
Our ensure block can raise a clearer error, but it can also clean up and it
>
can also verify that indeed, some side effect worked. You can see above that
>
it checks if the file exists before attempting to delete it -- in the event
>
the create_a_file call failed, it does nothing.
>
>
Anyhow, this long-winded explanation more or less amounts to a
>
simplification of what you're asking for -- an ensure block that spans all
>
resource classes.
>
>
service "foo" do
>
action :start
>
ensure do
>
sleep 10
>
# ensures the socket the service foo created is open
>
TCPSocket.new('localhost', 8675309)
>
end
>
end
>
>
But it's also general enough to gracefully handle failures:
>
>
cookbook_file "foo.tar.gz" do
>
action :create
>
end
>
>
execute "untar foo.tar.gz" do
>
code <<-EOF
>
tar xzf foo.tar.gz
>
EOF
>
ensure do
>
FileUtils.rm('foo.tar.gz') # always runs, even if the above untar fails
>
end
>
end
>
>
A corrolary call would be allowing some kind of state predicate to determine
>
if the ensure block is being fired due to success or failure. These could be
>
implemented as super-sets of ensure:
>
>
service "foo" do
>
action :start
>
success do
>
# check socket
>
end
>
>
failure do
>
# maybe kill process if it got started anyway?
>
end
>
end
>
>
(Which is the way things like jquery's ajax tooling works)
>
>
Or with a simple predicate you check yourself:
>
>
service "foo" do
>
action :start
>
ensure do
>
if success
>
# check socket
>
else
>
# maybe kill process?
>
end
>
end
>
end
>
>
It'd be nice if notifies worked here too -- so you could signal another
>
resource to run depending on what happened.
>
>
Anyhow, I think this is considerably more general and solves many more
>
use-cases than this specific problem, but does not exempt it from being
>
handled. Anyhow, back to my hole. :)
>
>
-Erik
>
>
Archive powered by MHonArc 2.6.16.