On Tuesday, May 7, 2013 at 1:10 AM, Bryan Berry wrote:
I have some ideas on extending test-kitchen to test clusteredapplications and I would love some feedback before I go coding off ina particular direction.Problem: I deal primarily with distributed applications and testingthe related cookbooks can be a pain. I also have to make sure thesecookbooks work across different linux distros. Test-Kitchen was notoriginally created with this use case in mind though at this time Idon't see any reason it couldn't support this use case.Vagabond[1], written by Chris Roberts, extends .kitchen.yml to includea clusters component, among other things. I would love to seetest-kitchen absorb some of that functionality or at least provideextension points to make this more easily pluggable.Testing a cluster works differently than testing an individual node. Iwant to interrogate the state of the cluster as a whole, not lookinside each individual server. To do this I need to wait until allservers in a cluster converge or at least a quorum of them do. Once aquorum of nodes has converged, run a series of tests against the thecluster. These tests execute on the client executing `kitchen` ratherthan inside the nodes of the cluster.Here are the steps in brief:1. Converge all nodes in a cluster2. Wait for quorum of nodes to converge3. execute tests against the clusterI would put the tests for a cluster in my_cluster/test/cluster/cluster_nameApplications like Elasticsearch, Zookeeper, or Cassandra don't have amaster node, so each node has an identical run_list and attribute setclusters:default:- member: zk1- member: zk2- member: zk3platforms:- name: ubuntu-12.04- name: centos-6.3suites:- name: defaultrun_list: [ "recipe[zookeeper]" ]To test the default cluster on CentOS, `kitchen test --clusterdefault --platform centos-6.3 `Let's make this even DRYerclusters:default:node_count: 3quorum: 2platforms:- name: ubuntu-12.04- name: centos-6.3suites:- name: defaultrun_list: [ "recipe[zookeeper]" ]The test for this example zookeeper cluster would connect to onezookeeper node, make sure it sees the other zookeeper nodes. I amparaphrasing the zookeeper code because I am not certain of what theactual call would be.my_cluster/test/cluster/default/check_members.rbrequire 'zk'describe "zookeeper cluster" dobefore(:all) do@zk = ZK.new(some_ip)endit "sees the other members" dopeers = @zk.get("system/peers")peers.should == ACTUAL_PEERSendendThe primary challenge here is resolving the names of the members ofthe cluster. One way to do this is to access the statefiles for thenodes in .kitchen/*.yml. Another would be to somehow access the@instances array for the current Kitchen config. Yet another optionwould be to stand up a chef-zero server.What about a distributed app where each node does not have anidentical run_list? Here is how I would handle that for something likeHBase that has a "hmaster" that stores metadata for the whole cluster.clusters:default:- member: head1run_list: [ "recipe[hbase::hmaster]" ]- member: store1run_list: [ "recipe[hbase::data_node]" ]- member: store2run_list: [ "recipe[hbase::data_node]" ]platforms:- name: ubuntu-12.04- name: centos-6.3suites:- name: defaultrun_list: [ "recipe[zookeeper]" ]I know that test-kitchen is an unopinionated tool and attempts to beworkflow agnostic but I feel that what I have shown here is a fairlysimple workflow. The cluster definition I have presented is notsuitable for modeling failover in a cluster or specific states. Atthat point one should use a custom workflow tool like chef-workflow orsimply custom rake tasks.Depending on the feedback I get from this I plan to extendtest-kitchen or Vagabond to handle the workflow I have described here.Thanks for reading!
Archive powered by MHonArc 2.6.16.