[chef] Re: Re: Re: Upstart problem with MongoDB recipe...


Chronological Thread 
  • From: Sam Pointer < >
  • To:
  • Subject: [chef] Re: Re: Re: Upstart problem with MongoDB recipe...
  • Date: Tue, 8 Oct 2013 08:52:04 +0100

I observe that the other public MongoDB cookbooks I found do not use Upstart (init.d instead) I presume because they are trying to support non-Upstart platforms and so choose not to use Upstart on Ubuntu.

I think this is more the case that they're trying to avoid Upstart for exactly the reasons you're encountering: it is broken by design. From what I gather a number of common daemons simply cannot work correctly with Upstart, nginx seeming to be the prime candidate. You're better off completely avoiding it and going with a traditional init script. If you're inclined to, some searches into "trying to get x running with Upstart" and "upstart cannot restart x" show a very long tail of things Upstart just cannot support in its present state.

The single biggest problem with Upstart (IMHO, and note that biggest means "worst in a set of many") is the short-sighted decision to only express two values for the number of forks permitted. What makes this worse is the thinking behind it, which takes a naive and inexperienced view that daemons should only need to fork twice and so that's all we'll support.

Out in the real world and away from trying to optimise the desktop for quick boot times this simply isn't the case with nginx, sinatra, as you found MongoDB, and countless other daemons. You might say that these relatively new pieces of software are running against the grain of a stone-hewn UNIX convention. I'd say that it is incredibly short-sighted to not express the number of forks to expect as an integer to allow anything to be accommodated if it deviates from the norm, as presumably the daemon author has good reason for doing so.

I find all of the above very puzzling if you consider the following pieces from Upstart's initial author:


"The only way I was ever going to get Colin to agree that writing a new init system was a good idea was by promising to make it backwards compatible with the old one."

It clearly isn't. Combined with the designer and original author's failed starting assumptions that daemons need only ever fork twice, insistence that "[traditional init is the] least understood and maintained [part of the system]. Nobody actually uses sysvinit's features" (speak for yourselves!) and willingness to throw away decades of accumulated knowledge:

"I firmly believe that sometimes you've just got to ditch the past and start over from scratch. (The standard library inside Upstart is called libnih for a reason :p) To steal a phrase from a favourite author of mine, I am to backwards compatibility what King Herod was to the Bethlehem Playgroup Association."

I'm surprised such a system made it into a distribution, into a server-orientated distribution, and into Fedora. Reinventing things that you don't understand has to be the worst part of the modern UNIX ecosystem.
 


On 4 October 2013 21:52, Russell Bateman < " target="_blank"> > wrote:
Well, I made some interesting, if not completely helpful, observations while looking deeper into this. If I were beginning to look into this problem, I'd want the slight advantage of knowing what I'm writing here.

Thanks to Sam Pointer and others, including in the MongoDB forum, for help in circling the wagons.

1. Following Upstart's instructions to the letter, I discovered that MongoDB (mongod) forks (or clones) itself 63 times starting up.
--I don't really believe that, I'm skeptical of the documentation's exact procedure, but that's what its advice  using strace and a regular _expression_ led to. I spent time in the trace itself and, while I'm an old guy with 2+ decades in C and Unix internals and recognized all the system calls made, I don't actually grok the cloning going on that pushes the count so high.
2. Feeling pretty bummed, but nimble-fingered, I experimented with all the possible permutations of
 expect fork
 expect daemon
 expect stop
 respawn
plus none of these at all (what MongoDB ships with). No permutation worked. Some left a state of claiming that the service was running when ps -ef | grep [m]ongo proved that it was not. No permutation left my VM with a running mongod daemon, even after reboot, something that the edelight cookbook succeeds at (read on).

3. Conducting a survey of all the MongoDB cookbooks I could find public, I observed that edelight's (the one I originally followed to begin learning Chef) uses what MongoDB ships with and its single.rb recipe succeeds in getting mongod launched and working, something I had already observed a couple of months ago, but Upstart is non-plussed and behaves exactly as I have complained. This said, when rebooted, such a node gets over the problem and Upstart begins to accept the installation as legitimate from then on.
--a possible direction to pursue here would be to cannibalize edelight's cookbook to see if I can't make it do all the things my own does so well.
4. I observe that the other public MongoDB cookbooks I found do not use Upstart (init.d instead). I presume because they are trying to support non-Upstart platforms and so choose not to use Upstart on Ubuntu. I thought to investigate walking that path which strikes me as more promising even though I'm not interested in any platform except Ubuntu. Nevertheless, ...

5. On the way to #4, I stumbled upon a hack which works very well, as well as #3. I don't like it much because I don't think Upstart scripts were meant to be written so, but it solves the problem in my exact environment, Ubuntu 12.04 (LTS). Here is what my script looks like for starting a mongod (or mongos). What tricks out Upstart is the pre-start which apparently doesn't count.
description "Keeps MongoDB running between boots"

limit nofile 20000 20000
kill timeout 300         # wait 300s between SIGTERM and SIGKILL.

pre-start script
  exec start-stop-daemon --start --quiet --chuid mongodb \

      --exec /usr/bin/mongod -- --config /data/mongodb/mongodb.conf

end script

script
  sleepWhileDaemonIsUp()
  {
    while pidof $1 > /dev/null; do
      sleep 1
    done
  }

  sleepWhileDaemonIsUp /usr/bin/mongod
end script

post-stop script
  if [ pidof /usr/bin/mongod ]; then
    kill `pidof /usr/bin/mongod`
  fi
end script



On 10/3/2013 5:10 PM, Sam Pointer wrote:
Hi Russell,

Whilst not a complete answer I realize, in my experience Upstart loosing track of process state is generally down to its insistence on fork following rather than pid files as a way of keeping track of what it thinks is running or not. The number of forks it expects to see is determined by the `expect fork|daemon` definition. 

The unfortunate thing about Upstart is that it only gives you two options, which causes problems with daemons who fork multiple times to really-really be detached from the parent. I guess it depends where you sit on the pragmatic/idealistic line as to whether this is a good thing or not.

My inclination would be to investigate exactly how many forks are involved starting and stopping the daemon via `service` and to see if they match the `expect` definition in the configuration file. The Upstart documentation has a good introduction to the strace incantations you can use to count forks: http://upstart.ubuntu.com/cookbook/#how-to-establish-fork-count

Sam Pointer
Lead Consulant


I have this MongoDB recipe that's working pretty well, but suffers from a problem that I've not been able to solve. The basic question is why does upstart stuff not work at the end of the recipe nor when the node(s) is(are) rebooted?

My use of upstart mechanisms is identical to what comes out of the MongoDB package and to other MongoDB community recipes. Or so I think.

First, particulars: Ubuntu 12.04 (LTS) everywhere, Chef server and clients (11.x).

The Chef recipe appears to work and finish perfectly, but upstart won't start the service. What I'm seeing is:
  1. upstart service says MongoDB daemon not launched (mongodb stop/waiting)
  2. performing service mongodb start by hand does not start it (mongodb stop/waiting)
  3. executing the start-stop-daemon by hand just as in /etc/init/mongodb.conf will launch the daemon
  4. once launched, upstart doesn't see it as the service is running (mongodb stop/waiting)
  5. processor status (ps) shows the daemon is running
Is #4 something to do with pid?

I hope someone sees what I've done wrong here. I very much appreciate the help.

Russ

Console scrape illustrating the list above, from end of Chef run:

.
.
.
[2013-10-03T18:33:59+00:00] INFO: template[/data/mongodb/mongodb.conf] sending restart action to service[mongodb] (immediate)
  * service[mongodb] action restart[2013-10-03T18:33:59+00:00] INFO: Processing service[mongodb] action restart (mongodb::replica line 23)
[2013-10-03T18:33:59+00:00] INFO: service[mongodb] restarted

    - restart service service[mongodb]

  * service[mongodb] action enable[2013-10-03T18:33:59+00:00] INFO: Processing service[mongodb] action enable (mongodb::replica line 42)
 (up to date)
  * service[mongodb] action start[2013-10-03T18:33:59+00:00] INFO: Processing service[mongodb] action start (mongodb::replica line 42)
[2013-10-03T18:33:59+00:00] INFO: service[mongodb] started

    - start service service[mongodb]

[2013-10-03T18:33:59+00:00] INFO: Chef Run complete in 5.902176274 seconds
[2013-10-03T18:33:59+00:00] INFO: Running report handlers
[2013-10-03T18:33:59+00:00] INFO: Report handlers complete
Chef Client finished, 7 resources updated
:~# service mongodb status
mongodb stop/waiting
:~# service mongodb start
mongodb stop/waiting
:~# cat /etc/init/mongodb.conf
description "Keeps MongoDB running between boots"
limit nofile 20000 20000
kill timeout 300         # wait 300s between SIGTERM and SIGKILL.
start on runlevel [2345]
stop on runlevel [06]
expect fork
script
  ENABLE_MONGODB="yes"
  PIDFILE=/var/run/mongodb.pid
  if [ -f /etc/default/mongodb ]; then . /etc/default/mongodb; fi
  if [ "x$ENABLE_MONGODB" = "xyes" ]; then
    exec start-stop-daemon --start --quiet --chuid mongodb -m --pidfile $PIDFILE \
        --exec /usr/bin/mongod -- --config /data/mongodb/mongodb.conf
  fi
end script
:~# PIDFILE=/var/run/mongodb.pid
:~# exec start-stop-daemon --start --chuid mongodb -m --pidfile $PIDFILE --exec /usr/bin/mongod -- --config /data/mongodb/mongodb.conf
about to fork child process, waiting until server is ready for connections.
forked process: 3969
all output going to: /data/mongodb/mongodb.log
child process started successfully, parent exiting
:~$ service mongodb status
mongodb stop/waiting
:~$ ps -ef | grep [m]ongo
mongodb 3969  1  0 18:35 ?  00:00:00 /usr/bin/mongod --config /data/mongodb/mongodb.conf










Archive powered by MHonArc 2.6.16.

§