- From: Alex Kiernan <
>
- To:
- Subject: [chef-dev] Re: Re: [chef] Re: Re: Intermittent chef-expander problem
- Date: Wed, 31 Aug 2011 17:23:29 +0100
On Wed, Aug 31, 2011 at 4:21 PM, Daniel DeLeo
<
>
wrote:
>
On Wednesday, August 31, 2011 at 5:00 AM, Alex Kiernan wrote:
>
> [moved to chef-dev]
>
>
>
> On Wed, Aug 31, 2011 at 11:12 AM, Alex Kiernan
>
> <
>
>
>
> (mailto:
)>
>
> wrote:
>
> > > > > I've seen something like this in a different context that leads me
>
> > > > > to believe it's a bug in the JSON gem or possibly Ruby. It seems
>
> > > > > to only occur on Red Hat systems. What version of ruby are you
>
> > > > > using? Is your system 64 bit? Is ruby 64 bit?
>
> > >
>
> > > I'm guessing this:
>
> > >
>
> > > https://github.com/flori/json/issues/46
>
> > >
>
> > > might be the problem. Time to get that omnibus build working!
>
> >
>
> > Tried this just on a client (once I'd hacked the gemspecs so it didn't
>
> > continue to pick up 1.5.2) and I still got some corrupt JSON through,
>
> > but I'm guessing the JSON that a client starts with comes from the
>
> > server, so any corruption there would propagate through?
>
> >
>
> > To test that theory I've just hacked up the server similarly...
>
>
>
> And it's just died in the same way. Though this time I just restarted
>
> rabbitmq and chef-expander is (as expected) happy.
>
>
>
> Anyone any ideas? It definitely looks like it's native code corruption
>
> :( My thinking at the moment is to patch chef-expander to discard (and
>
> log) invalid messages since there's little point in it getting into a
>
> die/fork/die loop.
>
>
>
Just to be clear, you're now running with the json gem version 1.5.4 [1] on
>
both the client and the server and you're still seeing this issue?
Almost but not quite 1.5.4 - what I had was HEAD from github with
commits up to fe046d68c5ed88b32b1cf3343babcf367b5cc79f, but I see
there's work since then with more GC guard stuff so I'll pull that and
give it a whirl (and the log references that exact defect - I could've
sworn I had those changes :|)
I've not got this code on all our clients, but the failure I observed
was from the client I upgraded (and verified using lsof... got bitten
by the version constraints first time around)
>
When I've seen this problem on the client side, the server will drop the
>
invalid JSON and will end up returning a 400 or 500. So I think what you're
>
seeing is that the server gets valid JSON in from the client, then
>
generates the data to send to chef-expander, and then triggers the bug when
>
converting this data to JSON.
>
Yeah, that fits.
>
Also, I agree that chef-expander should drop and log invalid messages. It's
>
a tricky balance between ensuring that messages get retried for
>
intermittent errors and dropping messages that will always cause an error,
>
but in this case log and drop is definitely correct.
>
Cool will give that a go when I get a chance!
--
Alex Kiernan
Archive powered by MHonArc 2.6.16.