[chef-dev] Re: Re: Re: goiardi postgres backed search preview


Chronological Thread 
  • From: Jeremy Bingham < >
  • To: Stephen Delano < >
  • Cc: Daniel DeLeo < >, Adam Jacob < >, " " < >
  • Subject: [chef-dev] Re: Re: Re: goiardi postgres backed search preview
  • Date: Tue, 23 Jun 2015 10:32:27 -0700

In regards to "issues with hitting Postgres even more than before", I mentioned that specifically because Seth Thomas had expressed concern about it with Hosted Chef. In my testing so far, the 10,000 fauxhai nodes lead to about a quarter million rows in the search items table. The queries themselves ran OK though; simple queries like "name:foo*" complete in about 50ms (IIRC), but more complex queries do take longer. Something like "datacenter.city:Vagra* AND name:(server2* OR server4*)" took more like 10-12 seconds. There's definitely room for improvement. So far using WITH clauses in the queries has helped immensely, but I'm sure there's more improvements that can be made.

One thing that would help I think is having a separate schema for each organization rather than one massive table. Goiardi uses its own schema instead of dumping everything in public, but I'm sure that erchef could deal with querying different schemas.

Another possibility is eventually one could query these postgres tables more directly, rather than going through the solr parser. Doing direct SQL is a bad idea, but being able to use the postgres ltree and trigram syntax directly would be nice.

Anyway, I'm up for chatting about this sometime. I'm certainly open to suggestions and comments.

Thanks,

-j

On Tue, Jun 23, 2015 at 9:40 AM, Stephen Delano < " target="_blank"> > wrote:
Jeremy, this is awesome. It’s continually on my list of “if I had free time” things to do, and I’m excited to poke around with what you’ve built.

When you mention "because of issues with hitting Postgres even more than before” - do you have any specific numbers / data on the load that this feature adds to the PostgreSQL server? While mulling over this sort of “new / different search” feature the past few months, our primary concern was with adding additional load to Postgres, which was already starting to see some scaling issues with very large installs. Over the past few months, and wrapping up with the Chef Server 12.1.0 release, we’ve significantly reduced the load that the Chef Server places on Postgres with a combination of query tuning / optimization and optimizations to the amount of “chatter” we do over the wire while making queries. With the increase in performance, search-via-postgres is now back on the table as an option for replacing the Rabbit / chef-expander / Solr stack.

I’d love to take a few minutes sometime to chat about this, and I’m sure there are more people here at Chef that would like to hear about your experiences building this.

Cheers!


Stephen Delano - Engineering Lead, Chef


On Tue, Jun 23, 2015 at 9:11 AM, Daniel DeLeo < " target="_blank"> > wrote:

Yep, this is awesome. Would actually get rid of quite a few services in the erchef app.

--
Daniel DeLeo


On Monday, June 22, 2015 at 5:30 PM, Adam Jacob wrote:

> Super cool. I've thought about prototyping this - it would, obviously, be great to get rid of a component.
>
> Adam
>
> On Mon, Jun 22, 2015 at 5:27 PM, Jeremy Bingham < " target="_blank"> (mailto: " target="_blank"> )> wrote:
> > As some of you know, I've been working off and on for a while on a new way of handling chef search without needing solr, but that's more robust than the built-in goiardi ersatz solr I cooked up, using Postgres to handle the backend. I'm happy to say that there's at last a preview version available in goiardi now. To save space in this email, I'll just link the writeup of the preview that I did: http://goiardi.gl/blog/2015/06/22/postgres-search-in-preview/.
> >
> > There's more here than just announcing that a new goiardi feature is coming, though. While it may or may not be particularly useful for something like hosted Chef (because of issues with hitting Postgres even more than before), it may be useful as an option for standalone self-hosted Chef installations. The tables aren't particularly tied to goiardi at all (you can see them at https://github.com/ctdk/goiardi-schema/blob/pg-search/postgres/deploy/ltree.sql, https://github.com/ctdk/goiardi-schema/blob/pg-search/postgres/deploy/ltree_del_col.sql, and https://github.com/ctdk/goiardi-schema/blob/pg-search/postgres/deploy/ltree_del_item.sql), and I think Chef Server could use the same structure just fine.
> >
> > I thought I'd bring this to the attention of the greater Chef developer community and see what people thought about it, get comments on the implementation, and discuss plans to make it available to erchef if there's interest. The easiest way, probably, to use it with erchef is to make a standalone search program (like how I did a while back with a standalone universe endpoint that I snipped out of goiardi as a proof of concept), but I'm sure there are other ways. A standalone implementation would need to wait until these search changes are merged back into goiardi's 1.0.0-dev branch and for that branch to be finished, for multi-org support and to take advantage of the rewritten http multiplexer, so that's a downside. On the plus side the solr query parser in goiardi's already written.
> >
> > Thoughts? Is this something anyone would like to hear more about?
> >
> > -j







Archive powered by MHonArc 2.6.16.

§