- From: Seth Falcon <
>
- To:
- Subject: [chef] Re: Re: Re: Using unary NOT in knife search
- Date: Fri, 20 Mar 2015 10:09:24 -0700
Hi there,
I did some of the work "way back when" on modifying how Chef handles
incoming data and indexes it in solr. As Dan explained, we condense
the contents of each Chef object into a single 'content' field. The
Chef JSON objects are transformed to key value pairs. The tricky bit
is dealing with the nested structures and arrays in JSON. Skipping
over those details of the nested structure, you can think of the
transform applied like this:
```
{ "name": "server1", "version": "1.2.3" } ---> content:
name__=__server1 version__=__1.2.3
```
So that's what we put into solr. Then incoming queries are parsed and
re-mapped. So a query comes in as: "name:server*" and we remap that
to: "content:name__=__server*". This is why a leading wildcard search,
which is invalid lucene search syntax, works for Chef. A query like:
"name:*" is really sent to solr (and thus lucene) as:
"content:name__=__*".
Now to your question about NOT and AND. The thing about lucene search
is that it is built for natural language search. For natural language
search, term frequency and other scores tend to be much more valuable
for good resutls than straight boolean term present or not. So NOT in
lucene behaves as a filter but doesn't itself return results (so a
bare NOT query is not valid because there is nothing to filter).
Similarly, AND can be confusing because you are scoring documents.
Just listing multiple terms will give "and"-ish behavior. Here are a
couple of links that give some detail and might explain this better
than I've been able to :)
https://lucidworks.com/blog/why-not-and-or-and-not/
https://stackoverflow.com/questions/17969461/not-operator-doesnt-work-in-query-lucene
Anyhow, hope this context is useful.
Best,
+ seth
Archive powered by MHonArc 2.6.16.