點擊上方藍色字體，選擇“設(shè)為星標”

回復(fù)”資源“獲取更多資源

點擊右側(cè)關(guān)注，大數(shù)據(jù)開發(fā)領(lǐng)域最強公眾號！

大數(shù)據(jù)真好玩

點擊右側(cè)關(guān)注，大數(shù)據(jù)真好玩！

一、搜索請求的結(jié)構(gòu)

1. 確定搜索范圍

2. 搜索請求的基本模塊

3. 基于請求主體的搜索請求

4. 回復(fù)的結(jié)構(gòu)

二、查詢和過濾器

1. match

2. term

3. query_string

三、復(fù)合查詢

1. bool查詢

2. bool過濾器

四、其它查詢和過濾器

1. range查詢和過濾器

2. prefix查詢和過濾器

3. wildcard查詢

4. exists過濾器

5. missing過濾器

6. 將任何查詢轉(zhuǎn)變?yōu)檫^濾器

五、為任務(wù)選擇最好的查詢

ES的搜索請求執(zhí)行流程如圖1所示。圖中索引包含兩個分片，每個分片有一個副本分片。在給文檔定位和評分后，缺省只會獲取排名前10的文檔。REST API搜索請求被發(fā)送到所連接的節(jié)點，該節(jié)點根據(jù)要查詢的索引，將這個請求依次發(fā)送到所有的相關(guān)分片（主分片或者副本分片）。從所有分片收集到足夠的排序和排名信息后，只有包含所需文檔的分片被要求返回相關(guān)內(nèi)容。這種搜索路由的行為是可配置的，圖1展示的默認行為，稱為查詢后獲取（query_then_fetch）。

圖1 搜索請求是如何路由的

一、搜索請求的結(jié)構(gòu)

ES的搜索是基于JSON文檔或者是基于URL的請求。

1. 確定搜索范圍

所有的REST搜索請求使用_search的REST端點，既可以是GET請求，也可以是POST請求。既可以搜索整個集群，也可以通過在搜索URL中指定索引或類型的名稱來限制范圍：

# 無條件搜索整個集群curl '172.16.1.127:9200/_search?pretty'curl '172.16.1.127:9200/_all/_search?pretty'curl '172.16.1.127:9200/*/_search?pretty' # 無條件搜索get-together索引，類似于SQL中的select * from get-together;curl '172.16.1.127:9200/get-together/_search?pretty' # 在ES6中已經(jīng)廢棄了type的概念，所以功能同上curl '172.16.1.127:9200/get-together/_doc/_search?pretty' # 無條件搜索get-together、dbinfo兩個索引curl '172.16.1.127:9200/get-together,dbinfo/_doc/_search?pretty' # 模糊匹配索引名稱，包含get-toge開頭的索引，但不包括get-togethercurl '172.16.1.127:9200/+get-toge*,-get-together/_search?pretty'

和DB類似，為了獲得更好的性能，盡可能地將查詢限制在最小數(shù)量索引。每個搜索請求必須發(fā)送到所有索引分片（類似于DB中的全索引掃描），發(fā)送到越多的索引，就會涉及越多的分片。

2. 搜索請求的基本模塊

類比SQL查詢語句：

select ...? from ...?where ...?order by ...?limit ... ? ? ? ? where <-> query? ?select ... <-> _source?? size + from <-> limit? ? ?order by <-> sort

搜索請求的基本模塊如下：

query：配置查詢和過濾器DSL，限制搜索的條件，類似于SQL查詢中的where子句。
size：返回文檔的數(shù)量，類似于SQL查詢中的limit子句中的數(shù)量。
from：和size一起使用，from用于分頁操作，類似于SQL查詢中的limit子句中的偏移量。如果結(jié)果集合不斷增加，獲取某些靠后的翻頁將會成為代價高昂的操作。（SQL中延遲關(guān)聯(lián)的思想應(yīng)該也可用于ES，先搜索出某一頁的ID，再通過ID查詢字段。）
_source：指定_source字段如何返回，默認返回完整的_source字段，類似于SQL中的select *。通過配置_source，將過濾返回的字段。
sort：類似于SQL中的order by子句，用于排序，默認的排序是基于文檔的得分。

下面看一些簡單的例子。
（1）返回第2頁的10個結(jié)果

# ES的from從0開始curl '172.16.1.127:9200/get-together/_search?from=10&size=10&pretty'

（2）按日期升序排列，返回前10項結(jié)果

curl '172.16.1.127:9200/get-together/_search?sort=date:asc&pretty'

（3）按日期升序排列，返回前10項結(jié)果中title、date的兩個字段

curl '172.16.1.127:9200/get-together/_search?sort=date:asc&_source=title,date&pretty'

（4）請求匹配了所有標題中含有“elasticsearch”的文檔（按小寫比較），按日期升序返回

curl '172.16.1.127:9200/get-together/_search?sort=date:asc&q=title:elasticsearch&pretty'

3. 基于請求主體的搜索請求

前面的搜索請求都是基于URL的。當執(zhí)行更多高級搜索的時候，采用基于請求主體的搜索會擁有更多的靈活性和選擇性。ES推薦使用基于請求主體的搜索請求。

（1）返回第2頁的10個結(jié)果

curl '172.16.1.127:9200/get-together/_search?pretty' -H 'Content-Type: application/json' -d '{? "query": {? ? "match_all": {}? },? "from": 10,? "size": 10}'

（2）返回指定字段

# 只返回name和date字段curl '172.16.1.127:9200/get-together/_search?pretty' -H 'Content-Type: application/json' -d '{? "query": {? ? "match_all": {}? },? "_source": [? ? "name",? ? "date"? ]}'

（3）_source中使用通配符返回字段

# 返回location開頭的字段和日期字段，但不返回location.geolocation字段curl '172.16.1.127:9200/get-together/_search?pretty' -H 'Content-Type: application/json' -d '{? "query": {? ? "match_all": {}? },? "_source": {? ? "include": [? ? ? "location.*",? ? ? "date"? ? ],? ? "exclude": [? ? ? "location.geolocation"? ? ]? }}'

（4）結(jié)果排序

# 類似于SQL中的order by created_on asc, name desc, _scorecurl -XPOST "172.16.1.127:9200/get-together/_mapping/_doc?pretty" -H 'Content-Type: application/json' -d'{? "properties": {? ? "name": {? ? ? "type": "text",? ? ? "fielddata": "true"? ? }? }}' curl '172.16.1.127:9200/get-together/_search?pretty' -H 'Content-Type: application/json' -d '{? "query": {? ? "match_all": {}? },? "sort": [? ? {? ? ? "created_on": "asc"? ? },? ? {? ? ? "name": "desc"? ? },? ? "_score"? ]}'

（5）綜合搜索基礎(chǔ)模塊

curl '172.16.1.127:9200/get-together/_search?pretty' -H 'Content-Type: application/json' -d '{? "query": {? ? "match_all": {}? },? "from": 0,? "size": 10,? "_source": [? ? "name",? ? "organizer",? ? "description"? ],? "sort": [? ? {? ? ? "created_on": "desc"? ? }? ]}'

類似于如下SQL查詢：

select name, organizer, description? from get-together?order by created_on desc?limit 0, 10;

注意，如果在返回結(jié)果中某些字段的值為null，缺省在ES返回的_source中根本就不會出現(xiàn)該字段名稱，這點與SQL是不同的。

4. 回復(fù)的結(jié)構(gòu)

下面看一下ES搜索返回的數(shù)據(jù)結(jié)構(gòu)。

curl '172.16.1.127:9200/_search?q=title:elasticsearch&_source=title,date&pretty'

結(jié)果返回：

{? "took" : 13, ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? # 查詢執(zhí)行所用的毫秒數(shù)? "timed_out" : false, ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? # 是否超時? "_shards" : {? ? "total" : 28, ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?# 搜索的分片數(shù)? ? "successful" : 28, ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? # 成功的分片數(shù)? ? "skipped" : 0, ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? # 跳過的分片數(shù)? ? "failed" : 0 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? # 失敗的分片數(shù)? },? "hits" : {? ? "total" : 7, ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? # 匹配的文檔數(shù)? ? "max_score" : 1.0128567, ? ? ? ? ? ? ? ? ? ? ? ? # 最高文檔得分? ? "hits" : [ ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? # 命中文檔的數(shù)組? ? ? {? ? ? ? "_index" : "get-together", ? ? ? ? ? ? ? ? ? # 文檔所屬索引? ? ? ? "_type" : "_doc", ? ? ? ? ? ? ? ? ? ? ? ? ? ?# 文檔所屬類型? ? ? ? "_id" : "103", ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? # 文檔ID? ? ? ? "_score" : 1.0128567, ? ? ? ? ? ? ? ? ? ? ? ?# 相關(guān)性得分? ? ? ? "_routing" : "2", ? ? ? ? ? ? ? ? ? ? ? ? ? ?# 文檔所屬的分片號? ? ? ? "_source" : { ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?# 請求的_source字段? ? ? ? ? "date" : "2013-04-17T19:00",? ? ? ? ? "title" : "Introduction to Elasticsearch"? ? ? ? }? ? ? },? ? ? {? ? ? ? "_index" : "get-together",? ? ? ? "_type" : "_doc",? ? ? ? "_id" : "105",? ? ? ? "_score" : 1.0128567,? ? ? ? "_routing" : "2",? ? ? ? "_source" : {? ? ? ? ? "date" : "2013-07-17T18:30",? ? ? ? ? "title" : "Elasticsearch and Logstash"? ? ? ? }? ? ? },? ? ? ...? ? ]? }}

如果沒有存儲文檔的_source或者是fields，那么將無法從ES中獲取數(shù)值！

二、查詢和過濾器

查詢和過濾器功能上類似于SQL查詢中的where子句，都是起到按查詢條件篩選文檔的作用，但它們在評分就機制和搜索行為的性能上有所不同。不像查詢會為特定的詞條計算得分，搜索的過濾器只是為“文檔是否匹配這個查詢”，返回簡的“是”或“否”的答案。圖2展示了查詢和過濾器之間的主要差別。

圖2 由于不計算得分，過濾器所需的處理更少，并且可以被緩存

由于這個差異，過濾器可以比普通的查詢更快，而且還可以被緩存。

1. match

（1）match_all
匹配所有文檔，類似于SQL中的無where條件查詢。

curl '172.16.1.127:9200/get-together/_search?pretty' -H 'Content-Type: application/json' -d '{? "query": {? ? "match_all": {}? }}'

在ES6中，match_all查詢返回文檔的_score都為1.0。

（2）match
匹配字段條件，類似于SQL中的where column='xxx'。下面的查詢搜索標題中有“hadoop”字樣的文檔：

curl '172.16.1.127:9200/get-together/_search?pretty' -H 'Content-Type: application/json' -d '{? "query": {? ? "match": {? ? ? "title": "hadoop"? ? }? }}'

match查詢不區(qū)分大小寫。在進行匹配時，詞條和輸入的文本都被轉(zhuǎn)換成小寫進行比較。match查詢返回文檔的_score相關(guān)性得分。

默認情況下，match查詢使用OR操作符。例如，如果搜索文本“Elasticsearch Denver”，ES會搜索“Elasticsearch OR Denver”，同時匹配“Elasticsearch Amsterdam”和“Denver Clojure”。下面的查詢搜索同時包含“Elasticsearch”和“Denver”關(guān)鍵詞的結(jié)果：

curl '172.16.1.127:9200/get-together/_search?pretty' -H 'Content-Type: application/json' -d '{? "query": {? ? "match": {? ? ? "name": {? ? ? ? "query": "Elasticsearch Denver",? ? ? ? "operator": "and"? ? ? }? ? }? }}'

（3）match_phrase
下面的查詢搜索name字段中包含“enterprise london”短語，并且“enterprise”和“l(fā)ondon”之間允許包含一個單詞的文檔：

curl '172.16.1.127:9200/get-together/_search?pretty' -H 'Content-Type: application/json' -d '{? "query": {? ? "match_phrase": {? ? ? "name": {? ? ? ? "query": "enterprise london",? ? ? ? "slop": 1? ? ? }? ? }? },? "_source": [? ? "name",? ? "description"? ]}'

（4）phrase_prefix
下面的例子中，phrase_prefix使用的是“Elasticsearch den”，ES使用“den”文本進行前綴匹配，查找所有name字段，發(fā)現(xiàn)那些以“den”開始的取值。max_expansions設(shè)置最大前綴擴展數(shù)量。由于產(chǎn)生的結(jié)果可能是個很大的集合，需要限制擴展的數(shù)量。

curl '172.16.1.127:9200/get-together/_search?pretty' -H 'Content-Type: application/json' -d '{? "query": {? ? "match_phrase_prefix": {? ? ? "name": {? ? ? ? "query": "Elasticsearch den",? ? ? ? "max_expansions": 1? ? ? }? ? }? },? "_source": [? ? "name"? ]}'

（5）multi_match
可以在多個字段中匹配多個詞條，類似于SQL中的where name like '%elasticsearch%' or name like '%hadoop%' or 'description' like '%elasticsearch%' or 'description' like '%hadoop%'：

curl '172.16.1.127:9200/get-together/_search?pretty' -H 'Content-Type: application/json' -d '{? "query": {? ? "multi_match": {? ? ? "query": "elasticsearch hadoop",? ? ? "fields": [? ? ? ? "name",? ? ? ? "description"? ? ? ]? ? }? }}'

就像match查詢可以轉(zhuǎn)化為phrase查詢或者phrase_prefix查詢，multi_match查詢可以轉(zhuǎn)化為phrase查詢或者phrase_prefix查詢，方法是指定type鍵。除了可以指定搜索字段是多個而不是單獨一個之外，可以將multi_match查詢當做match查詢使用。

2. term

term查詢和過濾器可以指定需要搜索的文檔字段和詞條。注意，term搜索的詞條是沒有經(jīng)過分析的，文檔中的詞條必須要精確匹配才能作為結(jié)果返回。

（1）term查詢

curl '172.16.1.127:9200/get-together/_search?pretty' -H 'Content-Type: application/json' -d '{? "query": {? ? "term": {? ? ? "tags": "elasticsearch"? ? }? },? "_source": [? ? "name",? ? "tags"? ]}'

（2）term過濾器
和term查詢相似，可以使用term過濾器來限制結(jié)果文檔，使其包含特定的詞條，不過無須計算得分。

curl '172.16.1.127:9200/get-together/_search?pretty' -H 'Content-Type: application/json' -d '{? "query": {? ? "bool": {? ? ? "filter": {? ? ? ? "term": {? ? ? ? ? "tags": "elasticsearch"? ? ? ? }? ? ? }? ? }? }}'

（3）terms查詢
和term查詢類似，terms查詢可以搜索某個文檔字段中的多個詞條。例如下面的查詢搜索標簽含有“jvm”或“hadoop”的文檔。

curl '172.16.1.127:9200/get-together/_search?pretty' -H 'Content-Type: application/json' -d '{? "query": {? ? "terms": {? ? ? "tags": [? ? ? ? "jvm",? ? ? ? "hadoop"? ? ? ]? ? }? },? "_source": [? ? "name",? ? "tags"? ]}'

對于和查詢匹配的文檔，可以強制規(guī)定每篇文檔中匹配詞條的最小數(shù)量，為了實現(xiàn)這一點需要指定minimum_should_match參數(shù)。

curl '172.16.1.127:9200/get-together/_search?pretty' -H 'Content-Type: application/json' -d '{? "query": {? ? "bool": {? ? ? "minimum_should_match": 2,? ? ? "must": {? ? ? ? "terms": {? ? ? ? ? "tags": [? ? ? ? ? ? "jvm",? ? ? ? ? ? "hadoop",? ? ? ? ? ? "lucene"? ? ? ? ? ]? ? ? ? }? ? ? }? ? }? }}'

3. query_string

下面的查詢搜索包含“nosql”的文檔。兩個查詢等價，前者使用URL執(zhí)行，后者使用請求主體發(fā)送：

curl -XGET '172.16.1.127:9200/get-together/_search?q=nosql&pretty'curl -XPOST '172.16.1.127:9200/get-together/_search?pretty' -H 'Content-Type: application/json' -d '{? "query": {? ? "query_string": {? ? ? "query": "nosql"? ? }? }}'

默認情況下，query_string查詢將會搜索_all字段，該字段是由所有字段組合而成?？梢酝ㄟ^default_field設(shè)置字段：

curl -XPOST '172.16.1.127:9200/_search?pretty' -H 'Content-Type: application/json' -d '{? "query": {? ? "query_string": {? ? ? "default_field": "description",? ? ? "query": "nosql"? ? }? }}'

也可以在多個字段上執(zhí)行查詢，此時應(yīng)使用fields：

curl -XPOST '172.16.1.127:9200/_search?pretty' -H 'Content-Type: application/json' -d '{? "query": {? ? "query_string": {? ? ? "fields": ["description", "tags"],? ? ? "query": "nosql"? ? }? }}'

下面的查詢搜索所有名稱中含有“nosql”的文檔，但是排除了那些描述中有“mongodb”的結(jié)果：

curl -XPOST '172.16.1.127:9200/_search?pretty' -H 'Content-Type: application/json' -d '{? "query": {? ? "query_string": {? ? ? "query": "name:nosql AND -description:mongodb"? ? }? }}'

可以使用如下命令查詢所有于1999年到2001年期間創(chuàng)建的標簽為搜索或lucene的文檔：

curl -XPOST '172.16.1.127:9200/_search?pretty' -H 'Content-Type: application/json' -d '{? "query": {? ? "query_string": {? ? ? "query": "(tags:search OR tags:lucene) AND (created_on:[1999-01-01 TO 2001-01-01])"? ? }? }}'

針對query_string查詢，建議的替換方案包括term、terms、match或multi_match查詢。

三、復(fù)合查詢

1. bool查詢

bool查詢允許在單獨的查詢中組合任意數(shù)量的查詢，指定的查詢子句表明哪些部分是必須（must）匹配、應(yīng)該（should）匹配或者是不能（must_not）匹配上ES索引里的數(shù)據(jù)。

下面的例子查詢attendees字段中必須包含“david”，也應(yīng)該包含“clint”和“andy”，并且date必須大于等于'2013-06-30'。minimum_should_match表示最小的should子句匹配數(shù)，滿足這個數(shù)量的文檔才能作為結(jié)果返回。minimum_should_match的默認值有一些隱藏的特性。如果指定了must子句，minimum_should_match的默認值為0。如果沒有指定must子句，默認值為1。

curl '172.16.1.127:9200/get-together/_search?pretty' -H 'Content-Type: application/json' -d '{? "query": {? ? "bool": {? ? ? "must": [? ? ? ? {? ? ? ? ? "term": {? ? ? ? ? ? "attendees": "david"? ? ? ? ? }? ? ? ? }? ? ? ],? ? ? "should": [? ? ? ? {? ? ? ? ? "term": {? ? ? ? ? ? "attendees": "clint"? ? ? ? ? }? ? ? ? },? ? ? ? {? ? ? ? ? "term": {? ? ? ? ? ? "attendees": "andy"? ? ? ? ? }? ? ? ? }? ? ? ],? ? ? "must_not": [? ? ? ? {? ? ? ? ? "range": {? ? ? ? ? ? "date": {? ? ? ? ? ? ? "lt": "2013-06-30T00:00"? ? ? ? ? ? }? ? ? ? ? }? ? ? ? }? ? ? ],? ? ? "minimum_should_match": 1? ? }? }}'

可以使用下面的語句改寫這個查詢，它在邏輯上與上個查詢等價，但只包含must一個bool查詢選項，更短小。

curl '172.16.1.127:9200/get-together/_search?pretty' -H 'Content-Type: application/json' -d '{  "query": {    "bool": {      "must": [        {          "term": {            "attendees": "david"          }        },        {          "range": {            "date": {              "gte": "2013-06-30T00:00"            }          }        },        {          "terms": {            "attendees": [              "clint",              "andy"            ]          }        }      ]    }  }}'

2. bool過濾器

bool過濾器和bool查詢的表現(xiàn)基本一致。只是它組合的是過濾器。bool過濾器不支持minimum_should_match屬性，而是使用了默認值1。

curl '172.16.1.127:9200/get-together/_search?pretty' -H 'Content-Type: application/json' -d '{? "query": {? ? "bool": {? ? ? "filter": {? ? ? ? "bool": {? ? ? ? ? "must": [? ? ? ? ? ? {? ? ? ? ? ? ? "term": {? ? ? ? ? ? ? ? "attendees": "david"? ? ? ? ? ? ? }? ? ? ? ? ? }? ? ? ? ? ],? ? ? ? ? "should": [? ? ? ? ? ? {? ? ? ? ? ? ? "term": {? ? ? ? ? ? ? ? "attendees": "clint"? ? ? ? ? ? ? }? ? ? ? ? ? },? ? ? ? ? ? {? ? ? ? ? ? ? "term": {? ? ? ? ? ? ? ? "attendees": "andy"? ? ? ? ? ? ? }? ? ? ? ? ? }? ? ? ? ? ],? ? ? ? ? "must_not": [? ? ? ? ? ? {? ? ? ? ? ? ? "range": {? ? ? ? ? ? ? ? "date": {? ? ? ? ? ? ? ? ? "lt": "2013-06-30T00:00"? ? ? ? ? ? ? ? }? ? ? ? ? ? ? }? ? ? ? ? ? }? ? ? ? ? ]? ? ? ? }? ? ? }? ? }? }}'

四、其它查詢和過濾器

1. range查詢和過濾器

（1）查詢

# where created_on > 2012-06-01 and created_on < 2012-09-01curl '172.16.1.127:9200/get-together/_search?pretty' -H 'Content-Type: application/json' -d '{? "query": {? ? "range": {? ? ? "created_on": {? ? ? ? "gt": "2012-06-01",? ? ? ? "lt": "2012-09-01"? ? ? }? ? }? }}'

（2）過濾器

curl '172.16.1.127:9200/get-together/_search?pretty' -H 'Content-Type: application/json' -d '{? "query": {? ? "bool": {? ? ? "filter": {? ? ? ? "range": {? ? ? ? ? "created_on": {? ? ? ? ? ? "gt": "2012-06-01",? ? ? ? ? ? "lt": "2012-09-01"? ? ? ? ? }? ? ? ? }? ? ? }? ? }? }}'

range查詢支持字符串范圍，如果想搜索name在“c”和“e”之間的文檔，可以使用下面的搜索：

curl '172.16.1.127:9200/get-together/_search?pretty' -H 'Content-Type: application/json' -d '{? "query": {? ? "range": {? ? ? "name": {? ? ? ? "gt": "c",? ? ? ? "lt": "e"? ? ? }? ? }? }}'

使用range查詢時，應(yīng)仔細考慮一下過濾器是否為更好的選擇。由于在查詢范圍之中的文檔是二元匹配（“是的，文檔在范圍之中”或者“不是，文檔不在范圍之中”），range查詢不必是查詢。為了獲得更好的性能，它應(yīng)該是過濾器。如果不確定是查詢還是過濾器，請使用過濾器。在99%的用例中，使用range過濾器是正確的選擇。

2. prefix查詢和過濾器

prefix查詢和過濾器允許根據(jù)給定的前綴來搜索詞條。這里前綴在搜索之前是沒有經(jīng)過分析的。例如，為了在索引中搜索title為“l(fā)iber”開頭的全部文檔，使用下面的查詢：

curl '172.16.1.127:9200/get-together/_search?pretty' -H 'Content-Type: application/json' -d '{? "query": {? ? "prefix": {? ? ? "title": "liber"? ? }? }}'

類似地也可以使用過濾器：

curl '172.16.1.127:9200/get-together/_search?pretty' -H 'Content-Type: application/json' -d '{? "query": {? ? "bool": {? ? ? "filter": {? ? ? ? "prefix": {? ? ? ? ? "title": "liber"? ? ? ? }? ? ? }? ? }? }}'

由于前綴搜索沒有經(jīng)過分析，前綴查詢或過濾器是大小寫敏感的。

3. wildcard查詢

# 創(chuàng)建索引，添加兩個文檔curl -XPOST '172.16.1.127:9200/wildcard-test/_doc/1?pretty' -H 'Content-Type: application/json' -d '{? "title":"The Best Bacon Ever"}' curl -XPOST '172.16.1.127:9200/wildcard-test/_doc/2?pretty' -H 'Content-Type: application/json' -d '{? "title":"How to raise a barn"}' # “ba*n”會匹配bacon和barncurl '172.16.1.127:9200/wildcard-test/_search?pretty' -H 'Content-Type: application/json' -d'{? "query": {? ? "wildcard": {? ? ? "title": {? ? ? ? "wildcard": "ba*n"? ? ? }? ? }? }}' # “ba?n”只會匹配barn，不會匹配baconcurl '172.16.1.127:9200/wildcard-test/_search?pretty' -H 'Content-Type: application/json' -d'{? "query": {? ? "wildcard": {? ? ? "title": {? ? ? ? "wildcard": "ba?n"? ? ? }? ? }? }}'

使用這種查詢時，需要注意的是wildcard查詢不像match等其它查詢那樣輕量級。查詢詞條中越早出現(xiàn)通配符（*或?），ES就需要做更多的工作來進行匹配。

4. exists過濾器

exists過濾器允許過濾文檔，只查找那些在特定字段有值的文檔：

curl '172.16.1.127:9200/get-together/_search?pretty' -H 'Content-Type: application/json' -d'{? "query": {? ? "bool": {? ? ? "filter": {? ? ? ? "exists": {? ? ? ? ? "field": "location_event.geolocation"? ? ? ? }? ? ? }? ? }? }}'

5. missing過濾器

missing過濾器可以搜索字段里沒有值，或者是映射時指定了默認值的文檔（也叫做null值，即映射里null_value）。為了搜索缺失reviews字段的文檔，可以使用下面的過濾器：

curl '172.16.1.127:9200/get-together/_search?pretty' -H 'Content-Type: application/json' -d'{? "query": {? ? "bool": {? ? ? "must_not": {? ? ? ? "exists": {? ? ? ? ? "field": "reviews"? ? ? ? }? ? ? }? ? }? }}'

6. 將任何查詢轉(zhuǎn)變?yōu)檫^濾器

ES允許通過query過濾器將任何查詢轉(zhuǎn)化為過濾器。例如，有個query_string查詢搜索匹配“Elasticsearch”的名字，可以使用如下搜索將其轉(zhuǎn)變?yōu)檫^濾器：

curl '172.16.1.127:9200/get-together/_search?pretty' -H 'Content-Type: application/json' -d'{? "query": {? ? "bool": {? ? ? "filter": {? ? ? ? "query_string": {? ? ? ? ? "query": "name:\"Elasticsearch\""? ? ? ? }? ? ? }? ? }? }}'

五、為任務(wù)選擇最好的查詢

表1為ES的常用案例中使用哪些查詢的指南。

用例	使用的查詢類型
想從類似Google的界面接受用戶的輸入，然后根據(jù)這些輸入搜索文檔	如果想支持+/-或者在特定字段中搜索，就是用simple_query_string查詢
想將輸入作為詞組并搜索包含這個詞組的文檔，詞組中的單詞也許包含一些間隔（slop）	要查找和用戶搜索相似的詞組，使用match_phrase查詢，并設(shè)置一定量的slop
想在not_analyzed字段中搜索單個關(guān)鍵字，并完全清楚這個詞應(yīng)該是如何出現(xiàn)的	使用term查詢，因為查詢的詞條不會被分析
希望組合許多不同的搜索請求或者不同類型的搜索，創(chuàng)建一個單獨的搜索來處理它們	使用bool查詢，將任意數(shù)量的子查詢組合到一個單獨的查詢
希望在某個文檔中的多個字段搜索特定的單詞	使用multi_match查詢，它和match查詢的表現(xiàn)類似，不過是在多個字段上搜索
希望通過一次搜索返回所有的文檔	使用match_all查詢，在一次搜索中返回全部文檔
希望在字段中搜索一定取值范圍內(nèi)的值	使用range查詢，搜索取值在一定范圍內(nèi)的文檔
希望在字段中搜索特定字符串開頭的取值	使用prefix查詢，搜索以給定字符串開頭的詞條
希望根據(jù)用戶已經(jīng)輸入的內(nèi)容，提供單個關(guān)鍵詞的自動完成功能	使用prefix查詢，發(fā)送用戶已經(jīng)輸入的內(nèi)容，然后獲取以此文本開頭的匹配項
希望搜索特定字段沒有取值的所有文檔	使用missing過濾器過濾出缺失某些字段的文檔

表1 常用案例中使用哪些類型的查詢

版權(quán)聲明：

本文為大數(shù)據(jù)技術(shù)與架構(gòu)整理，原作者獨家授權(quán)。未經(jīng)原作者允許轉(zhuǎn)載追究侵權(quán)責任。

編輯｜冷眼丶

微信公眾號｜import_bigdata

歡迎點贊+收藏+轉(zhuǎn)發(fā)朋友圈素質(zhì)三連

文章不錯？點個【在看】吧！??

觸類旁通Elasticsearch之吊打同行系列：搜索篇

一、搜索請求的結(jié)構(gòu)

1. 確定搜索范圍

2. 搜索請求的基本模塊

3. 基于請求主體的搜索請求

4. 回復(fù)的結(jié)構(gòu)

二、查詢和過濾器

1. match

2. term

3. query_string

三、復(fù)合查詢

1. bool查詢

2. bool過濾器

四、其它查詢和過濾器

1. range查詢和過濾器

2. prefix查詢和過濾器

3. wildcard查詢

4. exists過濾器

5. missing過濾器

6. 將任何查詢轉(zhuǎn)變?yōu)檫^濾器

五、為任務(wù)選擇最好的查詢

一、搜索請求的結(jié)構(gòu)

三、復(fù)合查詢

四、其它查詢和過濾器

五、為任務(wù)選擇最好的查詢