首頁 > 軟體

ElasticSearch查詢檔案基本操作範例

2023-09-12 18:01:43

查詢檔案 & 基本操作

為了方便學習, 本節中所有範例沿用上節的索引

按照ID單個

GET class_1/_doc/1

查詢結果:

{
  "_index" : "class_1",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 4,
  "_seq_no" : 4,
  "_primary_term" : 3,
  "found" : true,
  "_source" : {
    "name" : "l",
    "num" : 6
  }
}

按照ID批次

GET class_1/_mget
{
"ids":[1,2,3]
}

返回:

{
  "docs" : [
    {
      "_index" : "class_1",
      "_type" : "_doc",
      "_id" : "1",
      "_version" : 4,
      "_seq_no" : 4,
      "_primary_term" : 3,
      "found" : true,
      "_source" : {
        "name" : "l",
        "num" : 6
      }
    },
    {
      "_index" : "class_1",
      "_type" : "_doc",
      "_id" : "2",
      "found" : false
    },
    {
      "_index" : "class_1",
      "_type" : "_doc",
      "_id" : "3",
      "_version" : 3,
      "_seq_no" : 10,
      "_primary_term" : 4,
      "found" : true,
      "_source" : {
        "num" : 9,
        "name" : "e",
        "age" : 9,
        "desc" : [
          "hhhh"
        ]
      }
    }
  ]
}

查詢檔案是否存在 & 通過id判斷

HEAD class_1/_doc/1

返回:

200 - OK

HEAD class_1/_doc/1000

返回:

404 - Not Found

查詢部分欄位內容

GET class_1/_doc/1?_source_includes=name

返回:

{
  "_index" : "class_1",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 4,
  "_seq_no" : 4,
  "_primary_term" : 3,
  "found" : true,
  "_source" : {
    "name" : "l"
  }
}

可以看到只返回了name欄位, 以上是一個基本的操作,下面給大家講下條件查詢~

查詢檔案 & 條件查詢

查詢的複雜度取決於它附加的條件約束,跟我們寫sql一樣。下面就帶大家一步一步看一下ES中如何進行條件查詢~

不附加任何條件

GET class_1/_search

返回:

{
  "took" : 15,
  "timed_out" : false,
  "_shards" : {
    "total" : 3,
    "successful" : 3,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 8,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "h2Fg-4UBECmbBdQA6VLg",
        "_score" : 1.0,
        "_source" : {
          "name" : "b",
          "num" : 6
        }
      },
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "iGFt-4UBECmbBdQAnVJe",
        "_score" : 1.0,
        "_source" : {
          "name" : "g",
          "age" : 8
        }
      },
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "iWFt-4UBECmbBdQAnVJg",
        "_score" : 1.0,
        "_source" : {
          "name" : "h",
          "age" : 9
        }
      },
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "imFt-4UBECmbBdQAnVJg",
        "_score" : 1.0,
        "_source" : {
          "name" : "i",
          "age" : 10
        }
      },
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 1.0,
        "_source" : {
          "num" : 9,
          "name" : "e",
          "age" : 9,
          "desc" : [
            "hhhh"
          ]
        }
      },
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "4",
        "_score" : 1.0,
        "_source" : {
          "name" : "f",
          "age" : 10,
          "num" : 10
        }
      },
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "RWlfBIUBDuA8yW5cu9wu",
        "_score" : 1.0,
        "_source" : {
          "name" : "一年級",
          "num" : 20
        }
      },
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "name" : "l",
          "num" : 6
        }
      }
    ]
  }
}

可以看到索引class_1中的所有資料都是上節新增的。這裡提一下,我們也可以新增多個索引一起查,然後返回,用,逗號隔開就可以了

GET class_1,class_2,class_3/_search
{
  "took" : 7,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 9,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "h2Fg-4UBECmbBdQA6VLg",
        "_score" : 1.0,
        "_source" : {
          "name" : "b",
          "num" : 6
        }
      },
      {
        "_index" : "class_2",
        "_type" : "_doc",
        "_id" : "RWlfBIUBDuA8yW5cu9wu",
        "_score" : 1.0,
        "_source" : {
          "name" : "一年級",
          "num" : 20
        }
      },
      ....
    ]
  }
}

可以看到返回了索引class_2中的資料,並且合併到了一起。

相關欄位解釋

有的小夥伴可能對返回的欄位有點陌生,這裡給大家統一解釋一下:

{
    "took":"查詢操作耗時,單位毫秒",
    "timed_out":"是否超時",
    "_shards":{
        "total":"分片總數",
        "successful":"執行成功分片數",
        "skipped":"執行忽略分片數",
        "failed":"執行失敗分片數"
    },
    "hits":{
        "total":{
            "value":"條件查詢命中數",
            "relation":"計數規則(eq計數準確/gte計數不準確)"
        },
        "max_score":"最大匹配度分值",
        "hits":[
            {
                "_index":"命中結果索引",
                "_id":"命中結果ID",
                "_score":"命中結果分數",
                "_source":"命中結果原檔案資訊"
            }
        ]
    }
}

下面我們看下帶條件的查詢~

基礎分頁查詢

基本語法: es中通過引數sizefrom來進行基礎分頁的控制

  • from:指定跳過多少條資料
  • size:指定返回多少條資料

下面看下範例:

url引數

GET class_1/_search?from=2&size=2

返回:

{
  "took" : 5,
  "timed_out" : false,
  "_shards" : {
    "total" : 3,
    "successful" : 3,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 8,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "iWFt-4UBECmbBdQAnVJg",
        "_score" : 1.0,
        "_source" : {
          "name" : "h",
          "age" : 9
        }
      },
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "imFt-4UBECmbBdQAnVJg",
        "_score" : 1.0,
        "_source" : {
          "name" : "i",
          "age" : 10
        }
      }
    ]
  }
}

body 引數

GET class_1/_search
{
    "from" : 2,
    "size" : 2
}

返回結果和上面是一樣的~

單欄位全文索引查詢

這個大家應該不陌生,前面幾節都見過。使用query.match進行查詢,match適用與對單個欄位基於全文索引進行資料檢索。對於全文欄位,match使用特定的分詞進行全文檢索。而對於那些精確值,match同樣可以進行精確匹配,match查詢短語時,會對短語進行分詞,再針對每個詞條進行全文檢索。

GET class_1/_search
{
  "query": {
    "match": {
      "name":"i"
    }
  }
}

返回:

{
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
    "total" : 3,
    "successful" : 3,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.3862942,
    "hits" : [
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "imFt-4UBECmbBdQAnVJg",
        "_score" : 1.3862942,
        "_source" : {
          "name" : "i",
          "age" : 10
        }
      }
    ]
  }
}

單欄位不分詞查詢

使用query.match_phrase進行查詢, 它與match的區別就是不進行分詞,幹說,可能有點抽象,下面我們通過一個例子給大家分清楚:

先造點資料進去:

PUT class_1/_bulk
{ "create":{  } }
{"name":"I eat apple so haochi1~","num": 1}
{ "create":{  } }
{ "name":"I eat apple so zhen haochi2~","num": 1}
{ "create":{  } }
{"name":"I eat apple so haochi3~","num": 1}

假設有這麼幾個句子,現在我有一個需求我要把I eat apple so zhen haochi2~這句話匹配出來

match分詞結果

GET class_1/_search
{
  "query": {
    "match": {
      "name": "apple so zhen"
    }
  }
}

返回:

{
  "took" : 15,
  "timed_out" : false,
  "_shards" : {
    "total" : 3,
    "successful" : 3,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : 2.2169428,
    "hits" : [
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "cMfcCoYB090miyjed7YE",
        "_score" : 2.2169428,
        "_source" : {
          "name" : "I eat apple so zhen haochi2~",
          "num" : 1
        }
      },
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "b8fcCoYB090miyjed7YE",
        "_score" : 1.505254,
        "_source" : {
          "name" : "I eat apple so haochi1~",
          "num" : 1
        }
      },
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "ccfcCoYB090miyjed7YE",
        "_score" : 1.505254,
        "_source" : {
          "name" : "I eat apple so haochi3~",
          "num" : 1
        }
      }
    ]
  }
}

從結果來看,剛剛的幾句話都被查出來了,但是結果並大符合預期。從score來看,"_score" : 2.2169428得分最高,排在了第一,語句是I eat apple so zhen haochi2~,說明匹配度最高,這個句子正是我們想要的結果~

match_phrase 不分詞查詢結果

GET class_1/_search
{
  "query": {
    "match_phrase": {
      "name": "apple so zhen"
    }
  }
}

結果:

{
  "took" : 6,
  "timed_out" : false,
  "_shards" : {
    "total" : 3,
    "successful" : 3,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 2.2169428,
    "hits" : [
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "cMfcCoYB090miyjed7YE",
        "_score" : 2.2169428,
        "_source" : {
          "name" : "I eat apple so zhen haochi2~",
          "num" : 1
        }
      }
    ]
  }
}

結果符合預期,只返回了我們想要的那句。那麼match為什麼都返回了,這就是前面講到的分詞,首先會對name: apple so zhen進行分詞,也就是說存在apple的都會被返回。

當然,真正業務中的需求比這個複雜多了,這裡只是為了給大家做區分~ 下面接著看~

多欄位全文索引查詢

相當於對多個欄位執行了match查詢, 這裡需要注意的是query的型別要和欄位型別一致,不然會報型別異常

GET class_1/_search
{
  "query": {
    "multi_match": {
      "query": "apple",
      "fields": ["name","desc"]
    }
  }
}
{
  "took" : 5,
  "timed_out" : false,
  "_shards" : {
    "total" : 3,
    "successful" : 3,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : 0.752627,
    "hits" : [
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "b8fcCoYB090miyjed7YE",
        "_score" : 0.752627,
        "_source" : {
          "name" : "I eat apple so haochi1~",
          "num" : 1
        }
      },
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "ccfcCoYB090miyjed7YE",
        "_score" : 0.752627,
        "_source" : {
          "name" : "I eat apple so haochi3~",
          "num" : 1
        }
      },
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "cMfcCoYB090miyjed7YE",
        "_score" : 0.7389809,
        "_source" : {
          "name" : "I eat apple so zhen haochi2~",
          "num" : 1
        }
      }
    ]
  }
}

範圍查詢

使用range來進行範圍查詢,適用於陣列時間等欄位

GET class_1/_search
{
  "query": {
    "range": {
      "num": {
        "gt": 5,
        "lt": 10
      }
    }
  }
}

返回:

{
  "took" : 6,
  "timed_out" : false,
  "_shards" : {
    "total" : 3,
    "successful" : 3,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "h2Fg-4UBECmbBdQA6VLg",
        "_score" : 1.0,
        "_source" : {
          "name" : "b",
          "num" : 6
        }
      },
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 1.0,
        "_source" : {
          "num" : 9,
          "name" : "e",
          "age" : 9,
          "desc" : [
            "hhhh"
          ]
        }
      },
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "name" : "l",
          "num" : 6
        }
      }
    ]
  }
}

單欄位精確查詢

使用term進行非分詞欄位的精確查詢。需要注意的是,對於那些分詞的欄位,即使查詢的value是一個完全匹配的短語,也無法完成查詢

GET class_1/_search
{
 "query": {
   "term": {
     "num": {
       "value": "9"
     }
   }
 }
}

返回:

{
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
    "total" : 3,
    "successful" : 3,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 1.0,
        "_source" : {
          "num" : 9,
          "name" : "e",
          "age" : 9,
          "desc" : [
            "hhhh"
          ]
        }
      }
    ]
  }
}

欄位精確查詢 & 多值

與term一樣, 區別在於可以匹配一個欄位的多個值,滿足一個即檢索成功

GET class_1/_search
{
 "query": {
   "terms": {
     "num": [
      9,
      1
     ]
   }
 }
}

返回:

{
  "took" : 8,
  "timed_out" : false,
  "_shards" : {
    "total" : 3,
    "successful" : 3,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 4,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 1.0,
        "_source" : {
          "num" : 9,
          "name" : "e",
          "age" : 9,
          "desc" : [
            "hhhh"
          ]
        }
      },
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "b8fcCoYB090miyjed7YE",
        "_score" : 1.0,
        "_source" : {
          "name" : "I eat apple so haochi1~",
          "num" : 1
        }
      },
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "ccfcCoYB090miyjed7YE",
        "_score" : 1.0,
        "_source" : {
          "name" : "I eat apple so haochi3~",
          "num" : 1
        }
      },
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "cMfcCoYB090miyjed7YE",
        "_score" : 1.0,
        "_source" : {
          "name" : "I eat apple so zhen haochi2~",
          "num" : 1
        }
      }
    ]
  }
}

檔案包含欄位查詢

為了確定當前索引有哪些檔案包含了對應的欄位,es中使用exists來實現

GET class_1/_search
{
  "query": {
    "exists": {
      "field": "desc"
    }
  }
}

返回:

{
  "took" : 8,
  "timed_out" : false,
  "_shards" : {
    "total" : 3,
    "successful" : 3,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "class_1",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 1.0,
        "_source" : {
          "num" : 9,
          "name" : "e",
          "age" : 9,
          "desc" : [
            "hhhh"
          ]
        }
      }
    ]
  }
}

結束語

本節主要講了ES中的檔案查詢API操作,該部分內容較多, 下節繼續給大家講,就先消化這麼多~API大家都不要去背,多敲幾遍就記住了,關鍵是多用,多總結 。

以上就是ElasticSearch查詢檔案基本操作範例的詳細內容,更多關於ElasticSearch查詢檔案的資料請關注it145.com其它相關文章!


IT145.com E-mail:sddin#qq.com