Elasticsearch搜索-基本搜索(第一章) #2

BlackHole1 · 2018-12-27T08:11:24Z

Elasticsearch搜索-基本搜索(第一章)

此文也可以看做是《从Lucene到Elasticsearch:全文检索实战》一书的笔记

准备工作

需要安装 Elasticsearch、kibana、elasticsearch-analysis-ik

具体的安装方式，这里就不再阐述了。（安装完，记得重启 Elasticsearch ）

重启完成后，打开 kibana 的 Dev tools，输入下面的DSL代码，并运行：

PUT books
{
  "settings": {
    "number_of_replicas": 1,
    "number_of_shards": 3
  },
  "mappings": {
    "IT": {
      "properties": {
        "id": {
          "type": "long"
        },
        "title": {
          "type": "text",
          "analyzer": "ik_max_word"
        },
        "language": {
          "type": "keyword"
        },
        "author": {
          "type": "keyword"
        },
        "price": {
          "type": "double"
        },
        "year": {
          "type": "date",
          "format": "yyy-MM-dd"
        },
        "description": {
          "type": "text",
          "analyzer": "ik_max_word"
        }
      }
    }
  }
}

运行好后，下载 books.json 文件，并进行导入。如果你安装的 Elasticsearch 版本小于6.0，使用下面的命令进行导入 books.json：

curl -XPOST "http://localhost:9200/_bulk?pretty" --data-binary @books.json

如果你的 Elasticsearch 版本大于6.0，则使用下面的命令进行导入：

curl -H "Content-Type: application/json" -XPOST "http://localhost:9200/_bulk?pretty" --data-binary @books.json

基本搜索

返回指定index的所有文档

GET books/_search
{
  "query": {
    "match_all": {}
  }
}

可以简写为：

GET books/_search

查找指定字段中包含给定单词的文档

使用term来进行查询，term查询不会被解析，只有查询的词和文档中的词精确匹配才会被搜索到，应用场景为：查询人名、地名等需要精准匹配的需求。

查询title字段中含有思想的书籍

GET books/_search
{
  "query": {
    "term": {
      "title": "思想"
    }
  }
}

返回如下：

对查询结果进行分页

有时查询时，会返回成千上万的数据，这种情况下，分页的作用就出来了。

分页有两个属性，分别是from、size

from: 从何处开始
size: 返回的文档最大数量

可以理解为：我从from位置把剩下的文档全部返回，然后size限制了返回的数量。

用js代码来诠释就是：

const from = 100 - 1; // 数组从0开始，需要减一
const size = 10;
const data = [1, 2, 3, ..., 999, 1000];

const fromDate = data.splice(from);
const result = fromData.splice(0, size);
console.log(result) //=> [100, 101, 102, 103, 104, 105, 106, 107, 108, 109]

限制返回字段

一般我们查询时，都是为了观察某一个字段，而不是想看全部的字段。而如果是默认情况下，Elasticsearch 会返回的文档的全部字段信息。会对工作造成一定的影响。于是，Elasticsearch 提供了一个接口，用于限制返回的字段。假设我只需要 title 和 author 字段：

GET books/_search
{
  "_source": ["title", "author"],
  "query": {
    "term": {
      "title": "java"
    }
  }
}

结果如图：

基于最小评分过滤

因为 Elasticsearch 在做普通的搜索时，是采用相关性进行搜索的，而相关性是由评分 取决的。所以当我们进行模糊搜索时，Elasticsearch 可能会返回一些相关性不那么高的文档。所以我们可以通过 Elasticsearch 提供的接口，来设置一个评分最低标准，低于这个标准的文档，将不会出现在结果页中。

比如，我想搜索 title 里包含 java 的文档，并且评分不低于0.7：

GET books/_search
{
  "min_score": 0.7,
  "query": {
    "term": {
      "title": "java"
    }
  }
}

结果如图：

高亮关键字

有时，我们会把 Elasticsearch 结果直接导入到网页中，这个时候需要高亮关键字，让用户更加清楚自己想要的东西，Elasticsearch 已经提供了一个接口，比如我想让搜索出来的结果中的关键字高亮：

GET books/_search
{
  "_source": ["title"],
  "min_score": 0.7,
  "query": {
    "term": {
      "title": "java"
    }
  },
  "highlight": {
    "fields": {
      "title": {}
    }
  }
}

结果如图：

默认的标签是<em></em>，如果你想自定义，可以使用： pre_tags 和 post_tags。最终查询代码为：

GET books/_search
{
  "_source": ["title"],
  "min_score": 0.7,
  "query": {
    "term": {
      "title": "java"
    }
  },
  "highlight" : {
    "pre_tags" : ["<h1>"],
    "post_tags" : ["</h1>"],
    "fields" : {
      "title" : {}
    }
  }
}

结果如图：

下一篇：Elasticsearch搜索-全文查询(第二章)

The text was updated successfully, but these errors were encountered:

759803573 · 2019-10-29T02:56:57Z

可以简写为：
GET books/search

这个简写错了 _search

BlackHole1 · 2019-10-29T05:54:31Z

@759803573 感谢指正，已经改正

BlackHole1 mentioned this issue Dec 27, 2018

Elasticsearch搜索-全文查询(第二章) #3

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Elasticsearch搜索-基本搜索(第一章) #2

Elasticsearch搜索-基本搜索(第一章) #2

BlackHole1 commented Dec 27, 2018 •

edited

Loading

759803573 commented Oct 29, 2019

BlackHole1 commented Oct 29, 2019

Elasticsearch搜索-基本搜索(第一章) #2

Elasticsearch搜索-基本搜索(第一章) #2

Comments

BlackHole1 commented Dec 27, 2018 • edited Loading

Elasticsearch搜索-基本搜索(第一章)

准备工作

基本搜索

返回指定index的所有文档

查找指定字段中包含给定单词的文档

对查询结果进行分页

限制返回字段

基于最小评分过滤

高亮关键字

759803573 commented Oct 29, 2019

BlackHole1 commented Oct 29, 2019

BlackHole1 commented Dec 27, 2018 •

edited

Loading