结构化搜索针对日期、时间、数字等结构化数据的搜索,它们有自己的格式,我们可以对它们进行范围,比较大小等逻辑操作,这些逻辑操作得到的结果非黑即白,要么符合条件在结果集里,要么不符合条件在结果集之外,没有那种相似的概念。
前言结构化搜索将会有大量的搜索实例,我们将"音乐APP"作为主要的案例背景,去开发一些跟音乐APP相关的搜索或数据分析,有助力于我们理解实战的目标,顺带巩固一下学习的知识。
我们将一首歌需要的字段暂定为:
| name | code | type | remark |
| :---- | :--: | :--: | -----: |
| ID | id | keyword | 文档ID |
| 歌手 | author | text | |
| 歌曲名称 | name | text | |
| 歌词 | content | text | |
| 语种 | language | text | |
| 标签 | tags | text | |
| 歌曲时长 | length | long | 记录秒数 |
| 喜欢次数 | likes | long | 点击喜欢1次,自增1 |
| 是否发布 | isRelease | boolean | true已发布,false未发布 |
| 发布日期 | releaseDate | date | |
我们手动定义的索引mapping信息如下:
PUT /music { "mappings": { "children": { "properties": { "id": { "type": "keyword" }, "author_first_name": { "type": "text", "analyzer": "english" }, "author_last_name": { "type": "text", "analyzer": "english" }, "author": { "type": "text", "analyzer": "english", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } } }, "name": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } } }, "content": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } } }, "language": { "type": "text", "analyzer": "english", "fielddata": true }, "tags": { "type": "text", "analyzer": "english" }, "length": { "type": "long" }, "likes": { "type": "long" }, "isRelease": { "type": "boolean" }, "releaseDate": { "type": "date" } } } } }我们预先导入一批数据进去:
POST /music/children/_bulk { "index": { "_id": 1 }} { "id" : "34116101-7fa2-5630-a1a4-1735e19d2834", "author_first_name":"Peter", "author_last_name":"Gymbo", "author" : "Peter Gymbo", "name": "gymbo", "content":"I hava a friend who loves smile, gymbo is his name", "language":"english", "tags":["enlighten","gymbo","friend"], "length":53, "likes": 5, "isRelease":true, "releaseDate": "2019-12-20" } { "index": { "_id": 2 }} { "id" : "34117101-54cb-59a1-9b7a-82adb46fa58d", "author_first_name":"John", "author_last_name":"Smith", "author" : "John Smith", "name": "wake me, shark me", "content":"don't let me sleep too late, gonna get up brightly early in the morning", "language":"english", "tags":["wake","early","morning"], "length":55, "likes": 8,"isRelease":true, "releaseDate": "2019-12-21" } { "index": { "_id": 3 }} { "id" : "34117201-8d01-49d4-a495-69634ae67017", "author_first_name":"Jimmie", "author_last_name":"Davis", "author" : "Jimmie Davis", "name": "you are my sunshine", "content":"you are my sunshine, my only sunshine, you make me happy, when skies are gray", "language":"english", "tags":["sunshine","happy"], "length":65,"likes": 12, "isRelease":true, "releaseDate": "2019-12-22" } { "index": { "_id": 4 }} { "id" : "55fa74f7-35f3-4313-a678-18c19c918a78", "author_first_name":"Peter", "author_last_name":"Raffi", "author" : "Peter Raffi", "name": "brush your teeth", "content":"When you wake up in the morning it's a quarter to one, and you want to have a little fun You brush your teeth", "language":"english", "tags":"teeth", "length":45,"likes": 17, "isRelease":true, "releaseDate": "2019-12-22" } { "index": { "_id": 5 }} { "id" : "1740e61c-63da-474f-9058-c2ab3c4f0b0a", "author_first_name":"Jean", "author_last_name":"Ritchie", "author" : "Jean Ritchie", "name": "love somebody", "content":"love somebody, yes I do", "language":"english", "tags":"love", "length":38, "likes": 3,"isRelease":true, "releaseDate": "2019-12-22" } 精确值查找我们根据文档的mapping设计,可以按ID、按日期进行查找。
根据ID搜索歌曲 GET /music/children/_search { "query" : { "constant_score" : { "filter" : { "term" : { "id" : "34116101-7fa2-5630-a1a4-1735e19d2834" } } } } }注意ID建立时,类型是指定为keyword,这样ID在索引时不会进行分词。如果类型为text,UUID值在索引时会分词,这样反而查不到结果了。
按日期搜索歌曲 GET /music/children/_search { "query" : { "constant_score" : { "filter" : { "term" : { "releaseDate" : "2019-12-21" } } } } } 按歌曲时长搜索 GET /music/children/_search { "query" : { "constant_score" : { "filter" : { "term" : { "length" : 53 } } } } } 搜索已发布的歌曲 GET /music/children/_search { "query" : { "constant_score" : { "filter" : { "term" : { "isRelease" : true } } } } }以上3个小例子可以发现:准确值搜索对keyword、日期、数字、boolean值天然支持。
组合过滤前面的4个小例子都是单条件过滤的,实际的需求肯定会有多个条件,不过万变不离其宗,再复杂的搜索需求,也是由一个一个的基础条件复合而成的,我们来看几个简单的组合过滤的例子。
复习一下之前学过的逻辑:
bool 组合多个条件,可以嵌套
must 必须匹配
should 可以匹配(类似于or,多个条件在should里)