附:ElasticSearch version :7.10
path_hierarchy tokenizer 把分层的值看成是文件路径,用路径分隔符分割文本,输出树上的各个节点。
代码语言:javascript复制POST _analyze
{
"tokenizer": "path_hierarchy",
"text": "/one/two/three"
}
输出为
代码语言:javascript复制[ /one, /one/two, /one/two/three ]
Configurationedit
The path_hierarchy
tokenizer accepts the following parameters:
delimiter | The character to use as the path separator. Defaults to /. |
---|---|
replacement | An optional replacement character to use for the delimiter. Defaults to the delimiter. |
buffer_size | The number of characters read into the term buffer in a single pass. Defaults to 1024. The term buffer will grow by this size until all the text has been consumed. It is advisable not to change this setting. |
reverse | If set to true, emits the tokens in reverse order. Defaults to false. |
skip | The number of initial tokens to skip. Defaults to 0. |
实例数据:
In this example, we configure the path_hierarchy
tokenizer to split on -
characters, and to replace them with /
. The first two tokens are skipped:
PUT my-index-000001
{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"tokenizer": "my_tokenizer"
}
},
"tokenizer": {
"my_tokenizer": {
"type": "path_hierarchy",
"delimiter": "-",
"replacement": "/",
"skip": 2
}
}
}
}
}
POST my-index-000001/_analyze
{
"analyzer": "my_analyzer",
"text": "one-two-three-four-five"
}
The above example produces the following terms:
代码语言:javascript复制[ /three, /three/four, /three/four/five ]
If we were to set reverse
to true
, it would produce the following:(倒序输出)
[ one/two/three/, two/three/, three/ ]
详细示例:(路径查询)
代码语言:javascript复制PUT file-path-test
{
"settings": {
"analysis": {
"analyzer": {
"custom_path_tree": {
"tokenizer": "custom_hierarchy"
},
"custom_path_tree_reversed": {
"tokenizer": "custom_hierarchy_reversed"
}
},
"tokenizer": {
"custom_hierarchy": {
"type": "path_hierarchy",
"delimiter": "/"
},
"custom_hierarchy_reversed": {
"type": "path_hierarchy",
"delimiter": "/",
"reverse": "true"
}
}
}
},
"mappings": {
"properties": {
"file_path": {
"type": "text",
"fields": {
"tree": {
"type": "text",
"analyzer": "custom_path_tree"
},
"tree_reversed": {
"type": "text",
"analyzer": "custom_path_tree_reversed"
}
}
}
}
}
}
POST file-path-test/_doc/1
{
"file_path": "/User/alice/photos/2017/05/16/my_photo1.jpg"
}
POST file-path-test/_doc/2
{
"file_path": "/User/alice/photos/2017/05/16/my_photo2.jpg"
}
POST file-path-test/_doc/3
{
"file_path": "/User/alice/photos/2017/05/16/my_photo3.jpg"
}
POST file-path-test/_doc/4
{
"file_path": "/User/alice/photos/2017/05/15/my_photo1.jpg"
}
POST file-path-test/_doc/5
{
"file_path": "/User/bob/photos/2017/05/16/my_photo1.jpg"
}
match匹配,通过相关性进行匹配,如果没搜到,也会按照相关性进行匹配。
代码语言:javascript复制GET file-path-test/_search
{
"query": {
"match": {
"file_path": "/User/bob/photos/2017/05"
}
}
}
精准匹配,term来进行匹配路径,通常使用该方式来进行路径查询匹配
代码语言:javascript复制GET file-path-test/_search
{
"query": {
"term": {
"file_path.tree": "/User/alice/photos/2017/05/16"
}
}
}
同样,该查询也可用于多条件进行组合查询。