Elasticsearch Query Rule 现已普遍可用

2024-08-11 15:55:42 浏览数 (1)

Query Rule允许对特定查询或搜索用例进行细致入微的调整,以改变搜索结果。这在需要将品牌或赞助结果固定在搜索结果顶部的活动中非常有用,也可以帮助你在一些常见查询中“修正”顶部结果。

介绍

我们很高兴地宣布,Query Rule现在在我们的无服务器产品中普遍可用,并且将从8.15.0版本开始普遍可用。Query Rule首次在8.10.0版本中作为技术预览功能引入,允许索引维护者根据上下文查询条件策划特定文档,并将其固定在结果顶部。

Query Rule如何工作?

Query Rule流程Query Rule流程

Query Rule是基于特定查询元数据定义的规则。你首先定义一个Query Rule集,识别在查询中发送的特定元数据时需要提升的文档。在搜索时,你将这些元数据与规则查询一起发送。如果规则查询中的元数据与任何规则匹配,这些规则将应用于你的结果集。

新的Query Rule功能

我们在向普遍可用性迈进的过程中添加了一些新功能。

这些更改的简要总结:

  • 我们将规则查询从rule_query重命名为rule,以便与我们的其他API调用更加一致。
  • 我们现在支持在单个规则查询中指定多个规则集。
  • 我们扩展了Query Rule集管理API,支持管理单个Query Rule。

这看起来是怎样的?

假设我们有一个包含狗品种信息的索引,其中有两个字段:dog_breedadvert。我们想要设置规则,将品种和混合品种固定在同一规则中。以下是一个包含两个规则的示例规则集:

代码语言:javascript复制
PUT _query_rules/dog-breed-ruleset
{
 "ruleset_id": "dog-breed-ruleset",
 "rules": [
   {
     "rule_id": "pug-mixes",
     "type": "pinned",
     "criteria": [
       {
         "type": "exact",
         "metadata": "breed",
         "values": [
           "pug"
         ]
       }
     ],
     "actions": {
       "ids": [
         "pug",
         "puggle",
         "chug",
         "pugshire"
       ]
     },
     "priority": 5
   },
   {
     "rule_id": "chi-mixes",
     "type": "pinned",
     "criteria": [
       {
         "type": "exact",
         "metadata": "breed",
         "values": [
           "chihuahua"
         ]
       }
     ],
     "actions": {
       "ids": [
         "chihuahua",
         "chiweenie",
         "chug"
       ]
     },
     "priority": 10
   }
 ]
}

解释这个规则集的作用:

  1. 如果规则查询中发送了breed: pug,前几个结果将按顺序返回:pug、puggle、chug和pugshire。任何自然结果将排在这些固定结果之后。
  2. 如果规则查询中发送了breed: chihuahua,前几个结果将按顺序返回:chihuahua、chiweenie和chug。任何自然结果将排在这些固定结果之后。

需要注意的是每个规则在规则集中的优先级。这是每个规则的可选部分,但它可以帮助你定义单个规则在规则集中插入的位置。规则集按升序优先级排序,如本例所示,它们不必是连续的。在这个规则集中插入一个优先级为4或更低的新规则将把新规则插入到规则集的开头,而优先级在6到9之间将新规则插入到现有两个规则之间。如果将一个不包括优先级的规则添加到规则集中,它将简单地附加到规则集的末尾。

另一个有用的Query Rule功能是我们现在支持多个规则集传递到规则查询。这允许你组织和定义更多的规则;以前你只能限制在单个规则集中包含的规则数量(默认100个,最多可配置到每个规则集1000个)。

接下来,我们创建一个专门用于7月促销的新规则集。这条规则类型为“always”,因此无论什么情况它都将始终返回:

代码语言:javascript复制
PUT _query_rules/promo-ruleset
{
  "rules": [
    {
      "rule_id": "july-promo",
      "type": "pinned",
      "criteria": [
        {
          "type": "always"
        }
      ],
      "actions": {
        "ids": [
          "july-promotion"
        ]
      }
    }
  ]
}

现在,我们可以使用以下查询一起查询这些规则集:

代码语言:javascript复制
GET query-rules-test/_search
{
  "query": {
    "rule": {
      "organic": {
        "query_string": {
          "query": "chihuahua mixes"
        }
      },
      "ruleset_ids": [
        "promo-ruleset",
        "dog-breed-ruleset"
      ],
      "match_criteria": {
        "breed": "chihuahua"
      }
    }
  }
}

由于指定了两个规则集,我们将按请求中指定的顺序处理每个规则集。这意味着7月促销规则将始终作为第一个结果返回,而匹配的狗品种将随后被固定,在一个结果中显示如下:

代码语言:javascript复制
   "hits": [
     {
       "_index": "query-rules-test",
       "_id": "july-promotion",
       "_score": 1.7014128e 38,
       "_source": {
         "advert": "EVERYTHING ON SALE!"
       }
     },
     {
       "_index": "query-rules-test",
       "_id": "chihuahua",
       "_score": 1.7014126e 38,
       "_source": {
         "dog_breed": "chihuahua"
       }
     },
     {
       "_index": "query-rules-test",
       "_id": "chiweenie",
       "_score": 1.7014124e 38,
       "_source": {
         "dog_breed": "chiweenie"
       }
     },
     {
       "_index": "query-rules-test",
       "_id": "chug",
       "_score": 1.7014122e 38,
       "_source": {
         "dog_breed": "chug"
       }
     },
     ... // 自然结果随之返回...
   ]

这些是简单示例,那么一些更复杂的示例呢?

我们可以帮助你!

让我们继续以狗为主题,但扩展我们测试索引中的映射:

代码语言:javascript复制
PUT query-rules-test
{
 "mappings": {
   "properties": {
     "breed": {
       "type": "keyword"
     },
     "age": {
       "type": "integer"
     },
     "sex": {
       "type": "keyword"
     },
     "name": {
       "type": "keyword"
     },
     "bio": {
       "type": "text"
     },
     "good_with_animals": {
       "type": "boolean"
     },
     "good_with_kids": {
        "type": "boolean"
     }
   }
 }
}

我们可以将以下json索引到这个索引中作为一些示例数据。为了保持一致性,我们假设_id字段与json中的id字段匹配:

代码语言:javascript复制
[
  {"id":"buddy_pug","breed":"pug","sex":"Male","age":3,"name":"Buddy","bio":"Buddy is a charming pug who loves to play and snuggle. He is looking for a loving home.","good_with_animals":true,"good_with_kids":true},
  {"id":"lucy_beagle","breed":"beagle","sex":"Female","age":7,"name":"Lucy","bio":"Lucy is a friendly beagle who enjoys long walks and is very affectionate.","good_with_animals":true,"good_with_kids":true},
  {"id":"rocky_chihuahua","breed":"chihuahua","sex":"Male","age":2,"name":"Rocky","bio":"Rocky is a tiny chihuahua with a big personality. He is very playful and loves attention.","good_with_animals":false,"good_with_kids":false},
  {"id":"zoe_dachshund","breed":"dachshund","sex":"Female","age":5,"name":"Zoe","bio":"Zoe is a sweet dachshund who loves to burrow under blankets and is very loyal.","good_with_animals":true,"good_with_kids":true},
  {"id":"max_poodle","breed":"poodle","sex":"Male","age":6,"name":"Max","bio":"Max is a smart and active poodle who loves learning new tricks and is very friendly.","good_with_animals":true,"good_with_kids":true},
  {"id":"bella_yorkie","breed":"yorkie","sex":"Female","age":4,"name":"Bella","bio":"Bella is a cute yorkie who loves to be pampered and is very affectionate.","good_with_animals":true,"good_with_kids":true},
  {"id":"jack_puggle","breed":"puggle","sex":"Male","age":8,"name":"Jack","bio":"Jack is a friendly puggle who loves to play and is very social.","good_with_animals":true,"good_with_kids":true},
  {"id":"lola_chiweenie","breed":"chiweenie","sex":"Female","age":9,"name":"Lola","bio":"Lola is a playful chiweenie who loves to chase toys and is very affectionate.","good_with_animals":true,"good_with_kids":true},
  {"id":"charlie_chug","breed":"chug","sex":"Male","age":3,"name":"Charlie","bio":"Charlie is an energetic chug who loves to play and is very curious.","good_with_animals":false,"good_with_kids":true},
  {"id":"daisy_pugshire","breed":"pugshire","sex":"Female","age":7,"name":"Daisy","bio":"Daisy is a gentle pugshire who loves to cuddle and is very sweet.","good_with_animals":true,"good_with_kids":true},
  {"id":"buster_pug","breed":"pug","sex":"Male","age":10,"name":"Buster","bio":"Buster is a calm pug who loves to lounge around and is very loyal.","good_with_animals":true,"good_with_kids":true},
  {"id":"molly_beagle","breed":"beagle","sex":"Female","age":12,"name":"Molly","bio":"Molly is a sweet beagle who loves to sniff around and is very friendly.","good_with_animals":true,"good_with_kids":true},
  {"id":"toby_chihuahua","breed":"chihuahua","sex":"Male","age":4,"name":"Toby","bio":"Toby is a lively chihuahua who loves to run around and is very playful.","good_with_animals":false,"good_with_kids":false},
  {"id":"luna_dachshund","breed":"dachshund","sex":"Female","age":1,"name":"Luna","bio":"Luna is a young dachshund who loves to explore and is very curious.","good_with_animals":true,"good_with_kids":true},
  {"id":"oliver_poodle","breed":"poodle","sex":"Male","age":17,"name":"Oliver","bio":"Oliver is a wise poodle who loves to relax and is very gentle.","good_with_animals":true,"good_with_kids":true}]

接下来,创建一个规则集:

代码语言:javascript复制
PUT _query_rules/rescue-dog-search-ruleset
{
  "rules": [
    {
      "rule_id": "pugs_and_pug_mixes",
      "type": "pinned",
      "criteria": [
        {
          "type": "contains",
          "metadata": "query_string",
          "values": [
            "pug"
          ]
        }
      ],
      "actions": {
        "ids": [
          "buddy_pug",
          "buster_pug",
          "jack_puggle",
          "daisy_pugshire"
        ]
      }
    },
    {
      "rule_id": "puppies",
      "type": "pinned",
      "criteria": [
        {
          "type": "contains",
          "metadata": "query_string",
          "values": [
            "puppy",
            "puppies"
          ]
        }
      ],
      "actions": {
        "ids": [
          "luna_dachsund"
        ]
      }
    },
    {
      "rule_id": "puppies2",
      "type": "pinned",
      "criteria": [
        {
          "type": "lte",
          "metadata": "preferred_age",
          "values": [
            2
          ]
        }
      ],
      "actions": {
        "ids": [
          "luna_dachshund"
        ]
      }
    },
    {
      "rule_id": "adult",
      "type": "pinned",
      "criteria": [
        {
          "type": "gt",
          "metadata": "preferred_age",
          "values": [
            2
          ]
        },
        {
          "type": "lt",
          "metadata": "preferred_age",
          "values": [
            10
          ]
        }
      ],
      "actions": {
        "ids": [
          "lucy_beagle"
        ]
      }
    },
    {
      "rule_id": "special_needs",
      "type": "pinned",
      "criteria": [
        {
          "type": "exact",
          "metadata": "work_from_home",
          "values": [
            "true"
          ]
        },
        {
          "type": "exact",
          "metadata": "fenced_in_yard",
          "values": [
            "true"
          ]
        },
        {
          "type": "exact",
          "metadata": "kids_in_house",
          "values": [
            "false"
          ]
        },
        {
          "type": "exact",
          "metadata": "cats_in_house",
          "values": [
            "false"
          ]
        }
      ],
      "actions": {
        "ids": [
          "toby_chihuahua"
        ]
      }
    }
  ]
}

解释这个规则集:

  • 有一个 pugs_and_pug_mixes 规则,如果查询字符串包含“pug”,它将固定所有的pug和pug混种。
  • puppies 规则将返回索引中最年轻的狗,如果查询字符串包含“puppy”或“puppies”。
  • 另一个 puppies2 规则将在首选年龄小于或等于2时返回相同的狗。
  • adult 规则将在首选年龄在2到10之间时固定一个特定的比格犬。
  • 我们有一个复杂的 special_needs 规则,只有在潜在的主人在家工作,有围栏的院子,没有孩子或猫在家时才会触发。

让我们看看一些示例查询——这些示例使用match_none作为自然查询,因此返回的唯一结果是规则本身。

代码语言:javascript复制
GET query-rules-test/_search
{
  "query": {
    "rule": {
      "organic": {
        "match_none": {}
      },
      "ruleset_ids": [
        "rescue-dog-search-ruleset"
      ],
      "match_criteria": {
        "query_string": "I like pugs and pug mixes",
        "preferred_age": 5
      }
    }
  }
}

上述查询将匹配“pug和pug混种”规则,并返回4个pug结果。然而,由于首选年龄为5,我们将返回成年的狗,Lucy the beagle,作为第5个固定结果。

将首选年龄改为1,将使结果中的第5个结果变成小狗:

代码语言:javascript复制
GET query-rules-test/_search
{
  "query": {
    "rule": {
      "organic": {
        "match_none": {}
      },
      "ruleset_ids": [
        "rescue-dog-search-ruleset"
      ],
      "match_criteria": {
        "query_string": "I like pugs and pug mixes",
        "preferred_age": 1
      }
    }
  }
}

为了匹配特殊需求结果,所有条件都必须匹配:

代码语言:javascript复制
{
  "query": {
    "rule": {
      "organic": {
        "match_none": {}
      },
      "ruleset_ids": [
        "rescue-dog-search-ruleset"
      ],
      "match_criteria": {
        "work_from_home": "true",
        "fenced_in_yard": "true",
        "kids_in_house": "false",
        "cats_in_house": "false"
      }
    }
  }
}

如果有一个条件不匹配,这个规则将不会被触发:

代码语言:javascript复制
GET query-rules-test/_search
{
  "query": {
    "rule": {
      "organic": {
        "match_none": {}
      },
      "ruleset_ids": [
        "rescue-dog-search-ruleset"
      ],
      "match_criteria": {
        "work_from_home": "true",
        "fenced_in_yard": "true",
        "kids_in_house": "false",
        "cats_in_house": "true"
      }
    }
  }
}

当然!以下是经过优化的Markdown格式的文章:

排查Query Rule

有时候很难判断一个固定Query Rule是否被应用。要确定Query Rule是固定了你想要的文档还是自然返回的,有两种方法。

我们有一个 explain API,但这个并不明显:规则查询会被重写为固定查询,然后再被重写为常量分数查询,所以看起来会像是最大可能的分数:

代码语言:javascript复制
"_explanation": {
  "value": 1.7014128e 38,
  "description": "max of:",
  "details": [
    {
      "value": 1.7014128e 38,
      "description": "max of:",
      "details": [
        {
          "value": 1.7014128e 38,
          "description": "ConstantScore(_id:([ff 63 68 69 68 61 75 68 75 61]))^1.7014128E38",
          "details": []
        },
        ...
      ]
    },
    ...
  ]
}

对于固定规则,你也可以只返回规则,通过将规则查询与一个匹配为空的查询结合起来,如我们之前为了演示所做的那样:

代码语言:javascript复制
GET query-rules-test/_search
{
  "query": {
    "rule": {
      "organic": {
        "match_none": {}
      },
      "ruleset_ids": [
        "dog-breed-ruleset",
        "super-special-ruleset"
      ],
      "match_criteria": {
        "breed": "pug"
      }
    }
  }
}

另外,你也可以单独运行自然查询,并将其结果与规则查询的结果进行比较。

接下来是什么?

尽管Query Rule已经普遍可用,但这并不意味着我们的工作已经完成!

我们的路线图上还有许多令人兴奋的功能,包括:

  • 支持排除文档和固定文档
  • 一个无需API调用即可轻松管理Query Rule的用户界面
  • 分析器支持
  • ……以及更多!

我们还在探索人们可能感兴趣的其他使用Query Rule的方法。我们很想听到你在 我们的社区页面 上的反馈!

0 人点赞