Elasticsearch探索:7.0版本精确的总 hits 数

2020-12-31 18:06:46 浏览数 (1)

简介

从 Elasticsearch 7.0之后,为了提高搜索的性能,在 hits 字段中返回的文档数有时不是最精确的数值。Elasticsearch 限制了最多的数值为10000。

当文档的数值大于10000时,返回的 total 数值为10000,并在 relation 中指出 gte。

实操

启动elasticsearch-7.2.1版本 kibana-7.2.1版本

选中“Add data”:

这样我们就把Sample flight data的数据加载到Elasticsearch中去了。

我们在Dev tools中来查询我们的文档个数:

代码语言:javascript复制
GET kibana_sample_data_flights/_count

{
  "count" : 13059,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  }
}

我们可以看到有13059个数值。假如我们使用如下的方式来进行搜索的话:

代码语言:javascript复制
GET kibana_sample_data_flights/_search

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 13059,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "kibana_sample_data_flights",
        "_type" : "_doc",
        "_id" : "ZOgJuHYBrB5g-f-njyTu",
        "_score" : 1.0,
        "_source" : {
          "FlightNum" : "9HY9SWR",
          "DestCountry" : "AU",
          "OriginWeather" : "Sunny",
          "OriginCityName" : "Frankfurt am Main",
          "AvgTicketPrice" : 841.2656419677076,
          "DistanceMiles" : 10247.856675613455,
          "FlightDelay" : false,
          "DestWeather" : "Rain",
          "Dest" : "Sydney Kingsford Smith International Airport",
          "FlightDelayType" : "No Delay",
          "OriginCountry" : "DE",
          "dayOfWeek" : 0,
          "DistanceKilometers" : 16492.32665375846,
          "timestamp" : "2020-12-21T00:00:00",
          "DestLocation" : {
            "lat" : "-33.94609833",
            "lon" : "151.177002"
          },
          "DestAirportID" : "SYD",
          "Carrier" : "Kibana Airlines",
          "Cancelled" : false,
          "FlightTimeMin" : 1030.7704158599038,
          "Origin" : "Frankfurt am Main Airport",
          "OriginLocation" : {
            "lat" : "50.033333",
            "lon" : "8.570556"
          },
          "DestRegion" : "SE-BD",
          "OriginAirportID" : "FRA",
          "OriginRegion" : "DE-HE",
          "DestCityName" : "Sydney",
          "FlightTimeHour" : 17.179506930998397,
          "FlightDelayMin" : 0
        }
      }
    ]
  }
}

显然我们得到的文档的数目是10000个,但是它并不是我们的实际的满足条件的所有文档数。假如我们想得到所有的文档数,那么我们可以做如下的方式:

代码语言:javascript复制
GET kibana_sample_data_flights/_search
{
  "track_total_hits":true
}

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 10000,
      "relation" : "gte"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "kibana_sample_data_flights",
        "_type" : "_doc",
        "_id" : "ZOgJuHYBrB5g-f-njyTu",
        "_score" : 1.0,
        "_source" : {
          "FlightNum" : "9HY9SWR",
          "DestCountry" : "AU",
          "OriginWeather" : "Sunny",
          "OriginCityName" : "Frankfurt am Main",
          "AvgTicketPrice" : 841.2656419677076,
          "DistanceMiles" : 10247.856675613455,
          "FlightDelay" : false,
          "DestWeather" : "Rain",
          "Dest" : "Sydney Kingsford Smith International Airport",
          "FlightDelayType" : "No Delay",
          "OriginCountry" : "DE",
          "dayOfWeek" : 0,
          "DistanceKilometers" : 16492.32665375846,
          "timestamp" : "2020-12-21T00:00:00",
          "DestLocation" : {
            "lat" : "-33.94609833",
            "lon" : "151.177002"
          },
          "DestAirportID" : "SYD",
          "Carrier" : "Kibana Airlines",
          "Cancelled" : false,
          "FlightTimeMin" : 1030.7704158599038,
          "Origin" : "Frankfurt am Main Airport",
          "OriginLocation" : {
            "lat" : "50.033333",
            "lon" : "8.570556"
          },
          "DestRegion" : "SE-BD",
          "OriginAirportID" : "FRA",
          "OriginRegion" : "DE-HE",
          "DestCityName" : "Sydney",
          "FlightTimeHour" : 17.179506930998397,
          "FlightDelayMin" : 0
        }
      }
    ]
  }
}

我们在请求的参数中加入 track_total_hits,并设置为true,那么我们可以看到在返回的参数中,它正确地显示了所有满足条件的文档个数。

0 人点赞