Elasticsearch针对文档Search与CRUD操作的执行流程

在Elasticsearch中，针对文档的操作主要分为Search与CRUD两种。

1 Search

在分布式环境中，Search API的执行流程要比针对单个文档的CRUD API相对复杂些。因为针对单个文档的CRUD API一般都会携带文档标识(_id)，根据路由规则，可以很容易地知道该文档落在Elasticsearch集群中哪一个分片上；而Search API查询时所携带的内容均集中在_source字段，正如下面一条文档内容所示，无论是systemName，还是moduleName，亦或是message字段的内容均无法根据路由规则判断其分片归属，所以Elasticsearch不得不询问索引中主本分片(Primary Shard)或者副本分片(Replica Shard)是否包含匹配的文档。

代码语言：javascript复制

{
    "_index": "elk-2021.02.04",
    "_type": "_doc",
    "_id": "uSBua3cBbZJ5iJayCZqN",
    "_score": null,
    "_source": {
        "systemName": "ccn",
        "logLevel": "INFO",
        "moduleName": "ccn-admin",
        "lineNum": "256",
        "methodName": "report",
        "className": "c.c.cnfusion.ccn.pm.scheduler.CapacityReportScheduler",
        "message": "capacity report, resp body: {"code":200,"data":[],"msg":""}",
        "threadName": "scheduling-1",
        "timestamp": "2021-02-04 13:05:00"
    }
}

具体地，Search API的内部执行流程称为query-then-fetch，下面分别就query phase和fetch phase进行介绍。

1.1 Query Phase

从上图中，我们可以看出该Elasticsearch集群有3个节点，其中每个索引有两个主本分片，而每个主本分片有两个副本分片。接下来详细阐述query phase的执行流程。

客户端的搜索请求落在了Node 3上，那么Node 3即成为了协调节点(coordinating node)；协调节点将会构建一个空的优先级队列(priority queue)。
协调节点将搜索请求广播(broadcast)到Node 1节点P1分片和Node 2节点RO分片中去；接下来，P1分片和RO分片分别构建优先级队列，然后分别获取匹配的文档，将其保存在各自优先级队列中。
P1分片和RO分片分别将相匹配的文档_id列表返回给协调节点，然后协调节点将这些相匹配文档的_id整合到自己的优先级队列中，此时其优先进队列中的文档_id已经是一个全局排序后的结果了。

1.2 Fetch Phase

在query phase阶段，仅仅包含了相匹配文档的_id，并没有文档的详细数据，所以在fetch phase阶段，还需要获取相匹配文档的详细数据。

协调节点Node 3分别给P1分片和RO分片发送一个multi GET请求。
P1分片和RO分片将这些相匹配文档的详情传递给协调节点。
协调节点将最终数据返回给客户端。

优先级队列(Priority Queue)

优先级队列就是一个sorted list，其用于保存相匹配的文档。优先级别列的大小等于分页参数from size。例如，下面示例请求，优先级队列的大小就为100。

代码语言：javascript复制

GET /_search
{
   "from": 90,
   "size": 10
}

试想一下，如果分页深度很深且分片足够多，而每个分片都要构建一个大小为from size的优先级队列，此外，协调节点的优先级队列大小为number_of_shards * (from size)。所以，from搭配size的分页方式并不适合深度分页场景。

2 CRUD

2.1 新增文档(`INDEX`)

客户端发送请求。
协调节点接收到请求后，基于路由规则选择一个相应的主本分片，然后将该请求转发给主本分片处理。
主本分片接收到请求后，将文档添加到in-memory buffer和transaction log中去。
主本分片执行refresh操作，具体地：

代码语言：javascript复制

1) The docs in the in-memory buffer are written to a new segment.
2) The segment is opened to make it visible to search(searchable).
3) The in-memory buffer is cleared.

主本分片执行flush操作

代码语言：javascript复制

1) Any docs in the in-memory buffer are written to a new segment.
2) The buffer is cleared.
3) A commit point file is written to disk, which includes the name of the new segment.
4) The filesystem cache is flushed with an fsync.
5) A new translog is created, the old translog is deleted.

在refresh阶段，Elasticsearch会将in-memory buffer中的文档写入到new segment；事实上，new segment并不是直接持久化到磁盘中的，而是先写入filesystem cache (after a file is in the cache, it can be opened and read just like any other file )中，这就有可能导致数据丢失，所以flush操作的关键一环就是采用fsync函数将filesystem cache中new segment持久化到磁盘中；一旦磁盘持久化成功完成，那么transaction log也就没有必要存在了，Elasticsearch会直接删除，然后重新创建一个transaction log。

Transaction Log Transaction log主要用于容灾备份，Elasticsearch重启时，会将transaction log中的文档加载进in-memory buffer中。
wait_for_active_shards 一般，inedx操作在正式执行之前，需要等待一定数量的active shards，分片数量就是由wait_for_active_shards参数设定，wait_for_active_shards默认值为1，即一个主本分片即可，wait_for_active_shards最大值为一个主本分片与副本分片数量之和。如果当前活跃分片数小于wait_for_active_shards值，那么index操作必须等待并重试。

2.2 查询文档(`GET`)

客户端发送请求
接收到该请求的节点即成为协调节点，该协调节点根据文档_id判断出文档所归属的分片(如果主本分片与副本分片，那么采用轮训算法选取一个分片)，最后将请求转发给该分片进行处理。
分片执行查询请求获取文档数据，然后将其返回给协调节点
协调节点将数据返回给客户端

2.3 更新文档(`UPDATE`)

Segments are immutable, so documents cannot be updated from older segments, Instead, every commit point includes a .del file that lists which documents in which segments have been deleted, and the new version of the document is indexed in a new segment. Perhaps both versions of the document will match a query, but the older deleted version is removed before the query results are returned.

2.4 删除文档(`DELETE`)

Segments are immutable, so documents cannot be removed from older segments, Instead, every commit point includes a .del file that lists which documents in which segments have been deleted.

参考文档

https://www.elastic.co/guide/en/elasticsearch/guide/2.x/translog.html
https://www.elastic.co/guide/en/elasticsearch/reference/current/near-real-time.html
https://www.elastic.co/guide/en/elasticsearch/reference/current/index-modules-translog.html

ide ElasticsearchService api node.js 文件存储

0 人点赞

Elasticsearch针对文档Search与CRUD操作的执行流程