实现一个简单的Database9(译文)

* GreatSQL社区原创内容未经授权不得随意使用，转载请联系小编并注明来源。

前文回顾

实现一个简单的Database1(译文）
实现一个简单的Database2(译文）
实现一个简单的Database3(译文）
实现一个简单的Database4(译文）
实现一个简单的Database5(译文）
实现一个简单的Database6(译文)
实现一个简单的Database7(译文)
实现一个简单的Database8(译文)

译注：cstack在github维护了一个简单的、类似sqlite的数据库实现，通过这个简单的项目，可以很好的理解数据库是如何运行的。本文是第九篇，主要是实现B-tree的二叉搜索并解决主键重复问题

Part 9 二叉搜索与主键重复

上次注意到我们的 B 树存储 key 时仍然是非排序的。现在我们来解决这个问题，并检测和拒绝主键的重复（插入）。

现在我们的 execute_insert() 函数在插入时，选择的位置是在表的结尾处。作为替换，我们需要搜索表（树）中正确的插入位置，然后把 key 插入到这个位置上。如果 key 在这个位置上已经存在，那么就返回（键重复）报错。

代码语言：javascript复制

ExecuteResult execute_insert(Statement* statement, Table* table) {
   void* node = get_page(table->pager, table->root_page_num);
-  if ((*leaf_node_num_cells(node) >= LEAF_NODE_MAX_CELLS)) {
   uint32_t num_cells = (*leaf_node_num_cells(node));
   if (num_cells >= LEAF_NODE_MAX_CELLS) {
     return EXECUTE_TABLE_FULL;
   }

   Row* row_to_insert = &(statement->row_to_insert);
-  Cursor* cursor = table_end(table);
   uint32_t key_to_insert = row_to_insert->id;
   Cursor* cursor = table_find(table, key_to_insert);
 
   if (cursor->cell_num < num_cells) {
     uint32_t key_at_index = *leaf_node_key(node, cursor->cell_num);
     if (key_at_index == key_to_insert) {
       return EXECUTE_DUPLICATE_KEY;
     }
   }

   leaf_node_insert(cursor, row_to_insert->id, row_to_insert);

这样的话，我们就不再需要 table_end() 函数了。

代码语言：javascript复制

-Cursor* table_end(Table* table) {
-  Cursor* cursor = malloc(sizeof(Cursor));
-  cursor->table = table;
-  cursor->page_num = table->root_page_num;
-
-  void* root_node = get_page(table->pager, table->root_page_num);
-  uint32_t num_cells = *leaf_node_num_cells(root_node);
-  cursor->cell_num = num_cells;
-  cursor->end_of_table = true;
-
-  return cursor;
-}

我们用一个方法替换 table_end() 函数，方法实现使用给定的 key 来搜索 Btree。

代码语言：javascript复制

 /*
 Return the position of the given key.
 If the key is not present, return the position
 where it should be inserted
 */
 Cursor* table_find(Table* table, uint32_t key) {
   uint32_t root_page_num = table->root_page_num;
   void* root_node = get_page(table->pager, root_page_num);
 
   if (get_node_type(root_node) == NODE_LEAF) {
     return leaf_node_find(table, root_page_num, key);
   } else {
     printf("Need to implement searching an internal noden");
     exit(EXIT_FAILURE);
   }
 }

我先把内部节点实现的分支存根，因为我还没有实现内部节点（internal node）。我们可以在叶子节点中使用二叉搜索法来进行搜索。

代码语言：javascript复制

 Cursor* leaf_node_find(Table* table, uint32_t page_num, uint32_t key) {
   void* node = get_page(table->pager, page_num);
   uint32_t num_cells = *leaf_node_num_cells(node);
 
   Cursor* cursor = malloc(sizeof(Cursor));
   cursor->table = table;
   cursor->page_num = page_num;
 
   // Binary search
   uint32_t min_index = 0;
   uint32_t one_past_max_index = num_cells;
   while (one_past_max_index != min_index) {
     uint32_t index = (min_index   one_past_max_index) / 2;
     uint32_t key_at_index = *leaf_node_key(node, index);
     if (key == key_at_index) {
       cursor->cell_num = index;
       return cursor;
     }
     if (key < key_at_index) {
       one_past_max_index = index;
     } else {
       min_index = index   1;
     }
   }
 
   cursor->cell_num = min_index;
   return cursor;
 }

这将返回：

key 的位置
如果我们需要插入一个新的 key，需要移动 key 的话，返回这个 key 的位置
超出最后一个 key 的位置

因为我们现在会检查节点的类型，所以需要一个函数在节点中获取和设置节点的类型值。

代码语言：javascript复制

 NodeType get_node_type(void* node) {
   uint8_t value = *((uint8_t*)(node   NODE_TYPE_OFFSET));
   return (NodeType)value;
 }
 
 void set_node_type(void* node, NodeType type) {
   uint8_t value = type;
   *((uint8_t*)(node   NODE_TYPE_OFFSET)) = value;
 }

需要先把值转为 uint8_t 类型来确保它作为一个序列化的单字节。还需要初始化节点的类型。

代码语言：javascript复制

-void initialize_leaf_node(void* node) { *leaf_node_num_cells(node) = 0; }
 void initialize_leaf_node(void* node) {
   set_node_type(node, NODE_LEAF);
   *leaf_node_num_cells(node) = 0;
 }

最后，我们需要创建并处理一个新的错误码。

代码语言：javascript复制

-enum ExecuteResult_t { EXECUTE_SUCCESS, EXECUTE_TABLE_FULL };
 enum ExecuteResult_t {
   EXECUTE_SUCCESS,
   EXECUTE_DUPLICATE_KEY,
   EXECUTE_TABLE_FULL
 };

代码语言：javascript复制

case (EXECUTE_SUCCESS):
  printf("Executed.n");
  break;
       case (EXECUTE_DUPLICATE_KEY):
         printf("Error: Duplicate key.n");
         break;
case (EXECUTE_TABLE_FULL):
  printf("Error: Table full.n");
  break;

有了这些修改，我们的测试可以更改为检查排序顺序：

代码语言：javascript复制

"db > Executed.",
"db > Tree:",
"leaf (size 3)",
-      "  - 0 : 3",
-      "  - 1 : 1",
-      "  - 2 : 2",
       "  - 0 : 1",
       "  - 1 : 2",
       "  - 2 : 3",
"db > "
])
end

并且我们能够为重复键增加一个新的测试：

代码语言：javascript复制

   it 'prints an error message if there is a duplicate id' do
     script = [
       "insert 1 user1 person1@example.com",
       "insert 1 user1 person1@example.com",
       "select",
       ".exit",
     ]
     result = run_script(script)
     expect(result).to match_array([
       "db > Executed.",
       "db > Error: Duplicate key.",
       "db > (1, user1, person1@example.com)",
       "Executed.",
       "db > ",
     ])
   end

就是这样！接下来：实现拆分叶节点和创建内部节点。

Enjoy GreatSQL :)

《零基础学习MySQL》视频课程

戳此小程序即可直达B站

https://www.bilibili.com/video/BV1Da411W7Va?spm_id_from=333.999.0.0&vd_source=ae1951b64ea7b9e6ba11f1d0bbcff0e4

文章推荐：

GreatSQL 加入龙蜥社区，打造基于“龙蜥 GreatSQL”的开源技术底座
“采访”ChatGPT看看它对我们GreatSQL社区有什么看法
MySQL中sp运行check表版本更新流程解析
GreatSQL社区月报 | 2023.01
MySQL 8.0.32如期而至

关于 GreatSQL

GreatSQL是由万里数据库维护的MySQL分支，专注于提升MGR可靠性及性能，支持InnoDB并行查询特性，是适用于金融级应用的MySQL分支版本。

GreatSQL社区官网: https://greatsql.cn/

Gitee: https://gitee.com/GreatSQL/GreatSQL

GitHub: https://github.com/GreatSQL/GreatSQL

Bilibili：https://space.bilibili.com/1363850082

https 云数据库SQLServer 网络安全数据库 sql

0 人点赞