实现一个简单的Database4(译文)

2023-02-23 14:44:23 浏览数 (1)

* GreatSQL社区原创内容未经授权不得随意使用,转载请联系小编并注明来源。

前文回顾

  • 实现一个简单的Database1(译文)
  • 实现一个简单的Database2(译文)
  • 实现一个简单的Database3(译文)

译注:cstsck在github维护了一个简单的、类似SQLite的数据库实现,通过这个简单的项目,可以很好的理解数据库是如何运行的。本文是第四篇,主要是使用rspec对目前实现的功能进行测试并解决测试出现BUG

Part 4 我们的第一个测试(和BUG)

我们已经获得插入数据到数据库并打印所有数据的能力。现在来测试一下目前已有的功能。

我使用rspec来写我的测试,因为我对rspec很熟悉,它的语法也相当易读。

译注:rsepec 是一个基于Ruby的测试框架,语法非常简单,可以很方便的测试各种可执行程序,判断输出

我定义一个短小的help来发送一个帮助命令列表到数据库,然后对输出进行断言。

代码语言:javascript复制
describe 'database' do
  def run_script(commands)
    raw_output = nil
    IO.popen("./db", "r ") do |pipe|
      commands.each do |command|
        pipe.puts command
      end

      pipe.close_write

      # Read entire output
      raw_output = pipe.gets(nil)
    end
    raw_output.split("n")
  end

  it 'inserts and retrieves a row' do
    result = run_script([
      "insert 1 user1 person1@example.com",
      "select",
      ".exit",
    ])
    expect(result).to match_array([
      "db > Executed.",
      "db > (1, user1, person1@example.com)",
      "Executed.",
      "db > ",
    ])
  end
end

这个简单的测试是确认我们的输入能够获取返回结果。并确保能通过测试:

代码语言:javascript复制
bundle exec rspec
.

Finished in 0.00871 seconds (files took 0.09506 seconds to load)
1 example, 0 failures

现在测试插入更多行数据到数据库是可行的:

代码语言:javascript复制
it 'prints error message when table is full' do
  script = (1..1401).map do |i|
    "insert #{i} user#{i} person#{i}@example.com"
  end
  script << ".exit"
  result = run_script(script)
  expect(result[-2]).to eq('db > Error: Table full.')
end

再次运行测试:

代码语言:javascript复制
bundle exec rspec
..

Finished in 0.01553 seconds (files took 0.08156 seconds to load)
2 examples, 0 failures

妙啊,测试通过了!我们的数据库现在能够hold住1400行数据,这是因为我们设置pages最大数量是100页,每页可以存放14行数据。

查看我们目前写的的代码,我意识到我们可能没有正确处理存储文本字段。很容易用下面的例子测试出来(插入边界长度的字符串):

代码语言:javascript复制
it 'allows inserting strings that are the maximum length' do
  long_username = "a"*32
  long_email = "a"*255
  script = [
    "insert 1 #{long_username} #{long_email}",
    "select",
    ".exit",
  ]
  result = run_script(script)
  expect(result).to match_array([
    "db > Executed.",
    "db > (1, #{long_username}, #{long_email})",
    "Executed.",
    "db > ",
  ])
end

测试失败了:

代码语言:javascript复制
Failures:

  1) database allows inserting strings that are the maximum length
     Failure/Error: raw_output.split("n")

     ArgumentError:
       invalid byte sequence in UTF-8
     # ./spec/main_spec.rb:14:in `split'
     # ./spec/main_spec.rb:14:in `run_script'
     # ./spec/main_spec.rb:48:in `block (2 levels) in <top (required)>'

如果是我们自己人工来测试,当我们打印行数据时,会看到有一些奇怪的字符(例子中,我把很长的字符串进行了缩写):

代码语言:javascript复制
db > insert 1 aaaaa... aaaaa...
Executed.
db > select
(1, aaaaa...aaa�, aaaaa...aaa�)
Executed.
db >

发生了什么?如果看一下代码中我们定义的Row结构,我们确实为username字段分配了32个字节长度,为email字段分配255个字节长度。但是C语言的strings是以一个null字符来作为结尾的,这个字符我们没有为它分配空间。解决方法就是多分配一个额外的字节(来存放这个null字符):

代码语言:javascript复制
const uint32_t COLUMN_EMAIL_SIZE = 255;
typedef struct {
  uint32_t id;
-  char username[COLUMN_USERNAME_SIZE];
-  char email[COLUMN_EMAIL_SIZE];
   char username[COLUMN_USERNAME_SIZE   1];
   char email[COLUMN_EMAIL_SIZE   1];
} Row;

这样确实解决了上面的问题(重新运行上面插入边界长度字符串的测试):

代码语言:javascript复制
bundle exec rspec
...

Finished in 0.0188 seconds (files took 0.08516 seconds to load)
3 examples, 0 failures

我们不允许插入的username或者email的长度超过固定的列的长度。这样的超出长度要求的spec测试看起来就像下面这样:

代码语言:javascript复制
it 'prints error message if strings are too long' do
  long_username = "a"*33
  long_email = "a"*256
  script = [
    "insert 1 #{long_username} #{long_email}",
    "select",
    ".exit",
  ]
  result = run_script(script)
  expect(result).to match_array([
    "db > String is too long.",
    "db > Executed.",
    "db > ",
  ])
end

为了能够支持上面这种效果,我们需要升级我们的解析器(parser)。提醒一下,我们现在使用的是scanf()

代码语言:javascript复制
if (strncmp(input_buffer->buffer, "insert", 6) == 0) {
  statement->type = STATEMENT_INSERT;
  int args_assigned = sscanf(
      input_buffer->buffer, "insert %d %s %s", &(statement->row_to_insert.id),
      statement->row_to_insert.username, statement->row_to_insert.email);
  if (args_assigned < 3) {
    return PREPARE_SYNTAX_ERROR;
  }
  return PREPARE_SUCCESS;
}

但是scanf()有一些缺点。如果读取的string大于正在读取它的缓存(buffer),就会引起缓存溢出(buffer overflow)并写入到意想不到的地方。所以我们需要在拷贝string到Row结构前检查每一个string的长度。为了检查string长度,我们需要用空格分割输入。

译注: strtok: 字符串处理函数,char * strtok ( char * str, const char * delimiters ); 分解字符串为一组字符串。str为要分解的字符,delimiters为分隔符字符(如果传入字符串,则传入的字符串中每个字符均为分割符)

我使用strtok()来做这些。如果你看到过程就会觉得它很容易理解:

代码语言:javascript复制
 PrepareResult prepare_insert(InputBuffer* input_buffer, Statement* statement) {
   statement->type = STATEMENT_INSERT;
 
   char* keyword = strtok(input_buffer->buffer, " ");
   char* id_string = strtok(NULL, " ");
   char* username = strtok(NULL, " ");
   char* email = strtok(NULL, " ");
 
   if (id_string == NULL || username == NULL || email == NULL) {
     return PREPARE_SYNTAX_ERROR;
   }
 
   int id = atoi(id_string);
   if (strlen(username) > COLUMN_USERNAME_SIZE) {
     return PREPARE_STRING_TOO_LONG;
   }
   if (strlen(email) > COLUMN_EMAIL_SIZE) {
     return PREPARE_STRING_TOO_LONG;
   }
 
   statement->row_to_insert.id = id;
   strcpy(statement->row_to_insert.username, username);
   strcpy(statement->row_to_insert.email, email);
 
   return PREPARE_SUCCESS;
 }
 
 PrepareResult prepare_statement(InputBuffer* input_buffer,
                                 Statement* statement) {
   if (strncmp(input_buffer->buffer, "insert", 6) == 0) {
     return prepare_insert(input_buffer, statement);
-    statement->type = STATEMENT_INSERT;
-    int args_assigned = sscanf(
-        input_buffer->buffer, "insert %d %s %s", &(statement->row_to_insert.id),
-        statement->row_to_insert.username, statement->row_to_insert.email);
-    if (args_assigned < 3) {
-      return PREPARE_SYNTAX_ERROR;
-    }
-    return PREPARE_SUCCESS;
   }

每当输入到一个分隔符时(在我们的例子中就是空格),就在输入缓冲(input buffer)上连续调用strtok(),把它分解成子字符串。它返回一个指向子字符串开始位置的指针。

我们可以在每个文本值上调用strlen(),看看它是否太长(strlen()函数,获取字符串的长度)。

我们可以像处理其他错误码一样处理错误:

代码语言:javascript复制
enum PrepareResult_t {
  PREPARE_SUCCESS,
   PREPARE_STRING_TOO_LONG,
  PREPARE_SYNTAX_ERROR,
  PREPARE_UNRECOGNIZED_STATEMENT
};
代码语言:javascript复制
switch (prepare_statement(input_buffer, &statement)) {
  case (PREPARE_SUCCESS):
    break;
   case (PREPARE_STRING_TOO_LONG):
     printf("String is too long.n");
     continue;
  case (PREPARE_SYNTAX_ERROR):
    printf("Syntax error. Could not parse statement.n");
    continue;

这样就能通过测试了。

代码语言:javascript复制
bundle exec rspec
....

Finished in 0.02284 seconds (files took 0.116 seconds to load)
4 examples, 0 failures

到了这里,我们不妨再多处理一个错误情况(id值插入一个负值):

代码语言:javascript复制
it 'prints an error message if id is negative' do
  script = [
    "insert -1 cstack foo@bar.com",
    "select",
    ".exit",
  ]
  result = run_script(script)
  expect(result).to match_array([
    "db > ID must be positive.",
    "db > Executed.",
    "db > ",
  ])
end
代码语言:javascript复制
enum PrepareResult_t {
  PREPARE_SUCCESS,
   PREPARE_NEGATIVE_ID,
  PREPARE_STRING_TOO_LONG,
  PREPARE_SYNTAX_ERROR,
  PREPARE_UNRECOGNIZED_STATEMENT
@@ -148,9  147,6 @@ PrepareResult prepare_insert(InputBuffer* input_buffer, Statement* statement) {
  }

  int id = atoi(id_string);
   if (id < 0) {
     return PREPARE_NEGATIVE_ID;
   }
  if (strlen(username) > COLUMN_USERNAME_SIZE) {
    return PREPARE_STRING_TOO_LONG;
  }
@@ -230,9  226,6 @@ int main(int argc, char* argv[]) {
    switch (prepare_statement(input_buffer, &statement)) {
      case (PREPARE_SUCCESS):
        break;
       case (PREPARE_NEGATIVE_ID):
         printf("ID must be positive.n");
         continue;
      case (PREPARE_STRING_TOO_LONG):
        printf("String is too long.n");
        continue;

好了,测试做的差不多了。接下来是非常重要的功能:持久化!我们要实现保存我们的数据库到一个文件,再把它从文件中读取出来。(目前它还在内存当中)

现在它越来越牛了。

下面是和上一部分代码,修改位置的对比:

代码语言:javascript复制
@@ -22,6  22,8 @@

 enum PrepareResult_t {
   PREPARE_SUCCESS,
   PREPARE_NEGATIVE_ID,
   PREPARE_STRING_TOO_LONG,
   PREPARE_SYNTAX_ERROR,
   PREPARE_UNRECOGNIZED_STATEMENT
  };
@@ -34,8  36,8 @@
 #define COLUMN_EMAIL_SIZE 255
 typedef struct {
   uint32_t id;
-  char username[COLUMN_USERNAME_SIZE];
-  char email[COLUMN_EMAIL_SIZE];
   char username[COLUMN_USERNAME_SIZE   1];
   char email[COLUMN_EMAIL_SIZE   1];
 } Row;

@@ -150,18  152,40 @@ MetaCommandResult do_meta_command(InputBuffer* input_buffer, Table *table) {
   }
 }

-PrepareResult prepare_statement(InputBuffer* input_buffer,
-                                Statement* statement) {
-  if (strncmp(input_buffer->buffer, "insert", 6) == 0) {
 PrepareResult prepare_insert(InputBuffer* input_buffer, Statement* statement) {
   statement->type = STATEMENT_INSERT;
-  int args_assigned = sscanf(
-     input_buffer->buffer, "insert %d %s %s", &(statement->row_to_insert.id),
-     statement->row_to_insert.username, statement->row_to_insert.email
-     );
-  if (args_assigned < 3) {
 
   char* keyword = strtok(input_buffer->buffer, " ");
   char* id_string = strtok(NULL, " ");
   char* username = strtok(NULL, " ");
   char* email = strtok(NULL, " ");
 
   if (id_string == NULL || username == NULL || email == NULL) {
      return PREPARE_SYNTAX_ERROR;
   }
 
   int id = atoi(id_string);
   if (id < 0) {
      return PREPARE_NEGATIVE_ID;
   }
   if (strlen(username) > COLUMN_USERNAME_SIZE) {
      return PREPARE_STRING_TOO_LONG;
   }
   if (strlen(email) > COLUMN_EMAIL_SIZE) {
      return PREPARE_STRING_TOO_LONG;
   }
 
   statement->row_to_insert.id = id;
   strcpy(statement->row_to_insert.username, username);
   strcpy(statement->row_to_insert.email, email);
 
   return PREPARE_SUCCESS;
 
 }
 PrepareResult prepare_statement(InputBuffer* input_buffer,
                                 Statement* statement) {
   if (strncmp(input_buffer->buffer, "insert", 6) == 0) {
       return prepare_insert(input_buffer, statement);
   }
   if (strcmp(input_buffer->buffer, "select") == 0) {
     statement->type = STATEMENT_SELECT;
@@ -223,6  247,12 @@ int main(int argc, char* argv[]) {
     switch (prepare_statement(input_buffer, &statement)) {
       case (PREPARE_SUCCESS):
         break;
       case (PREPARE_NEGATIVE_ID):
  printf("ID must be positive.n");
  continue;
       case (PREPARE_STRING_TOO_LONG):
  printf("String is too long.n");
  continue;
       case (PREPARE_SYNTAX_ERROR):
  printf("Syntax error. Could not parse statement.n");
  continue;

并且我们添加了一些测试:

代码语言:javascript复制
 describe 'database' do
   def run_script(commands)
     raw_output = nil
     IO.popen("./db", "r ") do |pipe|
       commands.each do |command|
         pipe.puts command
       end
 
       pipe.close_write
 
       # Read entire output
       raw_output = pipe.gets(nil)
     end
     raw_output.split("n")
   end
 
   it 'inserts and retrieves a row' do
     result = run_script([
       "insert 1 user1 person1@example.com",
       "select",
       ".exit",
     ])
     expect(result).to match_array([
       "db > Executed.",
       "db > (1, user1, person1@example.com)",
       "Executed.",
       "db > ",
     ])
   end
 
   it 'prints error message when table is full' do
     script = (1..1401).map do |i|
       "insert #{i} user#{i} person#{i}@example.com"
     end
     script << ".exit"
     result = run_script(script)
     expect(result[-2]).to eq('db > Error: Table full.')
   end
 
   it 'allows inserting strings that are the maximum length' do
     long_username = "a"*32
     long_email = "a"*255
     script = [
       "insert 1 #{long_username} #{long_email}",
       "select",
       ".exit",
     ]
     result = run_script(script)
     expect(result).to match_array([
       "db > Executed.",
       "db > (1, #{long_username}, #{long_email})",
       "Executed.",
       "db > ",
     ])
   end
 
   it 'prints error message if strings are too long' do
     long_username = "a"*33
     long_email = "a"*256
     script = [
       "insert 1 #{long_username} #{long_email}",
       "select",
       ".exit",
     ]
     result = run_script(script)
     expect(result).to match_array([
       "db > String is too long.",
       "db > Executed.",
       "db > ",
     ])
   end
 
   it 'prints an error message if id is negative' do
     script = [
       "insert -1 cstack foo@bar.com",
       "select",
       ".exit",
     ]
     result = run_script(script)
     expect(result).to match_array([
       "db > ID must be positive.",
       "db > Executed.",
       "db > ",
     ])
   end
 end

Enjoy GreatSQL :)


《零基础学习MySQL》视频课程

戳此小程序即可直达B站

https://www.bilibili.com/video/BV1Da411W7Va?spm_id_from=333.999.0.0&vd_source=ae1951b64ea7b9e6ba11f1d0bbcff0e4


文章推荐:

  • 实现一个简单的Database3(译文)
  • 16. 数据一致性、安全性保障 | 深入浅出MGR
  • 实现一个简单的Database2(译文)
  • MySQL中dd::columns表结构转table过程以及应用
  • 实现一个简单的Database1(译文)

关于 GreatSQL

GreatSQL是由万里数据库维护的MySQL分支,专注于提升MGR可靠性及性能,支持InnoDB并行查询特性,是适用于金融级应用的MySQL分支版本。

GreatSQL社区官网: https://greatsql.cn/

Gitee: https://gitee.com/GreatSQL/GreatSQL

GitHub: https://github.com/GreatSQL/GreatSQL

Bilibili:

https://space.bilibili.com/1363850082/video

0 人点赞