系列文章第1篇:《手把手教你从零开始实现一个数据库系统》
我们正在实现SQLlite的克隆。SQLite的前端是SQL编译器,它接收字符串并输出我们称之为字节码的内部表达式。
字节码被传输到虚拟机并执行。
上述将事情分成两步处理有两点好处:
降低每个部分的复杂性(例如虚拟机不用考虑语法错误)
通过对查询语句的批量编译和字节码的缓存来提高性能
在这样的设计思路下,让我们重构主函数并在此过程中支持两个新关键字:
int main(int argc, char* argv[]) {
InputBuffer* input_buffer = new_input_buffer();
while (true) {
print_prompt();
read_input(input_buffer);
- if (strcmp(input_buffer->buffer, ".exit") == 0) {
- exit(EXIT_SUCCESS);
- } else {
- printf("Unrecognized command '%s'.\n", input_buffer->buffer);
+ if (input_buffer->buffer[0] == '.') {
+ switch (do_meta_command(input_buffer)) {
+ case (META_COMMAND_SUCCESS):
+ continue;
+ case (META_COMMAND_UNRECOGNIZED_COMMAND):
+ printf("Unrecognized command '%s'\n", input_buffer->buffer);
+ continue;
+ }
}
+
+ Statement statement;
+ switch (prepare_statement(input_buffer, &statement)) {
+ case (PREPARE_SUCCESS):
+ break;
+ case (PREPARE_UNRECOGNIZED_STATEMENT):
+ printf("Unrecognized keyword at start of '%s'.\n",
+ input_buffer->buffer);
+ continue;
+ }
+
+ execute_statement(&statement);
+ printf("Executed.\n");
}
}
Non-SQL中的声明语句如.exit被称作为“元命令”,它们都以“.”开始,我们用不同的函数来处理。
接下来,我们增加一步将输入行转化成内部表达式。这就是我们自定义的SQLite前端。
最后我们将prepared statement传递给execute_statement。这个函数会最终成为我们的虚拟机。
我们通过两个新函数的enums返回值来判断执行结果:
typedef enum {
META_COMMAND_SUCCESS,
META_COMMAND_UNRECOGNIZED_COMMAND
} MetaCommandResult;
typedef enum { PREPARE_SUCCESS, PREPARE_UNRECOGNIZED_STATEMENT } PrepareResult;
“Unrecognized statement”看起来像个异常,抛出异常不是什么好现象(C语言甚至不支持异常),所以我们在任何可用的地方使用enum result code来做判断。我的switch statement支持多enum的处理,因此我们可以更有信心处理函数的每个结果。预计更多的result code会产生出来。
dometacommand函数只是为现有函数做个打包,让程序更精简:
MetaCommandResult do_meta_command(InputBuffer* input_buffer) { if (strcmp(input_buffer->buffer, ".exit") == 0) { exit(EXIT_SUCCESS); } else { return META_COMMAND_UNRECOGNIZED_COMMAND; }}
“prepared statement”函数中的enum目前应该只包含两个值,如果我们设置参数,enum中将会有更多的值。
typedef enum { STATEMENT_INSERT, STATEMENT_SELECT } StatementType;
typedef struct {
StatementType type;
} Statement;
prepare_statement(我们的SQL编译器)目前还不能编译SQL,实际上它目前只认识两个单词:
PrepareResult prepare_statement(InputBuffer* input_buffer,
Statement* statement) {
if (strncmp(input_buffer->buffer, "insert", 6) == 0) {
statement->type = STATEMENT_INSERT;
return PREPARE_SUCCESS;
}
if (strcmp(input_buffer->buffer, "select") == 0) {
statement->type = STATEMENT_SELECT;
return PREPARE_SUCCESS;
}
return PREPARE_UNRECOGNIZED_STATEMENT;
}
因为“插入”关键字后面要紧跟数据,我们用strncmp实现“插入”。(例如,insert cstack foo@bar.com)
最终,execute_statement如下:
void execute_statement(Statement* statement) { switch (statement->type) { case (STATEMENT_INSERT): printf("This is where we would do an insert.\n"); break; case (STATEMENT_SELECT): printf("This is where we would do a select.\n"); break; }}
这个函数目前并不会返回任何报错信息。
通过代码重构,程序可以识别两个新关键词了!
~ ./dbdb > insert foo barThis is where we would do an insert.Executed.db > delete fooUnrecognized keyword at start of 'delete foo'.db > selectThis is where we would do a select.Executed.db > .tablesUnrecognized command '.tables'db > .exit~
我们的数据库越来越健壮了,在下一篇,我们会执行insert和select。本期代码如下:
@@ -10,6 +10,23 @@ struct InputBuffer_t {
} InputBuffer;
+typedef enum {
+ META_COMMAND_SUCCESS,
+ META_COMMAND_UNRECOGNIZED_COMMAND
+} MetaCommandResult;
+
+typedef enum { PREPARE_SUCCESS, PREPARE_UNRECOGNIZED_STATEMENT } PrepareResult;
+
+typedef enum { STATEMENT_INSERT, STATEMENT_SELECT } StatementType;
+
+typedef struct {
+ StatementType type;
+} Statement;
+
InputBuffer* new_input_buffer() {
InputBuffer* input_buffer = malloc(sizeof(InputBuffer));
input_buffer->buffer = NULL;
@@ -40,17 +57,67 @@ void close_input_buffer(InputBuffer* input_buffer) {
free(input_buffer);
}
+MetaCommandResult do_meta_command(InputBuffer* input_buffer) {
+ if (strcmp(input_buffer->buffer, ".exit") == 0) {
+ close_input_buffer(input_buffer);
+ exit(EXIT_SUCCESS);
+ } else {
+ return META_COMMAND_UNRECOGNIZED_COMMAND;
+ }
+}
+
+PrepareResult prepare_statement(InputBuffer* input_buffer,
+ Statement* statement) {
+ if (strncmp(input_buffer->buffer, "insert", 6) == 0) {
+ statement->type = STATEMENT_INSERT;
+ return PREPARE_SUCCESS;
+ }
+ if (strcmp(input_buffer->buffer, "select") == 0) {
+ statement->type = STATEMENT_SELECT;
+ return PREPARE_SUCCESS;
+ }
+
+ return PREPARE_UNRECOGNIZED_STATEMENT;
+}
+
+void execute_statement(Statement* statement) {
+ switch (statement->type) {
+ case (STATEMENT_INSERT):
+ printf("This is where we would do an insert.\n");
+ break;
+ case (STATEMENT_SELECT):
+ printf("This is where we would do a select.\n");
+ break;
+ }
+}
+
int main(int argc, char* argv[]) {
InputBuffer* input_buffer = new_input_buffer();
while (true) {
print_prompt();
read_input(input_buffer);
- if (strcmp(input_buffer->buffer, ".exit") == 0) {
- close_input_buffer(input_buffer);
- exit(EXIT_SUCCESS);
- } else {
- printf("Unrecognized command '%s'.\n", input_buffer->buffer);
+ if (input_buffer->buffer[0] == '.') {
+ switch (do_meta_command(input_buffer)) {
+ case (META_COMMAND_SUCCESS):
+ continue;
+ case (META_COMMAND_UNRECOGNIZED_COMMAND):
+ printf("Unrecognized command '%s'\n", input_buffer->buffer);
+ continue;
+ }
}
+
+ Statement statement;
+ switch (prepare_statement(input_buffer, &statement)) {
+ case (PREPARE_SUCCESS):
+ break;
+ case (PREPARE_UNRECOGNIZED_STATEMENT):
+ printf("Unrecognized keyword at start of '%s'.\n",
+ input_buffer->buffer);
+ continue;
+ }
+
+ execute_statement(&statement);
+ printf("Executed.\n");
}
}
原文链接:https://cstack.github.io/db_tutorial/parts/part2.html