分布式数据库中的子计划命名

关注晓楚

文章目录

OceanBase
Oracle
SQL Server
Greenplum
Flink
TiDB
Presto
ClickHouse

分布式数据库中的子计划命名

原创

晓楚 2023-06-23 07:35:12 博主文章分类：OceanBase ©著作权

文章标签 ide Server SQL 文章分类 JavaScript 前端开发

©著作权归作者所有：来自51CTO博客作者晓楚的原创作品，请联系作者获取转载授权，否则将追究法律责任

在并行数据库中，一个计划会被划分成多个子计划，这个子计划在不同系统里，称呼不同，本文在调研的基础上做一个小结。

DB	Sub Plan Name
OceanBase	DFO
Oracle	DFO
SQL Server	Branch
Greenplum	Slice
Flink	Operator Chain
TiDB	Task
Presto	Stage / Fragment
ClickHouse	N/A

OceanBase

DFO

分布式计划以数据重分布点为边界，切分为可以并行执行的逻辑子计划，每个子计划由一个 DFO 进行封装。

下图中展示了包含 6个 DFO 的计划树。

分布式数据库中的子计划命名_ide

Oracle

DFO

A parallel execution plan is carried out as a series of producer/consumer operations. Parallel execution (PX) servers that produce data for subsequent operations are called producers, PX servers that require the output of other operations are called consumers. Each producer or consumer parallel operation is performed by a set of PX servers called PX server sets. The number of PX servers in PX server set is called Degree of Parallelism (DOP). The basic unit of work for a PX server set is called a data flow operation (DFO).

SQL Server

branch

If you think of an execution plan as a tree, a branch is an area of the plan that groups one or more operators between Parallelism operators, also called Exchange Iterators.

Greenplum

slice

To achieve maximum parallelism during query execution, Greenplum divides the work of the query plan into slices. A slice is a portion of the plan that segments can work on independently. A query plan is sliced wherever a motion operation occurs in the plan, with one slice on each side of the motion.

Flink

operator chain

An Operator Chain consists of two or more consecutive Operators without any repartitioning in between. Operators within the same Operator Chain forward records to each other directly without going through serialization or Flink’s network stack.

task (instance of operator chain?)

Multiple operations/operators can be chained together using a feature called chaining. A group of one or multiple (chained) operators that Flink considers as a unit of scheduling is called a task.

TiDB

TBD

task is a new version of PhysicalPlanInfo. It stores cost information for a task. A task may be CopTask, RootTask, MPPTaskMeta or a ParallelTask.

Presto

stage

When Presto executes a query, it does so by breaking up the execution into a hierarchy of stages. For example, if Presto needs to aggregate data from one billion rows stored in Hive, it does so by creating a root stage to aggregate the output of several other stages all of which are designed to implement different sections of a distributed query plan.
The hierarchy of stages that comprises a query resembles a tree. Every query has a root stage which is responsible for aggregating the output from other stages. Stages are what the coordinator uses to model a distributed query plan, but stages themselves don’t run on Presto workers.

fragment

Each plan fragment is executed by a single or multiple Presto nodes. Fragments separation represent the data exchange between Presto nodes. Fragment type specifies how the fragment is executed by Presto nodes and how the data is distributed between fragments

ClickHouse

Not Available

There is no global query plan for distributed query execution. Each node has its local query plan for its part of the job. We only have simple one-pass distributed query execution: we send queries for remote nodes and then merge the results. But this is not feasible for complicated queries with high cardinality GROUP BYs or with a large amount of temporary data for JOIN. In such cases, we need to “reshuffle” data between servers, which requires additional coordination. ClickHouse does not support that kind of query execution, and we need to work on it.

赞
收藏
评论
分享
举报

上一篇：EMA - 指数移动平均

下一篇：i386, x86, x86_64, IA-32, IA-64, 安腾, AMD64 的关系是什么？

提问和评论都可以，用心的回复会被更多人看到评论

发布评论

相关文章

举报文章

请选择举报类型

内容侵权涉嫌营销内容抄袭违法信息其他

具体原因

包含不真实信息涉及个人隐私

原文链接（必填）

补充说明

0/200

上传截图

格式支持JPEG/PNG/JPG，图片不超过1.9M

已经收到您得举报信息，我们会尽快审核

鸿蒙开发者社区

WOT技术大会

公众号矩阵

移动端

短视频免费课程课程排行直播课软考学堂

全部课程厂商认证 IT技术 24年11月软考 PMP项目管理免费题库

在线学习

文章资源问答课堂专栏直播

51CTO

鸿蒙开发者社区

51CTO技术栈

51CTO官微

51CTO学堂

51CTO博客

CTO训练营

鸿蒙开发者社区订阅号

51CTO软考

51CTO学堂APP

51CTO学堂企业版APP

鸿蒙开发者社区视频号

51CTO软考题库

51CTO博客

首页
关注
排行榜
精品课程
免费直播
软考题库

科目全、试题精、讲解专业，扫码免费刷

搜索历史清空

热门搜索

查看【】的结果
写文章
创作中心
登录注册