目录
- 1. 基本使用指南
- 1.1 修改密码
- 1.2 创建用户并授权
- 1.3 语法帮助命令
- 1.4 创建数据库
- 1.5 表的创建
- 1.6 查询数据
- 2. 高级使用指南
- 2.1 表结构变更
- 2.2 数据表的查询
- 2.2.1 内存限制
- 2.2.2 查询超时时间
- 2.2.3 broadcast join和shuffle join
- 2.2.4 查询重试和高可用
因为Doris使用mysql协议,所以可以像连接mysql一样连接Doris
1. 基本使用指南
1.1 修改密码
- doris默认是没有密码
- 连接端口是fe.conf中的query_port参数值
[root@bigdata005 opt]#
[root@bigdata005 opt]# mysql -h bigdata001 -P 9030 -u root
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.1.0 Doris version 0.14.0-release-Unknown
Copyright (c) 2000, 2021, Oracle and/or its affiliates.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql>
mysql> set password for 'root' = password('Root_123');
Query OK, 0 rows affected (0.04 sec)
mysql>
1.2 创建用户并授权
- 新创建的用户,是没有任何权限,连读的权限都没有
mysql>
mysql> create user 'test' identified by 'Test_123';
Query OK, 0 rows affected (0.13 sec)
mysql>
mysql> create database test_db;
Query OK, 0 rows affected (0.01 sec)
mysql> grant all on test_db to test;
Query OK, 0 rows affected (0.02 sec)
mysql>
1.3 语法帮助命令
mysql>
mysql> help create;
Many help items for your request exist.
To make a more specific request, please type 'help <item>',
where <item> is one of the following
topics:
CREATE CLUSTER
CREATE DATABASE
CREATE FILE
CREATE FUNCTION
CREATE INDEX
CREATE MATERIALIZED VIEW
CREATE ROLE
CREATE USER
CREATE VIEW
ROUTINE LOAD
SHOW CREATE FUNCTION
mysql>
mysql> help create database;
Name: 'CREATE DATABASE'
Description:
该语句用于新建数据库(database)
语法:
CREATE DATABASE [IF NOT EXISTS] db_name;
Examples:
1. 新建数据库 db_test
CREATE DATABASE db_test;
mysql>
1.4 创建数据库
mysql>
mysql> create database if not exists test_db;
Query OK, 0 rows affected (0.02 sec)
mysql>
mysql> show databases;
+--------------------+
| Database |
+--------------------+
| information_schema |
| test_db |
+--------------------+
2 rows in set (0.01 sec)
mysql>
mysql> use test_db;
Database changed
mysql>
- information_schema是为了兼容mysql,实际数据不准确
1.5 表的创建
mysql>
mysql> create table table1(
-> id int default '0',
-> name varchar(32) default '',
-> city_code smallint,
-> pv bigint sum default '0'
-> )
-> aggregate key(id, name, city_code)
-> distributed by hash(id) buckets 10
-> properties('replication_num' = '3');
Query OK, 0 rows affected (0.03 sec)
mysql>
mysql> create table table2(
-> id int default '0',
-> name varchar(32) default '',
-> city_code smallint,
-> event_day date,
-> pv bigint sum default '0'
-> )
-> aggregate key(id, name, city_code, event_day)
-> partition by range(event_day)
-> (
-> partition p202107 values less than ('2021-08-01'),
-> partition p202108 values less than ('2021-09-01'),
-> partition p202109 values less than ('2021-10-01')
->
-> )
-> distributed by hash(id) buckets 10
-> properties('replication_num' = '3');
Query OK, 0 rows affected (0.34 sec)
mysql>
mysql> desc table1;
+-----------+-------------+------+-------+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-----------+-------------+------+-------+---------+-------+
| id | INT | Yes | true | 0 | |
| name | VARCHAR(32) | Yes | true | | |
| city_code | SMALLINT | Yes | true | NULL | |
| pv | BIGINT | Yes | false | 0 | SUM |
+-----------+-------------+------+-------+---------+-------+
4 rows in set (0.01 sec)
mysql>
mysql> desc table2;
+-----------+-------------+------+-------+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-----------+-------------+------+-------+---------+-------+
| id | INT | Yes | true | 0 | |
| name | VARCHAR(32) | Yes | true | | |
| city_code | SMALLINT | Yes | true | NULL | |
| event_day | DATE | Yes | true | NULL | |
| pv | BIGINT | Yes | false | 0 | SUM |
+-----------+-------------+------+-------+---------+-------+
5 rows in set (0.00 sec)
mysql>
mysql> show create table table1;
+--------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+--------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| table1 | CREATE TABLE `table1` (
`id` int(11) NULL DEFAULT "0" COMMENT "",
`name` varchar(32) NULL DEFAULT "" COMMENT "",
`city_code` smallint(6) NULL COMMENT "",
`pv` bigint(20) SUM NULL DEFAULT "0" COMMENT ""
) ENGINE=OLAP
AGGREGATE KEY(`id`, `name`, `city_code`)
COMMENT "OLAP"
DISTRIBUTED BY HASH(`id`) BUCKETS 10
PROPERTIES (
"replication_num" = "3",
"in_memory" = "false",
"storage_format" = "V2"
); |
+--------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.01 sec)
mysql>
mysql> show create table table2;
+--------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+--------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| table2 | CREATE TABLE `table2` (
`id` int(11) NULL DEFAULT "0" COMMENT "",
`name` varchar(32) NULL DEFAULT "" COMMENT "",
`city_code` smallint(6) NULL COMMENT "",
`event_day` date NULL COMMENT "",
`pv` bigint(20) SUM NULL DEFAULT "0" COMMENT ""
) ENGINE=OLAP
AGGREGATE KEY(`id`, `name`, `city_code`, `event_day`)
COMMENT "OLAP"
PARTITION BY RANGE(`event_day`)
(PARTITION p202107 VALUES [('0000-01-01'), ('2021-08-01')),
PARTITION p202108 VALUES [('2021-08-01'), ('2021-09-01')),
PARTITION p202109 VALUES [('2021-09-01'), ('2021-10-01')))
DISTRIBUTED BY HASH(`id`) BUCKETS 10
PROPERTIES (
"replication_num" = "3",
"in_memory" = "false",
"storage_format" = "V2"
); |
+--------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)
mysql>
- pv字段会自动根据id、name、city_code、event_day四个字段聚合求和
- 可以只分桶(distributed by),不分区(partition by); 如果同时指定,先分区再分桶
- 分区只支持整形和日期字段,且需指定分区范围
- 分桶根据字段进行hash分桶
- 可以按分区删除数据,也可进入指定分区删除数据
- 一行数据所有列的总字节长度不能超过100KB
1.6 查询数据
先进行数据的插入
mysql>
mysql> insert into table1(id, name, city_code, pv) values(2, 'grace', 1, 2),(5, 'helen', 3, 3),(3, 'tom', 2, 2);
Query OK, 3 rows affected (0.17 sec)
{'label':'insert_66da289c7bcf4462-a36c8ee68a0f9c0d', 'status':'VISIBLE', 'txnId':'2'}
mysql>
mysql> insert into table2(id, name, city_code, pv, event_day) values(2, 'grace', 1, 2, '2021-07-15'),(5, 'helen', 3, 3, '2021-08-15'),(3, 'tom', 2, 2, '2021-09-15');
Query OK, 3 rows affected (0.07 sec)
{'label':'insert_3bd9c0c6438840dc-8547d9e13fd4867b', 'status':'VISIBLE', 'txnId':'3'}
mysql>
再查询数据
mysql>
mysql> select * from table1 limit 3;
+------+-------+-----------+------+
| id | name | city_code | pv |
+------+-------+-----------+------+
| 2 | grace | 1 | 2 |
| 3 | tom | 2 | 2 |
| 5 | helen | 3 | 3 |
+------+-------+-----------+------+
3 rows in set (0.12 sec)
mysql>
mysql> select * from table1 order by city_code;
+------+-------+-----------+------+
| id | name | city_code | pv |
+------+-------+-----------+------+
| 2 | grace | 1 | 2 |
| 3 | tom | 2 | 2 |
| 5 | helen | 3 | 3 |
+------+-------+-----------+------+
3 rows in set (0.04 sec)
mysql>
mysql> show data;
+--------------+-------------+--------------+
| TableName | Size | ReplicaCount |
+--------------+-------------+--------------+
| table1 | 18.223 KB | 45 |
| table2 | 7.764 KB | 90 |
| Total | 25.986 KB | 135 |
| Quota | 1024.000 GB | 1073741824 |
| Left | 1024.000 GB | 1073741494 |
+--------------+-------------+--------------+
9 rows in set (0.09 sec)
mysql>
- ReplicaCount表示所有副本的大概数据量之和,除以replica副本数,可以得到表的数据量
join查询
mysql>
mysql> select sum(table1.pv) from table1 join table2 on table1.id = table2.id;
+--------------------+
| sum(`table1`.`pv`) |
+--------------------+
| 7 |
+--------------------+
1 row in set (0.12 sec)
mysql>
子查询
mysql>
mysql> select sum(pv) from table1 where id in (select id from table2 where id > 2);
+-----------+
| sum(`pv`) |
+-----------+
| 5 |
+-----------+
1 row in set (0.04 sec)
mysql>
2. 高级使用指南
2.1 表结构变更
- 添加列并查看变更进度
mysql> alter table table1 add column uv int sum default '0' after pv;
Query OK, 0 rows affected (0.00 sec)
mysql>
mysql> show alter table column;
+-------+-----------+---------------------+---------------------+-----------+---------+---------------+---------------+---------------+----------+------+----------+---------+
| JobId | TableName | CreateTime | FinishTime | IndexName | IndexId | OriginIndexId | SchemaVersion | TransactionId | State | Msg | Progress | Timeout |
+-------+-----------+---------------------+---------------------+-----------+---------+---------------+---------------+---------------+----------+------+----------+---------+
| 10442 | table1 | 2021-10-17 21:56:29 | 2021-10-17 21:56:56 | table1 | 10443 | 10400 | 1:1214538836 | 9 | FINISHED | | NULL | 86400 |
+-------+-----------+---------------------+---------------------+-----------+---------+---------------+---------------+---------------+----------+------+----------+---------+
1 rows in set (0.00 sec)
mysql>
- 可以对State不是FINISHED的列修改进行撤销:
cancel alter table column from table1;
- 修改列的类型
mysql>
mysql> alter table table1 modify column city_code int;
Query OK, 0 rows affected (0.01 sec)
mysql>
- 删除列
mysql>
mysql> alter table table1 drop column uv;
Query OK, 0 rows affected (0.01 sec)
mysql>
2.2 数据表的查询
2.2.1 内存限制
- 一个BE默认内存限制为2GB
mysql>
mysql> show variables like '%mem_limit%';
+----------------+------------+
| Variable_name | Value |
+----------------+------------+
| exec_mem_limit | 2147483648 |
| load_mem_limit | 0 |
+----------------+------------+
2 rows in set (0.00 sec)
mysql>
- 修改内存限制(Session级别),如8GB
mysql>
mysql> set exec_mem_limit = 8589934592;
Query OK, 0 rows affected (0.04 sec)
mysql>
- 修改内存限制(Global级别),如8GB
mysql>
mysql> set global exec_mem_limit = 8589934592;
Query OK, 0 rows affected (0.01 sec)
mysql>
2.2.2 查询超时时间
- 默认超时时间为300秒,超过该时间的查询会被cancel
- 超时检查间隔为5秒,所以查询时间为304秒也可能不会被cancel
mysql>
mysql> show variables like '%query_timeout%';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| query_timeout | 300 |
+---------------+-------+
1 row in set (0.01 sec)
mysql>
2.2.3 broadcast join和shuffle join
mysql>
mysql> select sum(a.pv) from table1 a join table2 b on a.id = b.id where a.city_code = 2;
+---------------+
| sum(`a`.`pv`) |
+---------------+
| 2 |
+---------------+
1 row in set (0.44 sec)
mysql>
mysql> select sum(a.pv) from table1 a join [broadcast] table2 b on a.id = b.id where a.city_code = 2;
+---------------+
| sum(`a`.`pv`) |
+---------------+
| 2 |
+---------------+
1 row in set (0.04 sec)
mysql>
mysql> select sum(a.pv) from table1 a join [shuffle] table2 b on a.id = b.id where a.city_code = 2;
+---------------+
| sum(`a`.`pv`) |
+---------------+
| 2 |
+---------------+
1 row in set (0.03 sec)
mysql>
- broadcast join:默认join方式,会将小表broadcast到所有节点,从而避免大表shuffle, 如果小表过大,会报内存不足。命中需要是两表字段的等值join
- shuffle join:根据key, 大表小表都进行shuffle
- 两种join的结果一样,只是内部实现过程不一样;系统会根据情况自动判断用哪种join方式,但如果报内存不足,可以显示指定shuffle join
2.2.4 查询重试和高可用
使用mysql jdbc connector来连接Doris,使用jdbc的自动重试机制
jdbc:mysql://host:port[,host:port][/database][?propertyName1=propertyValue1][&propertyName2=propertValue2]