数据删除(delete操作)

Clickhouse删除/更新数据(UPDATE/DELETE/DROP)与MySQL的sql语法有点区别,因此做一下记录。

按分区删除

ALTER TABLE db_name.table_name DROP PARTITION '20200601'

按条件删除

ALTER TABLE db_name.table_name DELETE WHERE day = '20200618'

数据更新(update操作)

ALTER TABLE <table_name> UPDATE col1 = expr1, ... WHERE <filter>

注意:
1. 该命令必须在版本号大于1.1.54388才可以使用,适用于 mergeTree 引擎
2. 该命令是异步执行的,可以通过查看表 system.mutations 来查看命令的是否执行完毕

举例:
:) select event_status_key, count(*) from test_update where event_status_key in (0, 22) group by event_status_key;
┌─event_status_key─┬──count()─┐
│                0 │ 17824710 │
│               22 │     1701 │
└──────────────────┴──────────┘

:) ALTER TABLE test_update UPDATE event_status_key=0 where event_status_key=22;
0 rows in set. Elapsed: 0.067 sec.

:) select event_status_key, count(*) from test_update where event_status_key in (0, 22) group by event_status_key;
 ┌─event_status_key─┬──count()─┐
 │                0 │ 17826411 │
 └──────────────────┴──────────┘

不适合频繁更新或point更新

由于Clickhouse更新操作非常耗资源,如果频繁的进行更新操作,可能会弄崩集群,请谨慎操作。
参考:
https://www.altinity.com/blog/2018/10/16/updates-in-clickhouse

关于MaterializeMySQL数据库引擎

Clickhouse截止目前的版本(21.8.5.7)可以支持使用MaterializeMySQL数据库引擎订阅mysql的binLog来同步mysql的数据,支持RENAME DATABASE, TRUNCATE, OVER, RENAME TABLE, DROP query, RENAME DICTIONARY, DETACH, EXCHANGE DICTIONARIES, RENAME query, DROP, create query, CREATE, EXCHANGE TABLES, ALTER TABLE, alter queryRENAME DATABASE, TRUNCATE, OVER, RENAME TABLE, DROP query, RENAME DICTIONARY, DETACH, EXCHANGE DICTIONARIES, RENAME query, DROP, create query, CREATE, EXCHANGE TABLES, ALTER TABLE等SQL语法操作来同步数据,clickhouse服务重启也能正常同步数据,但是遇到以下情况是无法继续同步数据的:
①在同步mysql的数据库中新建了一张表中没有主键:这会导致clickhouse从这一刻开始停止同步所有数据
②同步的mysql的数据库被删除然后重新建同样名字的数据库:这个操作clickhouse无法同步,这会导致clickhouse库中一直报错表已经存在,然后会停止同步后面的数据
③mysql的建表语句为create table table_name1 like table_name2这样的语法在clickhouse中是不被支持的

突变操作没有作用?杀掉突变表的异常记录

现象:clickhouse在集群下执行update操作后查询数据仍然是修改之前的数据,等待几个小时后还是以前的数据
解决思路:查询 system.Mutations 表 sql :select * from system.mutations where is_done = 0;发现任务在突变表中状态一直是未执行,查询日志发现如下报错 ↓

2021.04.09 10:59:12.971073 [ 6363 ] {} <Error> void DB::BackgroundProcessingPool::workLoopFunc(): Code: 393, e.displayText() = DB::Exception: There is no query, Stack trace (when copying this message, always include the lines below):

0. DB::Context::getQueryContext() const @ 0xd87f6ae in /usr/bin/clickhouse
1. DB::InterpreterSelectWithUnionQuery::getSampleBlock(std::__1::shared_ptr<DB::IAST> const&, DB::Context const&) @ 0xdd71271 in /usr/bin/clickhouse
2. ? @ 0xded66ce in /usr/bin/clickhouse
3. DB::getDatabaseAndTablesWithColumns(std::__1::vector<DB::ASTTableExpression const*, std::__1::allocator<DB::ASTTableExpression const*> > const&, DB::Context const&) @ 0xded77c0 in /usr/bin/clickhouse
4. DB::JoinedTables::resolveTables() @ 0xdda3ad4 in /usr/bin/clickhouse
5. DB::InterpreterSelectQuery::InterpreterSelectQuery(std::__1::shared_ptr<DB::IAST> const&, DB::Context const&, std::__1::shared_ptr<DB::IBlockInputStream> const&, std::__1::optional<DB::Pipe>, std::__1::shared_ptr<DB::IStorage> const&, DB::SelectQueryOptions const&, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&, std::__1::shared_ptr<DB::StorageInMemoryMetadata const> const&) @ 0xdbc5b34 in /usr/bin/clickhouse
6. DB::InterpreterSelectQuery::InterpreterSelectQuery(std::__1::shared_ptr<DB::IAST> const&, DB::Context const&, DB::SelectQueryOptions const&, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&) @ 0xdbc504d in /usr/bin/clickhouse
7. DB::InterpreterSelectWithUnionQuery::InterpreterSelectWithUnionQuery(std::__1::shared_ptr<DB::IAST> const&, DB::Context const&, DB::SelectQueryOptions const&, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&) @ 0xdd6fa64 in /usr/bin/clickhouse
8. DB::interpretSubquery(std::__1::shared_ptr<DB::IAST> const&, DB::Context const&, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&, DB::SelectQueryOptions const&) @ 0xdedd5cb in /usr/bin/clickhouse
9. DB::ActionsMatcher::makeSet(DB::ASTFunction const&, DB::ActionsMatcher::Data&, bool) @ 0xdc41089 in /usr/bin/clickhouse
10. DB::ActionsMatcher::visit(DB::ASTFunction const&, std::__1::shared_ptr<DB::IAST> const&, DB::ActionsMatcher::Data&) @ 0xdc3a7c8 in /usr/bin/clickhouse
11. DB::ActionsMatcher::visit(DB::ASTFunction const&, std::__1::shared_ptr<DB::IAST> const&, DB::ActionsMatcher::Data&) @ 0xdc3b96a in /usr/bin/clickhouse
12. DB::InDepthNodeVisitor<DB::ActionsMatcher, true, std::__1::shared_ptr<DB::IAST> const>::visit(std::__1::shared_ptr<DB::IAST> const&) @ 0xdc18b8a in /usr/bin/clickhouse
13. DB::ExpressionAnalyzer::getRootActions(std::__1::shared_ptr<DB::IAST> const&, bool, std::__1::shared_ptr<DB::ActionsDAG>&, bool) @ 0xdc18916 in /usr/bin/clickhouse
14. DB::SelectQueryExpressionAnalyzer::appendPrewhere(DB::ExpressionActionsChain&, bool, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&) @ 0xdc1ba39 in /usr/bin/clickhouse
15. DB::ExpressionAnalysisResult::ExpressionAnalysisResult(DB::SelectQueryExpressionAnalyzer&, std::__1::shared_ptr<DB::StorageInMemoryMetadata const> const&, bool, bool, bool, std::__1::shared_ptr<DB::FilterInfo> const&, DB::Block const&) @ 0xdc232a0 in /usr/bin/clickhouse
16. DB::InterpreterSelectQuery::getSampleBlockImpl() @ 0xdbd0164 in /usr/bin/clickhouse
17. ? @ 0xdbcb025 in /usr/bin/clickhouse
18. DB::InterpreterSelectQuery::InterpreterSelectQuery(std::__1::shared_ptr<DB::IAST> const&, DB::Context const&, std::__1::shared_ptr<DB::IBlockInputStream> const&, std::__1::optional<DB::Pipe>, std::__1::shared_ptr<DB::IStorage> const&, DB::SelectQueryOptions const&, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&, std::__1::shared_ptr<DB::StorageInMemoryMetadata const> const&) @ 0xdbc662b in /usr/bin/clickhouse
19. DB::InterpreterSelectQuery::InterpreterSelectQuery(std::__1::shared_ptr<DB::IAST> const&, DB::Context const&, std::__1::shared_ptr<DB::IStorage> const&, std::__1::shared_ptr<DB::StorageInMemoryMetadata const> const&, DB::SelectQueryOptions const&) @ 0xdbc8802 in /usr/bin/clickhouse
20. DB::MutationsInterpreter::MutationsInterpreter(std::__1::shared_ptr<DB::IStorage>, std::__1::shared_ptr<DB::StorageInMemoryMetadata const> const&, DB::MutationCommands, DB::Context const&, bool) @ 0xddbf6eb in /usr/bin/clickhouse
21. DB::StorageMergeTree::tryMutatePart() @ 0xe0d8564 in /usr/bin/clickhouse
22. DB::StorageMergeTree::mergeMutateTask() @ 0xe0da733 in /usr/bin/clickhouse
23. DB::BackgroundProcessingPool::workLoopFunc() @ 0xe294b53 in /usr/bin/clickhouse
24. ? @ 0xe295683 in /usr/bin/clickhouse
25. ThreadPoolImpl<std::__1::thread>::worker(std::__1::__list_iterator<std::__1::thread, void*>) @ 0x7b8963d in /usr/bin/clickhouse
26. ? @ 0x7b8d153 in /usr/bin/clickhouse
27. start_thread @ 0x7e65 in /usr/lib64/libpthread-2.17.so
28. __clone @ 0xfe88d in /usr/lib64/libc-2.17.so
 (version 20.10.3.30 (official build))

观察日志发现clickhouse认为这个语句不是一个查询,应该是内部处理出现问题,考虑终止Mutations操作;官方文档对于Mutations操作有如下描述:一旦mutation完成提交,就不能回退了,但是如果因为某种原因操作被卡住了, 可以通过 KILL MUTATION操作来取消它的执行,查找官方文档中 KILL MUTATION语法:

KILL MUTATION KILL MUTATION [ON CLUSTER cluster] WHERE <where expression to SELECT FROM system.mutations query> [TEST] [FORMAT format] 
尝试取消和删除 突变 当前正在执行。 要取消的突变选自 system.mutations 表使用由指定的过滤器 WHERE 《公约》条款 KILL 查询。 测试查询 (TEST)仅检查用户的权限并显示要停止的查询列表。 
例:
-- Cancel and remove all mutations of the single table: KILL MUTATION WHERE database = 'default' AND table = 'table';
-- Cancel the specific mutation: KILL MUTATION WHERE database = 'default' AND table = 'table' AND mutation_id = 'mutation_3.txt' ;
The query is useful when a mutation is stuck and cannot finish (e.g. if some function in the mutation query throws an exception when applied to the data contained in the table). 
已经由突变所做的更改不会回滚

使用如下语句终止了突变:

KILL MUTATION [ON CLUSTER cluster_name] WHERE database = 'default' AND table = 'table_name' AND IS_DONE = 0;

再次查询 system.Mutations 表发现已提交的突变已经消失,再次提交突变成功运行。
Clickhouse官方文档中对于mutations操作的建议性描述如下:
已有的表已经支持mutations操作(不需要转换)。但是在首次对表进行mutation操作以后,它的元数据格式变得和和之前 的版本不兼容,并且不能回退到之前版本。