!
1.前置条件
安装Flink 1.10 版本 【切记一定要在这之上的版本】,Flink下载地址
安装kafka
安装zookeeper
安装mysql
2.下载我提供的包
关注微信公众号 【LarkMidTable】,回复【flinksql学习】获取
3.替换Flink的lib包,用我提供的lib包
flinkx-sql/lib 替换 flink-1.10.0/lib
4.mysql中创建库和表
库名: flink-test
表 : pvuv_sink.sql
5.假设你的Flink,Mysql,kafka都启动的情况下copy到Linux服务器
flinkx-sql\target\flink-sql-submit.jar
flinkx-sql\src\main\resources\q1.sql
q1.sql需要修改的 (1.zookeeper的IP,2.kafka的IP,3.Mysql的IP、用户名、密码)
CREATE TABLE user_log (
user_id VARCHAR,
item_id VARCHAR,
category_id VARCHAR,
behavior VARCHAR,
ts TIMESTAMP
) WITH (
'connector.type' = 'kafka',
'connector.version' = 'universal',
'connector.topic' = 'user_behavior',
'connector.startup-mode' = 'earliest-offset',
'connector.properties.0.key' = 'zookeeper.connect',
'connector.properties.0.value' = 'localhost:2181',
'connector.properties.1.key' = 'bootstrap.servers',
'connector.properties.1.value' = 'localhost:9092',
'update-mode' = 'append',
'format.type' = 'json',
'format.derive-schema' = 'true'
);
-- sink
CREATE TABLE pvuv_sink (
dt VARCHAR,
pv BIGINT,
uv BIGINT
) WITH (
'connector.type' = 'jdbc',
'connector.url' = 'jdbc:mysql://localhost:3306/flink-test',
'connector.table' = 'pvuv_sink',
'connector.username' = 'root',
'connector.password' = '123456',
'connector.write.flush.max-rows' = '1'
);
INSERT INTO pvuv_sink
SELECT
DATE_FORMAT(ts, 'yyyy-MM-dd HH:00') dt,
COUNT(*) AS pv,
COUNT(DISTINCT user_id) AS uv
FROM user_log
GROUP BY DATE_FORMAT(ts, 'yyyy-MM-dd HH:00');
6.启动开始运行
1.往kafka里灌数据
java -cp flink-sql-submit.jar com.github.wuchong.sqlsubmit.SourceGenerator 1000 | /home/hadoop/data/kafka/bin/kafka-console-producer.sh --broker-list localhost:9092 --topic user_behavior
2.启动flink的任务
${basepath}/flink-1.10.0/bin/flink run -d -p 1 flink-sql-submit.jar -w ${q1.sql所在的目录} -f q1.sql
3.验证结果
查看Flink界面 http://localhost:8081
数据库表中是否有聚合数据