文章目录



一.简介

DataStream/DataSet Table 之间互相转换:


  • DataStream/DataSet 转换 Table
  • Table 转换DataStream/DataSet

二.示例

2.1 将Table转换为DataStream


有两种模式可以将 Table转换为DataStream



  • Append Mode 将一个表附加到流上
  • Retract Mode 将表转换为流


语法


// get TableEnvironment. 
// registration of a DataSet is equivalent
// ge val tableEnv = TableEnvironment.getTableEnvironment(env)
// Table with two fields (String name, Integer age)
val table: Table = ...
// convert the Table into an append DataStream of Row
val dsRow: DataStream[Row] = tableEnv.toAppendStream[Row](table)
// convert the Table into an append DataStream of Tuple2[String, Int]
val dsTuple: DataStream[(String, Int)] dsTuple =
tableEnv.toAppendStream[(String, Int)](table)
// convert the Table into a retract DataStream of Row.
// A retract stream of type X is a DataStream[(Boolean, X)].
// The boolean field indicates the type of the change.
// True is INSERT, false is DELETE.
val retractStream: DataStream[(Boolean, Row)] = tableEnv.toRetractStream[Row](table)


示例


object TableToDataStream {
def main(args: Array[String]): Unit = {
//构造数据,转换为table
val data = List(
Peoject(1L, 1, "Hello"),
Peoject(2L, 2, "Hello"),
Peoject(3L, 3, "Hello"),
Peoject(4L, 4, "Hello"),
Peoject(5L, 5, "Hello"),
Peoject(6L, 6, "Hello"),
Peoject(7L, 7, "Hello World"),
Peoject(8L, 8, "Hello World"),
Peoject(8L, 8, "Hello World"),
Peoject(20L, 20, "Hello World"))
val bsEnv = StreamExecutionEnvironment.getExecutionEnvironment
val bsSettings = EnvironmentSettings.newInstance().useBlinkPlanner().inStreamingMode().build()
val tEnv = StreamTableEnvironment.create(bsEnv, bsSettings)
import org.apache.flink.api.scala._
val stream = bsEnv.fromCollection(data)
val table = tEnv.fromDataStream(stream)
//TODO 将table转换为DataStream----将一个表附加到流上Append Mode
val appendStream: DataStream[Peoject] = tEnv.toAppendStream[Peoject](table)
//TODO 将表转换为流Retract Mode true代表添加消息,false代表撤销消息
val retractStream: DataStream[(Boolean, Peoject)] = tEnv.toRetractStream[Peoject](table)
retractStream.print()
bsEnv.execute()
}
case class Peoject(user: Long, index: Int, content: String)
}


结果


3> (true,Peoject(6,6,Hello))
2> (true,Peoject(5,5,Hello))
5> (true,Peoject(8,8,Hello World))
6> (true,Peoject(1,1,Hello))
8> (true,Peoject(3,3,Hello))
7> (true,Peoject(2,2,Hello))
7> (true,Peoject(20,20,Hello World))
4> (true,Peoject(7,7,Hello World))
1> (true,Peoject(4,4,Hello))
6> (true,Peoject(8,8,Hello World))

2.2 将Table转换为DataSet


语法


// get TableEnvironment 
// registration of a DataSet is equivalent
val tableEnv = TableEnvironment.getTableEnvironment(env)
// Table with two fields (String name, Integer age)
val table: Table = ...
// convert the Table into a DataSet of Row
val dsRow: DataSet[Row] = tableEnv.toDataSet[Row](table)
// convert the Table into a DataSet of Tuple2[String, Int]
val dsTuple: DataSet[(String, Int)] = tableEnv.toDataSet[(String, Int)](table)



示例


object TableToDataSet {
def main(args: Array[String]): Unit = {
//构造数据,转换为table
val data = List(
Peoject(1L, 1, "Hello"),
Peoject(2L, 2, "Hello"),
Peoject(3L, 3, "Hello"),
Peoject(4L, 4, "Hello"),
Peoject(5L, 5, "Hello"),
Peoject(6L, 6, "Hello"),
Peoject(7L, 7, "Hello World"),
Peoject(8L, 8, "Hello World"),
Peoject(8L, 8, "Hello World"),
Peoject(20L, 20, "Hello World"))
//初始化环境,加载table数据
val fbEnv = ExecutionEnvironment.getExecutionEnvironment
val fbTableEnv = BatchTableEnvironment.create(fbEnv)
import org.apache.flink.api.scala._
val collection: DataSet[Peoject] = fbEnv.fromCollection(data)
val table: Table = fbTableEnv.fromDataSet(collection)
//TODO 将table转换为dataSet
val toDataSet: DataSet[Peoject] = fbTableEnv.toDataSet[Peoject](table)
toDataSet.print()
}
case class Peoject(user: Long, index: Int, content: String)
}


结果


Peoject(1,1,Hello)
Peoject(2,2,Hello)
Peoject(3,3,Hello)
Peoject(4,4,Hello)
Peoject(5,5,Hello)
Peoject(6,6,Hello)
Peoject(7,7,Hello World)
Peoject(8,8,Hello World)
Peoject(8,8,Hello World)
Peoject(20,20,Hello World)

2.3 DataStrearm 转换Table对象

// get TableEnvironment
// registration of a DataSet is equivalent
val tableEnv = ... // see "Create a TableEnvironment" section
val stream: DataStream[(Long, String)] = ...
// convert the DataStream into a Table with default fields '_1, '_2
val table1: Table = tableEnv.fromDataStream(stream)
// convert the DataStream into a Table with fields 'myLong, 'myString
val table2: Table = tableEnv.fromDataStream(stream, 'myLong, 'myString)

2.4 DataSet 转换Table对象

// get TableEnvironment
// registration of a DataSet is equivalent
val tableEnv = ... // see "Create a TableEnvironment" section
val stream: DataSet[(Long, String)] = ...
// convert the DataSet into a Table with default fields '_1, '_2
val table1: Table = tableEnv.fromDataSet(stream)
// convert the DataSet into a Table with fields 'myLong, 'myString
val table2: Table = tableEnv.fromDataSet(stream, 'myLong, 'myString)


公众号


Flink DataStream/DataSet Table 之间的转换_flink