文章目录
- 问题产生原因
- 问题重点及解决
问题产生原因
公司使用的是mysql5.7版本数据库,具体再小版本就无所谓了,最近在表中加了两个字段,需要存json类型数据,所以就理所当然的把这两个字段类型也设置为json了,如下图:
然后用了往这两个字段中分别存了测试数据,使用mybatisPlus读取是没出现中文乱码的;但现在真实的数据是从另一个服务发送的,通过这两个字段可以看出,要存的是数据库的表结构和样例数据,另一个服务获取这些数据时,使用的不是mybatisPlus,而是调研的一个获取源数据的工具,由于这里的代码不是我写的,所以可能说的不是非常清楚,抱歉
<dependency>
<groupId>com.github.ben-manes.caffeine</groupId>
<artifactId>caffeine</artifactId>
<version>2.9.1</version>
</dependency>
<!--主要使用的就是这个工具,获取表结构和样例数据-->
<dependency>
<groupId>us.fatehi</groupId>
<artifactId>schemacrawler</artifactId>
<version>16.12.2</version>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-jdbc</artifactId>
<version>2.4.0</version>
</dependency>
由于获取表结构的代码是用scala写的,而且还不是我写的,因此这里代码我不太熟悉,所以这里我就只粘贴部分代码
//请求的接口地址,这里用的是akka写请求接口的方式,和spring有些不同
case HttpRequest(POST, Uri.Path("/getDataSourceMeta"), headers, entity, protocol) => {
//createDataSourceParamVo方法是把接口传送的值封装为实体,
//接口需要的值主要就是连接数据库的几个属性:url,driver,username,password,sql
//我们这里暂时只支持获取mysql和hive的连接(获取mysql和hive的源数据)
val dataSourceParamsVo: DataSourceParamsVo = createDataSourceParamVo(entity)
val response = API.getDataSourceMeta(dataSourceParamsVo)
println("getDataSourceMeta result: " + response)
if (response != "") {
Future.successful(HttpResponse(SUCCESS_CODE, entity = response))
} else {
Future.successful(HttpResponse(FAIL_CODE, entity = "getDataSourceMeta error!"))
}
}
//封装接口接收到的值
def createDataSourceParamVo(entity: MessageEntity) = {
val data = toJson(entity)
val bundle = data.get("bundle").getOrElse("").asInstanceOf[String]
var driverClassName = data.get("driverClassName").getOrElse("").asInstanceOf[String]
var url = data.get("url").getOrElse("").asInstanceOf[String]
val user = data.get("user").getOrElse("").asInstanceOf[String]
val password = data.get("password").getOrElse("").asInstanceOf[String]
val querySql = data.get("querySql").getOrElse("").asInstanceOf[String]
val dataSourceParamsVo: DataSourceParamsVo = bundle match {
case "cn.piflow.bundle.hive.SelectHiveQL" =>
val hiveUris = PropertyUtil.getPropertyValue("hive.metastore.uris")
if (hiveUris != null) {
url = "jdbc:hive2:" + hiveUris.substring(hiveUris.indexOf("/"), hiveUris.lastIndexOf(":")) + ":10000/"
} else {
throw new IllegalArgumentException("hive.metastore.uris is null")
}
driverClassName = "org.apache.hive.jdbc.HiveDriver"
new DataSourceParamsVo(bundle
, driverClassName
, url
, user
, password
, querySql
)
case "cn.piflow.bundle.hive.SelectHiveQLByJDBC" =>
driverClassName = "org.apache.hive.jdbc.HiveDriver"
new DataSourceParamsVo(bundle
, driverClassName
, url
, user
, password
, querySql
)
case "cn.piflow.bundle.jdbc.MysqlRead" =>
new DataSourceParamsVo(bundle
, driverClassName
, url
, user
, password
, querySql
)
case _ => throw new IllegalArgumentException("unknown bundle!")
}
dataSourceParamsVo
}
API的getDataSourceMeta和getDataSourceMeta方法
//获取表的源数据
def getDataSourceMeta(dataSourceParamsVo: DataSourceParamsVo): String = {
var response: DataSourceMetaResponse = new DataSourceMetaResponse(false, null)
try {
val dataSourceMetaServiceImpl: DataSourceMetaService = new DataSourceMetaServiceImpl()
val dataSourceMeta: DataSourceMetaVo = dataSourceMetaServiceImpl.getDataSourceMeta(dataSourceParamsVo)
val mapper: ObjectMapper = new ObjectMapper()
if (dataSourceMeta != null) {
response = new DataSourceMetaResponse(true, dataSourceMeta)
}
val responseJson: String = mapper.writeValueAsString(response)
responseJson
} catch {
case ex: Exception => ""
}
}
//获取表的样例数据
def getDataSourceSampleData(dataSourceParamsVo: DataSourceParamsVo): String = {
var response: DataSourceDataResponse = new DataSourceDataResponse(false, null)
try {
val dataSourceMetaServiceImpl: DataSourceMetaService = new DataSourceMetaServiceImpl()
val dataSourceSampleData: DataSourceDataVo = dataSourceMetaServiceImpl.getDataSourceSampleData(dataSourceParamsVo)
val mapper: ObjectMapper = new ObjectMapper()
if (dataSourceSampleData != null) {
response = new DataSourceDataResponse(true, dataSourceSampleData)
}
val responseJson: String = mapper.writeValueAsString(response)
responseJson
} catch {
case ex: Exception => ""
}
}
dataSourceMetaServiceImpl逻辑代码(因为写这代码的同事,不会scala,因为这个功能的实现,是java和scala代码的混合)
//实现的接口我这里就不粘贴了,就声明了两个方法,没啥意义
@Service
public class DataSourceMetaServiceImpl implements DataSourceMetaService {
//获取表源数据方法(表结构)
@Override
public DataSourceMetaVo getDataSourceMeta(DataSourceParamsVo dataSourceParamsVo) {
DataSourceMetaHandler metaHandler = DataSourceMetaHandlerManager.getOrCreate(dataSourceParamsVo.getDriverClassName(), dataSourceParamsVo.getUrl(), dataSourceParamsVo.getUser(), dataSourceParamsVo.getPassword());
// DataSourceMetaHandler metaHandler = DataSourceMetaHandlerManager.getCache().get(generateKey(dataSourceParamsVo.getDriverClassName(), dataSourceParamsVo.getUrl(), dataSourceParamsVo.getUser(), dataSourceParamsVo.getPassword()), o -> new DataSourceMetaHandler(dataSourceParamsVo.getDriverClassName(), dataSourceParamsVo.getUrl(), dataSourceParamsVo.getUser(), dataSourceParamsVo.getPassword()));
DataSourceMetaVo meta = null;
if (metaHandler != null && metaHandler.isConnected()) {
meta = metaHandler.getDataSourceMetaBySql(dataSourceParamsVo.getQuerySql());
}
return meta;
}
//获取表样例数据方法
@Override
public DataSourceDataVo getDataSourceSampleData(DataSourceParamsVo dataSourceParamsVo) {
DataSourceMetaHandler metaHandler = DataSourceMetaHandlerManager.getOrCreate(dataSourceParamsVo.getDriverClassName(), dataSourceParamsVo.getUrl(), dataSourceParamsVo.getUser(), dataSourceParamsVo.getPassword());
DataSourceDataVo meta = null;
if (metaHandler != null && metaHandler.isConnected()) {
meta = metaHandler.getDataSourceSampleData(dataSourceParamsVo.getQuerySql());
}
return meta;
}
}
DataSourceMetaHandlerManager类
@Slf4j
public class DataSourceMetaHandlerManager {
public static final char SEPARATOR = ':';
@Getter
private static Cache<String, DataSourceMetaHandler> cache = Caffeine.newBuilder().maximumSize(6000).expireAfterAccess(24, TimeUnit.HOURS)
.expireAfterWrite(24, TimeUnit.HOURS).recordStats()
.build(new CacheLoader<String, DataSourceMetaHandler>() {
@Override
public @Nullable DataSourceMetaHandler load(@NonNull String key) throws Exception {
// try {
// MetaHandler metaHandler = MetaHandler.class.getConstructor(String.class,
// String.class,
// String.class,
// String.class).newInstance(slipKey(key));
// return metaHandler;
// } catch (Exception e) {
// log.error("load MetaHandler error", e);
// return null;
// }
return null;
}
});
/**
* 获取缓存中的数据源处理实例,无缓存则创建
* @param driverClassName
* @param url
* @param user
* @param password
* @return
*/
public static DataSourceMetaHandler getOrCreate(String driverClassName, String url, String user, String password) {
DataSourceMetaHandler metaHandler = cache.get(generateKey(driverClassName, url, user, password), new Function<String, DataSourceMetaHandler>() {
@Override
public DataSourceMetaHandler apply(String o) {
return new DataSourceMetaHandler(driverClassName, url, user, password);
}
});
return metaHandler;
}
/**
* 生成缓存的key
* @param input
* @return
*/
public static String generateKey(String... input) {
return StringUtils.join(input, SEPARATOR);
}
public static String[] slipKey(String input) {
return StringUtils.split(input, SEPARATOR);
}
}
DataSourceMetaHandler 类
/**
* 数据源元数据处理类
*/
@Data
@Slf4j
public class DataSourceMetaHandler {
public static final int DB_RETRY_TIMES = 3;
private final String driverClassName;
private final String url;
private final String user;
private final String password;
private Connection con;
private DatabaseMetaData dbMetaData;
private Statement statement;
private boolean isConnected;
private AtomicInteger initTimes = new AtomicInteger(0);
public DataSourceMetaHandler(String driverClassName, String url, String user, String password) {
// driverClassName = "com.mysql.jdbc.Driver";
// url = "jdbc:mysql://127.0.0.1:3306/database";
// user = "root";
// password = "123456";
this.driverClassName = driverClassName;
this.url = url;
this.user = user;
this.password = password;
init();
}
// @PostConstruct
public void init() {
if (initTimes.getAndIncrement() >= DB_RETRY_TIMES) {
log.error("init fail! {} {}", driverClassName, url);
throw new RuntimeException("db init fail!");
}
try {
// DriverManager.registerDriver(new com.mysql.jdbc.Driver());
// String url = "jdbc:mysql://10.0.90.155:3306/piflow_web_1.2_bigflow_155?allowMultiQueries=true&useUnicode=true&characterEncoding=UTF8&zeroDateTimeBehavior=convertToNull&useSSL=false";
Class.forName(driverClassName);
connect();
dbMetaData = con.getMetaData();
isConnected = true;
} catch (Exception e) {
log.error("init error! {} {}", driverClassName, url, e);
init();
}
}
@PreDestroy
public void colseCon() {
try {
if (con != null) {
con.close();
}
if (statement != null) {
statement.close();
}
} catch (SQLException e) {
// TODO: handle SQLException
e.printStackTrace();
}
}
/**
* 根据查询生成视图,获得视图中的所有列信息
*/
public DataSourceMetaVo getDataSourceMetaBySql(String querySql) {
String viewName = getViewBySql(querySql);
return getDataSourceMetaByTable(viewName);
}
/**
* 创建视图
*
* @param querySql
* @return
*/
public String getViewBySql(String querySql) {
String viewName = fillViewName(querySql);
if (StringUtils.isNotBlank(viewName)) {
if (checkHasView(viewName)) {
return viewName;
}
boolean executed = createView(querySql, viewName, 0);
if (executed) {
return viewName;
}
}
return null;
}
private boolean createView(String querySql, String viewName, int i) {
if (checkHasView(viewName)) {
return true;
}
if (i >= DB_RETRY_TIMES) {
log.error("createView fail! {} {} {}", driverClassName, url, querySql);
throw new RuntimeException("db createView fail!");
}
boolean executed;
try {
executed = statement.execute("CREATE VIEW " + viewName + " AS " + querySql);
if (!executed) {
executed = createView(querySql, viewName, ++i);
}
} catch (SQLException e) {
log.error("createView error! {} {} {}", driverClassName, url, querySql, e);
connect();
executed = createView(querySql, viewName, ++i);
}
return executed;
}
private void connect() {
try {
con = DriverManager.getConnection(url, user, password);
statement = con.createStatement();
} catch (SQLException e) {
log.error("connect error!", e);
}
}
/**
* 拼接视图名称
*
* @param querySql
* @return
*/
public static String fillViewName(String querySql) {
String viewName = null;
if (StringUtils.isNotBlank(querySql)) {
if (StringUtils.containsIgnoreCase(querySql, "FROM ")) {
int hashCode = querySql.hashCode();
String code;
if (hashCode < 0) {
code = '_' + StringUtils.substring(Integer.toString(hashCode), 1);
} else {
code = Integer.toString(hashCode);
}
if (StringUtils.isNotBlank(StringUtils.substringBetween(querySql, "FROM ", " "))) {
viewName = "view_" + StringUtils.substringBetween(querySql, "FROM ", " ") + "_" + code;
} else {
viewName = "view_" + StringUtils.substringAfterLast(querySql, "FROM ") + "_" + code;
}
}
}
return viewName;
}
/**
* 是否存在视图
*/
public boolean checkHasView(String viewName) {
boolean result = false;
try {
String[] types =
{"VIEW"};
ResultSet rs = dbMetaData.getTables(null, null, viewName, types);
while (rs.next()) {
result = Objects.equals(viewName, rs.getString("TABLE_NAME"));
if (result) {
break;
}
}
} catch (SQLException e) {
log.error("checkHasView fail!", e);
}
return result;
}
/**
* 获得表或视图中的所有列信息
*/
public DataSourceMetaVo getDataSourceMetaByTable(String tableName) {
DataSourceMetaVo metaVo = new DataSourceMetaVo();
if (StringUtils.isNotBlank(tableName)) {
List<DataSourceMetaColumnVo> metaColumns = Lists.newArrayList();
metaVo.setColumnList(metaColumns);
try {
ResultSet rs = dbMetaData.getColumns(null, null, tableName, "%");
while (rs.next()) {
DataSourceMetaColumnVo metaColumn = DataSourceMetaColumnVo.builder()
.columnName(rs.getString("COLUMN_NAME"))
.typeName(rs.getString("TYPE_NAME") + "(" + rs.getInt("COLUMN_SIZE") + ")")
.remarks(rs.getString("REMARKS"))
.build();
System.out.println(rs.getString("COLUMN_NAME") +
"----" + rs.getString("TYPE_NAME") + "(" + rs.getInt("COLUMN_SIZE") + ")" +
"----" + rs.getString("REMARKS"));
metaColumns.add(metaColumn);
}
} catch (SQLException e) {
// TODO: handle SQLException
e.printStackTrace();
}
}
return metaVo;
}
/**
* 获取表或视图中的抽样记录
*/
public DataSourceDataVo getDataSourceSampleData(String querySql) {
DataSourceDataVo dataVo = new DataSourceDataVo();
if (StringUtils.isNotBlank(querySql)) {
List<String> columnNameList = Lists.newArrayList();
List<List<String>> rowDataList = Lists.newArrayList();
dataVo.setColumnNameList(columnNameList);
dataVo.setRowDataList(rowDataList);
try {
ResultSet rs = executeQuery(querySql, 0);
int columnCount = 0;
if (rs != null) {
ResultSetMetaData metaData = rs.getMetaData();
columnCount = metaData.getColumnCount();
if (columnCount > 0) {
for (int i = 1; i <= columnCount; i++) {
columnNameList.add(metaData.getColumnName(i));
System.out.print(metaData.getColumnName(i) + " || ");
}
System.out.println();
System.out.println("--------------------------");
}
}
if (columnCount > 0) {
while (rs.next()) {
List<String> rowData = Lists.newArrayList();
for (int i = 1; i <= columnCount; i++) {
rowData.add(rs.getString(i));
System.out.print(rs.getString(i) + " || ");
}
rowDataList.add(rowData);
System.out.println();
System.out.println("--------------------------");
}
}
} catch (SQLException e) {
log.error("getDataSourceSampleData error", e);
}
}
return dataVo;
}
private ResultSet executeQuery(String querySql, int i) {
ResultSet rs;
if (i >= DB_RETRY_TIMES) {
log.error("executeQuery fail! {} {} {}", driverClassName, url, querySql);
throw new RuntimeException("db executeQuery fail!");
}
try {
String executeSql = wrapSqlLimit(querySql);
rs = statement.executeQuery(executeSql);
} catch (SQLException e) {
log.error("executeQuery error! {} {} {}", driverClassName, url, querySql, e);
connect();
rs = executeQuery(querySql, ++i);
}
return rs;
}
/**
* 获取表或视图中的抽样记录拼接sql
*
* @param querySql
* @return
*/
private String wrapSqlLimit(String querySql) {
String executeSql;
if (!StringUtils.containsIgnoreCase(querySql, "limit ")) {
executeSql = querySql + " limit 10";
} else {
executeSql = StringUtils.substring(querySql, 0, StringUtils.lastIndexOfIgnoreCase(querySql, "limit ")) + " limit 10";
}
return executeSql;
}
}
DataSourceDataResponse 类(用来构建返回值)
/**
* 数据中心注册和更新请求的接口返回
**/
@Data
@Builder
@NoArgsConstructor
public class DataSourceDataResponse implements Serializable {
private Boolean isConnected;
private DataSourceDataVo dataSourceDataVo;
public DataSourceDataResponse(Boolean isConnected, DataSourceDataVo dataSourceDataVo) {
this.isConnected = isConnected;
this.dataSourceDataVo = dataSourceDataVo;
}
}
DataSourceDataVo 类(用来存储表的样例数据)
/**
* 数据源的数据记录
*/
@Data
@Builder
@NoArgsConstructor
@AllArgsConstructor
public class DataSourceDataVo {
List<String> columnNameList; //表的所有列名
List<List<String>> rowDataList; //获取的表的样例数据集合:里面的list存的是每行的所有字段数据;外层的list存的是所有行数据
}
DataSourceMetaColumnVo 类(存放表结构的实体,这里代码没用到)
/**
* 数据源的表结构-列
*/
@Data
@Builder
@NoArgsConstructor
@AllArgsConstructor
public class DataSourceMetaColumnVo {
private String columnName; //表的具体某列的名字
private String typeName; //对应列的数据类型
private String remarks; //对应列的备注
}
代码就粘贴这么多吧
这里还有个小bug,同事在自己的项目中写测试类测试,没什么问题;但通过我的项目调用他的项目,在service中有时metaHandler连接上了,但获取不到源数据和样例数据,这个是工具的问题,解决方法:在创建连接的时候重试几次(感兴趣的可以搜DB_RETRY_TIMES
常量在哪里使用了);这个问题基本解决
问题重点及解决
好,接下来进入问题的重点:
因为数据库可能存的有json数据,就用极端方式测试了:获取样例的数据中,有字段存的json数据,而且json中又有中文,就会乱码(比如获取的是A表的样例数据,A表中有B、C、D3个字段,BC都是varchar类型,D是json类型,那么B、C即使存的中文,读出来也不会乱码,但如果D中有中文,D自己会发生中文乱码);后来一步一步跟代码,找原因,最后找到了,这里直接说结论吧:其中是在代码ResultSet rs = executeQuery(querySql, 0);
下一行打的断点,主要看rs
的值,rs有个field属性,见下图:
图片截图比较长,有兴趣的可以放大图片看:右下角画横线的地方,meta_data和data_example的charsetName=ISO-8859-1
;虽然数据库和表的设计都是指定的utf-8编码,但meta_data和data_example的数据类型指定的是json,json不是utf-8,这里也找到了一个比较靠谱的博主说的
恰好我使用的也是mysql5.7版本,参考这个博主的说法,需要表改为utf8mb4编码,读取json时才不会乱码(由于个人项目原因,这种方式我没试,有遇到这问题的小伙伴可以自己尝试下这种方法)
我是直接把meta_data和data_example字段的数据类型改为了text类型,就没有出现了乱码;改为text类型后,还是打断点查看rs的field属性的值,如下图:
可以看出meta_data和data_example改为text类型后,charsetName=utf-8
,这里变成了utf-8,那最终的数据也就没有乱码