MariaDB Connector设计文档

环境简介

基于官方文档推荐的开发环境,简单介绍一下本人使用的开发环境

WSL(Ubuntu18.04.5)+WSLg(搭配IDEA的GUI进行开发调试)

Maven3.6.3

Java openjdk 1.8.0_312

Docker20.10.14(Docker Desktop 4.8.2 (79419) 整合WSL)

实现思路

由于本身MariaDB是Mysql的分支项目,也是支持JDBC协议的数据源,因此本次的Connector开发主要参照Mysql Connector的代码思路

此外,openLooKeng针对支持JDBC协议的数据源抽象出了一层BaseJdbcXXX,因此我们可以参照MysqlConnector的实现,在其基础上进行参照和修改,通过集成和实现已有的BaseJdbcXXX编程API来开发MariaDB Connector

功能概述

和Mysql Connector功能类似,我们可以在OLK中针对配置的MariaDB数据源进行操作

部署完毕后启动OLK的server

link@XiaoHuangYa:~/hetu-server-1.7.0-SNAPSHOT/bin$ ./launcher start
 Started as 6874

通过jar包作为客户端连接

link@XiaoHuangYa:~/hetu-server-1.7.0-SNAPSHOT/bin$ java -jar hetu-cli-1.7.0-SNAPSHOT-executable.jar --server localhost:8080 --catalog maria --schema test
 
 lk:test> select * from user;
  id |    name
 ----+-------------
   1 | tom
   2 | link
   3 | maria
   4 | xiaohuangya
   6 | xiaohuangya
   9 | hyy
 (6 rows)
 
 Query 20220617_035442_00002_a8n5b, FINISHED, 1 node
 Splits: 17 total, 17 done (100.00%)
 0:00 [6 rows, 0B] [14 rows/s, 0B/s]

初步配置

从gitee clone最新版的项目,这里clone过来的OLK版本是1.7.0

之后先要部署下载所有需要的依赖包(否则运行源码的UT会出现部分的类找不到),跳过所有的测试进行编译mvn clean install -DskipTests

新建模块

创建名为hetu-mariadb的模块来作为openLooKeng针对MariaDB的连接器

配置依赖

主要是参照Mysql Connector的pom.xml,但是针对依赖中Driver的实现类引入MariaDB官方的Driver驱动进行数据源的连接

需要注意parent标签是否为presto-root以及packaging标签是否为hetu-plugin,在打包编译时会将当前工程打包到heto-core的plugin目录下

<?xml version="1.0" encoding="UTF-8"?>
 <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
     <modelVersion>4.0.0</modelVersion>
 
     <parent>
         <groupId>io.hetu.core</groupId>
         <artifactId>presto-root</artifactId>
         <version>1.7.0-SNAPSHOT</version>
     </parent>
 
     <artifactId>hetu-mariadb</artifactId>
     <description>Hetu - MariaDB Connector</description>
     <packaging>hetu-plugin</packaging>
 
     <properties>
         <air.main.basedir>${project.parent.basedir}</air.main.basedir>
     </properties>
 
     <dependencies>
         <dependency>
             <groupId>io.hetu.core</groupId>
             <artifactId>presto-base-jdbc</artifactId>
         </dependency>
 
         <dependency>
             <groupId>io.airlift</groupId>
             <artifactId>configuration</artifactId>
         </dependency>
 
         <dependency>
             <groupId>com.google.guava</groupId>
             <artifactId>guava</artifactId>
         </dependency>
 
         <dependency>
             <groupId>com.google.inject</groupId>
             <artifactId>guice</artifactId>
         </dependency>
 
         <dependency>
             <groupId>javax.validation</groupId>
             <artifactId>validation-api</artifactId>
         </dependency>
 
         <dependency>
             <groupId>javax.inject</groupId>
             <artifactId>javax.inject</artifactId>
         </dependency>
 
         <dependency>
             <groupId>com.fasterxml.jackson.core</groupId>
             <artifactId>jackson-core</artifactId>
         </dependency>
 
         <dependency>
             <groupId>com.fasterxml.jackson.core</groupId>
             <artifactId>jackson-databind</artifactId>
         </dependency>
 
         <dependency>
             <groupId>io.airlift</groupId>
             <artifactId>json</artifactId>
         </dependency>
 
         <!-- Presto SPI -->
         <dependency>
             <groupId>io.hetu.core</groupId>
             <artifactId>presto-spi</artifactId>
             <scope>provided</scope>
         </dependency>
 
         <dependency>
             <groupId>io.airlift</groupId>
             <artifactId>slice</artifactId>
             <scope>provided</scope>
         </dependency>
 
         <dependency>
             <groupId>io.airlift</groupId>
             <artifactId>units</artifactId>
             <scope>provided</scope>
         </dependency>
 
         <dependency>
             <groupId>com.fasterxml.jackson.core</groupId>
             <artifactId>jackson-annotations</artifactId>
             <scope>provided</scope>
         </dependency>
 
         <dependency>
             <groupId>org.openjdk.jol</groupId>
             <artifactId>jol-core</artifactId>
             <scope>provided</scope>
         </dependency>
 
         <!-- for testing -->
         <dependency>
             <groupId>org.testng</groupId>
             <artifactId>testng</artifactId>
             <scope>test</scope>
         </dependency>
 
         <dependency>
             <groupId>org.assertj</groupId> 
            <artifactId>assertj-core</artifactId> 
            <scope>test</scope> 
        </dependency> 
 
        <dependency> 
            <groupId>io.airlift</groupId> 
            <artifactId>testing</artifactId> 
            <scope>test</scope> 
        </dependency> 
 
        <dependency> 
            <groupId>io.hetu.core</groupId> 
            <artifactId>presto-main</artifactId> 
            <exclusions> 
                <exclusion> 
                    <groupId>org.bouncycastle</groupId> 
                    <artifactId>bcprov-jdk15on</artifactId> 
                </exclusion> 
            </exclusions> 
            <scope>test</scope> 
        </dependency> 
 
        <dependency> 
            <groupId>io.hetu.core</groupId> 
            <artifactId>presto-tpch</artifactId> 
            <scope>test</scope> 
        </dependency> 
 
        <dependency> 
            <groupId>io.airlift.tpch</groupId> 
            <artifactId>tpch</artifactId> 
            <scope>test</scope> 
        </dependency> 
 
        <dependency> 
            <groupId>io.hetu.core</groupId> 
            <artifactId>presto-tests</artifactId> 
            <scope>test</scope> 
        </dependency> 
 
        <dependency> 
            <groupId>io.airlift</groupId> 
            <artifactId>testing-mysql-server</artifactId> 
            <scope>test</scope> 
        </dependency> 
 
        <dependency> 
            <groupId>io.hetu.core</groupId> 
            <artifactId>presto-testing-docker</artifactId> 
            <scope>test</scope> 
        </dependency> 
 
        <dependency> 
            <groupId>org.jetbrains</groupId> 
            <artifactId>annotations</artifactId> 
            <scope>test</scope> 
        </dependency> 
        <!-- MariaDB Driver --> 
        <dependency> 
            <groupId>org.mariadb.jdbc</groupId> 
            <artifactId>mariadb-java-client</artifactId> 
            <version>2.7.3</version> 
        </dependency> 
        <!--Testing MariaDB with container--> 
<!--        <dependency>--> 
<!--            <groupId>org.testcontainers</groupId>--> 
<!--            <artifactId>mariadb</artifactId>--> 
<!--            <version>1.16.3</version>--> 
<!--        </dependency>--> 
    </dependencies> 
 
</project>

注册插件

我们需要在hetu-server的xml中注册MariaDB的Connector

<artifactSet to="plugin/mariadb">
         <artifact id="${project.groupId}:hetu-mariadb:zip:${project.version}">
             <unpack/>
         </artifact>
     </artifactSet>

核心实现

外部函数注册和下推

主要是参考博客和视频学习

外部函数注册和下推

想在openLooKeng中直接使用外部数据源的函数,该怎么操作?

外部函数下推

只需要继承抽象模板类ApplyRemoteFunctionPushDown即可

public class MariaApplyRemoteFunctionPushDown
         extends ApplyRemoteFunctionPushDown
 {
     public MariaApplyRemoteFunctionPushDown(BaseJdbcConfig baseJdbcConfig, String connectorName)
     {
         super(baseJdbcConfig, connectorName);
     }
 }

外部函数注册

通过继承JdbcExternalFunctionHub绑定字符串,数字运算,日期相关的函数,参考Mysql

public class MariaExternalFunctionHub
         extends JdbcExternalFunctionHub
 {
     private final BaseJdbcConfig jdbcConfig;
 
     @Inject
     public MariaExternalFunctionHub(BaseJdbcConfig jdbcConfig)
     {
         this.jdbcConfig = requireNonNull(jdbcConfig, "jdbcConfig is null");
     }
 
     @Override
     public Optional<CatalogSchemaName> getExternalFunctionCatalogSchemaName()
     {
         return jdbcConfig.getConnectorRegistryFunctionNamespace();
     }
 
     @Override
     public Set<ExternalFunctionInfo> getExternalFunctions()
     {
         return ImmutableSet.<ExternalFunctionInfo>builder().addAll(MariaExternalMathFunctions.getFunctionsInfo()).addAll(MariaExternalStringFunctions.getFunctionsInfo()).addAll(MariaExternalDateTimeFunctions.getFunctionsInfo()).build();
     }
 }
  • MariaExternalDateTimeFunctions
public final class MariaExternalDateTimeFunctions
 {
     public static Set<ExternalFunctionInfo> getFunctionsInfo()
     {
         return ImmutableSet.<ExternalFunctionInfo>builder().add(MARIA_MONTH_NAME_FUNCTION_INFO).add(MARIA_HOUR_FUNCTION_INFO).add(MARIA_TIMESTAMP_FUNCTION_INFO).add(MARIA_DATE_FORMAT_FUNCTION_INFO).build();
     }
 
     private static final ExternalFunctionInfo MARIA_MONTH_NAME_FUNCTION_INFO = ExternalFunctionInfo.builder().functionName("monthname").inputArgs(StandardTypes.DATE).returnType(StandardTypes.VARCHAR).deterministic(true).calledOnNullInput(false).description("return name of the month for the input date").build();
     private static final ExternalFunctionInfo MARIA_HOUR_FUNCTION_INFO = ExternalFunctionInfo.builder().functionName("hour").inputArgs(StandardTypes.TIME).returnType(StandardTypes.INTEGER).deterministic(true).calledOnNullInput(false).description("return hour value for the input time").build();
     private static final ExternalFunctionInfo MARIA_TIMESTAMP_FUNCTION_INFO = ExternalFunctionInfo.builder().functionName("timestamp").inputArgs(StandardTypes.TIMESTAMP, StandardTypes.TIME).returnType(StandardTypes.TIMESTAMP).deterministic(true).calledOnNullInput(false).description("adds time expr2 to expr1 and return the result as a timestamp value").build();
 
     private static final ExternalFunctionInfo MARIA_DATE_FORMAT_FUNCTION_INFO = ExternalFunctionInfo.builder().functionName("date_format").inputArgs(StandardTypes.TIMESTAMP, StandardTypes.VARCHAR).returnType(StandardTypes.VARCHAR).deterministic(true).calledOnNullInput(false).description("format the date value according to the format string").build();
 
     private MariaExternalDateTimeFunctions()
     {
     }
 }
  • MariaExternalMathFunctions
public final class MariaExternalMathFunctions
 {
     private MariaExternalMathFunctions()
     {
     }
 
     private static final ExternalFunctionInfo MARIA_TRUNCATE_FUNCTION_INFO = ExternalFunctionInfo.builder().functionName("truncate").inputArgs(StandardTypes.DOUBLE, StandardTypes.INTEGER).returnType(StandardTypes.DOUBLE).deterministic(true).calledOnNullInput(false).description("return the value x that is reserved to y decimal places").build();
 
     private static final ExternalFunctionInfo MARIA_ABS_FUNCTION_INFO = ExternalFunctionInfo.builder().functionName("abs").inputArgs(StandardTypes.INTEGER).returnType(StandardTypes.INTEGER).deterministic(true).calledOnNullInput(false).description("return the absolute value of x").build();
 
     private static final ExternalFunctionInfo MARIA_BIG_INT_ABS_FUNCTION_INFO = ExternalFunctionInfo.builder().functionName("abs").inputArgs(StandardTypes.BIGINT).returnType(StandardTypes.BIGINT).deterministic(true).calledOnNullInput(false).description("return the absolute value of x").build();
 
     public static Set<ExternalFunctionInfo> getFunctionsInfo()
     {
         return ImmutableSet.<ExternalFunctionInfo>builder().add(MARIA_TRUNCATE_FUNCTION_INFO).add(MARIA_ABS_FUNCTION_INFO).add(MARIA_BIG_INT_ABS_FUNCTION_INFO).build();
     }
 }
  • MariaExternalStringFunctions
public final class MariaExternalStringFunctions
 {
     public static Set<ExternalFunctionInfo> getFunctionsInfo()
     {
         return ImmutableSet.<ExternalFunctionInfo>builder().add(MARIA_FORMAT_FUNCTION_INFO).add(MARIA_FORMAT_DECIMAL_FUNCTION_INFO).add(MARIA_LOWER_FUNCTION_INFO).build();
     }
 
     private static final ExternalFunctionInfo MARIA_FORMAT_FUNCTION_INFO = ExternalFunctionInfo.builder().functionName("format").inputArgs(StandardTypes.DOUBLE, StandardTypes.INTEGER).returnType(StandardTypes.VARCHAR).deterministic(true).calledOnNullInput(false).description("format the number 'num' to a format like'#,###,###.##', " + "rounded to 'lo' decimal places, and returns the result as a string").build();
 
     private static final ExternalFunctionInfo MARIA_FORMAT_DECIMAL_FUNCTION_INFO = ExternalFunctionInfo.builder().functionName("format").inputArgs(StandardTypes.DECIMAL, StandardTypes.INTEGER).returnType(StandardTypes.VARCHAR).deterministic(true).calledOnNullInput(false).description("format the number 'num' to a format like'#,###,###.##', " + "rounded to 'lo' decimal places, and returns the result as a string").build();
 
     private static final ExternalFunctionInfo MARIA_LOWER_FUNCTION_INFO = ExternalFunctionInfo.builder().functionName("lower").inputArgs(StandardTypes.VARCHAR).returnType(StandardTypes.VARCHAR).deterministic(true).calledOnNullInput(false).description("returns the string str with all characters changed to lowercase").build();
 
     private MariaExternalStringFunctions()
     {
     }
 }

MariaPlugin

主要是注册MariaClientModule,配置一些连接器基本的介绍元数据信息

public class MariaPlugin
         extends JdbcPlugin
 {
     public MariaPlugin()
     {
         super("maria", new MariaClientModule());
     }
 
     @Override
     public Optional<ConnectorWithProperties> getConnectorWithProperties()
     {
         ConnectorConfig connectorConfig = MariaPlugin.class.getAnnotation(ConnectorConfig.class);
         Optional<ConnectorWithProperties> connectorWithProperties = ConnectorUtil.assembleConnectorProperties(connectorConfig, Arrays.asList(BaseJdbcConfig.class.getDeclaredMethods()));
         ConnectorUtil.addConnUrlProperty(connectorWithProperties, "jdbc:mariadb://host:port");
         return connectorWithProperties;
     }
 }

MariaClientModule

绑定MariaClient、外部函数以及Config配置类

参考Mysql Connector的实现,但是由于我们Driver的实现是MariaDB官方提供的java-client-driver,因此需要修改针对URL的解析逻辑

在MysqlClientModule中setup的时候需要调用ensureCatalogIsEmpty方法,首先我们需要解析我们配置文件中的url地址,确保地址有效,否则抛出异常

除此之外,在url中也不能指定我们的数据库(catalog)

  • 确保url有效不为null
  • 确保在封装的propertis对象中不存在DBNAME等相关的key

在MariaDB提供的Driver中,针对URL的处理是放在了URLParser

我们可以进入到这里面进行详细查看

通过源码我们发现,其实在Mysql中关于url参数的解析都是在driver类中,但是在MariaDB的驱动中式在URLParser中

//通过URLParser的静态方法parse解析并返回URLParser实例
   public static UrlParser parse(final String url) throws SQLException {
     return parse(url, new Properties());
   }
   public static UrlParser parse(final String url, Properties prop) throws SQLException {
     if (url != null
         && (url.startsWith("jdbc:mariadb:")
             || url.startsWith("jdbc:mysql:") && !url.contains(DISABLE_MYSQL_URL))) {
       UrlParser urlParser = new UrlParser();
       parseInternal(urlParser, url, (prop == null) ? new Properties() : prop);
       return urlParser;
     }
     return null;
   }

如果解析失败返回的就是null,所以我们确保url有效不为null的实现就是判断parse方法的返回值是否为null

值得注意的是我们的UrlParser中有一个database成员,这个就是表示url解析后的数据库字段,所以确保在封装的propertis对象中不存在DBNAME等相关的key等价于确保urlParser实例的database成员不为null

private static void ensureCatalogIsEmpty(String connectionUrl)
     {
         try {
             //The implement for parsing url of MariaDB exists in the UrlParser class
             UrlParser urlParser = UrlParser.parse(connectionUrl);
             checkArgument(urlParser != null, "Invalid JDBC URL for MariaDB connector");
             checkArgument(urlParser.getDatabase() == null, "Database (catalog) must not be specified in JDBC URL for MariaDB connector");
         }
         catch (SQLException e) {
             throw new RuntimeException(e);
         }
     }

整体逻辑如下

public class MariaClientModule
         extends AbstractConfigurationAwareModule
 {
     @Override
     protected void setup(Binder binder)
     {
         binder.bind(JdbcClient.class).to(MariaClient.class).in(Scopes.SINGLETON);
         binder.bind(ExternalFunctionHub.class).to(MariaExternalFunctionHub.class).in(Scopes.SINGLETON);
         ensureCatalogIsEmpty(buildConfigObject(BaseJdbcConfig.class).getConnectionUrl());
         configBinder(binder).bindConfig(MariaConfig.class);
     }
 
     private static void ensureCatalogIsEmpty(String connectionUrl)
     {
         try {
             //The implement for parsing url of MariaDB Driver exists in the UrlParser class
             UrlParser urlParser = UrlParser.parse(connectionUrl);
             checkArgument(urlParser != null, "Invalid JDBC URL for MariaDB connector");
             checkArgument(urlParser.getDatabase() == null, "Database (catalog) must not be specified in JDBC URL for MariaDB connector");
         }
         catch (SQLException e) {
             throw new RuntimeException(e);
         }
     }
 
     @Provides
     @Singleton
     public static ConnectionFactory createConnectionFactory(BaseJdbcConfig config, MariaConfig mariaConfig)
             throws SQLException
     {
         Properties connectionProperties = basicConnectionProperties(config);
         connectionProperties.setProperty("useInformationSchema", "true");
         connectionProperties.setProperty("nullCatalogMeansCurrent", "false");
         connectionProperties.setProperty("useUnicode", "true");
         connectionProperties.setProperty("characterEncoding", "utf8");
         connectionProperties.setProperty("tinyInt1isBit", "false");
         if (mariaConfig.isAutoReconnect()) {
             connectionProperties.setProperty("autoReconnect", String.valueOf(mariaConfig.isAutoReconnect()));
             connectionProperties.setProperty("maxReconnects", String.valueOf(mariaConfig.getMaxReconnects()));
         }
         if (mariaConfig.getConnectionTimeout() != null) {
             connectionProperties.setProperty("connectTimeout", String.valueOf(mariaConfig.getConnectionTimeout().toMillis()));
         }
 
         return new DriverConnectionFactory(new Driver(), config.getConnectionUrl(), Optional.ofNullable(config.getUserCredentialName()), Optional.ofNullable(config.getPasswordCredentialName()), connectionProperties);
     }
 }

MariaClient关键实现

表示的是支持JDBC协议的一个客户端代表,主要是基于JDBC来操作数据源,是在BaseJdbcClient基础上针对MairaDB的一套适配

主要包括基本的针对表的CRUD

listSchemas=查看所有的数据库

@Override
     protected Collection<String> listSchemas(Connection connection)
     {
         // for MariaDB, we need to list catalogs instead of schemas
         try (ResultSet resultSet = connection.getMetaData().getCatalogs()) {
             ImmutableSet.Builder<String> schemaNames = ImmutableSet.builder();
             while (resultSet.next()) {
                 String schemaName = resultSet.getString("TABLE_CAT");
                 // skip internal schemas
                 if (!schemaName.equalsIgnoreCase("information_schema") && !schemaName.equalsIgnoreCase("mysql")) {
                     schemaNames.add(schemaName);
                 }
             }
             return schemaNames.build();
         }
         catch (SQLException e) {
             throw new RuntimeException(e);
         }
     }

getPreparedStatment获取类似JDBC中针对SQL语句的封装对象

@Override
     public PreparedStatement getPreparedStatement(Connection connection, String sql)
             throws SQLException
     {
         PreparedStatement statement = connection.prepareStatement(sql);
         if (statement.isWrapperFor(MariaDbStatement.class)) {
             statement.unwrap(MariaDbStatement.class);
         }
         return statement;
     }

这里针对不同的驱动,有不同的实现,在MysqlConnector中,wrap的是Mysql的Statement实现类,在这里我们就wrap MariaDB的实现类即可,

除此之外,使用Wrapper接口来对获取的preparedStatment来判断

值得注意的是我们的MariaDB的驱动并没有提供关于SQL语法ErrorCode和SqlState的映射关系

  • 在Mysql驱动提供的SQLError中,以SQL_STATE_ER_TABLE_EXISTS_ERROR为例
public static final String SQL_STATE_ER_TABLE_EXISTS_ERROR = "42S01";
         mysqlToSqlState.put(MysqlErrorNumbers.ER_TABLE_EXISTS_ERROR, SQL_STATE_ER_TABLE_EXISTS_ERROR);
         mysqlToSql99State.put(MysqlErrorNumbers.ER_TABLE_EXISTS_ERROR, SQL_STATE_ER_TABLE_EXISTS_ERROR);

而在MysqlErrorNumbers中

public final static int ER_TABLE_EXISTS_ERROR = 1050; //SQLSTATE: 42S01 Message: Table '%s' already exists

本质上就是把我们的ErrorCode和SqlState进行了一次映射

mariadb java连接 mariadb connector_xml

但是在MariaDB中并未提供映射,也没有列出所有相关的SQL_STATE字符串

参考Trino社区的实现,我们可以在createTable和renameColumn两个方法中针对抛出的异常来调用getCause方法来判断是否instance of具体表已经存在和版本不支持的异常

  • 针对Connector对应表已经存在的异常,亲测抛出的是SQLSyntaxErrorException
  • 针对当前版本不支持该语法的异常,抛出的也是SQLSyntaxErrorException

因此在MariaClient的createTable和renameColumn中重写父类后针对异常的判断实现如下

@Override
     public void createTable(ConnectorSession session, ConnectorTableMetadata tableMetadata)
     {
         try {
             createTable(session, tableMetadata, tableMetadata.getTable().getTableName());
         }
         catch (PrestoException e) {
             if (e.getCause() instanceof SQLSyntaxErrorException) {
                 throw new PrestoException(ALREADY_EXISTS, format("Table already exists", e));
             }
         }
         catch (SQLException e) {
             throw new PrestoException(JDBC_ERROR, e);
         }
     }
 
     @Override
     public void renameColumn(JdbcIdentity identity, JdbcTableHandle handle, JdbcColumnHandle jdbcColumn, String inputNewColumnName)
     {
         String newColumnName = inputNewColumnName;
         try (Connection connection = connectionFactory.openConnection(identity)) {
             DatabaseMetaData metadata = connection.getMetaData();
             if (metadata.storesUpperCaseIdentifiers()) {
                 newColumnName = newColumnName.toUpperCase(ENGLISH);
             }
             String sql = format("ALTER TABLE %s RENAME COLUMN %s TO %s", quoted(handle.getCatalogName(), handle.getSchemaName(), handle.getTableName()), quoted(jdbcColumn.getColumnName()), quoted(newColumnName));
             execute(connection, sql);
         }
         catch (PrestoException e) {
             // MariaDB versions earlier than 10.5.2 do not support the RENAME COLUMN syntax
             if (e.getCause() instanceof SQLSyntaxErrorException) {
                 throw new PrestoException(NOT_SUPPORTED, format("Rename column not supported  for the MariaDB server version", e));
             }
         }
         catch (SQLException e) {
             throw new PrestoException(JDBC_ERROR, e);
         }
     }

getQueryGenerator

@Override
     public Optional<QueryGenerator<JdbcQueryGeneratorResult, JdbcConverterContext>> getQueryGenerator(DeterminismEvaluator determinismEvaluator, RowExpressionService rowExpressionService, FunctionMetadataManager functionManager, StandardFunctionResolution functionResolution)
     {
          //默认情况下出于性能考虑是把查询默认Base模式下推到数据源
         JdbcPushDownModule mysqlPushDownModule = pushDownModule == DEFAULT ? BASE_PUSHDOWN : pushDownModule;
         JdbcPushDownParameter pushDownParameter = new JdbcPushDownParameter(getIdentifierQuote(), this.caseInsensitiveNameMatching, mysqlPushDownModule, functionResolution);
         return Optional.of(new MariaQueryGenerator(determinismEvaluator, rowExpressionService, functionManager, functionResolution, pushDownParameter, config));
     }

在这个QueryGenerator中主要做了下面的事情

1)创建JDBC查询下推的模块JdbcPushDownModule,默认下推的策略是BASE

这个JdbcPushDownModule是一个枚举

public enum JdbcPushDownModule
 {
     /**
      * Default Module
      */
     DEFAULT,
     /**
      * Push down all supported PlanNodes to TableScan
      */
     FULL_PUSHDOWN,
     /**
      * Only push down filter, aggregation, limit, topN, project to tableScan
      */
     BASE_PUSHDOWN;
 
     private static final Set<Class<? extends PlanNode>> BASE_PUSH_DOWN_NODE = ImmutableSet.of(
             FilterNode.class,
             AggregationNode.class,
             LimitNode.class,
             TopNNode.class,
             ProjectNode.class,
             TableScanNode.class);
 
     public boolean isAvailable(PlanNode node)
     {
         switch (this) {
             case FULL_PUSHDOWN:
                 return true;
             case BASE_PUSHDOWN:
                 return BASE_PUSH_DOWN_NODE.contains(node.getClass());
             default:
                 throw new PrestoException(NOT_SUPPORTED, "Unsupported push down module");
         }
     }
 }

2)之后封装为完整的JdbcPushDownParameter对象

在配置文件中指定前缀,以是否开启下推为例,查看BaseJdbcConfig配置类的源码,其他类比同理原理是一样的

jdbc.pushdown-enabled=true
public class BaseJdbcConfig
 {    
     // Hetu: JDBC query push down enable
     private boolean pushDownEnable = true;
      //以是否开启
     @Config("jdbc.pushdown-enabled")
     @ConfigDescription("Allow jdbc pushDown")
     public BaseJdbcConfig setPushDownEnable(boolean pushDownEnable)
     {
         this.pushDownEnable = pushDownEnable;
         return this;
     }
 }

测试实现

TestMariaPlugin

测试模块插件是否配置无误

public class TestMariaPlugin
 {
     @Test
     public void testCreateConnector()
     {
         Plugin plugin = new MariaPlugin();
         ConnectorFactory factory = getOnlyElement(plugin.getConnectorFactories());
         factory.create("test", ImmutableMap.of("connection-url", "jdbc:mariadb://test"), new TestingConnectorContext());
     }
 }

mariadb java连接 mariadb connector_mariadb java连接_02

TestMariaConfig

测试配置信息是否映射成功

public class TestMariaConfig
 {
      //测试默认实现
     @Test
     public void testDefaults()
     {
         assertRecordedDefaults(recordDefaults(MariaConfig.class)
                 .setAutoReconnect(true)
                 .setMaxReconnects(3)
                 .setConnectionTimeout(new Duration(10, TimeUnit.SECONDS)));
     }
      //测试具体的参数
     @Test
     public void testExplicitPropertyMappings()
     {
         Map<String, String> properties = new ImmutableMap.Builder<String, String>()
                 .put("mysql.auto-reconnect", "false")
                 .put("mysql.max-reconnects", "4")
                 .put("mysql.connection-timeout", "4s").build();
 
         MariaConfig expected = new MariaConfig()
                 .setAutoReconnect(false)
                 .setMaxReconnects(4)
                 .setConnectionTimeout(new Duration(4, TimeUnit.SECONDS));
 
         assertFullMapping(properties, expected);
     }
 }

MariaQueryRunner

依赖于airlift针对Mysql数据库的测试依赖,通过TestingMySqlServer来操作

经过研究发现其实这里的TestingMysqlServer本质上是内部又启动了一个Mysql的服务,因此不需要我们手动去配置外部测试环境,只需要运行测试案例即可

类似的,后面我们大量的测试都是基于airlift提供的TestingMysqlServer来创建Mysql的环境

public static final TestingMySqlServer createTestingMySqlServer(String userName, String password, String schemaName, String dataBase)
             throws Exception
     {
         TestingMySqlServer tempServer = null;
         final int createRetry = 3;
 
         for (int i = 0; i < createRetry; i++) {
             try {
                 if (dataBase == null) {
                     tempServer = new TestingMySqlServer(userName, password, schemaName);
                 }
                 else {
                     tempServer = new TestingMySqlServer(userName, password, schemaName, dataBase);
                 }
             }
             catch (Exception e) {
                 if (i == (createRetry - 1)) {
                     throw e;
                 }
                 continue;
             }
             break;
         }
 
         return tempServer;
     }

默认是三次创建Mysqld服务的重试机会,否则就抛出异常

进入源码可以看到核心的实现是在EmbeddedMysql

public TestingMySqlServer(String user, String password, Iterable<String> databases)
             throws Exception
     {
         this.user = requireNonNull(user, "user is null");
         this.password = requireNonNull(password, "password is null");
         this.databases = ImmutableSet.copyOf(requireNonNull(databases, "databases is null"));
 
       //初始化的时候默认是创建了EmbeddedMySql实例
         server = new EmbeddedMySql();
         port = server.getPort();
           //之后尝试连接我们的Mysql
         try (Connection connection = server.getMySqlDatabase()) {
             version = connection.getMetaData().getDatabaseProductVersion();
             try (Statement statement = connection.createStatement()) {
                 execute(statement, format("CREATE USER '%s'@'%%' IDENTIFIED BY '%s'", user, password));
                 execute(statement, format("GRANT ALL ON *.* to '%s'@'%%' WITH GRANT OPTION", user));
                 for (String database : databases) {
                     execute(statement, format("CREATE DATABASE %s", database));
                 }
             }
         }
         catch (SQLException e) {
             close();
             throw e;
         }
 
         log.info("MySQL server ready: %s", getJdbcUrl());
     }
public EmbeddedMySql()
             throws IOException
     {
         serverDirectory = createTempDirectory("testing-mysql-server");
 
         log.info("Starting MySQL server in %s", serverDirectory);
 
         try {
             unpackMySql(serverDirectory);
             initialize();
             mysqld = startMysqld();
         }
         catch (Exception e) {
             close();
             throw e;
         }
     }

但是我们启动测试后爆出了[ERROR] Fatal error: Please read "Security" section of the manual to find out how to run mysqld as root!的异常,原因在于一直编辑的身份是root,出于安全考虑,mysql默认是不支持以root身份启动在测试案例中启动的mysqld进程的

解决方案

修改源码包,在源码的启动参数中追加--user=root

具体如何修改源码

mariadb java连接 mariadb connector_mariadb java连接_03

关键修改的部分是EmbeddedMysql类中的startMysqld函数,追加启动参数--user=root即可,之后提交PR的时候再把这个关掉也就是默认使用io.airlift.testing.mysql的EmbeddedMysql的实现,不追加启动参数

DockerizedMariaServer

首先需要确保本机的环境是docker的环境,之后其实只需要修改我们拉取的镜像名称以及对应的参数名即可通过DockerContainer作为客户端连接到本机的Docker服务,创建对应的容器

public class DockerizedMariaServer
         implements Closeable
 {
     private static final int MYSQL_PORT = 3306;
 
     private static final String MYSQL_ROOT_USER = "root";
     private static final String MYSQL_ROOT_PASSWORD = "mysqlrootpassword";
     private static final String MYSQL_USER = "testuser";
     private static final String MYSQL_PASSWORD = "testpassword";
 
     private final DockerContainer dockerContainer;
 
     public DockerizedMariaServer()
     {
         try {
             //pull newest mariadb image
             this.dockerContainer = new DockerContainer(
                     "mariadb",
                     ImmutableList.of(MYSQL_PORT),
                     ImmutableMap.of(
                             "MYSQL_ROOT_PASSWORD", MYSQL_ROOT_PASSWORD,
                             "MYSQL_USER", MYSQL_USER,
                             "MYSQL_PASSWORD", MYSQL_PASSWORD,
                             "MYSQL_DATABASE", "tpch"),
                     DockerizedMariaServer::healthCheck);
         }
         catch (Exception e) {
             System.out.println("## Docker environment not properly set up. Skip test. ##");
             System.out.println("Error message: " + e.getStackTrace());
             throw new SkipException("Docker environment not initialized for tests");
         }
     }
 }

官方文档我们可以知道,其实对于2.x的MariaDB Java Client而言,我们的jdbcUrl其实以jdbc://mysqljdbc://mariadb都是可以的

mariadb java连接 mariadb connector_ide_04

使用上DockerizedMariaServer的主要是我们测试MariaDB的CaseInsensitiveMapping这一特性,具体没有特别大的变化,代码参照Mysql的实现

TestMariaDistributedQueries

参照Mysql的具体测试实现,运行测试可以看到大部分的测试都没有问题,除了我们MariaClient中获取columns方法中出现空指针,导致我们14个测试方法没有pass

进入到MariaClient中查看getColumns方法,补充追加针对结果集元数据信息的Null判断

@Override
     public Map<String, ColumnHandle> getColumns(ConnectorSession session, String sql, Map<String, Type> types)
     {
         try (Connection connection = connectionFactory.openConnection(JdbcIdentity.from(session)); PreparedStatement statement = connection.prepareStatement(sql)) {
             ResultSetMetaData metaData = statement.getMetaData();
             ImmutableMap.Builder<String, ColumnHandle> columnBuilder = new ImmutableMap.Builder<>();
             if(metaData!=null) {
             ...
             }
         }
     }

TestMariaIntegrationSmokeTest

主要是针对MariaDB除了BaseJdbc之外的一些功能测试

以IntegerColumn为例,主要是设置catalog为连接器提供的maria以及对应语句

@Test
     public void integerColumn()
             throws Exception
     {
         Session session = testSessionBuilder()
                 .setCatalog("maria")
                 .setSchema("test_database")
                 .build();
 
         assertUpdate("CREATE TABLE maria.test_database.testInteger(a tinyint, b smallint, c integer, d bigint)");
         assertUpdate("INSERT INTO maria.test_database.testInteger VALUES (tinyint '-2', smallint '1', 5, 158)", 1);
         assertQuery("SELECT * FROM maria.test_database.testInteger", "SELECT -2 a, 1 b, 5 c, 158 d");
         assertUpdate("DROP TABLE maria.test_database.testInteger");
     }

TestMariaRegisterRemoteUdf

用于测试用户自定义函数,主要测试MariaExternalFunctionHub,注册外部函数

总结

代码梳理

MariaPlugin

主要是注册我们的MariaClientModule

之后组装连接器的属性

  • 我们在MairaConfig中通过注解定义的数据比如文档地址docLink
  • BaseJDBCConfig通过配置文件读取的信息,通过反射读取

MariaClientModule

MariaClientModule注册了整个MariaDB Connetcor Plugin所有需要反射注入的类,通过createConnectionFactory,返回一个connector类,其中包含了基于JDBCConnector组件二次开发的MariaDB Connector

MariaConfig

可以通过配置文件对应前缀maria.xxx进行连接器的配置

  • 是否自动连接
  • 最大重连次数
  • 连接超时时间

错误排坑

源码编译不通过各种报错

需要通过mvn clean install -DskipTests先进行对应依赖的下载

如果权限不足可以通过sudo mvn clean install -DskipTests运行

运行TestCredentialPassthrough报错

报错信息:Please read "Security" section of the manual to find out how to run mysqld as root!

原因

TestCredentialPassthrough本质是调用airlift提供的TestingMysql包,在后台自动启动mysqld进程,不需要额外的数据源配置

报错的原因在于以root身份启动程序,导致以root身份启动mysqld,而出于安全考虑,TestingMysql启动的mysqld服务默认是不允许以root身份启动的,需要追加参数--user=root

对于Linux环境下的IDE中运行测试出现上述问题,有两种解决方案

  • 以普通用户身份启动IDE并运行测试
  • 修改源码包,修改源码包io.airlift.testing.mysql下的EmbeddedMySql,在初始化的startMysqld中追加--user=root

mariadb java连接 mariadb connector_mariadb java连接_05

外部数据源环境问题

运行代码的UT时是不需要在当前机器上额外配置任何Mysql或者是MariaDB的基本环境的,基于airlift提供的TestingMysql可以自动创建mysqld进程

而在DockerizedMariaServer中也只需要保证机器当前开启了docker服务即可,UT中会自动拉取最新的镜像并运行对应的container