实战|Flink不支持分库分表的改造之路(三)

WilliamGates
发布于 2022-6-20 17:50
浏览
0收藏

 

 @Override
 @SuppressWarnings("unchecked")
 public ScanRuntimeProvider getScanRuntimeProvider(ScanContext runtimeProviderContext) {
  final JdbcRowDataInputFormat.Builder builder = JdbcRowDataInputFormat.builder()
   .setDrivername(options.getDriverName())
   .setDBUrl(options.getDbURL())
   .setUsername(options.getUsername().orElse(null))
   .setPassword(options.getPassword().orElse(null));

  if (readOptions.getFetchSize() != 0) {
   builder.setFetchSize(readOptions.getFetchSize());
  }
  final JdbcDialect dialect = options.getDialect();
        JdbcNumericBetweenParametersProvider jdbcNumericBetweenParametersProvider = null;
  //数据分片配置
  if (readOptions.getPartitionColumnName().isPresent()) {
   long lowerBound = readOptions.getPartitionLowerBound().get();
   long upperBound = readOptions.getPartitionUpperBound().get();
   int numPartitions = readOptions.getNumPartitions().get();
   jdbcNumericBetweenParametersProvider = new JdbcNumericBetweenParametersProvider(lowerBound, upperBound).ofBatchNum(numPartitions);
  }
        //根据table分片
  List<TableItem>  tableItems = options.getTables();
  builder.setParametersProvider(new JdbcMultiTableProvider(tableItems)
    .withPartition(jdbcNumericBetweenParametersProvider, physicalSchema, readOptions.getPartitionColumnName().orElse(null)));

  final RowType rowType = (RowType) physicalSchema.toRowDataType().getLogicalType();
  builder.setRowConverter(dialect.getRowConverter(rowType));
  builder.setRowDataTypeInfo((TypeInformation<RowData>) runtimeProviderContext
   .createTypeInformation(physicalSchema.toRowDataType()));

  return InputFormatProvider.of(builder.build());
 }

 

2.3.5 改造JdbcRowDataInputFormat


在JdbcRowDataInputFormat的open(InputSplit inputSplit)中,初始化Connection、statement、以及sql查询模板。

 

JdbcRowDataInputFormat整个生命周期中,每个并行实例调用一次openInputFormat(),并对应关闭当前并行实例的方法:closeInputFormat())。

 

每次切换分片,都会调用一次open(InputSplit inputSplit)(对应关闭当前数据分片方法:close()),inputSplit的值对应Serializable[x][y]中x的值递增,并且每个并行实例不会重复执行,比如有1024个分表,每个表2个数据分片,那么inputSplit.getSplitNumber()值的范围是:[0~2047]。JdbcRowDataInputFormat对象持有Serializable[ x ] [y ],并且根据open(InputSplit inputSplit)来定位当前JdbcRowDataInputFormat处理对应分区的数据,从而达到数据分区根据并发度,并发查询的效果。

 

示例代码如下:

@Override
public void open(InputSplit inputSplit) throws IOException {
  try {
   //分库,分表逻辑
   Object[] params = parameterValues[inputSplit.getSplitNumber()];
   //初始化数据库连接,url= params[0].toString();
   initConnect(params);
   String url = params[0].toString();
   final JdbcDialect dialect = RdbsDialects.get(url).get();
   //数据查询模板,String table = params[1].toString();
   String queryTemplate = queryTemplate(params, dialect);
   statement = dbConn.prepareStatement(queryTemplate, resultSetType, resultSetConcurrency);
          
   if (inputSplit != null && parameterValues != null) {
   //从index=2 开始为数据分片配置
    for (int i = 2; i < parameterValues[inputSplit.getSplitNumber()].length; i++) {
     Object param = parameterValues[inputSplit.getSplitNumber()][i];
     if (param instanceof String) {
      statement.setString(i - 1, (String) param);
     } else if (param instanceof Long) {
      statement.setLong(i - 1, (Long) param);
     } else if (param instanceof Integer) {
      statement.setInt(i - 1, (Integer) param);
     ...
      //extends with other types if needed
      throw new IllegalArgumentException("open() failed. Parameter " + i + " of type " + param.getClass() + " is not handled (yet).");
     }
    }
    if (LOG.isDebugEnabled()) {
     LOG.debug(String.format("Executing '%s' with parameters %s", queryTemplate, Arrays.deepToString(parameterValues[inputSplit.getSplitNumber()])));
    }
   }
   resultSet = statement.executeQuery();
   hasNext = resultSet.next();
  } catch (SQLException se) {
   throw new IllegalArgumentException("open() failed." + se.getMessage(), se);
  }

}

 

基于上述步骤改造后,就可以实现从flink-jdbc-connector source端单库单表,到分库分表的改造。

 

文章转自公众号:中间件兴趣圈

标签
已于2022-6-20 17:50:30修改
收藏
回复
举报
回复
    相关推荐