回复
实战|Flink不支持分库分表的改造之路(三)
WilliamGates
发布于 2022-6-20 17:50
浏览
0收藏
@Override
@SuppressWarnings("unchecked")
public ScanRuntimeProvider getScanRuntimeProvider(ScanContext runtimeProviderContext) {
final JdbcRowDataInputFormat.Builder builder = JdbcRowDataInputFormat.builder()
.setDrivername(options.getDriverName())
.setDBUrl(options.getDbURL())
.setUsername(options.getUsername().orElse(null))
.setPassword(options.getPassword().orElse(null));
if (readOptions.getFetchSize() != 0) {
builder.setFetchSize(readOptions.getFetchSize());
}
final JdbcDialect dialect = options.getDialect();
JdbcNumericBetweenParametersProvider jdbcNumericBetweenParametersProvider = null;
//数据分片配置
if (readOptions.getPartitionColumnName().isPresent()) {
long lowerBound = readOptions.getPartitionLowerBound().get();
long upperBound = readOptions.getPartitionUpperBound().get();
int numPartitions = readOptions.getNumPartitions().get();
jdbcNumericBetweenParametersProvider = new JdbcNumericBetweenParametersProvider(lowerBound, upperBound).ofBatchNum(numPartitions);
}
//根据table分片
List<TableItem> tableItems = options.getTables();
builder.setParametersProvider(new JdbcMultiTableProvider(tableItems)
.withPartition(jdbcNumericBetweenParametersProvider, physicalSchema, readOptions.getPartitionColumnName().orElse(null)));
final RowType rowType = (RowType) physicalSchema.toRowDataType().getLogicalType();
builder.setRowConverter(dialect.getRowConverter(rowType));
builder.setRowDataTypeInfo((TypeInformation<RowData>) runtimeProviderContext
.createTypeInformation(physicalSchema.toRowDataType()));
return InputFormatProvider.of(builder.build());
}
2.3.5 改造JdbcRowDataInputFormat
在JdbcRowDataInputFormat的open(InputSplit inputSplit)中,初始化Connection、statement、以及sql查询模板。
JdbcRowDataInputFormat整个生命周期中,每个并行实例调用一次openInputFormat(),并对应关闭当前并行实例的方法:closeInputFormat())。
每次切换分片,都会调用一次open(InputSplit inputSplit)(对应关闭当前数据分片方法:close()),inputSplit的值对应Serializable[x][y]中x的值递增,并且每个并行实例不会重复执行,比如有1024个分表,每个表2个数据分片,那么inputSplit.getSplitNumber()值的范围是:[0~2047]。JdbcRowDataInputFormat对象持有Serializable[ x ] [y ],并且根据open(InputSplit inputSplit)来定位当前JdbcRowDataInputFormat处理对应分区的数据,从而达到数据分区根据并发度,并发查询的效果。
示例代码如下:
@Override
public void open(InputSplit inputSplit) throws IOException {
try {
//分库,分表逻辑
Object[] params = parameterValues[inputSplit.getSplitNumber()];
//初始化数据库连接,url= params[0].toString();
initConnect(params);
String url = params[0].toString();
final JdbcDialect dialect = RdbsDialects.get(url).get();
//数据查询模板,String table = params[1].toString();
String queryTemplate = queryTemplate(params, dialect);
statement = dbConn.prepareStatement(queryTemplate, resultSetType, resultSetConcurrency);
if (inputSplit != null && parameterValues != null) {
//从index=2 开始为数据分片配置
for (int i = 2; i < parameterValues[inputSplit.getSplitNumber()].length; i++) {
Object param = parameterValues[inputSplit.getSplitNumber()][i];
if (param instanceof String) {
statement.setString(i - 1, (String) param);
} else if (param instanceof Long) {
statement.setLong(i - 1, (Long) param);
} else if (param instanceof Integer) {
statement.setInt(i - 1, (Integer) param);
...
//extends with other types if needed
throw new IllegalArgumentException("open() failed. Parameter " + i + " of type " + param.getClass() + " is not handled (yet).");
}
}
if (LOG.isDebugEnabled()) {
LOG.debug(String.format("Executing '%s' with parameters %s", queryTemplate, Arrays.deepToString(parameterValues[inputSplit.getSplitNumber()])));
}
}
resultSet = statement.executeQuery();
hasNext = resultSet.next();
} catch (SQLException se) {
throw new IllegalArgumentException("open() failed." + se.getMessage(), se);
}
}
基于上述步骤改造后,就可以实现从flink-jdbc-connector source端单库单表,到分库分表的改造。
文章转自公众号:中间件兴趣圈
标签
已于2022-6-20 17:50:30修改
赞
收藏
回复
相关推荐