dremio 的元数据会影响查询的执行,以及bi 工具的使用,所以会包含两部分,读以及写,dremio 包含了定时刷新的,以及在source 首次创建
的时候(adhoc )

参考图

写入保存,基于页面操作的核心是DatasetSaver 实现的save 方法

dremio 元数据处理_java

 

Affect(class count: 1 , method count: 3) cost in 314 ms, listenerId: 4
ts=2022-10-12 06:27:29;thread_name=metadata-refresh-modifiable-scheduler-17;id=a3;is_daemon=true;priority=10;TCCL=sun.misc.Launcher$AppClassLoader@18b4aac2
@com.dremio.exec.catalog.DatasetSaverImpl.save()
at com.dremio.exec.catalog.DatasetSaverImpl.save(DatasetSaverImpl.java:137)
at com.dremio.exec.catalog.MetadataSynchronizer.tryHandleExistingDataset(MetadataSynchronizer.java:311)
at com.dremio.exec.catalog.MetadataSynchronizer.handleExistingDataset(MetadataSynchronizer.java:229)
at com.dremio.exec.catalog.MetadataSynchronizer.synchronizeDatasets(MetadataSynchronizer.java:201)
at com.dremio.exec.catalog.MetadataSynchronizer.go(MetadataSynchronizer.java:134)
at com.dremio.exec.catalog.SourceMetadataManager$RefreshRunner.refreshFull(SourceMetadataManager.java:441)
at com.dremio.exec.catalog.SourceMetadataManager$BackgroundRefresh.run(SourceMetadataManager.java:555)
at com.dremio.exec.catalog.SourceMetadataManager.wakeup(SourceMetadataManager.java:264)
at com.dremio.exec.catalog.SourceMetadataManager.access$300(SourceMetadataManager.java:96)
at com.dremio.exec.catalog.SourceMetadataManager$WakeupWorker.run(SourceMetadataManager.java:203)
at com.dremio.service.scheduler.LocalSchedulerService$CancellableTask.run(LocalSchedulerService.java:226)
at com.jprofiler.agent.callee.RunnableTracking.run(ejt:19)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
读取(核心是listTableSchemata 基于InformationSchemaCatalog 实现处理)

ts=2022-10-12 06:23:25;thread_name=grpc-default-executor-31;id=385;is_daemon=true;priority=5;TCCL=sun.misc.Launcher$AppClassLoader@18b4aac2
@com.dremio.exec.catalog.InformationSchemaCatalogImpl.listTableSchemata()
at com.dremio.exec.catalog.CatalogImpl.listTableSchemata(CatalogImpl.java:1720)
at com.dremio.exec.catalog.SourceAccessChecker.listTableSchemata(SourceAccessChecker.java:514)
at com.dremio.exec.catalog.DelegatingCatalog.listTableSchemata(DelegatingCatalog.java:365)
at com.dremio.exec.catalog.InformationSchemaServiceImpl.listTableSchemata(InformationSchemaServiceImpl.java:164)
at com.dremio.service.catalog.InformationSchemaServiceGrpc$MethodHandlers.invoke(InformationSchemaServiceGrpc.java:663)
at io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:182)
at io.grpc.PartialForwardingServerCallListener.onHalfClose(PartialForwardingServerCallListener.java:35)
at io.grpc.ForwardingServerCallListener.onHalfClose(ForwardingServerCallListener.java:23)
at io.grpc.ForwardingServerCallListener$SimpleForwardingServerCallListener.onHalfClose(ForwardingServerCallListener.java:40)
at io.grpc.Contexts$ContextualizedServerCallListener.onHalfClose(Contexts.java:86)
at io.grpc.PartialForwardingServerCallListener.onHalfClose(PartialForwardingServerCallListener.java:35)
at io.grpc.ForwardingServerCallListener.onHalfClose(ForwardingServerCallListener.java:23)
at io.grpc.ForwardingServerCallListener$SimpleForwardingServerCallListener.onHalfClose(ForwardingServerCallListener.java:40)
at io.opentracing.contrib.grpc.TracingServerInterceptor$2.onHalfClose(TracingServerInterceptor.java:231)
at io.grpc.PartialForwardingServerCallListener.onHalfClose(PartialForwardingServerCallListener.java:35)
at io.grpc.ForwardingServerCallListener.onHalfClose(ForwardingServerCallListener.java:23)
at io.grpc.ForwardingServerCallListener$SimpleForwardingServerCallListener.onHalfClose(ForwardingServerCallListener.java:40)
at io.grpc.util.TransmitStatusRuntimeExceptionInterceptor$1.onHalfClose(TransmitStatusRuntimeExceptionInterceptor.java:74)
at io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:340)
at io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:866)
at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
at com.jprofiler.agent.callee.RunnableTracking.run(ejt:19)
at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)

 

定时任务刷新元数据

dremio 元数据处理_java_02

 

 

jdbc 数据集列表处理handler,具体代码的jdbc 存储插件中(社区版,暂时没有开源),核心实现需要依赖一个fecher 服务 JdbcSchemaFetcherImpl

 

public class JdbcDatasetHandle implements DatasetHandle {
private final EntityPath entityPath;
private final String catalog;
private final String schema;
private final String table;
private JdbcFetcherProto.GetTableMetadataResponse tableMetadataResponse = null;

JdbcDatasetHandle(String catalog, String schema, String table) {
this.catalog = catalog;
this.schema = schema;
this.table = table;
ImmutableList.Builder<String> builder = ImmutableList.builder();
builder.add(JdbcStoragePlugin.this.config.getSourceName());
if (!Strings.isNullOrEmpty(catalog)) {
builder.add(catalog);
}

if (!Strings.isNullOrEmpty(schema)) {
builder.add(schema);
}

builder.add(table);
this.entityPath = new EntityPath(builder.build());
}

public EntityPath getDatasetPath() {
return this.entityPath;
}

JdbcFetcherProto.GetTableMetadataResponse getTableMetadataResponse() {
if (this.tableMetadataResponse == null) {
this.tableMetadataResponse = JdbcStoragePlugin.this.fetcher.getTableMetadata(GetTableMetadataRequest.newBuilder().setCatalog(this.catalog).setSchema(this.schema).setTable(this.table).build());
}

return this.tableMetadataResponse;
}
}

class JdbcIteratorListing implements DatasetHandleListing {
private final Set<CloseableIterator<JdbcFetcherProto.CanonicalizeTablePathResponse>> references = new HashSet();

JdbcIteratorListing() {
}

public Iterator<DatasetHandle> iterator() {
CloseableIterator<JdbcFetcherProto.CanonicalizeTablePathResponse> iterator = JdbcStoragePlugin.this.fetcher.listTableNames(ListTableNamesRequest.newBuilder().build());
this.references.add(iterator);
return Iterators.transform(iterator, (input) -> {
return JdbcStoragePlugin.this.new JdbcDatasetHandle(input.getCatalog(), input.getSchema(), input.getTable());
});
}

public void close() {
try {
AutoCloseables.close(this.references);
} catch (Exception var2) {
JdbcStoragePlugin.LOGGER.warn("Error closing iterators when listing JDBC datasets.", var2);
}

}
}

调度服务

  • 参考实现

dremio 元数据处理_github_03

说明

dremio 的元数据是比较重要的,大致了解下元数据的处理比较重要,同时为了方便bi 以及工具使用 dremio 包含了InformationSchemaCatalog 提供informationschema
对于jdbc 以及SourceMetadataManager 的可以参考我以前写的
的能力

参考资料

​https://github.com/dremio/dremio-oss/blob/d41cb52143b6b0289fc8ed4d970bfcf410a669e8/sabot/kernel/src/main/java/com/dremio/exec/store/ischema/Column.java​​​
​​​https://github.com/dremio/dremio-oss/blob/d41cb52143b6b0289fc8ed4d970bfcf410a669e8/sabot/vector-tools/src/main/java/com/dremio/common/expression/SqlTypeNameVisitor.java​​​
​​​https://github.com/dremio/dremio-oss/blob/d41cb52143b6b0289fc8ed4d970bfcf410a669e8/sabot/kernel/src/main/java/com/dremio/exec/catalog/DatasetSaverImpl.java​​​
​​​https://github.com/dremio/dremio-oss/blob/d41cb52143b6b0289fc8ed4d970bfcf410a669e8/sabot/kernel/src/main/java/com/dremio/exec/catalog/InformationSchemaCatalogImpl.java​​​
​​​https://github.com/dremio/dremio-oss/blob/d41cb52143b6b0289fc8ed4d970bfcf410a669e8/sabot/kernel/src/main/java/com/dremio/exec/catalog/InformationSchemaCatalog.java​​​
​​​https://github.com/dremio/dremio-oss/blob/d41cb52143b6b0289fc8ed4d970bfcf410a669e8/sabot/kernel/src/main/java/com/dremio/exec/catalog/SourceMetadataManager.java​​​
​​​https://github.com/dremio/dremio-oss/blob/d41cb52143b6b0289fc8ed4d970bfcf410a669e8/sabot/kernel/src/main/java/com/dremio/exec/catalog/MetadataSynchronizer.java​​​