一.uff模型加载:
1.创建一个空的网络network。
IBuilder* builder = createInferBuilder(gLogger);
INetworkDefinition* network = builder->createNetwork();

2.创建uff解析:
IUFFParser* parser = createUffParser();

3.声明网络的输入和输出:
parser->registerInput("Input_0", DimsCHW(1, 28, 28), UffInputOrder::kNCHW);
parser->registerOutput("Binary_3")

4.解析模型,通过指定的精度:
parser->parse(uffFile, *network, nvinfer1::DataType::kFLOAT);

二、建立engine
1.设置配置参数,maxBatchSize和maxWorkspaceSize;
builder->setMaxBatchSize(maxBatchSize);
IBuilderConfig * config = builder->createBuilderConfig();
config->setMaxWorkspaceSize(1 << 20);
ICudaEngine* engine = builder->buildEngineWithConfig(*network, *config);
//序列化
IHostMemory *serializedModel = engine->serialize();
保存序列化后的值

2.释放指针:
parser->destroy();
network->destroy();
config->destroy();
builder->destroy();
serializedModel->destroy();

三、加载engine做前向推理:
1.序列化模型engine
IRuntime* runtime = createInferRuntime(gLogger);
ICudaEngine* engine = runtime->deserializeCudaEngine(modelData, modelSize, nullptr);