TensorRT + YOLOv5第六版C++部署全解
OpenCV單目相機(jī)標(biāo)定,圖像畸變校正
前言
之前對(duì)YOLOv5第六版分別在OpenCV DNN、OpenVINO、ONNXRUNTIME?上做了測試,因?yàn)榘姹炯嫒輪栴},一直無法在TensorRT上做測試,我當(dāng)時(shí)跑CUDA11.0 + cuDNN8.4.x時(shí)候給我報(bào)的錯(cuò)誤如下:
Could not load library cudnn_cnn_infer64_8.dll. Error code 126Please make sure cudnn_cnn_infer64_8.dll is in your library path!
真實(shí)原因是cuDNN版本太高了導(dǎo)致TensorRT無法在CUDA11.0上支持,然后改為cuDNN8.2.0之后重新配置一下VS開發(fā)環(huán)境終于一切安好了,所以這里必須交代一下我軟件相關(guān)版本信息,防止翻車!
Win10 x64CUDA11.0.2cuDNN8.2.0TensorRT8.4.0VS2017OpenCV4.5.4GPU3050 ti
VS2017中開發(fā)環(huán)境配置
配置包含路徑

配置庫目錄路徑:

說明一下,我的TensorRT解壓縮之后在路徑為D:\TensorRT-8.4.0.6
配置連接器相關(guān)lib文件如下:

相關(guān)lib文件列表如下:(特別注意:版本不同會(huì)有差異,請(qǐng)慎重拷貝!)
nvinfer.libnvinfer_plugin.libnvonnxparser.libnvparsers.libcublas.libcublasLt.libcuda.libcudadevrt.libcudart.libcudart_static.libcudnn.libcudnn64_8.libcudnn_adv_infer.libcudnn_adv_infer64_8.libcudnn_adv_train.libcudnn_adv_train64_8.libcudnn_cnn_infer.libcudnn_cnn_infer64_8.libcudnn_cnn_train.libcudnn_cnn_train64_8.libcudnn_ops_infer.libcudnn_ops_infer64_8.libcudnn_ops_train.libcudnn_ops_train64_8.libcufft.libcufftw.libcurand.libcusolver.libcusolverMg.libcusparse.libnppc.libnppial.libnppicc.libnppidei.libnppif.libnppig.libnppim.libnppist.libnppisu.libnppitc.libnpps.libnvblas.libnvjpeg.libnvml.libnvrtc.libOpenCL.lib
YOLOv5模型轉(zhuǎn)換ONNX->engine
直接初始化YOLOv5TRTDetector類,然后調(diào)用onnx2engine方法,實(shí)現(xiàn)onnx到engine文件轉(zhuǎn)換,相關(guān)代碼如下:
auto detector = std::make_shared(); detector->onnx2engine("D:/python/yolov5-6.1/yolov5s.onnx", "D:/python/yolov5-6.1/yolov5s.engine", 0);
運(yùn)行結(jié)果如下:

相關(guān)方法實(shí)現(xiàn)代碼如下:
void?YOLOv5TRTDetector::onnx2engine(std::string?onnxfilePath,?std::string?enginefilePath,?int?type)?{
????IBuilder*?builder?=?createInferBuilder(gLogger);
????const?auto?explicitBatch?=?1U?<static_cast<uint32_t>(NetworkDefinitionCreationFlag::kEXPLICIT_BATCH);
????nvinfer1::INetworkDefinition*?network?=?builder->createNetworkV2(explicitBatch);
????auto?parser?=?nvonnxparser::createParser(*network,?gLogger);
????parser->parseFromFile(onnxfilePath.c_str(),?2);
????for?(int?i?=?0;?i?getNbErrors();?++i)
????{
??????????std::cout?<"load?error:?"<getError(i)->desc()?<std::endl;
????}
????printf("tensorRT?load?mask?onnx?model?successfully!!!...\n");
????//?創(chuàng)建推理引擎
????IBuilderConfig*?config?=?builder->createBuilderConfig();
????config->setMaxWorkspaceSize(16*(1?<20));
????if?(type?==?1)?{
??????????config->setFlag(nvinfer1::BuilderFlag::kFP16);
????}
????if?(type?==?2)?{
??????????config->setFlag(nvinfer1::BuilderFlag::kINT8);
????}
????auto?myengine?=?builder->buildEngineWithConfig(*network,?*config);
????std::cout?<"try?to?save?engine?file?now~~~"?<std::endl;
????std::ofstream?p(enginefilePath,?std::ios::binary);
????if?(!p)?{
??????????std::cerr?<"could?not?open?plan?output?file"?<std::endl;
??????????return;
????}
????IHostMemory*?modelStream?=?myengine->serialize();
????p.write(reinterpret_cast<const?char*>(modelStream->data()),?modelStream->size());
????modelStream->destroy();
????myengine->destroy();
????network->destroy();
????parser->destroy();
????std::cout?<"convert?onnx?model?to?TensorRT?engine?model?successfully!"?<std::endl;
}常見錯(cuò)誤:
Error?Code?1:?Cuda?Runtime?(driver?shutting?down)Unexpected?Internal?Error:?[virtualMemoryBuffer.cpp::~StdVirtualMemoryBufferImpl::121]?Error?Code?1:?Cuda?Runtime?(driver?shutting?down)Unexpected Internal Error: [virtualMemoryBuffer.cpp::nvinfer1::StdVirtualMemoryBufferImpl::~StdVirtualMemoryBufferImpl::121] Error Code 1: Cuda Runtime (driver shutting down)
要釋放,不然就是上面的錯(cuò)誤
context->destroy();
engine->destroy();
network->destroy();
parser->destroy();這樣就好啦
YOLOv5 engine模型加載與推理
分別轉(zhuǎn)換為32FP與16FP的engine文件之后,執(zhí)行推理代碼與運(yùn)行結(jié)果如下:
std::string?label_map?=?"D:/python/yolov5-6.1/classes.txt";
int?main(int?argc,?char**?argv)?{
????std::vectorclassNames;
????std::ifstream?fp(label_map);
????std::string?name;
????while?(!fp.eof())?{
??????????getline(fp,?name);
??????????if?(name.length())?{
??????????????????classNames.push_back(name);
??????????}
????}
????fp.close();
????auto?detector?=?std::make_shared();
????detector->initConfig("D:/python/yolov5-6.1/yolov5s.engine",?0.4,?0.25);
????std::vectorresults;
????cv::VideoCapture?capture("D:/images/video/sample.mp4");
????cv::Mat?frame;
????while?(true)?{
??????????bool?ret?=?capture.read(frame);
??????????detector->detect(frame,?results);
??????????for?(DetectResult?dr?:?results)?{
??????????????????cv::Rect?box?=?dr.box;
??????????????????cv::putText(frame,?classNames[dr.classId],?cv::Point(box.tl().x,?box.tl().y?-?10),?cv::FONT_HERSHEY_SIMPLEX,?.5,?cv::Scalar(0,?0,?0));
??????????}
??????????cv::imshow("YOLOv5-6.1?+?TensorRT8.4?-?by?gloomyfish",?frame);
??????????char?c?=?cv::waitKey(1);
??????????if?(c?==?27)?{?//?ESC?退出
??????????????????break;
??????????}
??????????//?reset?for?next?frame
??????????results.clear();
????}
????cv::waitKey(0);
????cv::destroyAllWindows();
????return?0;
}運(yùn)行結(jié)果:
FP32上推理,速度在80+FPS左右

FP16上推理,速度達(dá)到100+FPS左右,TensorRT8.4.0

總結(jié)
