<kbd id="afajh"><form id="afajh"></form></kbd>
<strong id="afajh"><dl id="afajh"></dl></strong>
    <del id="afajh"><form id="afajh"></form></del>
        1. <th id="afajh"><progress id="afajh"></progress></th>
          <b id="afajh"><abbr id="afajh"></abbr></b>
          <th id="afajh"><progress id="afajh"></progress></th>

          基于TensorRT量化部署YOLOV5s 4.0模型

          共 6804字,需瀏覽 14分鐘

           ·

          2021-02-02 20:13

          【GiantPandaCV導(dǎo)語(yǔ)】本文為大家介紹了一個(gè)TensorRT int8 量化部署 yolov5s 4.0 模型的教程,并開(kāi)源了全部代碼。主要是教你如何搭建tensorrt環(huán)境,對(duì)pytorch模型做onnx格式轉(zhuǎn)換,onnx模型做tensorrt int8量化,及對(duì)量化后的模型做推理,實(shí)測(cè)在1070顯卡做到了3.3ms一幀!開(kāi)源地址如下:https://github.com/Wulingtian/yolov5_tensorrt_int8_tools,https://github.com/Wulingtian/yolov5_tensorrt_int8。歡迎star。

          0x0. YOLOV5簡(jiǎn)介

          如果說(shuō)在目標(biāo)檢測(cè)領(lǐng)域落地最廣的算法,yolo系列當(dāng)之無(wú)愧,從yolov1到現(xiàn)在的"yolov5",雖然yolov5這個(gè)名字飽受爭(zhēng)議,但是阻止不了算法部署工程師對(duì)他的喜愛(ài),因?yàn)樗_實(shí)又快又好,從kaggle全球小麥檢測(cè)競(jìng)賽霸榜,到star數(shù)短短不到一年突破8k,無(wú)疑,用硬實(shí)力證明了自己??偠灾盟?,用它,用它?。?strong>在我的1070顯卡上,yolov5s 4.0 的模型 tensorrt int8 量化后,inference做到了3.3ms一幀!)

          推理過(guò)程展示

          0x1. 環(huán)境配置

          • ubuntu:18.04
          • cuda:11.0
          • cudnn:8.0
          • tensorrt:7.2.16
          • OpenCV:3.4.2
          • cuda,cudnn,tensorrt和OpenCV安裝包(編譯好了,也可以自己從官網(wǎng)下載編譯)可以從鏈接: https://pan.baidu.com/s/1dpMRyzLivnBAca2c_DIgGw 密碼: 0rct
          • cuda安裝
            • 如果系統(tǒng)有安裝驅(qū)動(dòng),運(yùn)行如下命令卸載
            • sudo apt-get purge nvidia*
            • 禁用nouveau,運(yùn)行如下命令
            • sudo vim /etc/modprobe.d/blacklist.conf
            • 在末尾添加 blacklist nouveau
            • 然后執(zhí)行sudo update-initramfs -u, chmod +x cuda_11.0.2_450.51.05_linux.run,sudo ./cuda_11.0.2_450.51.05_linux.run
            • 是否接受協(xié)議: accept
            • 然后選擇Install
            • 最后回車
            • vim ~/.bashrc 添加如下內(nèi)容:
            • export PATH=/usr/local/cuda-11.0/bin:$PATH
            • export LD_LIBRARY_PATH=/usr/local/cuda-11.0/lib64:$LD_LIBRARY_PATH
            • source .bashrc 激活環(huán)境
          • cudnn 安裝
            • tar -xzvf cudnn-11.0-linux-x64-v8.0.4.30.tgz
            • cd cuda/include
            • sudo cp *.h /usr/local/cuda-11.0/include
            • cd cuda/lib64
            • sudo cp libcudnn* /usr/local/cuda-11.0/lib64
          • tensorrt及OpenCV安裝
            • 定位到用戶根目錄
            • tar -xzvf TensorRT-7.2.1.6.Ubuntu-18.04.x86_64-gnu.cuda-11.0.cudnn8.0.tar.gz
            • cd TensorRT-7.2.1.6/python,該目錄有4個(gè)python版本的tensorrt安裝包
            • sudo pip3 install tensorrt-7.2.1.6-cp37-none-linux_x86_64.whl(根據(jù)自己的python版本安裝)
            • pip install pycuda 安裝python版本的cuda
            • 定位到用戶根目錄
            • tar -xzvf opencv-3.4.2.zip 以備推理調(diào)用

          0x2. yolov5s導(dǎo)出onnx

          • pip install onnx
          • pip install onnx-simplifier
          • git clone https://github.com/ultralytics/yolov5.git
          • cd yolov5/models
          • vim common.py
          • 把BottleneckCSP類下的激活函數(shù)替換為relu,tensorrt對(duì)leakyRelu int8量化不穩(wěn)定(這是一個(gè)深坑,大家記得避開(kāi))即修改為self.act = nn.ReLU(inplace=True)
          • 訓(xùn)練得到模型后
          • cd yolov5
          • python models/export.py --weights 訓(xùn)練得到的模型權(quán)重路徑 --img-size 訓(xùn)練圖片輸入尺寸
          • python3 -m onnxsim onnx模型名稱 yolov5s-simple.onnx 得到最終簡(jiǎn)化后的onnx模型

          0x3. ONNX模型轉(zhuǎn)換為 int8 TensorRT引擎

          • git clone https://github.com/Wulingtian/yolov5_tensorrt_int8_tools.git(求star)
          • cd yolov5_tensorrt_int8_tools
          • vim convert_trt_quant.py 修改如下參數(shù)
            • BATCH_SIZE 模型量化一次輸入多少?gòu)垐D片
            • BATCH 模型量化次數(shù)
            • height width 輸入圖片寬和高
            • CALIB_IMG_DIR 訓(xùn)練圖片路徑,用于量化
            • onnx_model_path onnx模型路徑
          • python convert_trt_quant.py 量化后的模型存到models_save目錄下

          0x4. TensorRT模型推理

          • git clone https://github.com/Wulingtian/yolov5_tensorrt_int8.git(求star)

          • cd yolov5_tensorrt_int8

          • vim CMakeLists.txt

          • 修改USER_DIR參數(shù)為自己的用戶根目錄

          • vim yolov5s_infer.cc 修改如下參數(shù)

          • output_name1 output_name2 output_name3 (yolov5模型有3個(gè)輸出)

          • 我們可以通過(guò)netron查看模型輸出名

          • pip install netron 安裝netron

          • vim netron_yolov5s.py 把如下內(nèi)容粘貼

            • import netron
            • netron.start('此處填充簡(jiǎn)化后的onnx模型路徑', port=3344)
          • python netron_yolov5s.py 即可查看 模型輸出名

          • trt_model_path 量化的的tensorrt推理引擎(models_save目錄下trt后綴的文件)

          • test_img 測(cè)試圖片路徑

          • INPUT_W INPUT_H 輸入圖片寬高

          • NUM_CLASS 訓(xùn)練的模型有多少類

          • NMS_THRESH nms閾值

          • CONF_THRESH 置信度

          • 參數(shù)配置完畢,開(kāi)始編譯運(yùn)行

            • mkdir build
            • cd build
            • cmake ..
            • make
            • ./YoloV5sEngine
          • 輸出平均推理時(shí)間,以及保存預(yù)測(cè)圖片到當(dāng)前目錄下,至此,部署完成!

          0x5. TensorRT int8 量化核心代碼一覽

          //量化預(yù)處理與訓(xùn)練保持一致,數(shù)據(jù)對(duì)齊
          def?preprocess_v1(image_raw):
          ????h,?w,?c?
          =?image_raw.shape
          ????image?=?cv2.cvtColor(image_raw,?cv2.COLOR_BGR2RGB)
          ????#?Calculate?widht?and?height?and?paddings
          ????r_w?=?width?/?w
          ????r_h?=?height?/?h
          ????if?r_h?>?r_w:
          ????????tw?=?width
          ????????th?=?int(r_w?*?h)
          ????????tx1?=?tx2?=?0
          ????????ty1?=?int((height?-?th)?/?2)
          ????????ty2?=?height?-?th?-?ty1
          ????else:
          ????????tw?=?int(r_h?*?w)
          ????????th?=?height
          ????????tx1?=?int((width?-?tw)?/?2)
          ????????tx2?=?width?-?tw?-?tx1
          ????????ty1?=?ty2?=?0
          ????#?Resize?the?image?with?long?side?while?maintaining?ratio
          ????image?=?cv2.resize(image,?(tw,?th))
          ????#?Pad?the?short?side?with?(128,128,128)
          ????image?=?cv2.copyMakeBorder(
          ????????image,?ty1,?ty2,?tx1,?tx2,?cv2.BORDER_CONSTANT,?(128,?128,?128)
          ????)
          ????image?=?image.astype(np.float32)
          ????#?Normalize?to?[0,1]
          ????image?/=?255.0
          ????#?HWC?to?CHW?format:
          ????image?=?np.transpose(image,?[2,?0,?1])
          ????#?CHW?to?NCHW?format
          ????#image?=?np.expand_dims(image,?axis=0)
          ????#?Convert?the?image?to?row-major?order,?also?known?as?"C?order":
          ????#image?=?np.ascontiguousarray(image)
          ????return?image

          //構(gòu)建IInt8EntropyCalibrator量化器
          class?Calibrator(trt.IInt8EntropyCalibrator):
          ????def?__init__(self,?stream,?cache_file=""):
          ????????trt.IInt8EntropyCalibrator.__init__(self)???????
          ????????self.stream?=?stream
          ????????self.d_input?=?cuda.mem_alloc(self.stream.calibration_data.nbytes)
          ????????self.cache_file?=?cache_file
          ????????stream.reset()

          ????def?get_batch_size(self):
          ????????return?self.stream.batch_size

          ????def?get_batch(self,?names):
          ????????batch?=?self.stream.next_batch()
          ????????if?not?batch.size:???
          ????????????return?None

          ????????cuda.memcpy_htod(self.d_input,?batch)

          ????????return?[int(self.d_input)]

          ????def?read_calibration_cache(self):
          ????????#?If?there?is?a?cache,?use?it?instead?of?calibrating?again.?Otherwise,?implicitly?return?None.
          ????????if?os.path.exists(self.cache_file):
          ????????????with?open(self.cache_file,?"rb")?as?f:
          ????????????????logger.info("Using?calibration?cache?to?save?time:?{:}".format(self.cache_file))
          ????????????????return?f.read()

          ????def?write_calibration_cache(self,?cache):
          ????????with?open(self.cache_file,?"wb")?as?f:
          ????????????logger.info("Caching?calibration?data?for?future?use:?{:}".format(self.cache_file))
          ????????????f.write(cache)

          //加載onnx模型,構(gòu)建tensorrt?engine
          def?get_engine(max_batch_size=1,?onnx_file_path="",?engine_file_path="",\
          ???????????????fp16_mode=False,?int8_mode=False,?calibration_stream=None,?calibration_table_path="",?save_engine=False):
          ????"""Attempts?to?load?a?serialized?engine?if?available,?otherwise?builds?a?new?TensorRT?engine?and?saves?it."""
          ????def?build_engine(max_batch_size,?save_engine):
          ????????"""Takes?an?ONNX?file?and?creates?a?TensorRT?engine?to?run?inference?with"""
          ????????with?trt.Builder(TRT_LOGGER)?as?builder,?\
          ????????????????builder.create_network(1)?as?network,\
          ????????????????trt.OnnxParser(network,?TRT_LOGGER)?as?parser:
          ????????????
          ????????????#?parse?onnx?model?file
          ????????????if?not?os.path.exists(onnx_file_path):
          ????????????????quit('ONNX?file?{}?not?found'.format(onnx_file_path))
          ????????????print('Loading?ONNX?file?from?path?{}...'.format(onnx_file_path))
          ????????????with?open(onnx_file_path,?'rb')?as?model:
          ????????????????print('Beginning?ONNX?file?parsing')
          ????????????????parser.parse(model.read())
          ????????????????assert?network.num_layers?>?0,?'Failed?to?parse?ONNX?model.?\
          ????????????????????????????Please?check?if?the?ONNX?model?is?compatible?'

          ????????????print('Completed?parsing?of?ONNX?file')
          ????????????print('Building?an?engine?from?file?{};?this?may?take?a?while...'.format(onnx_file_path))????????
          ????????????
          ????????????#?build?trt?engine
          ????????????builder.max_batch_size?=?max_batch_size
          ????????????builder.max_workspace_size?=?1?<30?#?1GB
          ????????????builder.fp16_mode?=?fp16_mode
          ????????????if?int8_mode:
          ????????????????builder.int8_mode?=?int8_mode
          ????????????????assert?calibration_stream,?'Error:?a?calibration_stream?should?be?provided?for?int8?mode'
          ????????????????builder.int8_calibrator??=?Calibrator(calibration_stream,?calibration_table_path)
          ????????????????print('Int8?mode?enabled')
          ????????????engine?=?builder.build_cuda_engine(network)?
          ????????????if?engine?is?None:
          ????????????????print('Failed?to?create?the?engine')
          ????????????????return?None???
          ????????????print("Completed?creating?the?engine")
          ????????????if?save_engine:
          ????????????????with?open(engine_file_path,?"wb")?as?f:
          ????????????????????f.write(engine.serialize())
          ????????????return?engine
          ????????
          ????if?os.path.exists(engine_file_path):
          ????????#?If?a?serialized?engine?exists,?load?it?instead?of?building?a?new?one.
          ????????print("Reading?engine?from?file?{}".format(engine_file_path))
          ????????with?open(engine_file_path,?"rb")?as?f,?trt.Runtime(TRT_LOGGER)?as?runtime:
          ????????????return?runtime.deserialize_cuda_engine(f.read())
          ????else:
          ????????return?build_engine(max_batch_size,?save_engine)

          0x6. TensorRT inference 核心代碼一覽

          //數(shù)據(jù)預(yù)處理和量化預(yù)處理保持一致,故不做展示
          //對(duì)模型的三個(gè)輸出進(jìn)行解析,生成返回模型預(yù)測(cè)的bboxes信息
          void?postProcessParall(const?int?height,?const?int?width,?int?scale_idx,?float?postThres,?tensor_t?*?origin_output,?vector<int>?Strides,?vector?Anchors,?vector?*bboxes)
          {
          ????Bbox?bbox;
          ????float?cx,?cy,?w_b,?h_b,?score;
          ????int?cid;
          ????const?float?*ptr?=?(float?*)origin_output->pValue;
          ????for(unsigned?long?a=0;?a<3;?++a){
          ????????for(unsigned?long?h=0;?h????????????for(unsigned?long?w=0;?w????????????????const?float?*cls_ptr?=??ptr?+?5;
          ????????????????cid?=?argmax(cls_ptr,?cls_ptr+NUM_CLASS);
          ????????????????score?=?sigmoid(ptr[4])?*?sigmoid(cls_ptr[cid]);
          ????????????????if(score>=postThres){
          ????????????????????cx?=?(sigmoid(ptr[0])?*?2.f?-?0.5f?+?static_cast<float>(w))?*?static_cast<float>(Strides[scale_idx]);
          ????????????????????cy?=?(sigmoid(ptr[1])?*?2.f?-?0.5f?+?static_cast<float>(h))?*?static_cast<float>(Strides[scale_idx]);
          ????????????????????w_b?=?powf(sigmoid(ptr[2])?*?2.f,?2)?*?Anchors[scale_idx?*?3?+?a].width;
          ????????????????????h_b?=?powf(sigmoid(ptr[3])?*?2.f,?2)?*?Anchors[scale_idx?*?3?+?a].height;
          ????????????????????bbox.xmin?=?clip(cx?-?w_b?/?2,?0.F,?static_cast<float>(INPUT_W?-?1));
          ????????????????????bbox.ymin?=?clip(cy?-?h_b?/?2,?0.f,?static_cast<float>(INPUT_H?-?1));
          ????????????????????bbox.xmax?=?clip(cx?+?w_b?/?2,?0.f,?static_cast<float>(INPUT_W?-?1));
          ????????????????????bbox.ymax?=?clip(cy?+?h_b?/?2,?0.f,?static_cast<float>(INPUT_H?-?1));
          ????????????????????bbox.score?=?score;
          ????????????????????bbox.cid?=?cid;
          ????????????????????//std::cout<
          ????????????????????bboxes->push_back(bbox);
          ????????????????}
          ????????????????ptr?+=?5?+?NUM_CLASS;
          ????????????}
          ????????}
          ????}
          }

          0x7. 預(yù)測(cè)結(jié)果展示

          預(yù)測(cè)結(jié)果展示
          預(yù)測(cè)結(jié)果展示

          在我的1070顯卡上,yolov5s 4.0 的模型 tensorrt int8 量化后,inference做到了3.3ms一幀!


          歡迎關(guān)注GiantPandaCV, 在這里你將看到獨(dú)家的深度學(xué)習(xí)分享,堅(jiān)持原創(chuàng),每天分享我們學(xué)習(xí)到的新鮮知識(shí)。( ? ?ω?? )?

          有對(duì)文章相關(guān)的問(wèn)題,或者想要加入交流群,歡迎添加BBuf微信:

          二維碼

          為了方便讀者獲取資料以及我們公眾號(hào)的作者發(fā)布一些Github工程的更新,我們成立了一個(gè)QQ群,二維碼如下,感興趣可以加入。

          公眾號(hào)QQ交流群


          瀏覽 66
          點(diǎn)贊
          評(píng)論
          收藏
          分享

          手機(jī)掃一掃分享

          分享
          舉報(bào)
          評(píng)論
          圖片
          表情
          推薦
          點(diǎn)贊
          評(píng)論
          收藏
          分享

          手機(jī)掃一掃分享

          分享
          舉報(bào)
          <kbd id="afajh"><form id="afajh"></form></kbd>
          <strong id="afajh"><dl id="afajh"></dl></strong>
            <del id="afajh"><form id="afajh"></form></del>
                1. <th id="afajh"><progress id="afajh"></progress></th>
                  <b id="afajh"><abbr id="afajh"></abbr></b>
                  <th id="afajh"><progress id="afajh"></progress></th>
                  欧美精品久久久久黄片18试看 | 免费的色情视频电影 | 九色国产在线 | 在线能看的丝袜网站 | 精品aaa|