影音先锋红桃视频,亚洲视频国产三级,我想看中国台湾特级黄色录像1级特黄特黄的 ,男人先锋资源,青青草大香蕉在线,国产一区二区在线播放,国产做爱视频一区二区三区,狠狠狠狠狠狠狠狠狠狠狠狠狠狠狠狠

點(diǎn)擊上方“小白學(xué)視覺(jué)”，選擇加"星標(biāo)"或“置頂”

重磅干貨，第一時(shí)間送達(dá)

作者：Rédigé par Gabriel Guerin

編譯：ronghuaiyang 來(lái)源：AI公園

導(dǎo)讀

有些情況下，收集各種場(chǎng)景下的數(shù)據(jù)很困難，本文給出了一種方法。

深度學(xué)習(xí)模型需要大量的數(shù)據(jù)才能得到很好的結(jié)果，目標(biāo)檢測(cè)模型也是一樣。

要訓(xùn)練一個(gè)YOLOv5的模型來(lái)自動(dòng)檢測(cè)你最喜歡的玩具，你需要拍幾千張你的玩具在不同上下文中的照片，對(duì)于每張圖，你需要標(biāo)注玩具在圖中的位置。

這樣是非常耗時(shí)的。

本文提出了使用圖像分割和stable diffusion來(lái)自動(dòng)生成目標(biāo)檢測(cè)數(shù)據(jù)集的方法。

生成自定義數(shù)據(jù)集的pipeline

生成目標(biāo)檢測(cè)數(shù)據(jù)集的pipeline包含4個(gè)步驟：

找一個(gè)和你要識(shí)別的物體屬于相同實(shí)例的數(shù)據(jù)集（比如狗數(shù)據(jù)集）。
使用圖像分割生成狗的mask。
微調(diào)圖像修復(fù)Stable Diffusion模型。
使用Stable Diffusion圖像修復(fù)模型和生成的mask來(lái)生成數(shù)據(jù)。

圖像分割：生成mask圖像

Stable Diffusion圖像修復(fù)pipeline需要輸入一個(gè)提示，一張圖像和一張mask圖像，這個(gè)模型會(huì)只從mask圖像中的白色像素部分上去生成新的圖像。

PixelLib這個(gè)庫(kù)幫助我們來(lái)做圖像分割，只用幾行代碼就可以，在這個(gè)例子里，我們會(huì)使用PointRend模型來(lái)檢測(cè)狗，下面是圖像分割的代碼。

import pixellib
from pixellib.torchbackend.instance import instanceSegmentation

ins = instanceSegmentation()
ins.load_model("pointrend_resnet50.pkl")
target_classes = ins.select_target_classes(dog=True)
results, output = ins.segmentImage(
  "dog.jpg", 
  show_bboxes=True, 
  segment_target_classes=target_classes, 
  output_image_name="mask_image.jpg"
)

使用pixellib來(lái)做圖像分割

segmentImage 函數(shù)返回一個(gè)tuple:

results : 是一個(gè)字典，包含了 'boxes', 'class_ids', 'class_names', 'object_counts', 'scores', 'masks', 'extracted_objects'這些字段。
output : 原始的圖像和mask圖像進(jìn)行了混合，如果show_bboxes 設(shè)置為True，還會(huì)有包圍框。

生成mask圖像

我們生成的mask只包含白色和黑色的像素，我們的mask會(huì)比原來(lái)圖中的狗略大一些，這樣可以給Stable Diffusion足夠的空間來(lái)進(jìn)行修復(fù)。為了做到這種效果，我們將mask向左、右、上、下分別平移了10個(gè)像素。

from PIL import Image
import numpy as np

width, height = 512, 512
image=Image.open("dog.jpg")

# Store the mask of dogs found by the pointrend model
mask_image = np.zeros(image.size)
for idx, mask in enumerate(results["masks"].transpose()):
  if results["class_names"][idx] == "dog":
    mask_image += mask


# Create a mask image bigger than the original segmented image
mask_image += np.roll(mask_image, 10, axis=[0, 0]) # Translate the mask 10 pixels to the left
mask_image += np.roll(mask_image, -10, axis=[0, 0]) # Translate the mask 10 pixels to the right
mask_image += np.roll(mask_image, 10, axis=[1, 1]) # Translate the mask 10 pixels to the bottom
mask_image += np.roll(mask_image, -10, axis=[1, 1]) # Translate the mask 10 pixels to the top


# Set non black pixels to white pixels
mask_image = np.clip(mask_image, 0, 1).transpose() * 255
# Save the mask image
mask_image = Image.fromarray(np.uint8(mask_image)).resize((width, height))
mask_image.save("mask_image.jpg")

從pixellib的輸出生成圖像的mask

現(xiàn)在，我們有了狗圖像的原始圖和其對(duì)應(yīng)的mask。

使用pixellib基于狗的圖像生成mask

微調(diào)Stable Diffusion圖像修復(fù)pipeline

Dreambooth是微調(diào)Stable Diffusion的一種技術(shù)，我們可以使用很少的幾張照片將新的概念教給模型，我們準(zhǔn)備使用這種技術(shù)來(lái)微調(diào)圖像修復(fù)模型。train_dreambooth_inpaint.py這個(gè)腳本中展示了如何在你自己的數(shù)據(jù)集上微調(diào)Stable Diffusion模型。

微調(diào)需要的硬件資源

在單個(gè)24GB的GPU上可以使用gradient_checkpointing和mixed_precision來(lái)微調(diào)模型，如果要使用更大的batch_size 和更快的訓(xùn)練，需要使用至少30GB的GPU。

安裝依賴

在運(yùn)行腳本之前，確保安裝了這些依賴:

pip install git+https://github.com/huggingface/diffusers.git
pip install -U -r requirements.txt

并初始化加速環(huán)境：

accelerate config

你需要注冊(cè)Hugging Face Hub的用戶，你還需要token來(lái)使用這些代碼，運(yùn)行下面的命令來(lái)授權(quán)你的token：

huggingface-cli login

微調(diào)樣本

在運(yùn)行這些計(jì)算量很大的訓(xùn)練的時(shí)候，超參數(shù)微調(diào)很關(guān)鍵，需要在你跑訓(xùn)練的機(jī)器上嘗試不同的參數(shù)，我推薦的參數(shù)如下：

$ accelerate launch train_dreambooth_inpaint.py \
  --pretrained_model_name_or_path="runwayml/stable-diffusion-inpainting"  \
  --instance_data_dir="dog_images" \
  --output_dir="stable-diffusion-inpainting-toy-cat" \
  --instance_prompt="a photo of a toy cat" \
  --resolution=512 \
  --train_batch_size=1 \
  --learning_rate=5e-6 \   
  --lr_scheduler="constant" \   
  --lr_warmup_steps=0 \   
  --max_train_steps=400 \
  --gradient_accumulation_steps=2 \
  --gradient_checkpointing \
  --train_text_encoder

運(yùn)行Stable Diffusion圖像修復(fù)pipeline

Stable Diffusion圖像修復(fù)是一個(gè)text2image的擴(kuò)散模型，使用一張帶mask的圖像和文本輸入來(lái)生成真實(shí)的圖像。使用https://github.com/huggingface/diffusers來(lái)實(shí)現(xiàn)這個(gè)功能。

from PIL import Image
from diffusers import StableDiffusionInpaintPipeline


# Image and Mask
image = Image.open("dog.jpg")
mask_image = Image.open("mask_image.jpg")


# Inpainting model
pipe = StableDiffusionInpaintPipeline.from_pretrained(
    "stable-diffusion-inpainting-toy-cat",
    torch_dtype=torch.float16,
)
image = pipe(prompt="a toy cat", image=image, mask_image=mask_image).images[0]

使用微調(diào)過(guò)的模型運(yùn)行Stable Diffusion圖像修復(fù)。

結(jié)論Conclusion

總結(jié)一下：

使用pixellib進(jìn)行圖像分割，得到圖像的mask。
微調(diào)runwayml/stable-diffusion-inpainting模型使得該模型能夠?qū)W習(xí)到新的玩具貓類型。
在狗的圖像上，使用微調(diào)過(guò)的模型和生成的mask運(yùn)行StableDiffusionInpaintPipeline。

最終的結(jié)果

所有步驟完成之后，我們生成了一個(gè)新的圖像，玩具貓代替了原來(lái)的狗的位置，這樣，2張圖像可以使用相同的包圍框。

![img](Stable Diffusion Inpainting Generate a Custom Dataset for Object Detection.assets/Capturedecran2023-01-22a23_17_25_8025faada328368a6335c61ced262d96_800.jpg)

我們現(xiàn)在可以為數(shù)據(jù)集中的所有的圖像都生成新的圖像。

局限性

Stable Diffusion并不能每次都生成好的結(jié)果，數(shù)據(jù)集生成之后，還需要進(jìn)行清理的工作。

這個(gè)pipeline是非常耗費(fèi)計(jì)算量的，Stable Diffusion的微調(diào)需要24GB內(nèi)存的顯卡，推理的時(shí)候也是需要GPU的。

這種構(gòu)建數(shù)據(jù)集的方法當(dāng)數(shù)據(jù)集中的圖像很難獲得的時(shí)候是很有用的，比如，你需要檢測(cè)森林火焰，最好是使用這種方法，而不是去森林里放火。但是，對(duì)于普通的場(chǎng)景，數(shù)據(jù)標(biāo)注還是最標(biāo)準(zhǔn)的做法。

—END—

英文原文：https://www.sicara.fr/blog-technique/dataset-generation-fine-tune-stable-diffusion-inpainting

   
    下載1：OpenCV-Contrib擴(kuò)展模塊中文版教程
    

   
   
    在「小白學(xué)視覺(jué)」公眾號(hào)后臺(tái)回復(fù)：擴(kuò)展模塊中文教程，即可下載全網(wǎng)第一份OpenCV擴(kuò)展模塊教程中文版，涵蓋擴(kuò)展模塊安裝、SFM算法、立體視覺(jué)、目標(biāo)跟蹤、生物視覺(jué)、超分辨率處理等二十多章內(nèi)容。
   
   
    

   
   
    下載2：Python視覺(jué)實(shí)戰(zhàn)項(xiàng)目52講
   
   
    在「小白學(xué)視覺(jué)」公眾號(hào)后臺(tái)回復(fù)：Python視覺(jué)實(shí)戰(zhàn)項(xiàng)目，即可下載包括圖像分割、口罩檢測(cè)、車道線檢測(cè)、車輛計(jì)數(shù)、添加眼線、車牌識(shí)別、字符識(shí)別、情緒檢測(cè)、文本內(nèi)容提取、面部識(shí)別等31個(gè)視覺(jué)實(shí)戰(zhàn)項(xiàng)目，助力快速學(xué)校計(jì)算機(jī)視覺(jué)。
   
   
    

   
   
    下載3：OpenCV實(shí)戰(zhàn)項(xiàng)目20講
   
   
    在「小白學(xué)視覺(jué)」公眾號(hào)后臺(tái)回復(fù)：OpenCV實(shí)戰(zhàn)項(xiàng)目20講，即可下載含有20個(gè)基于OpenCV實(shí)現(xiàn)20個(gè)實(shí)戰(zhàn)項(xiàng)目，實(shí)現(xiàn)OpenCV學(xué)習(xí)進(jìn)階。
   
   
    

   
交流群

歡迎加入公眾號(hào)讀者群一起和同行交流，目前有SLAM、三維視覺(jué)、傳感器、自動(dòng)駕駛、計(jì)算攝影、檢測(cè)、分割、識(shí)別、醫(yī)學(xué)影像、GAN、算法競(jìng)賽等微信群（以后會(huì)逐漸細(xì)分），請(qǐng)掃描下面微信號(hào)加群，備注：”昵稱+學(xué)校/公司+研究方向“，例如：”張三 + 上海交大 + 視覺(jué)SLAM“。請(qǐng)按照格式備注，否則不予通過(guò)。添加成功后會(huì)根據(jù)研究方向邀請(qǐng)進(jìn)入相關(guān)微信群。請(qǐng)勿在群內(nèi)發(fā)送廣告，否則會(huì)請(qǐng)出群，謝謝理解~

使用Stable Diffusion圖像修復(fù)來(lái)生成自己的目標(biāo)檢測(cè)數(shù)據(jù)集