<kbd id="afajh"><form id="afajh"></form></kbd>
<strong id="afajh"><dl id="afajh"></dl></strong>
    <del id="afajh"><form id="afajh"></form></del>
        1. <th id="afajh"><progress id="afajh"></progress></th>
          <b id="afajh"><abbr id="afajh"></abbr></b>
          <th id="afajh"><progress id="afajh"></progress></th>

          基于Opencv實現(xiàn)眼睛控制鼠標(biāo)

          共 7636字,需瀏覽 16分鐘

           ·

          2022-02-17 22:16

          點擊下方AI算法與圖像處理”,一起進(jìn)步!

          重磅干貨,第一時間送達(dá)

          如何用眼睛來控制鼠標(biāo)?一種基于單一前向視角的機(jī)器學(xué)習(xí)眼睛姿態(tài)估計方法。在此項目中,每次單擊鼠標(biāo)時,我們都會編寫代碼來裁剪你們的眼睛圖像。使用這些數(shù)據(jù),我們可以反向訓(xùn)練模型,從你們您的眼睛預(yù)測鼠標(biāo)的位置。在開始項目之前,我們需要引入第三方庫。

          # For monitoring web camera and performing image minipulationsimport cv2# For performing array operationsimport numpy as np# For creating and removing directoriesimport osimport shutil# For recognizing and performing actions on mouse pressesfrom pynput.mouse import Listener

          首先讓我們了解一下Pynput的Listener工作原理。pynput.mouse.Listener創(chuàng)建一個后臺線程,該線程記錄鼠標(biāo)的移動和鼠標(biāo)的點擊。這是一個簡化代碼,當(dāng)你們按下鼠標(biāo)時,它會打印鼠標(biāo)的坐標(biāo):

          from pynput.mouse import Listenerdef on_click(x, y, button, pressed):"""  Args:    x: the x-coordinate of the mouse    y: the y-coordinate of the mouse    button: 1 or 0, depending on right-click or left-click    pressed: 1 or 0, whether the mouse was pressed or released  """if pressed:print (x, y)with Listener(on_click = on_click) as listener:  listener.join()

          現(xiàn)在,為了實現(xiàn)我們的目的,讓我們擴(kuò)展這個框架。但是,我們首先需要編寫裁剪眼睛邊界框的代碼。我們稍后將在on_click函數(shù)內(nèi)部調(diào)用此函數(shù)。我們使用Haar級聯(lián)對象檢測來確定用戶眼睛的邊界框。你們可以在此處下載檢測器文件,讓我們做一個簡單的演示來展示它是如何工作的:

          import cv2# Load the cascade classifier detection objectcascade = cv2.CascadeClassifier("haarcascade_eye.xml")# Turn on the web cameravideo_capture = cv2.VideoCapture(0)# Read data from the web camera (get the frame)_, frame = video_capture.read()# Convert the image to grayscalegray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)# Predict the bounding box of the eyesboxes = cascade.detectMultiScale(gray, 1.3, 10)# Filter out images taken from a bad angle with errors# We want to make sure both eyes were detected, and nothing elseif len(boxes) == 2:eyes = []for box in boxes:    # Get the rectangle parameters for the detected eyex, y, w, h = box    # Crop the bounding box from the frameeye = frame[y:y + h, x:x + w]    # Resize the crop to 32x32eye = cv2.resize(eye, (32, 32))    # Normalizeeye = (eye - eye.min()) / (eye.max() - eye.min())    # Further crop to just around the eyeballeye = eye[10:-10, 5:-5]    # Scale between [0, 255] and convert to int datatypeeye = (eye * 255).astype(np.uint8)    # Add the current eye to the list of 2 eyeseyes.append(eye)  # Concatenate the two eye images into oneeyes = np.hstack(eyes)

          現(xiàn)在,讓我們使用此知識來編寫用于裁剪眼睛圖像的函數(shù)。首先,我們需要一個輔助函數(shù)來進(jìn)行標(biāo)準(zhǔn)化:

          def normalize(x):  minn, maxx = x.min(), x.max()return (x - minn) / (maxx - minn)

          這是我們的眼睛裁剪功能。如果發(fā)現(xiàn)眼睛,它將返回圖像。否則,它返回None:

          def scan(image_size=(32, 32)):_, frame = video_capture.read()gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)boxes = cascade.detectMultiScale(gray, 1.3, 10)if len(boxes) == 2:eyes = []for box in boxes:x, y, w, h = boxeye = frame[y:y + h, x:x + w]eye = cv2.resize(eye, image_size)eye = normalize(eye)eye = eye[10:-10, 5:-5]eyes.append(eye)return (np.hstack(eyes) * 255).astype(np.uint8)else:return None

          現(xiàn)在,讓我們來編寫我們的自動化,該自動化將在每次按下鼠標(biāo)按鈕時運行。(假設(shè)我們之前已經(jīng)root在代碼中將變量定義為我們要存儲圖像的目錄):

          def on_click(x, y, button, pressed):# If the action was a mouse PRESS (not a RELEASE)if pressed:# Crop the eyes    eyes = scan()# If the function returned None, something went wrongif not eyes is None:# Save the image      filename = root + "{} {} {}.jpeg".format(x, y, button)      cv2.imwrite(filename, eyes)

          現(xiàn)在,我們可以回憶起pynput的實現(xiàn)Listener,并進(jìn)行完整的代碼實現(xiàn):

          import cv2import numpy as npimport osimport shutilfrom pynput.mouse import Listener
          root = input("Enter the directory to store the images: ")if os.path.isdir(root): resp = ""while not resp in ["Y", "N"]: resp = input("This directory already exists. If you continue, the contents of the existing directory will be deleted. If you would still like to proceed, enter [Y]. Otherwise, enter [N]: ")if resp == "Y": shutil.rmtree(root)else: exit()os.mkdir(root)
          # Normalization helper functiondef normalize(x): minn, maxx = x.min(), x.max()return (x - minn) / (maxx - minn)
          # Eye cropping functiondef scan(image_size=(32, 32)): _, frame = video_capture.read() gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY) boxes = cascade.detectMultiScale(gray, 1.3, 10)if len(boxes) == 2: eyes = []for box in boxes: x, y, w, h = box eye = frame[y:y + h, x:x + w] eye = cv2.resize(eye, image_size) eye = normalize(eye) eye = eye[10:-10, 5:-5] eyes.append(eye)return (np.hstack(eyes) * 255).astype(np.uint8)else:return None
          def on_click(x, y, button, pressed):# If the action was a mouse PRESS (not a RELEASE)if pressed:# Crop the eyes eyes = scan()# If the function returned None, something went wrongif not eyes is None:# Save the image filename = root + "{} {} {}.jpeg".format(x, y, button) cv2.imwrite(filename, eyes)
          cascade = cv2.CascadeClassifier("haarcascade_eye.xml")video_capture = cv2.VideoCapture(0)
          with Listener(on_click = on_click) as listener: listener.join()

          運行此命令時,每次單擊鼠標(biāo)(如果兩只眼睛都在視線中),它將自動裁剪網(wǎng)絡(luò)攝像頭并將圖像保存到適當(dāng)?shù)哪夸浿小D像的文件名將包含鼠標(biāo)坐標(biāo)信息,以及它是右擊還是左擊。


          這是一個示例圖像。在此圖像中,我在分辨率為2560x1440的監(jiān)視器上在坐標(biāo)(385,686)上單擊鼠標(biāo)左鍵:

          級聯(lián)分類器非常準(zhǔn)確,到目前為止,我尚未在自己的數(shù)據(jù)目錄中看到任何錯誤?,F(xiàn)在,讓我們編寫用于訓(xùn)練神經(jīng)網(wǎng)絡(luò)的代碼,以給定你們的眼睛圖像來預(yù)測鼠標(biāo)的位置。

          import numpy as npimport osimport cv2import pyautoguifrom tensorflow.keras.models import *from tensorflow.keras.layers import *from tensorflow.keras.optimizers import *

          現(xiàn)在,讓我們添加級聯(lián)分類器:

          cascade = cv2.CascadeClassifier("haarcascade_eye.xml")video_capture = cv2.VideoCapture(0)

          正?;?/span>

          def normalize(x):  minn, maxx = x.min(), x.max()return (x - minn) / (maxx - minn)

          捕捉眼睛:

          def scan(image_size=(32, 32)):_, frame = video_capture.read()gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)boxes = cascade.detectMultiScale(gray, 1.3, 10)if len(boxes) == 2:eyes = []for box in boxes:x, y, w, h = boxeye = frame[y:y + h, x:x + w]eye = cv2.resize(eye, image_size)eye = normalize(eye)eye = eye[10:-10, 5:-5]eyes.append(eye)return (np.hstack(eyes) * 255).astype(np.uint8)else:return None

          讓我們定義顯示器的尺寸。你們必須根據(jù)自己的計算機(jī)屏幕的分辨率更改以下參數(shù):

          # Note that there are actually 2560x1440 pixels on my screen# I am simply recording one less, so that when we divide by these# numbers, we will normalize between 0 and 1. Note that mouse# coordinates are reported starting at (0, 0), not (1, 1)width, height = 2559, 1439

          現(xiàn)在,讓我們加載數(shù)據(jù)(同樣,假設(shè)你們已經(jīng)定義了root)。我們并不在乎是單擊鼠標(biāo)右鍵還是單擊鼠標(biāo)左鍵,因為我們的目標(biāo)只是預(yù)測鼠標(biāo)的位置:

          filepaths = os.listdir(root)X, Y = [], []for filepath in filepaths:x, y, _ = filepath.split(' ')x = float(x) / widthy = float(y) / heightX.append(cv2.imread(root + filepath))Y.append([x, y])X = np.array(X) / 255.0Y = np.array(Y)print (X.shape, Y.shape)

          讓我們定義我們的模型架構(gòu):

          model = Sequential()model.add(Conv2D(32, 3, 2, activation = 'relu', input_shape = (12, 44, 3)))model.add(Conv2D(64, 2, 2, activation = 'relu'))model.add(Flatten())model.add(Dense(32, activation = 'relu'))model.add(Dense(2, activation = 'sigmoid'))model.compile(optimizer = "adam", loss = "mean_squared_error")model.summary()

          這是我們的摘要:

          接下來的任務(wù)是訓(xùn)練模型。我們將在圖像數(shù)據(jù)中添加一些噪點:

          epochs = 200for epoch in range(epochs):  model.fit(X, Y, batch_size = 32)

          現(xiàn)在讓我們使用我們的模型來實時移動鼠標(biāo)。請注意,這需要大量數(shù)據(jù)才能正常工作。但是,作為概念證明,你們會注意到,實際上只有200張圖像,它確實將鼠標(biāo)移到了你們要查看的常規(guī)區(qū)域。當(dāng)然,除非你們擁有更多的數(shù)據(jù),否則這是不可控的。

          while True:  eyes = scan()if not eyes is None:      eyes = np.expand_dims(eyes / 255.0, axis = 0)      x, y = model.predict(eyes)[0]      pyautogui.moveTo(x * width, y * height)

          這是一個概念證明的例子。請注意,在進(jìn)行此屏幕錄像之前,我們只訓(xùn)練了很少的數(shù)據(jù)。這是我們的鼠標(biāo)根據(jù)眼睛自動移動到終端應(yīng)用程序窗口的視頻。就像我說的那樣,這很容易,因為數(shù)據(jù)很少。有了更多的數(shù)據(jù),它有望穩(wěn)定到足以以更高的特異性進(jìn)行控制。僅用幾百張圖像,你們就只能將其移動到注視的整個區(qū)域內(nèi)。另外,如果在整個數(shù)據(jù)收集過程中,你們在屏幕的特定區(qū)域(例如邊緣)都沒有拍攝任何圖像,則該模型不太可能在該區(qū)域內(nèi)進(jìn)行預(yù)測


          努力分享優(yōu)質(zhì)的計算機(jī)視覺相關(guān)內(nèi)容,歡迎關(guān)注:

          交流群


          歡迎加入公眾號讀者群一起和同行交流,目前有美顏、三維視覺、計算攝影、檢測、分割、識別、NeRF、GAN、算法競賽等微信群


          個人微信(如果沒有備注不拉群!
          請注明:地區(qū)+學(xué)校/企業(yè)+研究方向+昵稱



          下載1:何愷明頂會分享


          AI算法與圖像處理」公眾號后臺回復(fù):何愷明,即可下載。總共有6份PDF,涉及 ResNet、Mask RCNN等經(jīng)典工作的總結(jié)分析


          下載2:終身受益的編程指南:Google編程風(fēng)格指南


          AI算法與圖像處理」公眾號后臺回復(fù):c++,即可下載。歷經(jīng)十年考驗,最權(quán)威的編程規(guī)范!



          下載3 CVPR2021

          AI算法與圖像處公眾號后臺回復(fù):CVPR即可下載1467篇CVPR?2020論文 和 CVPR 2021 最新論文





          瀏覽 78
          點贊
          評論
          收藏
          分享

          手機(jī)掃一掃分享

          分享
          舉報
          評論
          圖片
          表情
          推薦
          點贊
          評論
          收藏
          分享

          手機(jī)掃一掃分享

          分享
          舉報
          <kbd id="afajh"><form id="afajh"></form></kbd>
          <strong id="afajh"><dl id="afajh"></dl></strong>
            <del id="afajh"><form id="afajh"></form></del>
                1. <th id="afajh"><progress id="afajh"></progress></th>
                  <b id="afajh"><abbr id="afajh"></abbr></b>
                  <th id="afajh"><progress id="afajh"></progress></th>
                  天天日天天综合 | 中文字字幕一区二区三区四区五区 | 亚洲无码影院 | 青青草2017在线视频 | 成人精品 |