岳乳丰满一区二区三区,亚洲成人网站第一网站,超碰97免费,97人人揉人人躁人人躁人人躁,日韩无码破解电影,日本在线播放,久草视频在线播放,超碰免费99

python爬蟲支持模塊多、代碼簡潔、開發(fā)效率高，是我們進行網(wǎng)絡(luò)爬蟲可以選取的好工具。對于一個個的爬取下載，勢必會消耗我們大量的時間，使用Python爬蟲就可以解決這個問題，即可以實現(xiàn)自動下載。本文向大家介紹python爬蟲的實戰(zhàn)練習(xí)之進行自動下載圖片的爬取過程。

一、自動下載圖片流程

1、總結(jié)網(wǎng)址規(guī)律，以便根據(jù)網(wǎng)址訪問網(wǎng)頁；

2、根據(jù)網(wǎng)址規(guī)律，循環(huán)爬取并返回網(wǎng)頁；

3、利用正則表達式提取并返回圖片。

二、使用Python爬蟲實現(xiàn)自動下載圖片步驟

1、導(dǎo)入相關(guān)包

import requestsimport importlibimport urllibimport reimport osimport sysimportlib.reload(sys)

2、定義網(wǎng)頁訪問函數(shù)

cookie獲得方式：正常訪問此頁面，鼠標(biāo)右鍵檢查或F12-在Network處查看自己的cookie，由于cookie很長且每個用戶的cookie不同，故代碼中將cookie省略了，讀者可查看自己瀏覽器的cookie，將其加入代碼中。

def askURL(url):    head = {           "Accept": "image/webp,image/apng,image/*,*/*;q=0.8",        "Accept-Language": "zh-CN,zh;q=0.9",        "Connection": "keep-alive",        "Cookie": " ",        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/        71.0.3578.98 Safari/537.36"    }    s = quote(url, safe=string.printable)    # 中文轉(zhuǎn)utf8字符，否則會報ascii錯    print(s)    request = urllib.request.Request(s, headers=head)    html = ""    try:        response = urllib.request.urlopen(request)        html = response.read().decode("utf-8")        print(html)    except urllib.error.URLError as e:        if hasattr(e, "code"):            print(e.code)        if hasattr(e, "reason"):            print(e.reason)    return html

3、提取圖片并返回

根據(jù)返回的html網(wǎng)頁可以看到，網(wǎng)頁中包含圖片的url共有四種類型，分別是objURL、middleURL、hoverURL和thumbURL，故利用正則表達式返回四種類型的鏈接并合并。

i = 1
def savePic(url):    global i  #     html = askURL(url)    pic_url = re.findall('"objURL":"(.*?)",', html, re.S)  # re.S表示讓換行符包含在字符中    pic_url2 = re.findall('"middleURL":"(.*?)",', html, re.S)    pic_url3 = re.findall('"hoverURL":"(.*?)",', html, re.S)    pic_url4 = re.findall('"thumbURL":"(.*?)",', html, re.S)    result = pic_url2 + pic_url + pic_url4 + pic_url3
    for item in result:        print("已下載" + str(i) + "張圖片")        # 定義異常控制        try:            pic = requests.get(item, timeout=5)        except Exception:              print("當(dāng)前圖片無法下載")            continue 
        #  保存圖片        string = 'D:/MyData/Python爬蟲/圖片/'+word+"/"+str(i)+".jpg"        fp = open(string, 'wb')        fp.write(pic.content)        fp.close()        i += 1

4、定義主函數(shù)

if __name__ == '__main__':  # 主程序    word = input("請輸入想要下載的圖片:")
    #  根據(jù)搜索的關(guān)鍵字判斷存放該類別的文件夾是否存在,不存在則創(chuàng)建    road = "D:/MyData/Python爬蟲/圖片下載器/" + word    if not os.path.exists(road):        os.mkdir(road)
    #  根據(jù)輸入的內(nèi)容構(gòu)建url列表，此處只訪問了四頁驗證效果    urls = [        'https://image.baidu.com/search/index?tn=baiduimage&ps=1&ct=201326592&lm=-1&cl=2&nc=1&ie=utf-8&word='        + word '.format(str(i)) for i in range(0, 40, 10)]
    for url in urls:        print(url)        downloadPic(url)
    print("下載完成!")

以上就是使用Python爬蟲實現(xiàn)自動下載圖片的過程，大家可以嘗試練習(xí)一下哦~

*聲明：本文于網(wǎng)絡(luò)整理，版權(quán)歸原作者所有，如來源信息有誤或侵犯權(quán)益，請聯(lián)系我們刪除或授權(quán)