<kbd id="afajh"><form id="afajh"></form></kbd>
<strong id="afajh"><dl id="afajh"></dl></strong>
    <del id="afajh"><form id="afajh"></form></del>
        1. <th id="afajh"><progress id="afajh"></progress></th>
          <b id="afajh"><abbr id="afajh"></abbr></b>
          <th id="afajh"><progress id="afajh"></progress></th>

          喜歡玩王者榮耀的有福了,用 Python 獲取英雄皮膚壁紙

          共 7895字,需瀏覽 16分鐘

           ·

          2020-10-16 08:55

          出品:Python數(shù)據(jù)之道 (ID:PyDataLab)

          作者:葉庭云,來自讀者投稿

          編輯:Lemon

          一、前言

          王者榮耀這款手游,想必大家都玩過或聽過,游戲里英雄有各式各樣的皮膚,制作得很精美,有些拿來做電腦壁紙它不香嗎。本文帶你利用 Python 爬蟲一鍵下載王者榮耀英雄皮膚壁紙。

          1. 目標(biāo)

          創(chuàng)建一個(gè)文件夾, 里面又有按英雄名稱分的子文件夾保存該英雄的所有皮膚圖片。

          URL:https://pvp.qq.com/web201605/herolist.shtml

          2. 環(huán)境

          運(yùn)行環(huán)境:Pycharm、Python3.7

          需要的庫

          import?requests
          import?os
          import?json
          from?lxml?import?etree
          from?fake_useragent?import?UserAgent
          import?logging

          二、分析網(wǎng)頁

          首先打開王者榮耀官網(wǎng),點(diǎn)擊英雄資料進(jìn)去。

          進(jìn)入新的頁面后,任意選擇一個(gè)英雄,檢查網(wǎng)頁。

          多選擇幾個(gè)英雄檢查網(wǎng)頁,可以發(fā)現(xiàn)各個(gè)英雄頁面的 URL 規(guī)律

          https://pvp.qq.com/web201605/herodetail/152.shtml
          https://pvp.qq.com/web201605/herodetail/150.shtml
          https://pvp.qq.com/web201605/herodetail/167.shtml

          發(fā)現(xiàn)只有末尾的數(shù)字在變化,末尾的數(shù)字可以認(rèn)為是該英雄的頁面標(biāo)識(shí)。

          點(diǎn)擊 Network,Crtl + R 刷新,可以找到一個(gè) herolist.json 文件。

          發(fā)現(xiàn)是亂碼,但問題不大,雙擊這個(gè) json 文件,將它下載下來觀察,用編輯器打開可以看到。

          ename 是英雄網(wǎng)址頁面的標(biāo)識(shí);而 cname 是對(duì)應(yīng)英雄的名稱;skin_name 為對(duì)應(yīng)皮膚的名稱。

          任選一個(gè)英雄頁面進(jìn)去,檢查該英雄下面所有皮膚,觀察 url 變化規(guī)律。

          url變化規(guī)律如下:
          https://game.gtimg.cn/images/yxzj/img201606/heroimg/152/152-bigskin-1.jpg
          https://game.gtimg.cn/images/yxzj/img201606/heroimg/152/152-bigskin-2.jpg
          https://game.gtimg.cn/images/yxzj/img201606/heroimg/152/152-bigskin-3.jpg
          https://game.gtimg.cn/images/yxzj/img201606/heroimg/152/152-bigskin-4.jpg
          https://game.gtimg.cn/images/yxzj/img201606/heroimg/152/152-bigskin-5.jpg

          復(fù)制圖片鏈接到瀏覽器打開,可以看到高清大圖。

          觀察到同一個(gè)英雄的皮膚圖片 url 末尾 -{x}.jpg1 開始依次遞增,再來看看不同英雄的皮膚圖片 url 是如何構(gòu)造的。會(huì)發(fā)現(xiàn), ename 這個(gè)英雄的標(biāo)識(shí)不一樣,獲取到的圖片就不一樣,由 ename 參數(shù)決定。

          https://game.gtimg.cn/images/yxzj/img201606/heroimg/152/152-bigskin-1.jpg
          https://game.gtimg.cn/images/yxzj/img201606/heroimg/150/150-bigskin-1.jpg
          https://game.gtimg.cn/images/yxzj/img201606/heroimg/153/153-bigskin-1.jpg
          #?可構(gòu)造圖片請(qǐng)求鏈接如下
          https://game.gtimg.cn/images/yxzj/img201606/heroimg/{ename}/{ename}-bigskin-{x}.jpg

          三、爬蟲代碼實(shí)現(xiàn)

          #?-*-?coding:?UTF-8?-*-
          """
          @File ???:王者榮耀英雄皮膚壁紙.py
          @Author ?:葉庭云
          @Date ???:2020/10/2 11:40
          @CSDN ???:https://blog.csdn.net/fyfugoyfa
          """

          import?requests
          import?os
          import?json
          from?lxml?import?etree
          from?fake_useragent?import?UserAgent
          import?logging

          #?日志輸出的基本配置
          logging.basicConfig(level=logging.INFO,?format='%(asctime)s?-?%(levelname)s:?%(message)s')


          class?glory_of_king(object):
          ????def?__init__(self):
          ????????if?not?os.path.exists("./王者榮耀皮膚"):
          ????????????os.mkdir("王者榮耀皮膚")
          ????????#?利用fake_useragent產(chǎn)生隨機(jī)UserAgent??防止被反爬
          ????????ua?=?UserAgent(verify_ssl=False,?path='fake_useragent.json')
          ????????for?i?in?range(1,?50):
          ????????????self.headers?=?{
          ????????????????'User-Agent':?ua.random
          ????????????}

          ????def?scrape_skin(self):
          ????????#?發(fā)送請(qǐng)求???獲取響應(yīng)
          ????????response?=?requests.get('https://pvp.qq.com/web201605/js/herolist.json',?headers=self.headers)
          ????????#?str轉(zhuǎn)為json
          ????????data?=?json.loads(response.text)
          ????????#?for循環(huán)遍歷data獲取需要的字段??創(chuàng)建對(duì)應(yīng)英雄名稱的文件夾
          ????????for?i?in?data:
          ????????????hero_number?=?i['ename']????#?獲取英雄名字編號(hào)
          ????????????hero_name?=?i['cname']??????#?獲取英雄名字
          ????????????os.mkdir("./王者榮耀皮膚/{}".format(hero_name))??#?創(chuàng)建英雄名稱對(duì)應(yīng)的文件夾
          ????????????response_src?=?requests.get("https://pvp.qq.com/web201605/herodetail/{}.shtml".format(hero_number),
          ????????????????????????????????????????headers=self.headers)
          ????????????hero_content?=?response_src.content.decode('gbk')??#?返回相應(yīng)的html頁面?解碼為gbk
          ????????????#?xpath解析對(duì)象??提取每個(gè)英雄的皮膚名字
          ????????????hero_data?=?etree.HTML(hero_content)
          ????????????hero_img?=?hero_data.xpath('//div[@class="pic-pf"]/ul/@data-imgname')
          ????????????#?去掉每個(gè)皮膚名字中間的分隔符
          ????????????hero_src?=?hero_img[0].split('|')
          ????????????logging.info(hero_src)
          ????????????#?遍歷英雄src處理圖片名稱。
          ????????????for?j?in?range(len(hero_src)):
          ????????????????#?去掉皮膚名字的&符號(hào)
          ????????????????index_?=?hero_src[j].find("&")
          ????????????????skin_name?=?hero_src[j][:index_]
          ????????????????#?請(qǐng)求下載圖片
          ????????????????response_skin?=?requests.get(
          ????????????????????"https://game.gtimg.cn/images/yxzj/img201606/skin/hero-info/{}/{}-bigskin-{}.jpg".format(
          ????????????????????????hero_number,?hero_number,?j?+?1))
          ????????????????#?獲取圖片二進(jìn)制數(shù)據(jù)????????
          ????????????????skin_img?=?response_skin.content??
          ????????????????#?把皮膚圖片保存到對(duì)應(yīng)名字的文件里
          ????????????????with?open("./王者榮耀皮膚/{}/{}.jpg".format(hero_name,?skin_name),?"wb")as?f:
          ????????????????????f.write(skin_img)
          ????????????????????logging.info(f"{skin_name}.jpg 下載成功??!")

          ????def?run(self):
          ????????self.scrape_skin()

          if?__name__?==?'__main__':
          ????spider?=?glory_of_king()
          ????spider.run()

          運(yùn)行效果如下:

          程序運(yùn)行一段時(shí)間,英雄皮膚壁紙就都保存在本地文件夾啦,結(jié)果如下:

          四、其他說明

          • 不建議抓取太多數(shù)據(jù),容易對(duì)服務(wù)器造成負(fù)載,淺嘗輒止即可。
          • 通過本文爬蟲,可以幫助你了解 json 數(shù)據(jù)的解析和提取需要的數(shù)據(jù),如何通過字符串的拼接來構(gòu)造URL請(qǐng)求。
          • 本文利用 Python 爬蟲一鍵下載王者榮耀英雄皮膚壁紙,實(shí)現(xiàn)過程中也會(huì)遇到一些問題,多思考和調(diào)試,最終解決問題,也能理解得更深刻。
          • 代碼可直接復(fù)制運(yùn)行,如果覺得還不錯(cuò),記得給個(gè)贊哦,也是對(duì)作者最大的鼓勵(lì),不足之處可以在評(píng)論區(qū)多多指正。

          解決報(bào)錯(cuò):fake_useragent.errors.FakeUserAgentError: Maximum amount of retries reached

          #?報(bào)錯(cuò)如下
          Error?occurred?during?loading?data.?Trying?to?use?cache?server?https://fake-useragent.herokuapp.com/browsers/0.1.11
          Traceback?(most?recent?call?last):
          ??File?"/usr/local/python3/lib/python3.6/urllib/request.py",?line?1318,?in?do_open
          ????encode_chunked=req.has_header('Transfer-encoding'))
          ??File?"/usr/local/python3/lib/python3.6/http/client.py",?line?1239,?in?request
          ????self._send_request(method,?url,?body,?headers,?encode_chunked)
          ??File?"/usr/local/python3/lib/python3.6/http/client.py",?line?1285,?in?_send_request
          ????self.endheaders(body,?encode_chunked=encode_chunked)
          ??File?"/usr/local/python3/lib/python3.6/http/client.py",?line?1234,?in?endheaders
          ????self._send_output(message_body,?encode_chunked=encode_chunked)
          ??File?"/usr/local/python3/lib/python3.6/http/client.py",?line?1026,?in?_send_output
          ????self.send(msg)
          ??File?"/usr/local/python3/lib/python3.6/http/client.py",?line?964,?in?send
          ????self.connect()
          ??File?"/usr/local/python3/lib/python3.6/http/client.py",?line?1392,?in?connect
          ????super().connect()
          ??File?"/usr/local/python3/lib/python3.6/http/client.py",?line?936,?in?connect
          ????(self.host,self.port),?self.timeout,?self.source_address)
          ??File?"/usr/local/python3/lib/python3.6/socket.py",?line?724,?in?create_connection
          ????raise?err
          ??File?"/usr/local/python3/lib/python3.6/socket.py",?line?713,?in?create_connection
          ????sock.connect(sa)
          socket.timeout:?timed?out
          ?
          During?handling?of?the?above?exception,?another?exception?occurred:
          ?
          Traceback?(most?recent?call?last):
          ??File?"/usr/local/python3/lib/python3.6/site-packages/fake_useragent/utils.py",?line?67,?in?get
          ????context=context,
          ??File?"/usr/local/python3/lib/python3.6/urllib/request.py",?line?223,?in?urlopen
          ????return?opener.open(url,?data,?timeout)
          ??File?"/usr/local/python3/lib/python3.6/urllib/request.py",?line?526,?in?open
          ????response?=?self._open(req,?data)
          ??File?"/usr/local/python3/lib/python3.6/urllib/request.py",?line?544,?in?_open
          ????'_open',?req)
          ??File?"/usr/local/python3/lib/python3.6/urllib/request.py",?line?504,?in?_call_chain
          ????result?=?func(*args)
          ??File?"/usr/local/python3/lib/python3.6/urllib/request.py",?line?1361,?in?https_open
          ????context=self._context,?check_hostname=self._check_hostname)
          ??File?"/usr/local/python3/lib/python3.6/urllib/request.py",?line?1320,?in?do_open
          ????raise?URLError(err)
          urllib.error.URLError:?
          ?
          During?handling?of?the?above?exception,?another?exception?occurred:
          ?
          Traceback?(most?recent?call?last):
          ??File?"/usr/local/python3/lib/python3.6/site-packages/fake_useragent/utils.py",?line?154,?in?load
          ????for?item?in?get_browsers(verify_ssl=verify_ssl):
          ??File?"/usr/local/python3/lib/python3.6/site-packages/fake_useragent/utils.py",?line?97,?in?get_browsers
          ????html?=?get(settings.BROWSERS_STATS_PAGE,?verify_ssl=verify_ssl)
          ??File?"/usr/local/python3/lib/python3.6/site-packages/fake_useragent/utils.py",?line?84,?in?get
          ????raise?FakeUserAgentError('Maximum?amount?of?retries?reached')
          fake_useragent.errors.FakeUserAgentError:?Maximum?amount?of?retries?reached

          解決方法如下:

          #?將 https://fake-useragent.herokuapp.com/browsers/0.1.11 里內(nèi)容復(fù)制?并另存為本地 json 文件:fake_useragent.json
          #?引用
          ua?=?UserAgent(verify_ssl=False,?path='fake_useragent.json')
          print(ua.random)

          運(yùn)行結(jié)果如下:

          Mozilla/5.0?(Windows?NT?6.2;?WOW64)?AppleWebKit/537.36?(KHTML,?like?Gecko)?Chrome/27.0.1500.55?Safari/537.36
          戀習(xí)Python

          關(guān)注戀習(xí)Python,Python都好練


          好文章,我在看??

          瀏覽 61
          點(diǎn)贊
          評(píng)論
          收藏
          分享

          手機(jī)掃一掃分享

          分享
          舉報(bào)
          評(píng)論
          圖片
          表情
          推薦
          點(diǎn)贊
          評(píng)論
          收藏
          分享

          手機(jī)掃一掃分享

          分享
          舉報(bào)
          <kbd id="afajh"><form id="afajh"></form></kbd>
          <strong id="afajh"><dl id="afajh"></dl></strong>
            <del id="afajh"><form id="afajh"></form></del>
                1. <th id="afajh"><progress id="afajh"></progress></th>
                  <b id="afajh"><abbr id="afajh"></abbr></b>
                  <th id="afajh"><progress id="afajh"></progress></th>
                  国产夫妻精品自拍 | 国产精品高潮呻吟视频 | 丁香花在线高清完整版视频 | 超碰自拍99 | 国产激情婷婷 |