<kbd id="afajh"><form id="afajh"></form></kbd>
<strong id="afajh"><dl id="afajh"></dl></strong>
    <del id="afajh"><form id="afajh"></form></del>
        1. <th id="afajh"><progress id="afajh"></progress></th>
          <b id="afajh"><abbr id="afajh"></abbr></b>
          <th id="afajh"><progress id="afajh"></progress></th>

          大家沉迷短視頻無(wú)法自拔?Python爬蟲(chóng)進(jìn)階,帶你玩轉(zhuǎn)短視頻

          共 3140字,需瀏覽 7分鐘

           ·

          2021-10-03 15:55

          現(xiàn)在短視頻可謂是一騎絕塵,吃飯的時(shí)候、休息的時(shí)候、躺在床上都在刷短視頻,今天給大家?guī)?lái)python爬蟲(chóng)進(jìn)階 :美拍視頻地址加密解析。

          短視頻js逆向解析


          抓取目標(biāo)

          目標(biāo)網(wǎng)址:

          工具使用

          開(kāi)發(fā)環(huán)境:win10、python3.7 開(kāi)發(fā)工具:pycharm、Chrome 工具包:requests、xpath、base64

          重點(diǎn)學(xué)習(xí)內(nèi)容

          爬蟲(chóng)采集數(shù)據(jù)的解析過(guò)程 js代碼調(diào)試技巧 js逆向解析代碼 Python代碼的轉(zhuǎn)換

          項(xiàng)目思路解析

          進(jìn)入到網(wǎng)站的首頁(yè) 挑選你感興趣的分類(lèi) 根據(jù)首頁(yè)地址獲取到進(jìn)入詳情頁(yè)面的超鏈接的跳轉(zhuǎn)地址 找到對(duì)應(yīng)加密的視頻播放地址數(shù)據(jù) 這個(gè)數(shù)據(jù)是靜態(tài)的網(wǎng)頁(yè)數(shù)據(jù),通過(guò)js代碼進(jìn)行解碼的 找到對(duì)應(yīng)的解析代碼 先找到視頻的播放地址 找到解析視頻地址的加密js文件 點(diǎn)擊播放的時(shí)候會(huì)觸發(fā)文件 大致能看出來(lái)這個(gè)是base64加密之后的數(shù)據(jù) 在對(duì)應(yīng)的js文件里搜索關(guān)鍵字 找到j(luò)s的加密方式 js函數(shù)的一些函數(shù)的用法

          ????#?eplace()方法用于在字符串中用一些字符替換另一些字符
          ????#?parseInt?數(shù)據(jù)轉(zhuǎn)換成對(duì)應(yīng)的整型
          ????#?base64.atob???對(duì)base64編碼過(guò)的字符串進(jìn)行解碼
          ????#?substring?方法可在字符串中抽取從?start?下標(biāo)開(kāi)始的指定數(shù)目的字符


          將js代碼轉(zhuǎn)換成Python代碼

          import?base64

          def?decode(data):
          ????def?getHex(a):
          ????????return?{
          ????????????'str':?a[4:],
          ????????????'hex':?''.join(list(a[:4])[::-1]),
          ????????}

          ????def?getDec(a):
          ????????b?=?str(int(a,?16))
          ????????return?{
          ????????????'pre':?list(b[:2]),
          ????????????'tail':?list(b[2:]),
          ????????}

          ????def?substr(a,?b):
          ????????c?=?a[0:?int(b[0])]
          ????????d?=?a[int(b[0]):?int(b[0])?+?int(b[1])]
          ????????return?c?+?a[int(b[0]):].replace(d,?"")

          ????def?getPos(a,?b):
          ????????b[0]?=?len(a)?-?int(b[0])?-?int(b[1])
          ????????return?b

          ????b?=?getHex(data)
          ????c?=?getDec(b['hex'])
          ????d?=?substr(b['str'],?c['pre'])
          ????return?base64.b64decode(substr(d,?getPos(d,?c['tail'])))

          print(decode("e121Ly9tBrI84RdnZpZGVvMTAubWVpdHVkYXRhLmNvbS82MGJjZDcwNTE3NGZieXBueG5udnRwMTA5N19IMjY0XzFfNWY3YThmM2U0MTEwNy5tc2JVjAu3EDQ="))



          得出最終視頻播放地址

          簡(jiǎn)易源碼分享

          import?requests
          from?lxml?import?etree
          import?base64

          def?decode_mp4(data):
          ????def?getHex(a):
          ????????return?{
          ????????????'str':?a[4:],
          ????????????'hex':?''.join(list(a[:4])[::-1]),
          ????????}

          ????def?getDec(a):
          ????????b?=?str(int(a,?16))
          ????????return?{
          ????????????'pre':?list(b[:2]),
          ????????????'tail':?list(b[2:]),
          ????????}

          ????def?substr(a,?b):
          ????????c?=?a[0:?int(b[0])]
          ????????d?=?a[int(b[0]):?int(b[0])?+?int(b[1])]
          ????????return?c?+?a[int(b[0]):].replace(d,?"")

          ????def?getPos(a,?b):
          ????????b[0]?=?len(a)?-?int(b[0])?-?int(b[1])
          ????????return?b

          ????b?=?getHex(data)
          ????c?=?getDec(b['hex'])
          ????d?=?substr(b['str'],?c['pre'])
          ????return?base64.b64decode(substr(d,?getPos(d,?c['tail'])))
          #?運(yùn)行主函數(shù)
          def?main():
          ????url?=?'https://www.meipai.com'
          ????headers?=?{
          ????????'User-Agent':?'Mozilla/5.0?(Windows?NT?10.0;?Win64;?x64)?AppleWebKit/537.36?(KHTML,?like?Gecko)?Chrome/75.0.3770.142?Safari/537.36',
          ????}
          ????response?=?requests.get(url=url,?headers=headers)
          ????html_data?=?etree.HTML(response.text)
          ????href_list?=?html_data.xpath('//div/a/@href')
          ????#?print(href_list)
          ????for?href?in?href_list:
          ????????res?=?requests.get('https://www.meipai.com'?+?href,?headers=headers)
          ????????html?=?etree.HTML(res.text)
          ????????name?=?html.xpath('//div[@id="detailVideo"]/img/@alt')[0]
          ????????mp4_data?=?html.xpath('//div[@id="detailVideo"]/@data-video')[0]
          ????????#?print(name,?mp4_data)
          ????????mp4_url?=?decode_mp4(mp4_data).decode('utf-8')
          ????????print(mp4_url)
          ????????result?=?requests.get("http:"?+?mp4_url)
          ????????with?open(name?+?".mp4",?'wb')?as?f:
          ????????????f.write(result.content)
          ????????????f.close()


          if?__name__?==?'__main__':
          ????main()

          歡迎大家在評(píng)論中交流技術(shù),記得一鍵三連哦,祝大家順順利利開(kāi)開(kāi)心心!


          瀏覽 30
          點(diǎn)贊
          評(píng)論
          收藏
          分享

          手機(jī)掃一掃分享

          分享
          舉報(bào)
          評(píng)論
          圖片
          表情
          推薦
          點(diǎn)贊
          評(píng)論
          收藏
          分享

          手機(jī)掃一掃分享

          分享
          舉報(bào)
          <kbd id="afajh"><form id="afajh"></form></kbd>
          <strong id="afajh"><dl id="afajh"></dl></strong>
            <del id="afajh"><form id="afajh"></form></del>
                1. <th id="afajh"><progress id="afajh"></progress></th>
                  <b id="afajh"><abbr id="afajh"></abbr></b>
                  <th id="afajh"><progress id="afajh"></progress></th>
                  国产露脸150部国语对白 | 成人中文字幕在线视频 | 夜夜操夜夜 | 欧美在线一区二区 | 啪啪福利视频 |