美女操逼在线观看的,精品老鸭窝,国产日逼片,另类无码,成人理伦A级A片在线论坛,日韩成人激情视频,三级乱伦视频,日本一级片直播

↑?關(guān)注 + 星標?，每天學Python新技能
后臺回復【大禮包】送你Python自學大禮包

文 |?某某白米飯

來源：Python教授「ID: justpython」

金屬質(zhì)感分割線

在網(wǎng)上搜小姐姐照片養(yǎng)養(yǎng)眼的時候發(fā)現(xiàn)了半次元這個網(wǎng)站，這里面有 cos、jk、繪圖等板塊，覺得不錯，就用爬蟲下載下來了。

獲取列表數(shù)據(jù)

從搜索框中搜索 jk，進入 jk 頁面。

選擇關(guān)注人數(shù)最多的點進去。

要獲取 url 地址就得打開 F12 控制面板，找到一條 https://bcy.net/apiv3/common/circleFeed?circle_id=492&since=0&sort_type=2&grid_type=10 看它的數(shù)據(jù)集包含了 uid、昵稱、頭像等數(shù)據(jù)。再加上進入詳情頁面的 item_id。這個鏈接就是獲取最新的 jk 列表的地址。

import?requests
import?time

header?=?{
????'user-agent':?'Mozilla/5.0?(Windows?NT?10.0;?Win64;?x64)?AppleWebKit/537.36?(KHTML,?like?Gecko)?Chrome/96.0.4664.45?Safari/537.36'
}

def?get_list():

????try:
????????url?=?'https://bcy.net/apiv3/common/circleFeed?circle_id=492&since='+str(int(time.time()))+'.000000&sort_type=2&grid_type=10'
????????response?=?requests.get(url,headers=?header)
????????response.raise_for_status()
????????#轉(zhuǎn)碼
????????response.encoding?=?'utf-8'
????????print(response.text)
????except:?????????????????????
????????print("Failed!")????????

提取返回值里面的 item_id 就很簡單，它就是個 json 串。

def?parse_list(data):
????item_ids?=?[]
????json_data?=?json.loads(data)
????for?item?in?json_data['data']['items']:
????????item_ids.append(item['item_detail']['item_id'])
????return?item_ids

示例結(jié)果：

獲取 jk 圖

從上面獲取到 item_id 后，將它拼入到 https://bcy.net/item/detail/{item_id}?_source_page=hashtag 中。在 F12 中的結(jié)果集不是 json 串，而是一個頁面。搜索發(fā)現(xiàn) jk 圖片的數(shù)據(jù)在 javascript 中。

用截取字符串的方式將數(shù)據(jù)提取出來，最后將 jk 圖下載下來慢慢看。

import?re

def?get_item(item_ids):
????intercepts?=?[]
????for?id?in?item_ids:
????????url?=?'https://bcy.net/item/detail/'+?str(id)?+?'?_source_page=hashtag'
????????response?=?requests.get(url,?headers?=?header)
????????response.encoding?=?'utf-8'
????????text?=?response.text
????????intercept?=?text[text.index('JSON.parse("')?+?len('JSON.parse("'):?text.index('");')].replace(r'\"',r'"')
????????intercepts.append(intercept)
????return?intercepts
????
def?download(intercepts):
????for?i?in?intercepts:
????????pattern?=?re.compile('"multi":\[{"path":"(.*?)","type"')
????????pattern_item_id?=?re.compile('"post_data":{"item_id":"(.*?)","uid"')
????????b?=?pattern.findall(i)
????????item_id??=?pattern_item_id.findall(i)[0]
????????index?=?0
????????for?url?in?b:
????????????index?=?index?+?1
????????????content?=?re.sub(r'(\\u[a-zA-Z0-9]{4})',lambda?x:x.group(1).encode("utf-8").decode("unicode-escape"),url)
????????????response?=?requests.get(content.replace('\\',''))
????????????with?open('D:\\bcy\\'?+?str(item_id)?+?str(index)?+?'.png',?'wb')?as?f:
????????????????f.write(response.content)

最后在提取圖片的url 的時候沒有使用 json 串是因為，它的json串中有特殊的字符，有些轉(zhuǎn)換不了。

總結(jié)

爬蟲千萬條，安全最重要。

推薦閱讀
用Python爬了微信好友，原來他們是這樣的人...
鴻蒙，真的成了！
Python Web實戰(zhàn)：Flask + Vue 開發(fā)一個漂亮的詞云網(wǎng)站

又是入刑的一天！抓取 jk 小姐姐圖片

↑?關(guān)注 + 星標?，每天學Python新技能

后臺回復【大禮包】送你Python自學大禮包

獲取列表數(shù)據(jù)

獲取 jk 圖

總結(jié)

推薦閱讀

用Python爬了微信好友，原來他們是這樣的人...鴻蒙，真的成了！Python Web實戰(zhàn)：Flask + Vue 開發(fā)一個漂亮的詞云網(wǎng)站

又是入刑的一天！抓取 jk 小姐姐圖片

↑?關(guān)注 + 星標?，每天學Python新技能

用Python爬了微信好友，原來他們是這樣的人...
鴻蒙，真的成了！
Python Web實戰(zhàn)：Flask + Vue 開發(fā)一個漂亮的詞云網(wǎng)站