永久官看美女裸体网站,丁香婷婷六月,日韩AV在线电影,精品久久中文娱乐网,8av在线观看,欧美北条麻妃在线,西西www444大胆无码视频,国产精品视频在线免费观看

點擊上方“Python共享之家”，進行關注

回復“資源”即可獲贈Python學習資料

今

日

雞

湯

青山橫北郭，白水繞東城。

大家好，我是皮皮。

一、前言

前幾天在Python白銀群有個叫【手中的流沙】的粉絲問了一道關于pyquery選擇器提取的問題，如下圖所示。

就像這樣：

原網(wǎng)頁的信息如下圖所示：

實現(xiàn)過程

這里【甯同學】給了一份代碼，如下所示：

from?pyquery?import?PyQuery?as?pq


headers?=?{
????'Accept-Language':?'zh-CN,zh;q=0.9',
????'Cache-Control':?'max-age=0',
????'Upgrade-Insecure-Requests':?'1',
????'User-Agent':?'Mozilla/5.0?(Windows?NT?10.0;?WOW64)?AppleWebKit/537.36?(KHTML,?like?Gecko)?Chrome/78.0.3904.108?Safari/537.36'
}
html?=?pq(url='https://www.cditv.cn/list-3894-1.html',?headers=headers)
doc?=?pq(html)
li?=?doc('div.style-type3?>?div:gt(0)?>?ul?>?li.item?>?ul?>?li.list-item').items()
for?i?in?li:
????info?=?{
????????'city':?i.text().split('\n\n\n')
????}
????print(info)

代碼運行之后，結果如下圖所示：

確實一步到位了，很強！原來pq可以直接請求網(wǎng)頁，確實也省事了。主要是那個css構造還是需要點時間和精力的。

這個地方也還可以使用xpath提取來實現(xiàn)，代碼如下：

import?requests
from?lxml?import?etree

res?=?requests.get(url='https://www.cditv.cn/list-3894-1.html',?headers=headers)
res.encoding?=?res.apparent_encoding
html?=?etree.HTML(res.text)
li_lists?=?html.xpath('/html/body/div[1]/div[2]/div[2]/div[2]/ul/li')
print(len(li_lists))
for?li?in?li_lists:
????info?=?li.xpath('./ul//li//text()')
????#?shi?=?li.xpath('./ul//li/h4/text()')
????#?qu?=?li.xpath('./ul//li/strong/text()')
????#?jiedao?=?li.xpath('./ul//li/br/text()')
????print(info)