日韩口爆在线,99精品无码,亚欧毛片,女生操逼视频,大鸡巴AV在线,欧美A级黄色片,久久亚洲大家都在搜,国产精品综合小视频

點擊藍字“Python學(xué)習(xí)部落”關(guān)注我

讓學(xué)習(xí)變成你的習(xí)慣！

本例爬取數(shù)據(jù)分析師

環(huán)境：

?1.python 3

2.Anaconda3-Spyder

3.win10?

源碼：

from?selenium?import?webdriverimport timeimport loggingimport randomimport openpyxl
wb = openpyxl.Workbook()sheet = wb.activesheet.append(['job_name', 'company_name', 'city','industry', 'salary', 'experience_edu','welfare','job_label'])logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s: %(message)s')
def search_product(key_word):    browser.find_element_by_id('cboxClose').click()     # 關(guān)閉讓你選城市的窗口    time.sleep(2)    browser.find_element_by_id('search_input').send_keys(key_word)  # 定位搜索框 輸入關(guān)鍵字    browser.find_element_by_class_name('search_button').click()     # 點擊搜索    browser.maximize_window()    # 最大化窗口    time.sleep(2)    #time.sleep(random.randint(1, 3))    browser.execute_script("scroll(0,2500)")      # 下拉滾動條????get_data()?#?調(diào)用抓取數(shù)據(jù)的函數(shù)    # 模擬點擊下一頁   翻頁爬取數(shù)據(jù)  每爬取一頁數(shù)據(jù)  休眠   控制抓取速度  防止被反爬 讓輸驗證碼    for i in range(4):      browser.find_element_by_class_name('pager_next ').click()      time.sleep(1)      browser.execute_script("scroll(0,2300)")      get_data()??????time.sleep(random.randint(3,?5))??????def get_data():    items = browser.find_elements_by_xpath('//*[@id="s_position_list"]/ul/li')    for item in items:        job_name = item.find_element_by_xpath('.//div[@class="p_top"]/a/h3').text        company_name = item.find_element_by_xpath('.//div[@class="company_name"]').text        city = item.find_element_by_xpath('.//div[@class="p_top"]/a/span[@class="add"]/em').text        industry = item.find_element_by_xpath('.//div[@class="industry"]').text        salary = item.find_element_by_xpath('.//span[@class="money"]').text        experience_edu = item.find_element_by_xpath('.//div[@class="p_bot"]/div[@class="li_b_l"]').text        welfare = item.find_element_by_xpath('.//div[@class="li_b_r"]').text        job_label = item.find_element_by_xpath('.//div[@class="list_item_bot"]/div[@class="li_b_l"]').text        data = f'{job_name},{company_name},{city},{industry},{salary},{experience_edu},{welfare},{job_label}'        logging.info(data)        sheet.append([job_name, company_name, city,industry, salary, experience_edu, welfare, job_label])?????????def main():    browser.get('https://www.lagou.com/')    time.sleep(random.randint(1, 3))    search_product(keyword)????wb.save('C:/Users/liz/job_info.xlsx')

if __name__ == '__main__':    keyword = 'Python 數(shù)據(jù)分析'????chrome_driver?=?r'C:/Users/liz/chromedriver.exe'?#chromedriver驅(qū)動的路徑    options = webdriver.ChromeOptions()    # 關(guān)閉左上方 Chrome 正受到自動測試軟件的控制的提示    options.add_experimental_option('useAutomationExtension', False)    options.add_experimental_option("excludeSwitches", ['enable-automation'])    browser = webdriver.Chrome(options=options, executable_path=chrome_driver)    main()    browser.quit()

運行截圖：

注意：

??1，chromedriver版本必須和運行的谷歌瀏覽器一致

2，非完全原創(chuàng)，借鑒網(wǎng)上代碼運行的

3，可以反反爬：現(xiàn)在很多網(wǎng)站為防止爬蟲，加載的數(shù)據(jù)都使用js的方式加載，如果使用python的request庫爬取的話就爬不到數(shù)據(jù)，selenium庫能模擬打開瀏覽器，瀏覽器打開網(wǎng)頁并加載js數(shù)據(jù)后，再獲取數(shù)據(jù)，這樣就達到反反爬蟲。??

最后也要有好看的小姐姐

selenium拉勾網(wǎng)職位信息爬取