爬取拉鉤網上所有的python職位

#
2.爬取拉鉤網上的所有python職位。
from urllib import
request,parse
import
json,random
defuser_agent(page):
#瀏覽器列表,每次訪問可以用不同的瀏覽器訪問
user_agent_list =[
'',    ''
,    ''
,    
'mozilla/5.0 (x11; ubuntu; linux x86_64; rv:58.0) gecko/20100101 firefox/58.0',
'',    ''
,    
'mozilla/5.0 (windows nt 6.1; wow64; rv:31.0) gecko/20100101 firefox/31.0',
'',    ''
,    ''
,    ""
,    ""
]    
#隨機選取乙個瀏覽器訪問
user_agent =random.choice(user_agent_list)
#呼叫拉鉤函式
lagou(page,user_agent)
deflagou(page,user_agent):
#職位請求位址
base_url = "
"#判斷是否是第一次訪問,第二次訪問data的值不一樣
if page == 1:
first = '
true
'else
:        first = '
false
'data =
#引數拼接及轉碼,生成是字串格式,  注意:長度下面的headers用的到
data =parse.urlencode(data)
#一定要比較每次page不一樣的時候headers的各項的細微差別  這個很重要 也是能否爬取資料的關鍵
#在這裡content-length,user-agent的值相對來說比較重要
headers =
req = request.request(url=base_url,data=bytes(data,encoding='
utf-8
'),headers=headers)
response =request.urlopen(req)
html =response.read()
html = html.decode('
utf-8')
#使用json格式化,生成乙個字典,然後從字典裡頭取值就可以,下面就是取值的過程,想要啥就可以啥
json_data =json.loads(html)
#print(json_data)
positionresult = json_data['
content
']['
positionresult']
#print(positionresult)
result_list = positionresult['
result']
#print(result_list)
for result in
result_list:
print
(len(result))
companyfullname = result['
companyfullname']
positionname = result['
positionname']
print
(positionname,companyfullname)
with open(
'lagou.html
','a
',encoding='
utf-8
') as f:
f.write(str(result_list))
print('
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~')
if__name__ == '
__main__':
#for page in range(1,31):
#user_agent(page)
user_agent(1)

python丨Selenium爬取拉鉤職位資訊

第一頁職位資訊 from selenium import webdriver from lxml import etree import re import time class lagouspider object def init self self.driver webdriver.chrom...

拉鉤JSON資料爬取

訪問url位址檢視網頁源發現職位資訊為動態載入通過開發者工具檢視xhr請求，發現json資料位址為，將位址複製到瀏覽器，出現您操作太頻繁，請稍後訪問之類的資料，無法檢視完整資料向web中的位址傳送請求，獲取cookies和session資訊使用post方式將之前獲取的cookies，session...

用Python爬取拉鉤網招聘職位資訊

本文實現自動爬取拉鉤網招聘資訊，並將爬取結果儲存在本地文字中也可以將資料存入資料庫使用到的python模組包 python3 1.urllib.request 2.urllib.parse 3.json 簡單分析 1.在向伺服器傳送請求，需要傳入post引數 2.搜尋的職位列表資訊存在乙個jos...

爬取拉鉤網上所有的python職位

python丨Selenium爬取拉鉤職位資訊

拉鉤JSON資料爬取

用Python爬取拉鉤網招聘職位資訊

相關推薦