爬蟲 python 58同城 1

import datetime #時間
import sqlite3  #資料庫模組
import requests #獲取html網頁的主要方法，對應於http的get

for i in range(1,50): #抓50頁，愛抓幾頁寫幾頁
print('當前抓取的頁面為',i)
url=''.format(i) 
html=requests.get(url) #獲取58同城的網頁
soup=beautifulsoup(html.text,'lxml') #lxml 解析網頁
for li in soup.find('ul',id='list_con').find_all('li'): #要抓的內容都是在這個模組下噢
title=li.find( 'span',class_='name').text #獲取標題
address=li.find('span',class_='address').text #獲取位址
salary=li.find('p',class_='job_salary').text #獲取薪資
#         source = tp_soup.find("span", class_="source") 
#         source = source.string if source else none
wel=li.find('div',class_='job_wel clearfix') 
if wel:
wel= wel.text
else:
wel=none
comp=li.find('div',class_='comp_name').text
cate=li.find('span',class_='cate').text
jingyan=li.find('span',class_='jingyan').string
one = (none,title, address, salary, wel, comp, cate, jingyan)
print('正在抓取：',title)
print('你要的東西抓完了')

輸出展示（自己存到資料庫或則excel那一類的哦）

python 爬蟲 58同城

from bs4 import beautifulsoup import requests import csv import time url 已完成的頁數序號，初時為0 page 0 建立乙個有寫許可權的csv file csv file open rent.csv w 建立csv writer...

爬蟲成都58同城所有房價,Python實現

程式發布日期2018 9 25 如果以後不能使用了,就需要更改解析方式.github部落格傳送門 csdn部落格傳送門熟悉lxml中的etree模組 xpath的使用檔案操作函式時間模組 import urllib.request 開啟網頁,讀取網頁內容用 from lxml import e...

爬蟲解析加密字型例子58 同城

一些會將部分內容進行加密，防止爬蟲簡單的獲取到資訊最近在爬取58同城的品牌公寓時遇到租房資訊裡的爬下來是看不懂的字型一些資料在瀏覽器裡面顯示是正常的,但是渲染前和渲染後的html原始碼都看不到字型,渲染前看到的是16進製制的4位字元,渲染後看到的是一些方塊.然後分析了一下的原始碼，發現...

爬蟲 python 58同城 1

python 爬蟲 58同城

爬蟲 成都58同城所有房價,Python實現

爬蟲 解析 加密字型 例子58 同城

相關推薦

爬蟲成都58同城所有房價,Python實現

爬蟲解析加密字型例子58 同城