BeautifulSoup基本用法總結

pip install beautifulsoup4

解析庫

使用方法

優勢劣勢

1python標準庫

beautifulsoup(html,』html.parser』)

python內建標準庫；執行速度快

容錯能力較差

2lxml html解析庫

beautifulsoup(html,』lxml』)

速度快；容錯能力強

需要安裝，需要c語言庫

3lxml xml解析庫

beautifulsoup(html,[『lxml』,』xml』])

速度快；容錯能力強；支援xml格式

需要c語言庫

4htm5lib解析庫

beautifulsoup(html,』htm5llib』)

以瀏覽器方式解析，最好的容錯性

速度慢

from bs4 import beautifulsoup

url=''
resp=urllib2.urlopen(url)
html=resp.read()

bs=beautifulsoup(html)

print bs.prettify()

#提取tag
print bs.title
print
type
(bs.title)

print bs.a .name print bs.a

.attrs

a
)"}

#coding:utf-8
from bs4 import beautifulsoup
html='''
'''bs=beautifulsoup(html,"html.parser")
print bs.a
print bs.a.string

class="css"
href=""
id="test">
a>

#判斷是否是注釋
if type(bs.a.string)==element.comment:
print bs.a.string

class="css1"
href=""
id="css">abcgha>

for i in bs.a.contents:

print i

print bs.a
.get_text()

print bs.find_all('a')

print bs.find_all(['a','b'])

print bs.find_all(re.compile('^b'))

def
has_class_but_not_id
(tag):
return tag.has_attr('class') and
not tag.has_attr('id')
print bs.find_all(has_class_but_not_id)

print bs.find_all(id='css')
print bs.find_all(id=re.compile('^a'))

print bs.find_all(id='css',href=re.compile('^ex'))

print bs.find_all(class_='css')
print bs.find_all(attrs=)

print bs.find_all(text=re.compile('^abc'))

#標籤選擇器
print bs.select('a')
#類名選擇器
print bs.select('.css')
#id選擇器
print bs.select('#css')
#屬性選擇器
print bs.select('a[class="css"]')
#遍歷for tag in bs.select('a'):
print tag.get_text()

BeautifulSoup4的基本使用

序 beautifulsoup是python解析html非常好用的第三方庫！pip install beautifulsoup4from bs4 import beautifulsoup html str soup beautifulsoup html str,html.parser html物件 ...

Beautiful Soup在爬蟲中的基本使用語法

beautiful soup是python 的乙個html 或 xml的解析庫，借助網頁的結構和屬性特徵來解析網頁，便於使用者抓取資料。beautiful soup能夠自動將輸入的文件轉化為unicode，輸出的文件轉換為utf 8，這大大提高了文件提取的效率。基本用法如下 beautifulsou...

BeautifulSoup常用方法

1.初始化 2.查詢指定標籤 eg 要找到符合的所有標籤 p.findall div 反覆利用標籤特徵可以找到最終需要的標籤 3.直接加標籤名可以找到所有子標籤 eg 找到所有標籤 p.td 4.直接以字典形式，可以訪問標籤內對應屬性的值 eg 要找到中href 的值 www.csdn.net p...

BeautifulSoup基本用法總結

BeautifulSoup4的基本使用

Beautiful Soup在爬蟲中的基本使用語法

BeautifulSoup常用方法

相關推薦