linux環境
1. 安裝
方法一:
解壓:tar -xzvf beautifulsoup4-4.2.0.tar.gz
安裝:進入解壓後的目錄
python setup.py build方法二(快速安裝)sudo python setup.py install
(ubuntu) sudo apt-get2. 引用(python環境下)install
python-bs4
或者install beautifulsoup4
或著easy_install beautifulsoup4
from bs4 import beautifulsoup3. 使用
案例1
2
3
4
5
6
7
8
9
10
11
12
13
html_doc = """
<
html
><
head
><
title
>the dormouse's story
<
p
class="title"><
b
>the dormouse's story
<
p
class="story">once upon a time there were three little sisters; and their names were
<
a
href="" class="sister" id="link1">elsie,
<
a
href="" class="sister" id="link2">lacie and
<
a
href="" class="sister" id="link3">tillie;
and they lived at the bottom of a well.
<
p
class="story">...
"""
開始
from bs4 importbeautifulsoup
soup = beautifulsoup(html_doc)
>>>soup.head()[the dormouse'
s story]
>>>soup.titlethe dormouse'
s story
>>>soup.title.stringu"
the dormouse's story
"
>>>soup.body.bthe dormouse'
s story
>>>soup.body.b.stringu"
the dormouse's story
"
>>>soup.a找到所有的aclass="
sister
" href="
" id="
link1
">elsie
soup.find_all('a'列印每個a中的資訊)[class="
sister
" href="
" id="
link1
">elsie, class="
sister
" href="
" id="
link2
">lacie, class="
sister
" href="
" id="
link3
">tillie]
>>> for key in soup.find_all('a'):...
print key.get('
class
'), key.get("
href")
... ['
sister
'] ['
sister
'] ['
sister
']
(一)BeautifulSoup的安裝
確保必要的工作 已經安裝好python和pip 執行cmd,在命令列中輸入一下命令即可安裝成功。pip install beautifulsoup4 由於本人使用的是ubuntu,即主要講解的是ubuntu下的安裝,其實其他發行版本的安裝都是差不多的。安裝python 由於在ubuntu的發行版本中...
BeautifulSoup的安裝及介紹
在學習python爬蟲知識的過程中,你肯定聽說過beautiful soup了,它在網頁爬蟲學習中起著舉足輕重的地位,下面詳細講解一下beautiful soup以及其安裝過程。beautiful soup的介紹 官方給出的幾點介紹 beautiful soup提供一些簡單的 python式的函式用...
BeautifulSoup常用方法
1.初始化 2.查詢指定標籤 eg 要找到符合的所有標籤 p.findall div 反覆利用標籤特徵可以找到最終需要的標籤 3.直接加標籤名可以找到所有子標籤 eg 找到所有標籤 p.td 4.直接以字典形式,可以訪問標籤內對應屬性的值 eg 要找到 中href 的值 www.csdn.net p...