Python常用網頁字串處理技巧

首先一些python字串處理的簡易常用的用法。其他的以後用到再補充。

1.去掉重複空格

s = "
hello   hello   hello
"s = '
'.join(s.split())

2.去掉所有回車（或其他字元或字串）

s = "
hello\nhello\nhello hello\n
"print
(s)s = s.replace("
\n",""
)print(s)

3.查詢字串首次出現的位置（沒有返回-1）

s = "
hello\nhello\nhello hello\n
"print(s.find('\n'
))print(s.find('
la'))

4.查詢字串從後往前找首次出現的位置（沒有返回-1）

s = "
hello\nhello\nhello hello\n
"print(s.rfind('\n'
))print(s.rfind('
la'))

5.將字串轉化成列表list

s = "
hello\nhello\nhello hello\n
"print(list(s))

6.查詢所有匹配的子串

import
res = "
hello\nhello\nhello hello\n
"print(re.findall('
hello
',s)) #
hello也可以換成正規表示式

然後是網頁字串處理的高階用法：（綜合運用requests模組，beautifulsoup模組，re模組等）

1.requests獲取乙個鏈結的內容並原封不動寫入檔案

import
requests
r = requests.get('
')with open(
'test.html
', 'wb'
) as fd:
for chunk in r.iter_content(100):
fd.write(chunk)

2.讀取乙個檔案的所有內容存到乙個字串裡

#
encoding : utf-8
with open(
'test.html
','r
',encoding='
utf-8
') as f:
content =f.readlines()
content = ''
.join(content)
#content = content.replace('\n','') # 如果想去掉回車可以加上這行
print(content)

3.把網頁字串用beautifulsoup存起來處理

from bs4 import
beautifulsoup
soup = beautifulsoup(content,'
html.parser')
print(soup.prettify())

4.存到beautifulsoup裡之後這個字串就可以任你擺布了，比如：提取出所有標籤

soup = beautifulsoup(content,'
html.parser')
print(soup.find_all('
a'))

或者提取出所有標籤和標籤

soup = beautifulsoup(content,'
html.parser')
print(soup.find_all(['
a','
b']))

這些屬於beautifulsoup的內容了，可以看官方文件：

也可以看我的另一篇部落格：

5.多個關鍵字切分字串

import
rere.split(
'; |, 
',str)
>>> a='
beautiful, is; better*than\nugly
'>>> import
re>>> re.split('
; |, |\*|\n
',a)['
beautiful
', '
is', '
better
', '
than
', '
ugly
']

python字串處理常用方法

1 str.find str.rfind str.index str.rindex str.count s hello python,hello world s.find hello 從左側開始查詢 0 s.rfind hello 從右側開始查詢 13 s.find wahaha 查詢失敗，返回 1...

Python 字串處理常用函式

python處理字串有很多常用函式 1.使用成員操作符in str hello,world n sstr n result sstr in str print result true2.使用字串的find index 火count 方法 str hello,world n sstr n result...

Python常用的字串處理函式

1.capitalize 將字串中的第乙個字元大寫，需要注意的是，只有字串的首字元為字母時才能起到大寫作用 2.upper 將字串全部轉成大寫 lower 將字串全部轉成小寫 casefold 同lower 3.title 將每個單詞的首字母變成大寫 istitle 判斷是否title模式 isup...

Python常用網頁字串處理技巧

python字串處理常用方法

Python 字串處理常用函式

Python常用的字串處理函式

相關推薦