安裝:pip install textblob
配置國內源安裝:pip install textblob -i
參考:
from textblob import textblob
1.詞性標註text =
'i love natural language processing! i am not like fish!'
blob = textblob(text)
blob.tags
2.短語抽取[('i', 'prp'),
('love', 'vbp'),
('natural', 'jj'),
('language', 'nn'),
('processing', 'nn'),
('i', 'prp'),
('am', 'vbp'),
('not', 'rb'),
('like', 'in'),
('fish', 'nn')]
np = blob.noun_phrases
for w in np:
(w)
natural language processing
3.計算句子情感值for sentence in blob.sentences:
(sentence +
'------>'
+str
(sentence.sentiment.polarity)
)
4.tokenization(把文字切割成句子或者單詞)i love natural language processing!------>0.3125
i am not like you!------>0.0
token = blob.words
for w in token:
(w)
i
love
natural
language
processingiam
notlike
fish
sentence = blob.sentences
for s in sentence:
(s)
5.詞語變形(words inflection)i love natural language processing!
i am not like fish!
token = blob.words
for w in token:
# 變複數
(w.pluralize())
# 變單數
(w.singularize(
))
6.詞幹化(words lemmatization)we
ilove
love
naturals
natural
languages
language
processings
processingwei
amsam
nots
notlikes
like
fish
fish
from textblob import word
w = word(
'went'
(w.lemmatize(
'v')
)w = word(
'octopi'
(w.lemmatize(
))
7.整合wordnetgo
octopus
from textblob.wordnet import verb
word = word(
'octopus'
)syn_word = word.synsets
for syn in syn_word:
(syn)
指定返回的同義詞集為動詞synset('octopus.n.01')
synset('octopus.n.02')
syn_word1 = word(
"hack"
).get_synsets(pos=verb)
for syn in syn_word1:
(syn)
檢視synset(同義詞集)的具體定義synset('chop.v.05')
synset('hack.v.02')
synset('hack.v.03')
synset('hack.v.04')
synset('hack.v.05')
synset('hack.v.06')
synset('hack.v.07')
synset('hack.v.08')
word(
"beautiful"
).definitions
8.拼寫糾正(spelling correction)['delighting the senses or exciting intellectual or emotional admiration',
'(of weather) highly enjoyable']
sen =
'i lvoe naturl language processing!'
sen = textblob(sen)
(sen.correct(
))
i love nature language processing!
word.spellcheck()返回拼寫建議以及置信度
w1 = word(
'good'
)w2 = word(
'god'
)w3 = word(
'gd'
(w1.spellcheck())
(w2.spellcheck())
(w3.spellcheck(
))
9.句法分析(parsing)[('good', 1.0)]
[('god', 1.0)]
[('go', 0.586139896373057), ('god', 0.23510362694300518), ('d', 0.11658031088082901), ('g', 0.03626943005181347), ('ed', 0.009067357512953367), ('rd', 0.006476683937823834), ('nd', 0.0038860103626943004), ('gr', 0.0025906735751295338), ('sd', 0.0006476683937823834), ('md', 0.0006476683937823834), ('id', 0.0006476683937823834), ('gdp', 0.0006476683937823834), ('ga', 0.0006476683937823834), ('ad', 0.0006476683937823834)]
text = textblob(
'i lvoe naturl language processing!'
(text.parse(
))
i/prp/b-np/o lvoe/nn/i-np/o naturl/nn/i-np/o language/nn/i-np/o processing/nn/i-np/o !/./o/o
10.n-gramstext = textblob(
'i lvoe naturl language processing!'
(text.ngrams(n=2)
)
[wordlist(['i', 'lvoe']), wordlist(['lvoe', 'naturl']), wordlist(['naturl', 'language']), wordlist(['language', 'processing'])]
歡迎關注【ai小白入門】,這裡分享python、機器學習、深度學習、自然語言處理、人工智慧等技術,關注前沿技術,求職經驗等,陪有夢想的你一起成長。
自然語言處理基礎技術工具篇之Jieba
沒想到堅持學習以及寫作總結已經超過半個月了,謝謝大家的關注 點讚 收藏 前面談了nlp的基礎技術,我始終覺得,入門學習一件事情最好的方式就是實踐,加之現在python如此好用,有越來越多的不錯nlp的python庫,所以接下來的一段時間裡,讓我們一起來感受一下這些不錯的工具。我均使用jupyter編...
自然語言處理基礎技術工具篇之spaCy
安裝 pip install spacy 國內源安裝 pip install spacy i import spacy nlp spacy.load en doc nlp u this is a sentence.1.tokenize功能for token in doc print token th...
自然語言處理基礎技術工具篇之Flair
flair簡介 flair是最近開源的乙個基於pytorch的nlp框架,據官方github介紹,它具有以下特點 乙個功能強大的nlp庫。flair允許您將最先進的自然語言處理 nlp 模型應用於您的文字,例如命名實體識別 ner 詞性標註 pos 意義消歧和分類。文字嵌入庫。flair具有簡單的介...