et.parser 用法
python3 xml解析模組xml.etree.elementtree簡介
刪除重複xml節點
import xml.etree.elementtree as et----------匯入xml模組
root = et.parse('gho.xml')------------------分析指定xml檔案
tree = root.getroot()-----------------------獲取第一標籤
data = tree.find('data')--------------------查詢第一標籤中'data'標籤
for obs in data:----------------------------歷遍'data'中的所有標籤
for item in obs:------------------------歷遍'data'中的'obs'標籤下的所有標籤
key = item.attrib()-----------------提取key值引數
print(list(key))--------------------輸出key值
如何讀取屬性及節點內容。
怎樣將data中的 id,name及其值取出來?
問題解釋
兩種方式:
1.先取得node
string strid = node.getattributes().getnameditem("id").getnodevalue();
string strname = node.getattributes().getnameditem("name").getnodevalue();
2.先取得element
string strid = element.getattribute("id");
string strname = element.getattribute("name");
小練習
#!/usr/bin/env pythonimport sys
import xml.etree.elementtree as et
tree = et.parse('abcdefg.xml')
root = tree.getroot()
iter_elem = root.findall('.//*')
print(len(iter_elem))
#elem = root.find('')
#print iter_elem
for element in iter_elem:
if element is none:
continue
if element.text is none:
continue
print("hello")
context=
src_elem = element.find("source")
if src_elem is none:
continue
print( "attri :%s"%src_elem.attrib)
print("tag :%s"%src_elem.tag)
#for item in src_elem:
# key = item.text()
# print list(key)
del duplicatd node:
import xml.etree.elementtree as etpath = 'in.xml'
tree = et.parse(path)
root = tree.getroot()
prev = none
def elements_equal(e1, e2):
if type(e1) != type(e2):
return false
if e1.tag != e1.tag: return false
if e1.text != e2.text: return false
if e1.tail != e2.tail: return false
if e1.attrib != e2.attrib: return false
if len(e1) != len(e2): return false
return all([elements_equal(c1, c2) for c1, c2 in zip(e1, e2)])
for page in root: # iterate over pages
elems_to_remove =
for elem in page:
if elements_equal(elem, prev):
print("found duplicate: %s" % elem.text) # equal function works well
continue
prev = elem
for elem_to_remove in elems_to_remove:
page.remove(elem_to_remove)
tree.write("out.xml")
Python解析xml檔案
war,thriller 2003 pg10 talk about a us japan war science fiction 1989r8 a schientific fiction action 4 pg10 vash the stampede comedy vhspg 2viewable b...
Python解析xml檔案
解析 xml 格式的檔案有多種方法,這裡只介紹使用 xml.etree.elementtree 這種解析方式.elementtree在 python 標準庫中有兩種實現。一種是純 python 實現例如 xml.etree.elementtree 另外一種是速度快一點的 xml.etree.cele...
Python 解析XML檔案
python檔案 複製 如下 par ml.py 本例子參考自python聯機文件,做了適當改動和新增 import xml.parsers.expat 控制列印縮排 level 0 獲取某節點名稱及屬性值集合 def start element name,attrs global level pr...