python 解析xml檔案

2022-07-26 07:27:09 字數 2642 閱讀 2813

et.parser 用法

python3 xml解析模組xml.etree.elementtree簡介

刪除重複xml節點

import xml.etree.elementtree as et----------匯入xml模組

root = et.parse('gho.xml')------------------分析指定xml檔案

tree = root.getroot()-----------------------獲取第一標籤

data = tree.find('data')--------------------查詢第一標籤中'data'標籤

for obs in data:----------------------------歷遍'data'中的所有標籤

for item in obs:------------------------歷遍'data'中的'obs'標籤下的所有標籤

key = item.attrib()-----------------提取key值引數

print(list(key))--------------------輸出key值 

如何讀取屬性及節點內容。

怎樣將data中的 id,name及其值取出來?

問題解釋

兩種方式:

1.先取得node

string strid = node.getattributes().getnameditem("id").getnodevalue();

string strname = node.getattributes().getnameditem("name").getnodevalue();

2.先取得element

string strid = element.getattribute("id");

string strname = element.getattribute("name");

小練習

#!/usr/bin/env python

import sys

import xml.etree.elementtree as et

tree = et.parse('abcdefg.xml')

root = tree.getroot()

iter_elem = root.findall('.//*')

print(len(iter_elem))

#elem = root.find('')

#print iter_elem

for element in iter_elem:

if element is none:

continue

if element.text is none:

continue

print("hello")

context=

src_elem = element.find("source")

if src_elem is none:

continue

print( "attri :%s"%src_elem.attrib)

print("tag :%s"%src_elem.tag)

#for item in src_elem:

# key = item.text()

# print list(key)

del duplicatd node:

import xml.etree.elementtree as et

path = 'in.xml'

tree = et.parse(path)

root = tree.getroot()

prev = none

def elements_equal(e1, e2):

if type(e1) != type(e2):

return false

if e1.tag != e1.tag: return false

if e1.text != e2.text: return false

if e1.tail != e2.tail: return false

if e1.attrib != e2.attrib: return false

if len(e1) != len(e2): return false

return all([elements_equal(c1, c2) for c1, c2 in zip(e1, e2)])

for page in root: # iterate over pages

elems_to_remove =

for elem in page:

if elements_equal(elem, prev):

print("found duplicate: %s" % elem.text) # equal function works well

continue

prev = elem

for elem_to_remove in elems_to_remove:

page.remove(elem_to_remove)

tree.write("out.xml")

Python解析xml檔案

war,thriller 2003 pg10 talk about a us japan war science fiction 1989r8 a schientific fiction action 4 pg10 vash the stampede comedy vhspg 2viewable b...

Python解析xml檔案

解析 xml 格式的檔案有多種方法,這裡只介紹使用 xml.etree.elementtree 這種解析方式.elementtree在 python 標準庫中有兩種實現。一種是純 python 實現例如 xml.etree.elementtree 另外一種是速度快一點的 xml.etree.cele...

Python 解析XML檔案

python檔案 複製 如下 par ml.py 本例子參考自python聯機文件,做了適當改動和新增 import xml.parsers.expat 控制列印縮排 level 0 獲取某節點名稱及屬性值集合 def start element name,attrs global level pr...