Usando XPath en elementtree

Mi archivo XML tiene el siguiente aspecto:Usando XPath en elementtree

<?xml version="1.0"?> 
<ItemSearchResponse xmlns="http://webservices.amazon.com/AWSECommerceService/2008-08-19"> 
    <Items> 
    <Item> 
     <ItemAttributes> 
     <ListPrice> 
      <Amount>2260</Amount> 
     </ListPrice> 
     </ItemAttributes> 
     <Offers> 
     <Offer> 
      <OfferListing> 
      <Price> 
       <Amount>1853</Amount> 
      </Price> 
      </OfferListing> 
     </Offer> 
     </Offers> 
    </Item> 
    </Items> 
</ItemSearchResponse>

Todo lo que quiero hacer es extraer ListPrice.

Este es el código que estoy utilizando:

>> from elementtree import ElementTree as ET 
>> fp = open("output.xml","r") 
>> element = ET.parse(fp).getroot() 
>> e = element.findall('ItemSearchResponse/Items/Item/ItemAttributes/ListPrice/Amount') 
>> for i in e: 
>> print i.text 
>> 
>> e 
>>

Absolutamente ninguna salida. También probé

>> e = element.findall('Items/Item/ItemAttributes/ListPrice/Amount')

No hay ninguna diferencia.

¿Qué estoy haciendo mal?

Fuente

2009-08-23 Ryan R. Rosario

Existen 2 problemas que tienes.

1) element contiene solo el elemento raíz, no recursivamente todo el documento. Es de tipo Elemento no ElementTree.

2) Su cadena de búsqueda necesita utilizar espacios de nombres si mantiene el espacio de nombres en el XML.

Para solucionar el problema # 1:

Es necesario cambiar:

element = ET.parse(fp).getroot()

element = ET.parse(fp)

Para solucionar el problema # 2:

Usted puede despegar th e xmlns del documento XML por lo que se ve así:

<?xml version="1.0"?> 
<ItemSearchResponse> 
    <Items> 
    <Item> 
     <ItemAttributes> 
     <ListPrice> 
      <Amount>2260</Amount> 
     </ListPrice> 
     </ItemAttributes> 
     <Offers> 
     <Offer> 
      <OfferListing> 
      <Price> 
       <Amount>1853</Amount> 
      </Price> 
      </OfferListing> 
     </Offer> 
     </Offers> 
    </Item> 
    </Items> 
</ItemSearchResponse>

Con este documento se puede utilizar la siguiente cadena de búsqueda:

e = element.findall('Items/Item/ItemAttributes/ListPrice/Amount')

El código completo:

from elementtree import ElementTree as ET 
fp = open("output.xml","r") 
element = ET.parse(fp) 
e = element.findall('Items/Item/ItemAttributes/ListPrice/Amount') 
for i in e: 
    print i.text

alternativo solución al problema # 2:

De lo contrario, necesita para especificar los xmlns dentro de la cadena de búsqueda para cada elemento.

El código completo:

from elementtree import ElementTree as ET 
fp = open("output.xml","r") 
element = ET.parse(fp) 

namespace = "{http://webservices.amazon.com/AWSECommerceService/2008-08-19}" 
e = element.findall('{0}Items/{0}Item/{0}ItemAttributes/{0}ListPrice/{0}Amount'.format(namespace)) 
for i in e: 
    print i.text

Tanto impresión:

Fuente

2009-08-23 20:02:48

Muchas gracias. Estaba a punto de golpearme la cabeza contra la pared varias veces. –

No hay problema, deberían dar un ejemplo con espacios de nombres en su documentación para find y findall. –

bueno, podrían haber dejado esto más claro en la documentación ... ¡gracias! – jorrebor

árbol elemento utiliza espacios de nombres por lo que todos los elementos en su xml tienen nombre como { http://webservices.amazon.com/AWSECommerceService/2008-08-19} Artículos

Haga que la búsqueda incluya el espacio de nombres , p.

search = '{http://webservices.amazon.com/AWSECommerceService/2008-08-19}Items/{http://webservices.amazon.com/AWSECommerceService/2008-08-19}Item/{http://webservices.amazon.com/AWSECommerceService/2008-08-19}ItemAttributes/{http://webservices.amazon.com/AWSECommerceService/2008-08-19}ListPrice/{http://webservices.amazon.com/AWSECommerceService/2008-08-19}Amount' 
element.findall(search)

da el elemento correspondiente a 2260

Fuente

2009-08-23 20:23:54 Mark

Creo que te refieres a: 2260 –

Sí - lazyness Acabo de ver el mismo elemento de python Amounty y la dirección que no hice un poco más y veo qué teext the Element tenía – Mark

from xml.etree import ElementTree as ET 
tree = ET.parse("output.xml") 
namespace = tree.getroot().tag[1:].split("}")[0] 
amount = tree.find(".//{%s}Amount" % namespace).text

También, considere el uso de lxml. Es mucho más rápido.

from lxml import ElementTree as ET

Fuente

2009-08-23 21:11:16 gonsalu

acabo de pasar de xml a lxml y wooo qué diferencia de velocidad ... lxml es mucho más rápido y maneja mejor los espacios de nombres. –

que terminó excluyendo los xmlns del xml prima así:

def strip_ns(xml_string): 
    return re.sub('xmlns="[^"]+"', '', xml_string)

Obviamente tener mucho cuidado con esto, pero funcionó bien para mí.

Fuente

2012-04-27 00:24:28 Franz

Uno del enfoque más directo y funciona incluso con Python 3.0 y otras versiones es como abajo:

Sólo toma la raíz y comienza a entrar en ella hasta que tengamos la etiqueta especificado "Cantidad"

from xml.etree import ElementTree as ET 
tree = ET.parse('output.xml') 
root = tree.getroot() 
#print(root) 
e = root.find(".//{http://webservices.amazon.com/AWSECommerceService/2008-08-19}Amount") 
print(e.text)

Fuente

2017-10-13 17:08:08

Usando XPath en elementtree

Respuesta

Cuestiones relacionadas