Giuseppe: Parsing large combined XML document with Python -

Saturday, 15 September 2012

Parsing large combined XML document with Python -

I have a large document (400 mb), which has hundreds of XML documents, each of which has its own announcements I'm trying to parse each document using ElementTree in Python. I'm having a big problem dividing each XML document to parse the information here. What example of the document is this:

     Ideally I would like to read through every XML declaration, parsing data and continue with the next XML document. Any suggestions will help   
 
  You will need to read documents separately; Here is a generator function that will present the complete XML document from a given file object:  
  def xml_documents (fileobj): document = [] for the line in fileobj: if line.strip () ('& Lt ;? Xml') and document: produce '' .joy (document) document = [] document. Append (line) if the document: yield '' .join (document)   < P> then open with file ('file_with_multiple_xmldocuments'):    Use  ElementTree.fromstring ()  to load and parse: xml_documents (Fileobj) To switch to xml: tree = Elimenttry. Froststring (xml)    

 




Posted by



Unknown




at

03:22











Email ThisBlogThis!Share to XShare to FacebookShare to Pinterest




No comments:







Post a Comment




Newer Post


Older Post

Home




Subscribe to:
Post Comments (Atom)


















    
About Me




Unknown



View my complete profile



Blog Archive








        ► 
      



2015

(1886)





        ► 
      



September

(203)







        ► 
      



August

(208)







        ► 
      



July

(224)







        ► 
      



June

(210)







        ► 
      



May

(230)







        ► 
      



April

(195)







        ► 
      



March

(209)







        ► 
      



February

(201)







        ► 
      



January

(206)









        ► 
      



2014

(2117)





        ► 
      



September

(239)







        ► 
      



August

(251)







        ► 
      



July

(226)







        ► 
      



June

(208)







        ► 
      



May

(229)







        ► 
      



April

(199)







        ► 
      



March

(255)







        ► 
      



February

(275)







        ► 
      



January

(235)









        ► 
      



2013

(2011)





        ► 
      



September

(199)







        ► 
      



August

(228)







        ► 
      



July

(210)







        ► 
      



June

(222)







        ► 
      



May

(217)







        ► 
      



April

(229)







        ► 
      



March

(243)







        ► 
      



February

(221)







        ► 
      



January

(242)









        ▼ 
      



2012

(1993)





        ▼ 
      



September

(227)

php - Virtual hosts with two servers on localhost -
testing - Where does fitnesse write my class logs -
android - Different app name in sharing menu -
Parsing large combined XML document with Python -
c++ - Why does my edited WtWithQt example crash? -
actionscript 3 - Nape Moving Platform -
android - Can I define multiple ColorStateLists in...
c++ - Project Euler 16 2^1000 i got 1319? -
java - indexFor in hashmap? -
database - Annoying bug at mysql table creation -
opengl - Run Open GL programs on the Raspberry Pi? -
Error messages not displaying for form validation ...
osx - CSS3 transform property on TideSDK Mac -
css - Fixed div next to a centered div -
couchDB: checkpoint_commit_failure -
java - Too many pool exhaustion errors with jdbc-p...
c# - crop image without copying -
java - shell_exec() in PHP on Mac OSX using MAMP g...
jQuery select everything between divs except some ...
How to have a button only for large devices on And...
android - Make notification disappear after 5 minu...
java - How to clear the standard input (term) -
Applescript to change illustrator colors -
find Kth element in a Binary Search tree -
apache - Can't find controller directory -
css - how to align a div between a position absolu...
ruby on rails - How to stub after_create callback ...
ruby on rails - remove all html tag before validat...
Refactoring to higher order function in Scala -
c++ - No file created using fstream -
google app engine - Making db.put() failsafe -
winapi - How to capture â€œEnter keyâ€ in combobo...
jqplot y-axis label isn't vertical -
javascript - Print iFrame with jQuery -
c++ - boost member function pointers -
How to disable the functionality of home button in...
could not retrieve mail on android using javamail -
How to I read any file in binary using C#? -
javascript - How do I send a character to a textar...
c# - Picture box not updating its content -
php - Laravel 4 - Dynamic Routing/Views -
c# - Get Layer of Window Relative to Other Windows -
c# - Using pinvoke to get access to system objects...
android - Using Gridview and Button in LinearLayout -
c# - Dynamically created controls lose values -
Adding atoms to lists prolog -
java - extracting indivdiual bibo:Articles from RD...
python - How do I statically configure Celery appl...
css - How can I place 3 objects next to each other...
firebase - Issue with security rules getting data ...
windows 8 - Image for all WinJS controls list -
linux - Trying to create a file to call another fi...
javascript - jquery slide toggle : hidden + new li...
Custom expandable list in android -
html - Put a DIV class in the middle using css -
java - Can't get this array to work -
c# - DataTable distinct rows -
android - R class not generating ids -
json - can I target a specific node in d3 -
busybox - ssh client (dropbear on a router) does n...
asp.net - How can I include view files from anothe...
ruby on rails - How to do inheritance between mode...
unix - How do i examine the contents of a file nam...
Large object shows up in git pack after deletion -
html - How do hide div if ad does not appear -
Google Directions API - can you query for directio...
go - Unable to Unmarshal a payload with spaces in ...
php - Quick way to remove similar results from MyS...
c# - Copy same file from multiple threads to multi...
mysql - SELECT INTO OUTFILE two tables with condit...
python - Pyshp shapefile reader not working -
jquery - HTML5 input required - how to disable -
Rails 4 force_ssl issue? -
wpf - Force the use of the shift key while pressin...
Get my country from my current location google map...
jquery - Redirect user to login page after clickin...
ASP.Net C# Variables not storing on web page -
java - JPA/Hibernate + Postgres SequenceGenerator -
javascript - how to make div toggle onclick, child...
c++ - Cross-platform method for streaming encrypte...
angularjs - Anyway to trigger a method when Angula...
javascript - Variables in function expressions -
join - Android count rows and show it in list view -
linux - lesskey config does not get used by less p...
debugging - How do you log all garbage collection ...
c# - How to catch mouse movement when left mouseBu...
jquery - Trying to toggle a drawer open and closed...
reporting services - Page headers on SSRS subrepor...
Disable caching magento while editing CSS and HTML...
twitter bootstrap - Using boostrap.css file in Rai...
ruby - Using `defined?` at one-level expansion -
Javascript: pushing an object into a collection -
php - jQuery: Can I send and receive an $.ajax res...
javascript - issue with variable declaration in js -
Why is my android custom view not square? -
java - Controlling security on model updates in RE...
How to set text formatted in Android Studio in ubu...
jQuery:How to select elements with particular clas...
ios - Transparent Pixels on bottom corners of UIIm...
android - Paypal API , credit card to multiple rec...








        ► 
      



August

(235)







        ► 
      



July

(225)







        ► 
      



June

(206)







        ► 
      



May

(221)







        ► 
      



April

(216)







        ► 
      



March

(206)







        ► 
      



February

(227)







        ► 
      



January

(230)









        ► 
      



2011

(1964)





        ► 
      



September

(220)







        ► 
      



August

(222)







        ► 
      



July

(219)







        ► 
      



June

(224)







        ► 
      



May

(219)







        ► 
      



April

(206)







        ► 
      



March

(216)







        ► 
      



February

(221)







        ► 
      



January

(217)









        ► 
      



2010

(1952)





        ► 
      



September

(230)







        ► 
      



August

(202)







        ► 
      



July

(221)







        ► 
      



June

(207)







        ► 
      



May

(213)







        ► 
      



April

(199)







        ► 
      



March

(234)







        ► 
      



February

(244)







        ► 
      



January

(202)


















    















Powered by Blogger.