XML writing tools for Python

I'm currently trying ElementTree and it looks fine, it escapes HTML entities and so on and so forth. Am I missing something truly wonderful I haven't heard of?

This is similar to what I'm actually doing:

import xml.etree.ElementTree as ET
root = ET.Element('html')
head = ET.SubElement(root,'head')
script = ET.SubElement(head,'script')
script.set('type','text/javascript')
script.text = "var a = 'I love á letters'"
body = ET.SubElement(root,'body')
h1 = ET.SubElement(body,'h1')
h1.text = "And I like the fact that 3 > 1"
tree = ET.ElementTree(root)
tree.write('foo.xhtml')

# more foo.xhtml
<html><head><script type="text/javascript">var a = 'I love &amp;aacute;
letters'</script></head><body><h1>And I like the fact that 3 &gt; 1</h1>
</body></html>


Asked by: Melissa835 | Posted: 06-10-2021






Answer 1

Another way is using the E Factory builder from lxml (available in Elementtree too)

>>> from lxml import etree

>>> from lxml.builder import E

>>> def CLASS(*args): # class is a reserved word in Python
...     return {"class":' '.join(args)}

>>> html = page = (
...   E.html(       # create an Element called "html"
...     E.head(
...       E.title("This is a sample document")
...     ),
...     E.body(
...       E.h1("Hello!", CLASS("title")),
...       E.p("This is a paragraph with ", E.b("bold"), " text in it!"),
...       E.p("This is another paragraph, with a", "\n      ",
...         E.a("link", href="http://www.python.org"), "."),
...       E.p("Here are some reserved characters: <spam&egg>."),
...       etree.XML("<p>And finally an embedded XHTML fragment.</p>"),
...     )
...   )
... )

>>> print(etree.tostring(page, pretty_print=True))
<html>
  <head>
    <title>This is a sample document</title>
  </head>
  <body>
    <h1 class="title">Hello!</h1>
    <p>This is a paragraph with <b>bold</b> text in it!</p>
    <p>This is another paragraph, with a
      <a href="http://www.python.org">link</a>.</p>
    <p>Here are some reservered characters: &lt;spam&amp;egg&gt;.</p>
    <p>And finally an embedded XHTML fragment.</p>
  </body>
</html>

Answered by: John345 | Posted: 07-11-2021



Answer 2

There's always SimpleXMLWriter, part of the ElementTree toolkit. The interface is dead simple.

Here's an example:

from elementtree.SimpleXMLWriter import XMLWriter
import sys

w = XMLWriter(sys.stdout)
html = w.start("html")

w.start("head")
w.element("title", "my document")
w.element("meta", name="generator", value="my application 1.0")
w.end()

w.start("body")
w.element("h1", "this is a heading")
w.element("p", "this is a paragraph")

w.start("p")
w.data("this is ")
w.element("b", "bold")
w.data(" and ")
w.element("i", "italic")
w.data(".")
w.end("p")

w.close(html)

Answered by: Carina737 | Posted: 07-11-2021



Answer 3

I assume that you're actually creating an XML DOM tree, because you want to validate that what goes into this file is valid XML, since otherwise you'd just write a static string to a file. If validating your output is indeed your goal, then I'd suggest

from xml.dom.minidom import parseString

doc = parseString("""<html>
    <head>
        <script type="text/javascript">
            var a = 'I love &amp;aacute; letters'
        </script>
    </head>
    <body>
        <h1>And I like the fact that 3 &gt; 1</h1>
    </body>
    </html>""")

with open("foo.xhtml", "w") as f:
    f.write( doc.toxml() )

This lets you just write the XML you want to output, validate that it's correct (since parseString will raise an exception if it's invalid) and have your code look much nicer.

Presumably you're not just writing the same static XML every time and want some substitution. In this case I'd have lines like

var a = '%(message)s'

and then use the % operator to do the substitution, like

</html>""" % {"message": "I love &amp;aacute; letters"})

Answered by: Agata928 | Posted: 07-11-2021



Answer 4

https://github.com/galvez/xmlwitch:

import xmlwitch
xml = xmlwitch.Builder(version='1.0', encoding='utf-8')
with xml.feed(xmlns='http://www.w3.org/2005/Atom'):
    xml.title('Example Feed')
    xml.updated('2003-12-13T18:30:02Z')
    with xml.author:
        xml.name('John Doe')
    xml.id('urn:uuid:60a76c80-d399-11d9-b93C-0003939e0af6')
    with xml.entry:
        xml.title('Atom-Powered Robots Run Amok')
        xml.id('urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a')
        xml.updated('2003-12-13T18:30:02Z')
        xml.summary('Some text.')
print(xml)

Answered by: Joyce847 | Posted: 07-11-2021



Answer 5

don't you actually want something like:

html(head(script(type='text/javascript', content='var a = ...')),
body(h1('And I like the fact that 3 < 1'), p('just some paragraph'))

I think I saw something like that somewhere. This would be wonderful.

EDIT: Actually, I went and wrote a library today to do just that: magictree

You can use it like this:

from magictree import html, head, script, body, h1, p
root = html(
         head(
           script('''var a = 'I love &amp;aacute; letters''', 
                  type='text/javascript')),
         body(
           h1('And I like the fact that 3 > 1')))

# root is a plain Element object, like those created with ET.Element...
# so you can write it out using ElementTree :)
tree = ET.ElementTree(root)
tree.write('foo.xhtml')

The magic in magictree lies in how the importing works: The Element factories are created when needed. Have a look at the source, it is based on an answer to another StackOverflow question.

Answered by: Agata918 | Posted: 07-11-2021



Answer 6

I ended up using saxutils.escape(str) to generate valid XML strings and then validating it with Eli's approach to be sure I didn't miss any tag

from xml.sax import saxutils
from xml.dom.minidom import parseString
from xml.parsers.expat import ExpatError

xml = '''<?xml version="1.0" encoding="%s"?>\n
<contents title="%s" crawl_date="%s" in_text_date="%s" 
url="%s">\n<main_post>%s</main_post>\n</contents>''' %
(self.encoding, saxutils.escape(title), saxutils.escape(time), 
saxutils.escape(date), saxutils.escape(url), saxutils.escape(contents))
try:
    minidoc = parseString(xml)
catch ExpatError:
    print "Invalid xml"

Answered by: Roland164 | Posted: 07-11-2021



Answer 7

For anyone encountering this now, there's actually a way to do this hidden away in Python's standard library in xml.sax.utils.XMLGenerator. Here's an example of it in action:

>>> from xml.sax.saxutils import XMLGenerator
>>> import StringIO
>>> w = XMLGenerator(out, 'utf-8')
>>> w.startDocument()
>>> w.startElement("test", {'bar': 'baz'})
>>> w.characters("Foo")
>>> w.endElement("test")
>>> w.endDocument()
>>> print out.getvalue()
<?xml version="1.0" encoding="utf-8"?>
<test bar="baz">Foo</test>

Answered by: John938 | Posted: 07-11-2021



Answer 8

Try http://uche.ogbuji.net/tech/4suite/amara. It is quite complete and has a straight forward set of access tools. Normal Unicode support, etc.

#
#Output the XML entry
#
def genFileOLD(out,label,term,idval):
    filename=entryTime() + ".html"
    writer=MarkupWriter(out, indent=u"yes")
    writer.startDocument()
    #Test element and attribute writing
    ans=namespace=u'http://www.w3.org/2005/Atom'
    xns=namespace=u'http://www.w3.org/1999/xhtml'
    writer.startElement(u'entry',
       ans,
       extraNss={u'x':u'http://www.w3.org/1999/xhtml' ,
                 u'dc':u'http://purl.org/dc/elements/1.1'})
    #u'a':u'http://www.w3.org/2005/Atom',
    #writer.attribute(u'xml:lang',unicode("en-UK"))

    writer.simpleElement(u'title',ans,content=unicode(label))
    #writer.simpleElement(u'a:subtitle',ans,content=u' ')
    id=unicode("http://www.dpawson.co.uk/nodesets/"+afn.split(".")[0])
    writer.simpleElement(u'id',ans,content=id)
    writer.simpleElement(u'updated',ans,content=unicode(dtime()))
    writer.startElement(u'author',ans)
    writer.simpleElement(u'name',ans,content=u'Dave ')
    writer.simpleElement(u'uri',ans,
      content=u'http://www.dpawson.co.uk/nodesets/'+afn+".xml")
    writer.endElement(u'author')
    writer.startElement(u'category', ans)
    if (prompt):
        label=unicode(raw_input("Enter label "))
    writer.attribute(u'label',unicode(label))
    if (prompt):
        term = unicode(raw_input("Enter term to use "))
    writer.attribute(u'term', unicode(term))
    writer.endElement(u'category')
    writer.simpleElement(u'rights',ans,content=u'\u00A9 Dave 2005-2008')
    writer.startElement(u'link',ans)
    writer.attribute(u'href',
         unicode("http://www.dpawson.co.uk/nodesets/entries/"+afn+".html"))
    writer.attribute(u'rel',unicode("alternate"))
    writer.endElement(u'link')
    writer.startElement(u'published', ans)
    dt=dtime()
    dtu=unicode(dt)
    writer.text(dtu)
    writer.endElement(u'published')
    writer.simpleElement(u'summary',ans,content=unicode(label))
    writer.startElement(u'content',ans)
    writer.attribute(u'type',unicode("xhtml"))
    writer.startElement(u'div',xns)
    writer.simpleElement(u'h3',xns,content=unicode(label))
    writer.endElement(u'div')
    writer.endElement(u'content')
    writer.endElement(u'entry')

Answered by: Connie640 | Posted: 07-11-2021



Similar questions

Writing python client for SOAP with suds

I want to convert a perl SOAP client into a python SOAP client. The perl client is initialized like $url = 'https://host:port/cgi-devel/Service.cgi'; $uri = 'https://host/Service'; my $soap = SOAP::Lite -&gt; uri($uri) -&gt; proxy($url); I tried to replicate this in python 2.4.2 with suds 0.3.6 doing from suds.client import Client url="https://host:port/cgi-devel...


Writing an XML file using python

I have to write an xml file using python standard modules (not using elementtree, lxml etc) The metadata is a SAML identity provider metadata and is of the form - &lt;?xml version="1.0"?&gt; &lt;EntityDescriptor xmlns="urn:oasis:names:tc:SAML:2.0:metadata" xmlns:saml="urn:oasis:names:tc:SAML:2.0:assertion" xmlns:ds="http://www.w3.org/2000/09/xmldsig#" entityID="http://wsa.saas.com"&gt; &lt;IDPS...


python - writing to data to excel

I have data that I need to export to excel, I just don't know how to go about it, here's the view I'm using, I've commented out my attempts.A push to the right direction will be greatly appreciated. def month_end(request): """ A simple view that will generate a month end report as a PDF response. """ current_date = datetime.now() context = {} context['month'] = current_date.month ...


python - Writing a telnet client

HI, I have a device that exposes a telnet interface which you can log into using a username and password and then manipulate the working of the device. I have to write a C program that hides the telnet aspect from the client and instead provides an interface for the user to control the device. What would be a good way to proceed. I tried writing a simple socket program but it stops at the login prom...


python - Error in writing a class

I am running through a tutorial online at http://www.sthurlow.com/python/lesson08/ and I believe I understand how classes work in python, at least to some degree but when I run this code: class Shape: def __init__(self,x,y): self.x = x self.y = y description = "This shape has not been described yet" ...


error in writing data into file in python

a='aa' &gt;&gt;&gt; f=open("key.txt","w") &gt;&gt;&gt; s=str(a) &gt;&gt;&gt; f.write(s) and still the key.txt file remains blank .. why?


python - Writing string to a file on a new line every time

I want to append a newline to my string every time I call file.write(). What's the easiest way to do this in Python?


python - Writing file line to Django model field

I can't seem to write a line from a file to a field of a Django model. The field is described in the model as: text = models.TextField(null=True, blank=True, help_text='A status message.') However, when I attempt to create a new object I cannot fill this field using the readline function: file = open(filename, 'r') str = file.readline() When I attempt to ...


python - How can I check if a file is empty or not? and empty it before writing if not?

myfile = open('out.txt', 'a') myfile.write(caption) I write something like this whenever my script is run. But after the first run, it keeps on adding same data to the file. How can I check if it's empty or not and tell it to write only if it's empty?


Writing certain lines to a file in python

I have a large file called fulldataset. I would like to write lines from fulldataset to a new file called newdataset. I only want to write the lines from fulldataset though that contain the id numbers present in the listfile. Also all the id numbers start with XY. The id numbers occur in the middle of each line though. Here is an example line from list file: Robert, Brown, "XY-12344343...


Writing to the windows logs in Python

Is it possible to write to the windows logs in python?


Writing unit tests in Django / Python

I've not used Unit Tests before other than a quick introduction in a Uni course. I'm currently writing an application though and would like to teach myself TDD in the process. The problem is, I've no idea what to test or really how. I'm writing a Django application, and so far have only created the models (and customised the admin application). This is how I've written the skeletons of my tests so far:


is there COMMIT analog in python for writing into a file?

I have a file open for writing, and a process running for days -- something is written into the file in relatively random moments. My understanding is -- until I do file.close() -- there is a chance nothing is really saved to disk. Is that true? What if the system crashes when the main process is not finished yet? Is there a way to do kind of commit once every... say -- 10 minutes (and I call this commit myself -- ...


Writing binary data to a socket (or file) with Python

Let's say I have a socket connection, and the 3rd party listener on the other side expects to see data flowing in a very structured manner. For example, it looks for an unsigned byte that denotes a type of message being sent, followed by an unsigned integer that denotes the length of message, then another unsigned byte which is really a bit field with some flags set or unset and etc... How would I do this in python...


Writing a binary buffer to a file in python

I have some python code that: Takes a BLOB from a database which is compressed. Calls an uncompression routine in C that uncompresses the data. Writes the uncompressed data to a file. It uses ctypes to call the C routine, which is in a shared library. This mostly works, except for the ac...


Writing a website in Python

I'm pretty proficient in PHP, but want to try something new. I'm also know a bit of Python, enough to do the basics of the basics, but haven't used in a web design type situation. I've just written this, which works: #!/usr/bin/python def main(): print "Content-type: text/html" print print "&lt;html&gt;&lt;head&gt;" print "&lt;title&gt;Hello World from Python&lt;/title&gt;" ...


Writing python client for SOAP with suds

I want to convert a perl SOAP client into a python SOAP client. The perl client is initialized like $url = 'https://host:port/cgi-devel/Service.cgi'; $uri = 'https://host/Service'; my $soap = SOAP::Lite -&gt; uri($uri) -&gt; proxy($url); I tried to replicate this in python 2.4.2 with suds 0.3.6 doing from suds.client import Client url="https://host:port/cgi-devel...


Writing Sudoku Solver wih Python

Here is my Sudoku Solver written in python language, When I run this program there seems to be a problem with in Update function and Solve function. No matter how much time I look over and move the codes around, I seem to have no luck Can anyone Help me? import copy def display (A): if A: for i in range (9): for j in range (9): if type (A[i][j...


Writing an XML file using python

I have to write an xml file using python standard modules (not using elementtree, lxml etc) The metadata is a SAML identity provider metadata and is of the form - &lt;?xml version="1.0"?&gt; &lt;EntityDescriptor xmlns="urn:oasis:names:tc:SAML:2.0:metadata" xmlns:saml="urn:oasis:names:tc:SAML:2.0:assertion" xmlns:ds="http://www.w3.org/2000/09/xmldsig#" entityID="http://wsa.saas.com"&gt; &lt;IDPS...


c# - Writing crawler that stay logged in with any server

I am writing a crawler. Once after the crawler logs into a website I want to make the crawler to "stay-always-logged-in". How can I do that? Is a client (like browser, crawler etc.,) make a server to obey this rule? This scenario could occur when the server allows limited logins in day.






Still can't find your answer? Check out these communities...



PySlackers | Full Stack Python | NHS Python | Pythonist Cafe | Hacker Earth | Discord Python



top