Quickstart

So you have some great thoughts in your mind, and you want to save them in a beautiful Microsoft Word document? Here is an example how Moby Dick might be crafted:

  1. Import the docxgen library first
>>> from docxgen import *
  1. Create a Document instance
>>> doc = Document()

3. Add title, heading, and paragraph in the document body, repeat the following steps until the work is finished.

>>> body = doc.body
>>> body.append(title(run('Moby Dick; or, the whale')))
>>> body.append(h1(run('CHAPTER 1. Loomings.')))
>>> body.append(paragraph([run('Call me Ishmael. Some years ago ...')]))
  1. update the document core properties to claim your authorship.
>>> doc.update(title='Moby Dick; or, the Whale', creator='Herman Melville')
  1. Save the document and profit!
>>> doc.save('/tmp/moby-dick.docx')

Key Concepts

The Microsoft Word document is mere a collection of document, stylesheets, image blobs serialized in the zip format. The main document is just a XML file specified as WordprocessingML. So technically, you can create a Word document from scratch just using a text editor and zip utility application.

docxgen leverages the power of lxml library to provide a more convenient interface to create the Word document element than the bare-metal XML manipulation. You do not need to dive into the WordProcessingML spec, but we highly recommend you to understand some key concepts:

  • The smallest build block in the WordprocessingML is a text element, t.
  • Then you can attach visual styles, like bold, italic and foreground / background colors to create a text run, r.
  • The text runs form a paragraph, para, which may be stylized for heading, title, and various list items.
  • All the paragraphs are saved in a body element just like HTML.
  • The body is then saved in the top-level element document.
  • The document is serialized as word/document.xml with other stylesheets and image blobs into a zip container during the serialization.

Please head over to the API Documentation for more information, also it is worthy checking out the lxml document for the convention used in the docxgen.