.. include:: ../header.txt ======================== The Docutils Publisher ======================== :Author: David Goodger :Contact: docutils-develop@lists.sourceforge.net :Date: $Date$ :Revision: $Revision$ :Copyright: This document has been placed in the public domain. .. contents:: The ``docutils.core.Publisher`` class is the core of Docutils, managing all the processing and relationships between components. See `PEP 258`_ for an overview of Docutils components. Configuration is done via `runtime settings`_ assembled from several sources. The *Publisher convenience functions* are the normal entry points for using Docutils as a library. .. _PEP 258: ../peps/pep-0258.html Publisher Convenience Functions =============================== There are several convenience functions in the ``docutils.core`` module. Each of these functions sets up a `docutils.core.Publisher` object, then calls its ``publish()`` method. ``docutils.core.Publisher.publish()`` handles everything else. See the module docstring, ``help(docutils.core)``, and the function docstrings, e.g., ``help(docutils.core.publish_string)``, for details and a description of the function arguments. .. TODO: generate API documentation with Sphinx and add links to it. publish_cmdline() ----------------- Function for custom `command-line front-end tools`_ (like ``tools/rst2html.py``) or "console_scripts" `entry points`_ (like `core.rst2html()`) with file I/O. In addition to writing the output document to a file-like object, also returns it as `str` instance (rsp. `bytes` for binary output document formats). .. _command-line front-end tools: ../howto/cmdline-tool.html .. _entry points: https://packaging.python.org/en/latest/specifications/entry-points/ publish_file() -------------- For programmatic use with file I/O. In addition to writing the output document to a file-like object, also returns it as `str` instance (rsp. `bytes` for binary output document formats). publish_string() ---------------- For programmatic use with _`string I/O`: Input can be a `str` or `bytes` instance. `bytes` are decoded with input_encoding_. Output * is a `bytes` instance, if output_encoding_ is set to an encoding registered with Python's "codecs_" module (default: "utf-8"), * a `str` instance, if output_encoding_ is set to the special value ``"unicode"``. .. Caution:: The "output_encoding" and "output_encoding_error_handler" `runtime settings`_ may affect the content of the output document: Some document formats contain an *encoding declaration*, some formats use substitutions for non-encodable characters. Use `publish_parts()`_ to get a `str` instance of the output document as well as the values of the output_encoding_ and output_encoding_error_handler_ runtime settings. *This function is provisional* because in Python 3 the name and behaviour no longer match. .. _codecs: https://docs.python.org/3/library/codecs.html publish_doctree() ----------------- Parse string input (cf. `string I/O`_) into a `Docutils document tree`_ data structure (doctree). The doctree can be modified, pickled & unpickled, etc., and then reprocessed with `publish_from_doctree()`_. publish_from_doctree() ---------------------- Render from an existing `document tree`_ data structure (doctree). Returns the output document as a memory object (cf. `string I/O`_). *This function is provisional* because in Python 3 the name and behaviour of the *string output* interface no longer match. publish_programmatically() -------------------------- Auxilliary function used by `publish_file()`_, `publish_string()`_, `publish_doctree()`_, and `publish_parts()`_. Applications should not need to call this function directly. .. _publish-parts-details: publish_parts() --------------- For programmatic use with string input (cf. `string I/O`_). Returns a dictionary of document parts as `str` instances. [#binary-output]_ Dictionary keys are the part names. Each Writer component may publish a different set of document parts, described below. Example: post-process the output document with a custom function ``post_process()`` before encoding with user-customizable encoding and errors :: def publish_bytes_with_postprocessing(*args, **kwargs): parts = publish_parts(*args, **kwargs) out_str = post_process(parts['whole']) return out_str.encode(parts['encoding'], parts['errors']) There are more usage examples in the `docutils/examples.py`_ module. .. _docutils/examples.py: ../../docutils/examples.py .. _ODT: ../user/odt.html Parts Provided By All Writers ````````````````````````````` _`encoding` The `output_encoding`_ setting. _`errors` The `output_encoding_error_handler`_ setting. _`version` The version of Docutils used. _`whole` Contains the entire formatted document. [#binary-output]_ .. [#binary-output] Output documents in binary formats (e.g. ODT_) are stored as a `bytes` instance. Parts Provided By the HTML Writers `````````````````````````````````` HTML4 Writer ^^^^^^^^^^^^ _`body` ``parts['body']`` is equivalent to parts['fragment_']. It is *not* equivalent to parts['html_body_']. _`body_prefix` ``parts['body_prefix']`` contains::