Our reviewers evaluate career opinion pieces independently. Learn how we stay transparent, our methodology, and tell us about anything we missed.
XML authoring refers to the use of XML for authoring technical documentation, books, and other types of technical publications and documents.
XML authoring differs from today’s dominant authoring paradigm, which is desktop publishing. Documentation departments can increase productivity and realize significant cost savings by switching to structured authoring with XML.[1][2]
If you’re interested in learning more about XML authoring via video, then watch below. Otherwise, skip ahead.
XML is the acronym for Extensible Markup Language. It is a text-centric markup language derived from Standard Generalized Markup Language (SGML).
XML is used to store structured data, rather than to format or display information on a page. You can use XML to represent structured information for documents, books, data, manuscripts, and more.
XML is an important development because of the following two reasons:
XML allows the flexible development of user-defined document types. It provides a robust, non-proprietary, persistent, and verifiable file format for the storage and transmission of text and data both on and off the Web; and it removes the more complex options of SGML, making it easier to program. [3]

XML offers the following benefits for technical communicators:
An authoring paradigm presents technical writers with a particular view of the document model.
Writers use unstructured authoring to create content according to rules and approved styles described in style guides.
A style guide contains a documented approach on how the writing team is supposed to author content, including:
Adherence to the approved style guide is double-checked by editors. This manual process of ensuring style guide adherence is time-consuming.
Writers use desktop publishing tools for the unstructured authoring of documents. These tools, such as Microsoft Word, allow authoring and publishing from the same system. The tools integrate content and format, and the graphical user interface (GUI) of the tool is almost always What You See is What You Get (WYSIWYG). Desktop publishing tools give writers control over content presentation and delivery.
Structured authoring is a publishing paradigm that defines and enforces consistent organization of information.
Structured authoring incorporates the following:
Structured authoring is a concept or paradigm.
XML is a technology that you can use to implement structure authoring. For XML, the structure and the legal elements and attributes of a document are defined in a Document Type Definition or DTD.
Today, the terms “structured authoring” and “XML” are often used interchangeably.

XML authoring offers multiple benefits for technical writers and communicators:
An XML authoring tool or structured authoring tool is a text editor that you can use with a markup language such as XML to “tag” content based on a predefined structure or set of rules laid down in a DTD.
Oxygen XML Editor is a great XML authoring tool that you can use to create XML files and XML documents. It offers multiple platforms for XML editing. The tool is renowned among developers as an advanced solution for technical authoring and development. Oxygen XML Editor features an advanced set of editing tools in addition to numerous other helpful tools.
Oxygen XML Editor includes features such as Web Help, XML Author, XML Editor, XML Web Author, and XML Developer. From simple authoring to development and editing, Oxygen XML Editor makes it simple for all types of technical communication projects.
Notepad++ is a free text editor with a ready plugin for editing XML files. It helps users copy, paste, and highlight text in XML files. The tool also enables users to work on multiple files at the same time. Notepad++ is based on C++ and the editing component Scintilla and has a GPL License. It supports code formatting, code folding, syntax highlighting, and auto-completing functionality for scripting, programming, and markup languages. In addition, Notepad++ features a Color Coding feature to differentiate content from code in an XML file.
One of the drawbacks of Notepad++ is that it does not support functionality for syntax hacking or code completion. To edit libXML2-XML documents, you can add the XML Tools Plugin to Notepad++.
Notepad++ helps users define Macros for applying bulk actions to multiple XML files. The tool also supports a ‘Pretty Print layout’ for defining, structuring, and organizing XML files.
XML Notepad is an open-source editor for XML. It boasts a user-friendly interface for browsing and editing XML documents.
Some of the features supported by XML Notepad are:
In addition, the tool supports configurable fonts and colors, integrated XML diff tool, and support for custom editors for date, Time, etc. It is one of the best tools for large XML documents. XML Notepad also provides users with XSD schema information and support for XInclude.
XML Notepad’s toolbar buttons provide convenience for handling the movement of nodes on the tree. It is one of the best tools for developers and technical writers alike, given that it provides intelligence-based elements and values. [4]
Even though the desktop publishing paradigm based on unstructured authoring is popular, it has many disadvantages.
For e.g when employees are asked to create materials for a single presentation, each piece of the created content originates and resides in a different place throughout the organization. Over time, a lot of duplicate content is created and a lot of content becomes obsolete. Content that is scattered throughout the organization is difficult to find and difficult to maintain. Moreover, HR departments have to create and maintain training material for the different desktop publishing platforms preferred by employees.
This time-consuming, inefficient, and error-prone approach to content management is frustrating for individual employees and costly for organizations.
For these reasons, organizations are finding that structured content is a much more efficient and reliable way to generate, maintain and publish content.
According to Scott Abel’s benchmarking survey published in 2012, 44% of companies were using structured XML content, and 81% of those companies were using DITA. According to DITAWriter, more than 770 companies are already using DITA in 2022. DITA is a popular XML-based authoring model for creating and publishing content.
Making the shift to structured authoring with XML does involve costs such as an XML editor software, training, expertise (in-house/outsourced), and process implementation. For the long term, however, the shift to XML provides multiple benefits such as ease and reduced cost of document maintenance and new document creation. The benefits increase further when translation is factored into the equation.
Here are some frequently asked questions about XML.
SGML and XML are metalanguages. HTML, XHTML, and HTML5 are all applications of SGML/ XML.
SGML is the “mother tongue”, and has been used for describing different document types from transcripts of ancient manuscripts to technical documentation, patients’ medical records, and even musical notation. SGML is large and complex, and overkill for most common office desktop applications.
XML is an abbreviated version of SGML, to make it easier to use over the Web, easier for you to define your own document types, and easier for programmers to write programs to handle them.
HTML, XHTML, and HTML5 are XML applications most frequently used on the Web.
The OASIS Open Darwin Information Typing Architecture (DITA) is a standard XML-based architecture for representing documents. DITA provides architectural features for content modularity, content reuse, and controlled extension of document vocabularies in a way that ensures interoperability of DITA documents.
The DITA architecture was developed inside IBM for IBM technical publications and was donated to OASIS Open in 2004. DITA is an OASIS Open standard first published in 2005 and last updated in 2015 with the publication of version 1.3.
[1] Chiang (2009). Engineering Information Into Open Documents., 9-19. https://doi.org/10.4018/978-1-60566-246-6.ch002
[2] Hoelzer, Schweiger, Dudeck (2003). Transparent ICD and DRG Coding Using Information Technology: Linking and Associating Information Sources with the eXtensible Markup Language. J Am Med Inform Assoc, 5(10), 463-469. https://doi.org/10.1197/jamia.m1258
[3] Feng, Chang, Dillon (2002). “A semantic network-based design methodology for XML documents”, ACM Trans. Inf. Syst., 20(4), 390-421. Retrieved from https://doi.org/10.1145/582415.582417
[4] Liu, Hu, Takeichi (2005). An environment for maintaining computation dependency in XML documents. https://doi.org/10.1145/1096601.1096616
Stay up to date with the latest technical writing trends.
Get the weekly newsletter keeping 23,000+ technical writers in the loop.
Get our #1 industry rated weekly technical writing reads newsletter.