Jabber.org's Quick Guide to DocBook

Eliot Landrum

Jabber.org

eliot@landrum.cx

12/14/2000


Introduction

While Jabber.org welcomes documents in any format (the content is what matters!), we prefer documents to be formatted using DocBook SGML. DocBook allows conversion to many formats (PDF, HTML, ASCII text, etc.) and frees the writer from trying to maintain a consistant formatting style. DocBook SGML is easy to learn and can be written with any plain text editor (some WYSIWYG editors exist, but I've found it easier just to use vim!). This document steps through the basics of starting a DocBook document from scratch and processing it to other formats (in Linux).

As can be seen, this guide is very brief. For more information about DocBook, read the online version of DocBook: The Definitive Guide published by O'Reilly Associates.

If you're familiar with HTML, you're already well on your way to understanding DocBook SGML. DocBook is a markup language like HTML, just with different elements (tags). Ready to jump into some of the fundamental elements?


Fundamental Elements

Document Type

Before any DocBook SGML can be started, the document type must be stated:

<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook V3.1//EN">
				
This specifies that the document type is going to be an article and that we're using DocBook version 3.1. At the time of this writing, DocBook SGML/XML 4.1 is the most current version. We do not use the XML version or the 4.1 version yet because all of the tools currently cannot handle the updates.

The entire document must be wrapped within one main element. DocBook has several different main elements, depending on the size of the document. The most commonly used document type for Jabber documents is article. After the DOCTYPE is stated, the actual DocBook elements can be started.


Metadata

First some metadata must be specified. This usually includes information about the document title, author(s), contributors, and publication date. All of this information is inclosed with the artheader element. The metadata for this document:

<artheader>
	<title>Jabber.org's Quick Guide to DocBook</title>
	<author>
		<firstname>Eliot</firstname><surname>Landrum</surname>
		<affiliation>
			<orgname>Jabber.org</orgname>
			<address><email>eliot@landrum.cx</email></address>
		</affiliation>
	</author>

	<pubdate>12/14/2000</pubdate>
</artheader>
				


The Content

Now the good stuff can begin! All text must be inclosed within sections. Each section is marked with, originally enough, the section element. Every section must have a title, marked with the title element. Sections can also have subsections, simply by inclosing section's inside other sections. All text must be inclosed within the section by para elements. Here's a simple section with an inclosed section:

<section>
	<title>Section 1</title>
	<para>
		This is section 1 with a nice little paragraph.
	</para>
	<section>
		<title>Section 1.1</title>
		<para>
			Here we have a subsection and the text of it.
		</para>
	</section>
</section>
				


Linking

You can link to various parts of the document by using the link element. First, you must specify an ID for the element which you would like to link to. Nearly all the elements in DocBook may have the ID attribute. Each ID must be unique. The following is an example of using the ID attribute and the link element.

<section id="section1">
	<title>Section 1</title>
	<para>
		This is section 1 with a nice little paragraph.
	</para>
</section>
<section id="section2">
	<title>Section 2</title>
	<para>
		This is section 2. <link linkend="section1">Section 1</link>
		provides more indepth information.
	</para>
</section>
				

To link to a URL, use the ulink element. Simply provide the URL in the url attribute:

<ulink url="http://www.jabber.org">Jabber.org</ulink>
				


Lists

A bulleted list:

<itemizedlist>
	<listitem><para>
		Item 1
	</para></listitem>
	<listitem><para>
		Item 2
	</para></listitem>
</itemizedlist>
				

A variable list (useful for any definition lists):

<variablelist>
	<title>List of Variables</title>
	<varlistentry>
		<term>Jabber</term>
		<term>ICQ</term>
		<term>AIM</term>
		<listitem><para>
			Instant messaging systems.
		</para></listitem>
	</varlistentry>
	<varlistentry>
		<term>DocBook SGML</term>
		<listitem><para>
			An markup language for creating structured documents.
		</para></listitem>
	</varlistentry>
</variablelist>
				


Extras

In DocBook there is no "bold" or "italic" formatting, instead, words are marked specific to what they mean. For instance, a command would be wrapped with the command element. Commonly used formatting elements are: wordasword, emphasis, varname, literal, filename and replaceable.


Processing

To output to various formats, the jade program is used. RedHat users will need to install the following packages: docbook, sgml-common, stylesheets, openjade. If you use Emacs, grab the psgml package. Debian users, simply apt-get install cygnus-stylesheets.

Now that the packages are installled, the SGML can be processed. To generate a one page HTML document, run (on one line): jade -t sgml -d /usr/lib/sgml/stylesheet/dsssl/docbook/nwalsh/html/docbook.dsl -i html -V nochunks file.sgml > file.html. To generate a more "book" like document, with multiple pages, run db2html file.sgml. To make the first page index.html, give the article element an ID of index (e.g. article id="index").

Cleaning the HTML: I often use tidy from the W3C.org to clean the HTML up outputted by jade. More precisely, tidy with the -im flags seems to do the best.

Outputting a PDF or PS document is just as easy as outputting HTML. You may have to go through and make sure that your examples aren't destroyed in the process though. Simply run db2pdf file.sgml or db2ps file.sgml to generate the desired output.