Working with DocBook

Steven J. Zeil

Old Dominion University, Dept. of Computer Science

Table of Contents

1. Getting & Installing the Toolset
1.1. Setting Up a New Document Set
2. Editing DocBook Documents
2.1. Linking in DocBook
2.2. Source Code Highlighting
2.3. Mathematics
3. Building a Document Set
3.1. Checking Your Documents
4. Changing Appearances
4.1. Overriding the CSS Rules
4.2. Overriding Javascript
4.3. Overriding Graphics

This document describes the use of the tool set to convert course documents written in DocBook into web pages.

1. Getting & Installing the Toolset

The tool set can be run on Unix (Solaris), Linux, or Windows machines.[1] Requirements are:

  • Java 1.6 or later

  • Apache ant - ant is a multi-platform build manager. It serves much the same purpose as the Unix/Linux make command, but allows most projects to be built without writing commands in a (system dependent) command shell language.

  • A copy of the tool set

    • If you are working on our Solaris network, you can simply use my copy at /home/zeil/usr/local/DocBook/.

    • If you are working on a Linux system, make a copy of that same directory on your Unix system.

    • On a Windows system, you will also need a copy of that directory. You might find it easiest to "shadow" the Solaris path by creating, on your C: drive, a folder named "home", then inside that a folder named "zeil", and so on.

In all of the subsequent examples, I will assume that you are using a tool set at /home/zeil/usr/local/DocBook/. If you have opted to place it elsewhere, you will need to adjust paths accordingly.

1.1. Setting Up a New Document Set

In any convenient directory, create your document directories with your source documents. You will need to create a site map and an outline for the document set.

Finally, copy build.xml from /home/zeil/usr/local/DocBook/tools/ into that directory. Use a text editor to open up build.xml. If necessary, change the docbook, sitemap, and outline properties near the start of that file.

2. Editing DocBook Documents

DocBook is an XML-based language. As such, you could edit it with any text editor, but editing large XML documents that way gets very finicky and would also require a pretty thorough understanding of the intricacies of DocBook.

As an alternative, the tool set includes a GUI-based editor, xxe, that "understands" DocBook and can lead you through the options available to you.

To run this editor, give the command

/home/zeil/usr/local/DocBook/tools/xxe-perso-4_5_2/bin/xxe documentName.dbk

The editor is a Java program. With appropriate changes to the path in that command, this should work under Unix, Linux, or Windows.

It's definitely worth putting in some time to go through the "Getting Started" portion of the xxe help and practicing on a few existing DocBook documents.

2.1. Linking in DocBook

DocBook actually provides a variety of linking mechanisms, but xxe does not offer all of them as options to insert into your documents. It's when you are doing linking that you are likely to be most aware of the fact that DocBook is, underneath it all, XML. Even more important than the specific tags used to support linking are the attributes:

xml:id

Like the id attribute in modern HTML, this can be attached to any element to serve as an anchor - a mnemonic name for that location within the document.

linkend

Used to link to an xml:id within the same document (not necessarily the same web page when the document is being converted to multi-page formats such as "pages" or "slides"). This is rather like the HTML usage href="#...", but with linkends you do not give the '#'.

xlink:href

Used to link to web URLs outside the document set. Essentially identical to the HTML href.

targetdoc

Used to link to other documents within the same document set. You name the document you want to link to using the same targetdoc that you assigned to it in the sitemap. The DocBook tools will use the information in the sitemap to determine the equivalent URL to insert into any generated web pages.

targetptr

Can be used in conjunction with a targetdoc to specify a location (an xml:id) within the target document that you want to link to.

For consistency, the same attributes are used, without the namespaces, in other linking contexts within this tool set. Both the course outline and the html output format allow the use of href to link outside of the document set and targetdoc/targetptr to link within the document set.

The actual link forms that you can insert into your documents with xxe are links and olinks.

link

A link is used for links that do not refer to other documents within the document set. It can be used as well for documents within the document set that do not have their own targetdoc entry in the sitemap (typically, source code pages or pages of HTML that are copied "as is" rather than transformed via the HTML output form).

A link takes the linkend attribute to link within the same document. It takes the xlink:href attribute to link to anything else.

The easiest way to create a link in xxe is to highlight the text that you want to use as a link and use the "Convert to link" button in the toolbar. Use the attribute boxes in the right column to set the values of the linkend or the xlink:href attributes.

olink

An olink links to other documents within the document set that have been assigned a targetdoc name in the sitemap. The advantage offered by olinks is that you don't need to know exactly where the web pages of the target document will be located, which depends both on how the directories were arranged and on the output form(s) selected for that document.

An olink uses the targetdoc and, optionally, the targetptr attributes to name the location to which the link points.

There are two ways to create an olink in xxe:

  • Select a phrase to serve as the linking text. Hit ^T or select the Convert button to wrap this text in an element, Type "olink" (Actually, you only need to type the first couple of letters till the list of options narrows down to the one you want.) and hit Enter. Use the attribute boxes in the right column to set the values of the targetdoc and targetptr attributes.

  • With no text selected, position the text cursor where you wish to place the link. Hit ^I or select the Insert button to insert an element at that point. Type "olink" (Actually, you only need to type the first couple of letters till the list of options narrows down to the one you want.) and hit Enter. Use the attribute boxes in the right column to set the values of the targetdoc and targetptr attributes.

    Because you have not supplied any text to serve as the actual link, when the olink is processed the tools will consult the sitemap, then fetch the title of the target document and insert that title as the text for the link.

2.2. Source Code Highlighting

The usual DocBook element for source code is the programlisting. One of the optional attributes for a programlisting is the language. If you set the language attribute to cpp or to java, then the source code will be highlighted accordingly.

Here, for example, is a programlisting with no language value set.

#include <iostream>

using namespace std;
/*  A simple
    program */
int main()
{
  cout << "Hello World!" << endl;
  return 0; // Zero denotes a normal termination
}

and here is the same listing with a language attribute of cpp:

#include <iostream>

using namespace std;
/*  A simple
    program */
int main()
{
  cout << "Hello World!" << endl;
  return 0; // Zero denotes a normal termination
}

2.3. Mathematics

DocBook contains elements that serve as "wrappers" for mathematics but does not, itself, provide for formatting and display of mathematical expressions. To actually create mathematical formulae, the tool set has support for ASCIIMath.

` \sum_{i=0}^n i = (n(n-1))/(2) `

To activate this support, your document must have at least one element, anywhere in the document, that has a role attribute of "`text(am)text(ath)`". I usually put this on the DocBook wrapper elements for mathematics such as informalequation or inlineequation.

A typical sequence for creating a line of mathematics, then, is to insert an informalequation just after a paragraph. xxe will automatically insert a mathphrase inside the informalequation. Within the mathphrase you can type text. Type the reverse apostrophe (the one under the ~ key on most keyboards), then one or more spaces, then the equation you want (in the ASCIImath format), then one or more spaces, then close with another reverse apostrophe.

3. Building a Document Set

Just cd to the directory where you have placed build.xml and give the command

ant

to build the document set. This may take several minutes the first time you do it. After that, successive invocations of ant will usually only rerun commands required based upon which source files have changed since the most recent build.

After the first time that ant has been run, you can also request rebuilds of specific documents with the command

ant documentName

where the documentName matches the targetdoc name given for the desired document in the sitemap.

Finally, you can clean up the generated documents, returning to an almost pure input state, by the command

ant clean

3.1. Checking Your Documents

Obviously, as you modify your document set, things can go wrong.

The ant build process generates a steady stream of messages, some pure information and some errors. You can capture these into a file by using the -l option to write all messages to a log file, e.g.:

ant -l build.log

Unfortunately, that hides the messages from you while the build is running. On Unix/Linux system, therefore, I prefer

ant | tee build.log

Error Messages

Watch for messages indicating that a file did not exist or could not be read, particularly if you have just made changes to the sitemap.

Most of all, watch for error messages indicating that an Olink was "unresolved". This could mean that you misspelled a targetdoc entry in an olink, or that you forgot to list a document in the sitemap.

Use a Link Checker

Once the document set has been constructed, I recommend using a link checker such as Xenu's Link Sleuth on the documents. It's best to do this before you publish the documents to your website. The reason, besides simply avoiding embarrassment, is that link checkers may have problems with any security you have put on your website, so it's best to do this checking on the file set within your working directory.

As a convenient target for a link checker, a file is created in the output directory named sitemap.html with a link to every output form of every document named in the course.sitemap file.

4. Changing Appearances

As DocBook is converted into the page/pages/slides output formats, the conversion process mainly deals in the order in which things appear. The actual appearances are generally left to CSS style sheets. For example, if you were, in one of your documents, to create a program listing:

int x = 0;

and then look at the HTML generated from that you would probably see something like

<pre class="programlisting">
<b class="hl-keyword">int</b> x = <span class="hl-number">0</span>;
</pre>

There's nothing directly in that HTML that indicates that the keyword "int" should appear in bold, that program listings appear on a grey background , etc. Those decisions are made via CSS.

Most DocBook elements are translated into a basic HTML element with a class attribute that refers back to the original DocBook.

When you build the document set, within your output directory a directory named "shared" is created. Into this file is copied a basic set of CSS stylesheets, graphics (e.g., navigation icons), and Javascript files.

Most important of these is docbook.css, the basic set of instructions for rendering HTML generated from DocBook, and slides.css, a set of modifications to the docbook.css rules for slides (mainly uses larger fonts and allows for a background graphic).

4.1. Overriding the CSS Rules

There are several ways in which to override the default appearances I have set up.

  1. In your top directory (where you have your sitemap and outline), create your own shared directory. Any file that you place in there that has the same name as one of the defaults will be copied into your output area instead. Thus you can, if you wish, override the entire set of rules in docbook.css by providing your own file with the same name.

  2. Each DocBook output form also attempts to load, from the shared directory, a css file with the same name as that output form: page.css, pages.css, slides.css. You can provide any of these in your own shared directory.

  3. Each directory named in the sitemap can provide a file overrides.css that will be used by all documents in that directory. Thus you can provide a common set of rules for a small handful of related documents.

  4. Finally, each document documentName.dbk can be given, in its own directory, a file documentName.css to make document-specific changes.

Most documents won't use all of these mechanism. In order to avoid false error messages from link checkers, the actual code to load these files is inserted into web pages only if those files exist at the time that you build the document set. Keep in mind then, that if you decide to add a new .css file for the first time, you will need to rebuild all the affected documents before they start using it.

4.2. Overriding Javascript

There's not, at the present, a whole lot of Javascript used in this toolset. It's used

  • to provide the ASCIIMath support, and

  • on the mailbox/communications link at the bottom of the page (and the main reason it's used there is simply to encode the email address as it actually appears in the HTML to prevent spammers from harvesting your email address).

If you need to add new Javascript, then the rules are exactly the same as for the CSS files above. Just replace ".css" by ".js".

4.3. Overriding Graphics

The shared directory is also used to hold the graphics for the navigation icons, the callout marks, and a few other things. So, if you decide you don't like the look of the prev/next pointers and want to use something else, just place your own versions in your own shared directory.



[1] Windows use is currently limited to editing documents. I hope to support building document sets on Windows soon.