Documents are grouped into directories. In general, a given directory can hold only a single document that will be published in "pages" form and a single document (possibly the same one) that will be published in "slides" form. Any number of "page", "pdf", or source code documents may occur in a directory.
Typically, a directory name will reflect the name of the
pages/slides document that it contains. For example, a directory
overview
might contain a DocBook file
overview.dbk
together with any graphics files,
stylesheets, or other auxiliary content that is used to prepare the final
document. Let's suppose that this overview document also links to some C++
source code foo.cpp
and to an assignment file
asst.html
. Let's further suppose that we decide to
publish the overview in all available output forms. Then the relation
between the input and output files is given by
Input File | Output File | |
---|---|---|
overview/overview.dbk | overview/page/overview.html | The "page" output format |
overview/overview.dbk | overview/pages/index.html | The opening page of the multi-page output for the "pages" output form. |
overview/overview.dbk | overview/slides/index.html | The opening page of the multi-page output for the "slides" output form. |
overview/overview.dbk | overview/page/overview.pdf | The "pdf" output format |
overview/foo.cpp | overview/foo.cpp.html | Highlighted version of the source code |
overview/asst.html | overview/html/asst.html | The "html" output format |
Actually, the table above over-simplified in one respect. There will actually be two separate directories named "overview". One is the original input directory and the other is a copy of that directory created in an output area. In general, document conversion leaves the input directories untouched.
The thing that determines which documents are included in the set and what output forms should be used is the site map, an XML file that lists the directories in which documents are stored, the documents stored there, and the output forms desired for each document.
The site map is normally called
"course.sitemap
" and looks something like like
this
<?xml version="1.0" encoding="utf-8"?> <targetset> <targetsetinfo> CS 250 </targetsetinfo> <sitemap home="index.html" email="cs250@cs.odu.edu"> <dir name="cs250Documents"> <dir name="Directory"> <document targetdoc="topics"> <form>page</form> </document> <document targetdoc="info"> <form>page</form> </document> <document targetdoc="buttons"> <form>html</form> </document> </dir> <dir name="syllabus"> <document targetdoc="syllabus"> <form>page</form> </document> </dir> <!--- ============ Lectures =============== --> <dir name="cppProgramStructure"> <document targetdoc="cppProgramStructure"> <form>slides</form> <form>page</form> </document> </dir> <dir name="arrays"> <document targetdoc="arrays"> <form>slides</form> <form>page</form> </document> </dir> ...
This is used in a few places as part of a title, to identify the course. |
|
The home attribute contains a URL that all documents will link back to as a home page for the course. This might be a relative URL to a document inside the document set (e.g., the topics page) or, if the course will be published on BlackBoard or another LMS, the address of the course on that system |
|
This email address is used to construct the messaging link atthe bottom of each page. The purpose of providing such links is so that, when students send messages about a lecture/assignment/whatever, the URL of the web page they are asking about can be automatically copied into their message. |
|
This introduces a directory named "cs250Documents". Documents and other directories can be nested inside this one. The top-level directory named in the sitemap is a bit special. It gives a name to the directory within which all output forms are written. It also provides a name for the zip file that will eventually contain all the outputs and that can eventually be published to a website. |
|
This introduces an inner directory. For the input, this is a directory at the same level as the sitemap file. In the output this appears inside the topmost directory. For example, if we had stored the sitemap at
|
|
This introduces the first of three documents that are stored
in Within the document can be one or more form elements. Each names an output form that we wish to generate for that document. In this example, we are defining three documents. Each will be
published as a single web page, but the input forms differ. The
first two documents are created from inputs
The "topics" document is actually something of a special case.
The DocBook |
|
This introduces another document. It will be created from
inputs |
|
This is actually a more typical entry. It introduces a document (a set of slides for a lecture) that is produced in two output forms. One is the multi-page set of slides and the other is a single web page (for printing). When a document is declared to have multiple output forms, a couple of things need to be noted:
|
It's worth pointing out nothing in the above sitemap makes any explicit mention of source code documents. That's because source code documents tend to be numerous and are likely to have identical names. Source code documents are handled (almost) implicitly. For every directory named in the site map (whether that directory contains any explicit document entries or not), that directory will be scanned for files endinging in ".h", ".cpp", or ".java". Each such file located will be converted into a web page.
Note that, if you should have a directory that contains only source code but no input DocBook or HTML documents, you would list that directory in the sitemap in order to collect and convert the source code.
It is also possible to have html pages that are not converted in any
way. If the input directories contain a *.html
file
that is not listed in the sitemap, that file simply gets copied to the
corresponding output directory. I use this, for example, to provide an
index.html for the entire document set that consists of a frameset showing
a navigation table (buttons.html
) on one side and the
topics page on the other.
This is actually not a special case, by the way. All files in a named input directory get copied to the corresponding output directory and to the output form directories. That's how graphics, stylesheets, and other related content stays with your web pages.