Map-first pre-processing
DITA-OT provides a map-first pre-processing option as an alternative to the
default preprocess
operation. The method, which was introduced in DITA-OT 2.5 as an experimental
feature, has since been improved and is ready for use in production scenarios. Map-first pre-processing provides
the same functionality as the default preprocess
, but takes a different
approach.
Whereas the default pre-processing routine handles both maps and topics at the same time, often switching back and forth between map operations and topic operations, the map-first approach only begins processing topics after nearly all map processing is complete. This simplifies the processing logic and creates cleaner module responsibilities, which makes it easier to process only those topics that are actually referenced after filtering, for example, or to only process the map to validate the map structure.
The current pre-processing architecture was established during the DITA 1.0 era when there were fewer DITA features that operated on the map level. Initially, the difference between processing modes was not that great. DITA 1.2 and 1.3 introduced many more map-level features, such as keys and key scopes, that make it difficult to reliably work with topics before all map features have been resolved.
The original pre-processing operation already handles many map operations first, but this was not the original design and requires regular refactoring to handle edge cases. The new map-first pre-processing is designed with this model in mind, improving the overall processing flow and making it more formal about the map-first model. The new model also takes advantage of hashed topic file names in the temporary directory, which simplifies many processing steps, and is better able to handle topics referenced outside of the map directory (that case has resulted in a variety of issues with the original model).
As of DITA-OT 4.2, the map-first pre-processing pipeline also supports additional subject scheme features.
How to use map-first pre-processing
To use (or test) map-first pre-processing, call the preprocess2
Ant target in your custom
transformation types instead of the preprocess
target.
For example, if you have a custom HTML5 transformation type named "myhtml", then you may have a plug-in extension that looks this:
<!-- Simple variant: set properties and call default HTML5 -->
<target name="dita2myhtml" depends="myhtml.init,dita2html5"/>
This type of extension is quite common, and is used to set default properties for your environment followed by a normal build to use those properties. As of DITA-OT 4.2, this approach will inherit the map-first pre-processing routine from the HTML5 transformation.
In earlier versions, youโd need to replace dita2html5
with the normal HTML5 steps, swapping
out preprocess
for preprocess2
:
<!-- Simple variant: set properties and call default HTML5 -->
<target name="dita2myhtml"
depends="myhtml.init,
html5.init,
build-init,
preprocess2,
html5.topic,
html5.map,
html5.css"/>
preprocess2
.Some custom transformation types already require you to repeat the default dependencies, in which case you
should already call preprocess
directly, as in the following:
<!-- More complex variant: add processing steps to default HTML5 -->
<target name="dita2myhtml"
depends="myhtml.init,
build-init,
preprocess,
local-extensions-after-preprocess,
html5.topic,
html5.map,
html5.css"/>
In such cases, the modification is much easier โ simply add a 2
to the existing
preprocess
target.
How to test in a production environment
In some cases, you may be responsible for maintaining transformation types that are actually run by many people
on your team or around a company. In this case, you likely need to maintain your existing transformation types
based on the backwards-compatible preprocess
modules, but also want to provide your colleagues
with a way to test their own documents using preprocess2
.
There are several ways to do this. One fairly straightforward approach would be to create a new custom
transformation type that is exactly the same, except for pre-processing. For example, if you have a local HTML
variant called myhtml
as above, instead of modifying that transformation directly, you could
create a second transformation type called myhtml-beta
that provides exactly the same support,
but with the new map-first pre-processing:
<!-- Original "myhtml" is not modified, used for production -->
<target name="dita2myhtml5" depends="myhtml.init,dita2html5"/>
<!-- "myhtml-beta" used to test and provide feedback on preprocess2 -->
<target name="dita2myhtml-beta"
depends="myhtml.init,
html5.init,
build-init,
preprocess2,
html5.topic,
html5.map,
html5.css"/>
Known limitations
The internal extension points that run before or after individual steps in the
original preprocess
pipeline (preprocess.*.pre/preprocess.*.post
) are not
available in the newer map-first pre-processing pipeline (preprocess2
), which is used in the PDF
and HTML Help transformations as of DITA-OT 3.0, and in HTML5 and Normalized DITA output as of DITA-OT 4.2.