Transform a DITA Package¶
The DITA Package Processor transforms a source DITA package into a materialized target package using a strict, deterministic pipeline.
The pipeline consists of:
- Discovery – observe and classify the source package
- Normalize – enforce structural invariants
- Planning – generate an explicit execution plan
- Execution – materialize content into a target directory
Filesystem mutation only occurs when explicitly enabled with --apply.
There is no in-place mutation.
There is no guessing.
There is no implicit execution.
The one command that matters¶
To transform a source DITA package into a materialized target package, run:
dita_package_processor run \
--package /path/to/source-dita-package \
--source-root /path/to/source-dita-package \
--output /path/to/target-package \
--apply
That’s the canonical invocation.
No extra flags.
No hidden defaults.
No accidental overwrites.
What each part actually does (no lies)¶
--package¶
This identifies the logical package root for discovery and planning.
It must contain:
index.ditamap- a resolvable main map
- well-formed XML
Nothing in this directory is mutated.
Ever.
--source-root¶
This is the filesystem truth used during execution.
Key rule (the one that bit you):
All paths in the execution plan are resolved relative to
--source-root.
If this is wrong, execution will “work” but find nothing.
This flag is mandatory for execute and required for run once execution is involved.
--output¶
This is the target directory where the processed package is written.
Rules enforced by the CLI:
- Required when
--applyis present - Must be writable
- Is created if missing
- Is never deleted implicitly
- Is never allowed to overlap the source tree
There is no in-place mutation.
This is intentional and non-negotiable.
--apply¶
This flag explicitly authorizes filesystem mutation.
Without it:
- discovery runs
- normalization runs
- planning runs
- execution runs in dry-run mode only
- no files are written
With it:
- execution switches to the filesystem executor
- actions are applied deterministically
- files are written into
--output - results are recorded in an execution report
No --apply means no writes. Always.
What actually happens internally (the real pipeline)¶
When you run the command above, the pipeline is:
Discovery
↓
Normalize
↓
Planning
↓
Execution
• sandboxed filesystem access
• explicit source-root resolution
• deterministic action dispatch
Important clarifications:
- There is no separate materialization phase anymore
- Layout resolution is encoded directly in the plan
- Collision prevention is enforced by:
- explicit target paths
- sandbox validation
- executor policy checks
Execution never invents paths.
It only executes what the plan describes.
What you get at the end¶
After a successful run with --apply, you have:
- A fully materialized DITA package in
--output - Deterministic directory layout
- Explicitly copied maps, topics, and media
- Normalized map hierarchy
- Optional semantic refactors applied
- A package that opens cleanly in Oxygen
Example:
oxygen /path/to/target-package/<main-map>.ditamap
No cleanup required.
No manual repair.
Dry run (strongly recommended first)¶
dita_package_processor run \
--package ./source \
--source-root ./source \
--output ./out
This:
- executes the full pipeline
- simulates execution
- validates every action
- proves path resolution
- writes no files
If this fails, applying would have failed too.
Writing an execution report¶
dita_package_processor run \
--package ./source \
--source-root ./source \
--output ./out \
--apply \
--report execution_report.json
The report records:
- every action
- execution status
- handler identity
- dry-run vs applied state
- failures and skips
- timing (
started_at,finished_at,duration_ms) - discovery counts (
maps,topics,media,missing_references,external_references)
It is a forensic artifact.
You can diff it, archive it, or feed it into CI.
Quick checks:
jq '{dry_run, summary, discovery}' execution_report.json
jq '{started_at, finished_at, duration_ms}' execution_report.json
jq '.results[] | select(.status == "failed") | {action_id, handler, error_type, error}' execution_report.json
For full usage examples, see Execution Report Guide.
The important meta-point¶
You did not build:
- a script
- a copier
- a batch hack
You built:
- a compiler-style pipeline
- with explicit contracts
- a validated plan
- a sandboxed executor
- and a boring execution phase
The CLI feels strict because it is.
Strict is what makes this safe at scale.