Product Roadmap
markdown-validator — as of 2026-03-20
This roadmap is driven by the findings in the Project Assessment and by real-world usage patterns observed in large documentation repositories.
Guiding principles
- Rule authors should not need to understand the engine to express intent. If a rule requires workflow inversion to express absence, the abstraction has failed.
- Silent wrong answers are worse than loud errors. Flat-DOM structural queries can silently return wrong results; this must be surfaced before v1.0.
- The architecture is the moat. The 4-layer design enables fast, targeted fixes. Each improvement should be local to one module.
Timeline overview
timeline
title markdown-validator Release Timeline
section Patch series
v0.2.x : NLTK download fix
: Rule 8 POS tag correction
: Rule 34 XPath correction
section Minor releases
v0.3.0 : Negation operator (negate flag)
: Not-contains operator
: Multi-result equality semantics fix
: Parallel directory scanning
v0.4.0 : Structured-DOM parser (section wrapping)
: YAML schema for rule sets
: VS Code extension (rule authoring)
section Major release
v1.0.0 : Stable public API guarantee
: Plugin system for custom operators
: Pre-built rule packs (DocFX, Hugo, Docusaurus)
v0.2.x — Patch series (current)
These are targeted bug fixes with no API or schema changes.
| Fix | File | Change |
|---|---|---|
| NLTK data download | README.md + pos.py |
Add nltk.download(quiet=True) guard to pos.py module init; add install step to README |
| Rule 8 POS bug | tests/fixtures/checkworkflow.json |
Change operation to "p3", value to "VB" |
| Rule 34 XPath | tests/fixtures/checkworkflow.json |
Change XPath to .//h2[last()-1] |
| Rules 28/29 inversion | tests/fixtures/checkworkflow.json |
Invert regex to ^([^0-9]) |
No API changes. No migration required.
v0.3.0 — Rule language enhancements
Target: Q2 2026
Negation operator
Add a negate: bool = False field to RuleModel. When True, the boolean result of
evaluate_rule is flipped before it is wrapped in ValidationResult. This eliminates
the entire class of brittle workflow-inversion patterns used to express absence.
flowchart LR
R[Rule evaluated] --> V{negate?}
V -->|False| T[Pass/Fail as-is]
V -->|True| F[Pass becomes Fail<br/>Fail becomes Pass]
T --> VR[ValidationResult]
F --> VR
Affected modules: domain/models.py, domain/evaluator.py
Migration: Existing rule files without negate default to false; fully backward compatible.
Not-contains operator (![])
A dedicated ![] operator for "does not contain". Eliminates the need to use [] with
workflow inversion for absence checks (e.g., "title must not contain the word 'guide'").
Affected modules: domain/operators.py (one new function + one registry entry)
Multi-result equality semantics
When an XPath expression returns multiple elements and the operation is ==, the current
all(truth) aggregation almost always produces the wrong result for real rules. Change
the aggregation to any(truth) for equality checks on multi-element result sets, and
document the behaviour explicitly.
Affected modules: domain/evaluator.py
Parallel directory scanning
Scanner.validate_directory() currently processes files sequentially.
Add an optional workers: int = 1 parameter that uses concurrent.futures.ThreadPoolExecutor
for parallel validation. Default remains sequential (backward compatible).
Affected modules: services/scanner.py
v0.4.0 — Structural improvements
Target: Q3 2026
Structured-DOM parser
The current markdown library produces flat HTML; all headings are direct children of
<body>. This prevents within-section XPath queries. Replace or augment the parser with
one that wraps headings and their content in <section> elements, enabling:
//section[h2[text()='Prerequisites']]//a
Options to evaluate: markdown-it-py + mdit-py-plugins; mistletoe; custom
post-processing of lxml tree.
Affected modules: infrastructure/parser.py (isolated — no other layers change)
YAML schema for rule sets
Provide a JSON Schema (or Pydantic model export) for rule-set files. Enable IDE
validation of rule files. Publish schema to docs/schema/ruleset.schema.json.
VS Code extension — rule authoring support
A thin VS Code extension that:
- Validates rule-set JSON files against the published schema
- Provides IntelliSense for flag and operator values
- Runs md-validate on the active file from the command palette
v1.0.0 — Stable release
Target: Q4 2026
Stable public API guarantee
Commit to semantic versioning for Scanner, ScanReport, RuleModel, and all Pydantic
models. Document breaking-change policy.
Plugin system for custom operators
Allow third-party packages to register operators via a markdown_validator.operators
entry point. This keeps the core small while enabling domain-specific checks
(e.g., terminology validators, link checkers).
Pre-built rule packs
Ship installable rule pack packages for common static site generators:
| Package | Targets |
|---|---|
markdown-validator-docfx |
DocFX metadata, ms.topic values, H1 conventions |
markdown-validator-hugo |
Hugo front-matter, archetypes, shortcode patterns |
markdown-validator-docusaurus |
Docusaurus metadata, sidebar_label, slug |
What is explicitly out of scope
- Real-time editor linting (language server protocol) — this is a CI/batch tool.
- Markdown auto-fixing — the tool reports; it does not rewrite content.
- Non-Markdown file types — scope is
.mdfiles only. - Cloud service / SaaS — the tool is intentionally local-first.