Markdown Validator
Rule-based Markdown document validation for static site content.
What it does
markdown-validator validates Markdown files against a declarative JSON rule set. It checks:
- YAML front-matter metadata — required keys, expected values, date freshness
- Document body content — heading structure, section order, text patterns, character limits
It is designed for large documentation repositories built with static site generators such as DocFX and Hugo.
Quick start
# Install
pip install markdown-validator
# Validate a single file
md-validate validate docs/article.md --rules rules/tutorial.json
# Validate all .md files in a directory
md-validate validate docs/ --rules rules/tutorial.json --output reports/
# Interactive REPL for rule development
md-validate repl
How it works
- A rule set JSON file defines assertions about the document.
- Each rule targets either the metadata (YAML front matter) or the body (via XPath on the rendered HTML).
- Rules can be chained into workflows for conditional logic.
- The tool outputs a score (passing rules / total rules) and a per-rule pass/fail report.
Example rule set
{
"rules": {
"header": [
{
"id": 1,
"name": "topic-must-be-tutorial",
"type": "header",
"query": "ms.topic",
"flag": "value",
"operation": "==",
"value": "tutorial",
"level": "Required",
"mitigation": "Set ms.topic: tutorial in the front matter."
}
],
"body": [
{
"id": 2,
"name": "must-have-h1",
"type": "body",
"query": "/html/body/h1",
"flag": "count",
"operation": "==",
"value": "1",
"level": "Required",
"mitigation": "The document must have exactly one H1 heading."
}
]
},
"workflows": [
{
"name": "topic-check",
"steps": "S-1,1-E",
"fix": "Set ms.topic: tutorial in the front matter."
}
]
}
Python API
from markdown_validator import Scanner
scanner = Scanner()
report = scanner.validate("docs/article.md", "rules/tutorial.json")
print(f"Passed: {report.passed}")
print(f"Score: {report.score}/{report.total_rules}")
for result in report.results:
status = "✓" if result.passed else "✗"
print(f" {status} [{result.rule_id}] {result.rule_name}")
if not result.passed:
print(f" → {result.mitigation}")