Rules Reference
Rule fields
| Field | Type | Required | Description |
|---|---|---|---|
id |
integer | Yes | Unique rule identifier (strings coerced to int) |
name |
string | Yes | Human-readable rule name |
type |
"header" or "body" |
Yes | Whether to query metadata or document body |
query |
string | Yes | Metadata key (header) or XPath expression (body) |
flag |
string | Yes | Processing mode (see Flags table) |
operation |
string | Yes | Comparison operator (see Operators table) |
value |
string | Yes | Expected value; comma-separated for multi-value checks |
level |
"Required" or "Suggested" |
No | Default: "Required" |
mitigation |
string | No | Remediation message shown on failure |
level and CI exit codes: A Required rule that fails sets the CLI exit code to
1, blocking a CI merge gate. A Suggested rule that fails is reported but does not
change the exit code. Use Suggested for style recommendations that should not block
publication.
Flags
| Flag | Applies to | Description |
|---|---|---|
value |
header | Evaluate the metadata value with operation and value |
check |
header | True if the key exists (ignores operation/value) |
date |
header | Compare metadata value as a date |
pattern |
header | Match metadata value against value as a regex |
count |
body | Count matching XPath nodes; compare with operation/value |
text |
body | Extract text of first matching XPath node |
dom |
body | Extract element tag names of matching nodes |
all |
body | Full plain-text content of the page |
Flag examples
value — check that a metadata field equals an expected string:
{
"id": 1, "name": "topic-must-be-tutorial", "type": "header",
"query": "ms.topic", "flag": "value", "operation": "==", "value": "tutorial",
"mitigation": "Set ms.topic: tutorial in the front matter."
}
check — verify a required metadata key is present (value doesn't matter):
{
"id": 2, "name": "description-must-exist", "type": "header",
"query": "description", "flag": "check", "operation": "==", "value": "true",
"mitigation": "Add a description field to the front matter."
}
date — check that ms.date is no more than 365 days in the past.
The value field is the maximum age in days; the operation is < (age must be less
than the threshold):
{
"id": 3, "name": "date-must-be-fresh", "type": "header",
"query": "ms.date", "flag": "date", "operation": "<", "value": "365",
"mitigation": "Update ms.date to a date within the last year."
}
dateutil (e.g.
2026-01-15). The comparison is: (today − ms.date).days < int(value).
pattern — match the metadata value against a regex (alias for r on header fields):
{
"id": 4, "name": "author-github-format", "type": "header",
"query": "author", "flag": "pattern", "operation": "r",
"value": "^[a-z0-9-]+$",
"mitigation": "author must be a lowercase GitHub username (letters, numbers, hyphens)."
}
count — count matching XPath nodes:
{
"id": 5, "name": "must-have-exactly-one-h1", "type": "body",
"query": "/html/body/h1", "flag": "count", "operation": "==", "value": "1",
"mitigation": "The document must have exactly one H1 heading."
}
text — extract the text content of the first matching XPath node:
{
"id": 6, "name": "h1-must-start-with-tutorial", "type": "body",
"query": "/html/body/h1", "flag": "text", "operation": "[:","value": "Tutorial:",
"mitigation": "H1 must begin with 'Tutorial:' for tutorial articles."
}
dom — extract element tag names (useful for checking heading order):
{
"id": 7, "name": "first-heading-after-h1-must-be-h2", "type": "body",
"query": "/html/body/*[2]", "flag": "dom", "operation": "==", "value": "h2",
"mitigation": "The second element in the body must be an H2."
}
all — run a check against the entire plain-text body:
{
"id": 8, "name": "body-must-not-be-empty", "type": "body",
"query": "/html/body", "flag": "all", "operation": ">", "value": "0",
"mitigation": "The document body must not be empty."
}
Operators
| Token | Name | Notes |
|---|---|---|
== |
Equal | Strips whitespace |
!= |
Not equal | Strips whitespace |
> |
Greater than | Numeric comparison |
< |
Less than | Numeric comparison |
[] |
Contains | Case-insensitive |
[: |
Starts with | |
:] |
Ends with | |
r |
Regex match | Python re.search, DOTALL mode |
l |
Length limit | len(result) < int(value) — True if shorter than threshold |
s |
Max sentences | sentence_count <= int(value) — requires NLTK |
p<N> |
Part of speech | p1 = first word POS tag; uses Penn Treebank tags |
Operator examples for non-obvious operators
r (regex) — check that the title matches a pattern:
{
"flag": "text", "operation": "r", "value": "^Tutorial: [A-Z]",
"mitigation": "Title must match 'Tutorial: ' followed by a capital letter."
}
l (length limit) — ensure the title is under 60 characters:
{
"flag": "text", "operation": "l", "value": "60",
"mitigation": "Title must be under 60 characters."
}
s (sentence count) — limit the intro paragraph to 3 sentences:
{
"flag": "all", "operation": "s", "value": "3",
"mitigation": "Introduction must be no more than 3 sentences."
}
python -m nltk.downloader punkt_tab averaged_perceptron_tagger_eng after installing the package.
p<N> (part of speech) — check that the third word of the H1 is a verb (VB):
{
"flag": "text", "operation": "p3", "value": "VB",
"mitigation": "The third word of the H1 should be a base-form verb."
}
NN (noun), VB (base verb), VBZ (3rd-person verb),
JJ (adjective), RB (adverb). Position is 1-indexed.
Common mistakes
Multi-element XPath + equality
When an XPath returns multiple elements and you use ==, the evaluator checks that
every element satisfies the assertion (logical AND). This is almost always wrong.
Wrong: "The second-to-last H2 is 'Clean up resources'"
{ "query": ".//h2[last()]/preceding-sibling::h2", "flag": "text", "operation": "==" }
'Clean up resources' fails on any document with more than two H2s.
Right: Target the specific element directly:
{ "query": ".//h2[last()-1]", "flag": "text", "operation": "==" }
Comma-separated multi-values
The value field is split on , to produce a list for operators that support
multi-value checks ([]). The value "guide, article, topic" becomes three checks:
"guide", " article", " topic" (note the leading space). Whitespace is stripped
before comparison, so this works in practice — but a literal comma inside a single
expected value has no escape mechanism.
Expressing absence (no negation operator)
The rule language has no built-in negation. To express "the title must NOT contain the
word 'guide'", the idiomatic (but fragile) approach is to use workflow branching with a
known-passing rule as a sentinel. The upcoming negate: true field (planned for v0.3)
will eliminate this pattern. See the Roadmap
for details.
Workflow step language
A workflow is a comma-separated string of <source>-<target> tokens that chains rule
evaluations with conditional branching.
| Pattern | Example | Meaning |
|---|---|---|
S-N |
S-1 |
Start: load rule N as the initial state |
N-D |
1-D |
Rule N result becomes the decision point |
M-D |
M-D |
Merge state becomes the decision |
T-N |
T-2 |
If decision is True, load rule N |
F-N |
F-3 |
If decision is False, load rule N |
T-R |
T-R |
If decision is True, reverse (negate) it |
F-R |
F-R |
If decision is False, reverse (negate) it |
N-M |
2-M |
Rule N exits into merge state |
M-N |
M-3 |
Merge state exits to rule N |
M-E |
M-E |
Merge state ends the workflow |
N-E |
4-E |
Rule N result ends the workflow |
N-N |
1-2 |
Both rules must pass |
Worked example 1 — simple rule check
S-1,1-E
Start with rule 1. Rule 1's result ends the workflow. This is the simplest possible workflow: a single rule.
Worked example 2 — conditional branch
S-1,1-D,T-2,F-3,2-M,3-M,M-E
Rule 1 is evaluated and its result becomes the decision. If True, rule 2 runs; if False, rule 3 runs. Both rules exit into merge state, which ends the workflow. The overall workflow passes if the branch that executed passed.
Practical use: Rule 1 checks ms.topic == "tutorial". Rule 2 checks that the H1
starts with "Tutorial:". Rule 3 checks that the H1 starts with "How to". This enforces
topic-type-specific H1 conventions in a single workflow.
Worked example 3 — conditional presence check
S-38,38-D,T-39,F-38,39-M,M-E
Rule 38 counts H2 headings. If the count is > 0 (True), rule 39 checks for H3s. If there are no H2s (False), rule 38's own (passing) result feeds into merge. This implements "H3s are only required if H2s are present."
For the full technical reference, see Design — Workflow Step Language.