Domain API
Domain value objects and Pydantic contract models.
All objects in this module are immutable. They define the contracts between layers and are safe to pass across layer boundaries without defensive copying.
Design pattern: Value Object — each model is frozen after construction; mutation always creates a new instance.
ParsedDocument
dataclass
An immutable representation of a parsed Markdown file.
Attributes:
| Name | Type | Description |
|---|---|---|
filepath |
Path
|
Absolute path to the source |
metadata |
dict[str, str]
|
Key-value pairs extracted from the YAML front matter. |
html |
str
|
HTML string produced by rendering the document body. |
Source code in markdown_validator/domain/models.py
192 193 194 195 196 197 198 199 200 201 202 203 | |
RuleModel
Bases: BaseModel
A single validation rule loaded from a JSON rule-set file.
Attributes:
| Name | Type | Description |
|---|---|---|
id |
int
|
Unique positive integer identifier for this rule. |
name |
str
|
Human-readable rule description. |
type |
Literal['header', 'body']
|
Whether this rule targets |
query |
str
|
For |
flag |
str
|
Processing mode — controls what |
operation |
str
|
Comparison operator token. See
:mod: |
value |
str
|
Expected value used in the comparison assertion. |
level |
Literal['Required', 'Suggested']
|
Severity — |
mitigation |
str
|
Human-readable remediation hint shown on failure. |
Source code in markdown_validator/domain/models.py
27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 | |
coerce_id(v: object) -> int
classmethod
Accept string IDs from older JSON files and coerce to int.
Source code in markdown_validator/domain/models.py
59 60 61 62 63 64 65 66 | |
id_must_be_positive(v: int) -> int
classmethod
Enforce that rule IDs are positive integers.
Source code in markdown_validator/domain/models.py
68 69 70 71 72 73 74 | |
normalise_type(v: object) -> str
classmethod
Normalise rule type to lowercase.
Source code in markdown_validator/domain/models.py
76 77 78 79 80 81 82 | |
RuleSetModel
Bases: BaseModel
The top-level schema for a rule-set JSON file.
Attributes:
| Name | Type | Description |
|---|---|---|
rules |
RulesSection
|
Header and body rule definitions. |
workflows |
list[WorkflowModel]
|
Optional list of multi-step workflow definitions. |
Source code in markdown_validator/domain/models.py
164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 | |
all_rules: list[RuleModel]
property
Return all rules (header + body) in definition order.
rules_by_id: dict[int, RuleModel]
property
Return a mapping from rule ID to :class:RuleModel.
RulesSection
Bases: BaseModel
The "rules" section of a rule-set JSON file.
Attributes:
| Name | Type | Description |
|---|---|---|
header |
list[RuleModel]
|
Rules that operate on YAML front-matter metadata. |
body |
list[RuleModel]
|
Rules that operate on the HTML-rendered document body. |
Source code in markdown_validator/domain/models.py
85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 | |
inject_type_from_section(data: object) -> object
classmethod
Inject type from the section name when absent.
Older rule JSON files (e.g. concept.json) omit the type
field on each rule because the section name already encodes it.
This validator adds "type": "header" or "type": "body"
to any rule dict that lacks the field.
Source code in markdown_validator/domain/models.py
97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 | |
no_duplicate_ids() -> RulesSection
Fail fast if any two rules share the same ID.
Source code in markdown_validator/domain/models.py
116 117 118 119 120 121 122 123 124 125 | |
ScanReport
Bases: BaseModel
Aggregated results of running all rules in a rule set against one file.
Attributes:
| Name | Type | Description |
|---|---|---|
filepath |
str
|
Path to the validated document. |
score |
int
|
Number of rules that passed. |
total_rules |
int
|
Total number of rules evaluated. |
passed |
bool
|
|
results |
list[ValidationResult]
|
Per-rule validation outcomes. |
Source code in markdown_validator/domain/models.py
236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 | |
ValidationResult
Bases: BaseModel
The outcome of evaluating a single rule against a document.
Attributes:
| Name | Type | Description |
|---|---|---|
rule_id |
int
|
ID of the rule that was evaluated. |
rule_name |
str
|
Human-readable name of the rule. |
passed |
bool
|
|
level |
Literal['Required', 'Suggested']
|
Severity of this rule ( |
expected_value |
str
|
The value the rule expected to find. |
actual_value |
str
|
The value actually found (or |
mitigation |
str
|
Remediation hint shown when the rule fails. |
filepath |
str
|
Path to the document that was validated. |
Source code in markdown_validator/domain/models.py
211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 | |
WorkflowModel
Bases: BaseModel
A single workflow definition from a rule-set JSON file.
Attributes:
| Name | Type | Description |
|---|---|---|
name |
str
|
Descriptive name for the workflow. |
steps |
str
|
Step string in the workflow step language, e.g.
|
level |
Literal['Required', 'Suggested']
|
Whether this workflow is |
fix |
str
|
Human-readable remediation text shown when the workflow fails. |
Source code in markdown_validator/domain/models.py
128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 | |
normalise_steps(v: object) -> str
classmethod
Normalise (S,1)(1,E) format to S-1,1-E format.
Source code in markdown_validator/domain/models.py
147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 | |
WorkflowResult
Bases: BaseModel
Outcome of running a single workflow step sequence.
Attributes:
| Name | Type | Description |
|---|---|---|
workflow_name |
str
|
Name of the workflow. |
passed |
bool
|
Final boolean state after all steps. |
fix |
str
|
Remediation text if the workflow failed. |
Source code in markdown_validator/domain/models.py
255 256 257 258 259 260 261 262 263 264 265 266 267 | |
Pure comparison operator functions.
Every public function in this module is a strategy — a Callable[[str, str], bool]
that takes a result string and an expected value string, and returns
True if the assertion is satisfied.
Design pattern: Strategy — operators are independent functions with a
uniform signature. Adding a new operator requires no changes to the caller;
it is simply registered in :data:OPERATOR_REGISTRY.
None of these functions perform I/O, logging, or raise exceptions on normal
evaluation. Invalid inputs return False.
op_contains(result: str, value: str) -> bool
Return True if value appears inside result (case-insensitive).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
result
|
str
|
Actual string extracted from the document. |
required |
value
|
str
|
Substring to search for. |
required |
Returns:
| Type | Description |
|---|---|
bool
|
|
Source code in markdown_validator/domain/operators.py
77 78 79 80 81 82 83 84 | |
op_date(result: str, operator: str, value: str) -> bool
Compare a date string against another date or an offset.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
result
|
str
|
Date string from the document metadata. |
required |
operator
|
str
|
One of |
required |
value
|
str
|
Either |
required |
Returns:
| Type | Description |
|---|---|
bool
|
Boolean result of the date comparison. |
Source code in markdown_validator/domain/operators.py
161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 | |
op_ends_with(result: str, value: str) -> bool
Return True if result ends with value.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
result
|
str
|
Actual string extracted from the document. |
required |
value
|
str
|
Expected suffix. |
required |
Returns:
| Type | Description |
|---|---|
bool
|
|
Source code in markdown_validator/domain/operators.py
97 98 99 100 101 102 103 104 | |
op_equal(result: str, value: str) -> bool
Return True if result and value are equal (stripped).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
result
|
str
|
Actual string extracted from the document. |
required |
value
|
str
|
Expected value to compare against. |
required |
Returns:
| Type | Description |
|---|---|
bool
|
|
Source code in markdown_validator/domain/operators.py
29 30 31 32 33 34 35 36 | |
op_greater(result: str, value: str) -> bool
Return True if numeric result is greater than numeric value.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
result
|
str
|
Actual value (will be cast to |
required |
value
|
str
|
Expected threshold (will be cast to |
required |
Returns:
| Type | Description |
|---|---|
bool
|
|
Source code in markdown_validator/domain/operators.py
49 50 51 52 53 54 55 56 57 58 59 60 | |
op_length(result: str, value: str) -> bool
Return True if len(result) is less than value.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
result
|
str
|
String whose length is measured. |
required |
value
|
str
|
Maximum allowed length (exclusive), as a string integer. |
required |
Returns:
| Type | Description |
|---|---|
bool
|
|
Source code in markdown_validator/domain/operators.py
123 124 125 126 127 128 129 130 131 132 133 134 | |
op_less(result: str, value: str) -> bool
Return True if numeric result is less than numeric value.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
result
|
str
|
Actual value (will be cast to |
required |
value
|
str
|
Expected threshold (will be cast to |
required |
Returns:
| Type | Description |
|---|---|
bool
|
|
Source code in markdown_validator/domain/operators.py
63 64 65 66 67 68 69 70 71 72 73 74 | |
op_not_equal(result: str, value: str) -> bool
Return True if result and value are not equal (stripped).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
result
|
str
|
Actual string extracted from the document. |
required |
value
|
str
|
Expected value to compare against. |
required |
Returns:
| Type | Description |
|---|---|
bool
|
|
Source code in markdown_validator/domain/operators.py
39 40 41 42 43 44 45 46 | |
op_regex(result: str, value: str) -> bool
Return True if result matches the regex pattern in value.
Uses Python :mod:re with re.DOTALL.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
result
|
str
|
String to search within. |
required |
value
|
str
|
Regular expression pattern (Python syntax). |
required |
Returns:
| Type | Description |
|---|---|
bool
|
|
Source code in markdown_validator/domain/operators.py
107 108 109 110 111 112 113 114 115 116 117 118 119 120 | |
op_starts_with(result: str, value: str) -> bool
Return True if result starts with value.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
result
|
str
|
Actual string extracted from the document. |
required |
value
|
str
|
Expected prefix. |
required |
Returns:
| Type | Description |
|---|---|
bool
|
|
Source code in markdown_validator/domain/operators.py
87 88 89 90 91 92 93 94 | |
Rule evaluation engine.
:func:evaluate_rule is the single entry point for applying a
:class:~markdown_validator.domain.models.RuleModel to a
:class:~markdown_validator.domain.models.ParsedDocument.
It dispatches to:
- :mod:
markdown_validator.domain.operatorsfor string/numeric/regex comparisons. - :mod:
markdown_validator.domain.posfor part-of-speech and sentence-count checks.
The function is pure with respect to I/O — it never reads files or logs at the INFO level; it emits DEBUG messages only.
Raises:
| Type | Description |
|---|---|
ValueError
|
If the rule's flag or operation is unrecognised. |
evaluate_header_value_list(rule: RuleModel, doc: ParsedDocument) -> bool
Evaluate a header rule where the expected value may be a CSV list.
Each value in the comma-separated rule.value must independently pass the assertion.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
rule
|
RuleModel
|
Header rule with potentially comma-separated |
required |
doc
|
ParsedDocument
|
Parsed document to check. |
required |
Returns:
| Type | Description |
|---|---|
bool
|
|
Source code in markdown_validator/domain/evaluator.py
269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 | |
evaluate_rule(rule: RuleModel, doc: ParsedDocument) -> ValidationResult
Apply rule to doc and return a :class:ValidationResult.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
rule
|
RuleModel
|
The rule to evaluate. |
required |
doc
|
ParsedDocument
|
The parsed document to evaluate the rule against. |
required |
Returns:
| Type | Description |
|---|---|
ValidationResult
|
A frozen :class: |
Source code in markdown_validator/domain/evaluator.py
43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 | |
Part-of-speech analysis utilities.
Wraps NLTK tokenisation and POS tagging behind a narrow, pure interface. All functions accept plain text strings and return plain text strings or integers; no I/O or side-effects.
NLTK data (punkt_tab, averaged_perceptron_tagger_eng) must be
downloaded before first use::
import nltk
nltk.download("punkt_tab")
nltk.download("averaged_perceptron_tagger_eng")
sentence_count(text: str) -> int
Return the number of sentences in text.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
Plain text to analyse. |
required |
Returns:
| Type | Description |
|---|---|
int
|
Number of sentences detected by NLTK's sentence tokeniser. |
Source code in markdown_validator/domain/pos.py
65 66 67 68 69 70 71 72 | |
word_pos_at(text: str, index: int) -> str
Return the Penn Treebank POS tag for the word at index (1-based).
The entire text is tokenised as a single corpus before indexing, so index counts across all tokens in order.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
text
|
str
|
Plain text to analyse. |
required |
index
|
int
|
1-based position of the word whose POS tag is requested. |
required |
Returns:
| Type | Description |
|---|---|
str
|
POS tag string, e.g. |
Source code in markdown_validator/domain/pos.py
75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 | |