Infrastructure API
Markdown document parser.
Reads a .md file from disk and returns a :class:~markdown_validator.domain.models.ParsedDocument.
This module is the only place in the codebase that touches the filesystem for
reading source documents.
Pipeline
- Read the raw file bytes (UTF-8).
- Split the YAML front-matter block from the body using PyYAML.
- Convert the body Markdown to HTML using the
markdownlibrary. - Return an immutable :class:
~markdown_validator.domain.models.ParsedDocument.
Raises:
| Type | Description |
|---|---|
FileNotFoundError
|
If the given path does not exist. |
ValueError
|
If the YAML front-matter block is missing or malformed. |
find_markdown_files(directory: str | Path) -> list[Path]
Recursively discover all .md files under directory.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
directory
|
str | Path
|
Root directory to search. |
required |
Returns:
| Type | Description |
|---|---|
list[Path]
|
Sorted list of :class: |
Raises:
| Type | Description |
|---|---|
NotADirectoryError
|
If directory is not a directory. |
Source code in markdown_validator/infrastructure/parser.py
64 65 66 67 68 69 70 71 72 73 74 75 76 | |
parse_document(filepath: str | Path) -> ParsedDocument
Parse a Markdown file and return an immutable :class:ParsedDocument.
The file must begin with a YAML front-matter block delimited by ---
lines, for example::
---
title: My Article
ms.topic: tutorial
---
# Body starts here
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filepath
|
str | Path
|
Path to the |
required |
Returns:
| Type | Description |
|---|---|
ParsedDocument
|
Parsed and validated document representation. |
Raises:
| Type | Description |
|---|---|
FileNotFoundError
|
If filepath does not exist. |
ValueError
|
If front-matter is absent or cannot be parsed as YAML. |
Source code in markdown_validator/infrastructure/parser.py
33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 | |
Rule-set loader — Repository pattern.
Loads a :class:~markdown_validator.domain.models.RuleSetModel from a JSON
file. This is the only place in the codebase that reads rule-set files from
disk.
Design pattern: Repository — the :class:RuleSetRepository class
separates the service layer from the filesystem. Tests can substitute
an in-memory rule set without touching the loader.
Raises:
| Type | Description |
|---|---|
FileNotFoundError
|
If the JSON file does not exist. |
ValueError
|
If the JSON does not conform to the rule-set schema. |
RuleSetRepository
Loads and validates rule-set JSON files.
Usage::
repo = RuleSetRepository()
rule_set = repo.load("path/to/rules.json")
Design pattern: Repository — decouples the scanner service from direct file I/O, making the scanner independently testable.
Source code in markdown_validator/infrastructure/loader.py
28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 | |
load(filepath: str | Path) -> RuleSetModel
Load and validate a rule-set from a JSON file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filepath
|
str | Path
|
Path to the |
required |
Returns:
| Type | Description |
|---|---|
RuleSetModel
|
A validated, immutable :class: |
Raises:
| Type | Description |
|---|---|
FileNotFoundError
|
If filepath does not exist. |
ValueError
|
If the JSON is invalid or fails schema validation. |
Source code in markdown_validator/infrastructure/loader.py
40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 | |
load_from_dict(data: dict) -> RuleSetModel
Load and validate a rule-set from an already-parsed dictionary.
Useful in tests to construct rule sets without filesystem access.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
dict
|
Dictionary conforming to the rule-set schema. |
required |
Returns:
| Type | Description |
|---|---|
RuleSetModel
|
A validated, immutable :class: |
Raises:
| Type | Description |
|---|---|
ValueError
|
If data fails schema validation. |
Source code in markdown_validator/infrastructure/loader.py
78 79 80 81 82 83 84 85 86 87 88 89 90 | |
Scan result reporter.
Writes :class:~markdown_validator.domain.models.ScanReport objects to disk
in JSON or CSV format. This is the only place in the codebase that writes
output files.
Raises:
| Type | Description |
|---|---|
OSError
|
If the destination directory cannot be created or the file cannot be written. |
write_csv_report(report: ScanReport, output_path: str | Path) -> Path
Write a :class:ScanReport as a flat CSV file.
Each row represents one :class:~markdown_validator.domain.models.ValidationResult.
The parent directory is created if it does not already exist.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
report
|
ScanReport
|
Validated scan report to serialise. |
required |
output_path
|
str | Path
|
Destination |
required |
Returns:
| Type | Description |
|---|---|
Path
|
Resolved path of the written file. |
Raises:
| Type | Description |
|---|---|
OSError
|
On filesystem errors. |
Source code in markdown_validator/infrastructure/reporter.py
41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 | |
write_json_report(report: ScanReport, output_path: str | Path) -> Path
Serialise a :class:ScanReport to a JSON file.
The parent directory is created if it does not already exist.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
report
|
ScanReport
|
Validated scan report to serialise. |
required |
output_path
|
str | Path
|
Destination |
required |
Returns:
| Type | Description |
|---|---|
Path
|
Resolved path of the written file. |
Raises:
| Type | Description |
|---|---|
OSError
|
On filesystem errors. |
Source code in markdown_validator/infrastructure/reporter.py
23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 | |