Software Design¶
Novel Testbed is designed to treat a novel as a system that performs work. Each scene, exposition block, or transition is assumed to exist for one reason: to change the reader's state in some measurable way.
The architecture enforces this by separating structure, meaning, and evaluation into distinct, testable stages. Segmentation normalizes raw prose into explicit narrative joints. Parsing turns those joints into stable structural objects. Contracts make narrative intent explicit. Inference can automate that intent, but never replaces it. Assessment rules verify whether declared change actually occurred.
An emergent property of this design is that a novel becomes observable as a dynamic system: pressure curves appear, dead zones become visible, repetition surfaces, and structural dishonesty is exposed. The system does not judge artistry. It makes narrative movement legible.
Every stage is deterministic, inspectable, and falsifiable — so that "this scene matters" becomes a claim that can be tested rather than a feeling that must be defended.
Architecture¶
At a high level, the system is a pipeline with two entry paths and one mandatory normalization stage:
flowchart LR
A[Markdown Novel] --> S[Segmenter]
S --> B[Parser]
B --> C[Novel Model]
C --> D1[Contract Generator]
D1 --> E1[Blank Contract YAML]
E1 --> F[Assessment Engine]
C --> D2[LLM Inferencer]
D2 --> E2[Inferred Contract YAML]
E2 --> F
F --> G[PASS / WARN / FAIL Report]
Pipeline stages¶
- Segmenter — Guarantees structurally valid Markdown:
- Adds chapter headers if missing
- Adds module headers (
## Scene,## Exposition,## Transition) - Ensures correct chapter → module ordering
- Guarantees idempotency on already-structured input
-
Two implementations:
ModuleSegmenter(deterministic) andLLMSegmenter(semantic, OpenAI-backed) -
Parser — Consumes only annotated Markdown and identifies:
- Chapters
- Modules with their types
- Text anchors (start / end previews)
-
Stable positional module IDs (
M001,M002, ...) -
Contract Generator — Converts parsed modules into a YAML contract:
- Reader
pre_state - Reader
post_state -
Intended narrative change (
expected_changes) -
LLM Inferencer — Operates on parsed modules and fills the same contract automatically, chaining
post_state→pre_stateacross modules. -
Assessment Engine — Applies rule objects to detect:
- Missing change
- Contradictory change
- Inert modules
- Structural dishonesty
This mirrors a standard compiler:
normalize → parse → specify → validate → diagnose
Design Patterns¶
Strategy Pattern (Segmentation + Parsing)¶
Both segmentation and parsing use the Strategy pattern to allow alternative implementations without changing the pipeline.
classDiagram
class ModuleSegmenter {
+segment_markdown(text, title) str
}
class LLMSegmenter {
+segment_markdown(text, title) str
}
ModuleSegmenter <|-- LLMSegmenter
class NovelParser {
+parse(text, title) Novel
}
class CommonMarkNovelParser {
+parse(text, title) Novel
}
NovelParser <|-- CommonMarkNovelParser
Template Method (Inference)¶
ContractInferencer defines the interface; OpenAIContractInferencer
provides the concrete implementation. New backends (Anthropic, NLP, local)
can be plugged in without touching the assessment layer.
classDiagram
class ContractInferencer {
+infer(modules, novel_title) List[ModuleContract]
}
class OpenAIContractInferencer {
+infer(modules, novel_title) List[ModuleContract]
}
ContractInferencer <|-- OpenAIContractInferencer
Protocol (Assessment Rules)¶
Assessment rules use a structural Protocol so any object implementing
evaluate(contract) → Finding | None qualifies as a rule — no inheritance
required.
classDiagram
class Rule {
+name: str
+evaluate(contract) Optional[Finding]
}
class UnspecifiedStateRule {
+evaluate(contract) Optional[Finding]
}
class MissingExpectedChangeRule {
+evaluate(contract) Optional[Finding]
}
class NoChangeRule {
+evaluate(contract) Optional[Finding]
}
Rule <|.. UnspecifiedStateRule
Rule <|.. MissingExpectedChangeRule
Rule <|.. NoChangeRule
Dependency Injection (LLM clients)¶
Both OpenAIContractInferencer and LLMSegmenter accept their LLM client
via the constructor. This keeps them testable without network calls:
# Production
client = OpenAILLMClient(config=LLMClientConfig(model="gpt-4.1-mini"))
inferencer = OpenAIContractInferencer(client=client)
# Tests
stub = StubClient(outputs=[...])
inferencer = OpenAIContractInferencer(client=stub)
Layering¶
Layers are strictly ordered. Lower layers must not import from higher layers.
CLI (entry point)
└── Segmentation (no imports from inference, parser, contracts)
└── Parser (no imports from inference, contracts)
└── Inference (imports from parser models only)
└── Contracts (imports from models only)
└── Models (no imports from any application layer)
└── Utils (no imports from application layers)
LLMSegmenter uses a local import to obtain OpenAILLMClient at
instantiation time, avoiding a module-level dependency on the inference layer.
Narrative Contract Pipeline¶
Author-declared workflow¶
sequenceDiagram
participant Author
participant Segmenter
participant Parser
participant ContractGen
participant Assessor
Author->>Segmenter: Raw or structured Markdown
Segmenter->>Parser: Annotated Markdown
Parser->>ContractGen: Parsed Modules
ContractGen->>Author: Blank Contract YAML
Author->>Assessor: Filled Contract YAML
Assessor->>Author: PASS/WARN/FAIL Report
LLM-inferred workflow¶
sequenceDiagram
participant Author
participant Segmenter
participant Parser
participant Inferencer
participant Assessor
Author->>Segmenter: Raw Markdown
Segmenter->>Parser: Annotated Markdown
Parser->>Inferencer: Parsed Modules
Inferencer->>Author: Inferred Contract YAML
Author->>Assessor: Review and Assess
Assessor->>Author: PASS/WARN/FAIL Report
Source Provenance¶
The parse command embeds a source block in every contract YAML:
source:
original_path: /path/to/novel.md
copied_path: /path/to/output/source.md
sha256: a3f2c...
generated_at: 2026-03-01T12:00:00+00:00
This SHA-256 fingerprint binds the contract to the exact version of the source Markdown that generated it. Drift detection can be implemented by comparing the stored hash against a re-hash of the current file.
Test Coverage¶
| Area | Test files | Count |
|---|---|---|
| Models | tests/core/test_models.py |
6 |
| CLI | tests/core/test_cli.py |
11 |
| Logging | tests/core/test_logging_config.py |
3 |
| Parser (unit) | tests/parser/test_commonmark_parser.py |
7 |
| Parser (integration) | tests/parser/test_parser.py |
1 |
| Parser (abstract) | tests/parser/test_parser_base.py |
3 |
| Segmentation (deterministic) | tests/segmentation/test_segmentation_unit.py |
5 |
| Segmentation (invariants) | tests/segmentation/test_segmentation_invarients.py |
4 |
| Segmentation (LLM stubbed) | tests/segmentation/test_llm_segmenter.py |
7 |
| Contracts | tests/contracts/test_contract.py |
4 |
| Contracts YAML | tests/contracts/test_contract_yaml.py |
1 |
| Rules | tests/contracts/test_rules.py |
7 |
| Assessor | tests/contracts/test_assessor.py |
5 |
| Inference (auto) | tests/inference/test_auto_contract.py |
3 |
| Inference (LLM client) | tests/inference/test_llm_client_stubbed.py |
3 |
| Inference (inferencer) | tests/inference/test_llm_inferencer.py |
1 |
| Prompts | tests/inference/test_prompts.py |
7 |
| Types | tests/inference/test_types.py |
10 |
| Fingerprinting | tests/utils/test_source_fingerprint.py |
8 |
| Integration | tests/integration/test_pipeline.py |
1 |
| Total | 97 |
All 97 tests pass against the current codebase.