API: Segmentation Layer¶

The segmentation layer is the structural front-end of Novel Testbed. Its job is to take whatever prose you wrote and make it executable. That means inserting explicit chapter and module boundaries so the parser and inference system have something real to work with.

It does not analyze meaning. It does not judge quality. It only creates joints.

Markdown segmentation utilities.

This module is responsible for discovering and inserting structural boundaries into raw prose Markdown so that downstream parsers can operate.

This is the first phase of the narrative compiler::

segment → parse → infer → assess

The base :class:ModuleSegmenter is deterministic and conservative. It guarantees:

A top-level chapter header
At least one module header
Correct chapter → module ordering
Idempotency when input is already well-formed

:class:LLMSegmenter extends this with optional LLM-based semantic segmentation. It depends on :class:~novel_testbed.inference.llm_client.OpenAILLMClient for the API call but does not depend on the inference layer's contract logic.

LLMSegmenter ¶

Bases: ModuleSegmenter

LLM-powered semantic segmenter.

Uses an object with a complete(prompt: str) -> str interface — typically :class:~novel_testbed.inference.llm_client.OpenAILLMClient — to insert:

Chapter titles
Scene boundaries
Exposition blocks
Transition modules

This replaces naive segmentation with semantic awareness. The LLM returns plain annotated Markdown (not JSON), so the complete method is used rather than infer_json.

If no client is supplied, a default :class:~novel_testbed.inference.llm_client.OpenAILLMClient is constructed automatically (requires OPENAI_API_KEY in the environment).

init ¶

__init__(client=None)

Initialize the LLM segmenter.

:param client: An object with a complete(prompt: str) -> str method. If None, a default :class:~novel_testbed.inference.llm_client.OpenAILLMClient is created automatically. :raises RuntimeError: If client is None and OPENAI_API_KEY is not set in the environment.

segment_markdown ¶

segment_markdown(text, title)

Use an LLM to infer module boundaries and return fully annotated Markdown.

Sends a structured prompt requesting # Chapter and ## Scene / Exposition / Transition headers be inserted at semantically appropriate locations.

:param text: Raw Markdown prose. :param title: Novel title. :return: Structurally valid annotated Markdown ending with a trailing newline.

ModuleSegmenter ¶

Deterministic Markdown segmenter.

Guarantees a structurally valid document:

One top-level chapter heading: # Title
At least one module heading: ## Scene ...
Chapter must appear before any module
Idempotent for already-correct Markdown

segment_markdown ¶

segment_markdown(text, title)

Segment raw Markdown into structurally annotated Markdown.

Structural invariants enforced:

Chapter must come before any module
At least one chapter exists
At least one module exists

:param text: Raw or partially annotated Markdown. :param title: Novel title used for synthetic chapter heading. :return: Valid annotated Markdown ending with a trailing newline.

Core classes¶

ModuleSegmenter¶

class ModuleSegmenter:
    def segment_markdown(self, text: str, title: str) -> str:
        ...

Deterministic and conservative. Guarantees:

A chapter header: # <title>
At least one module: ## Scene 1
Correct ordering (chapter before modules)
Idempotent for already-structured Markdown
Always returns a trailing newline

LLMSegmenter¶

class LLMSegmenter(ModuleSegmenter):
    def __init__(self, client=None) -> None:
        ...
    def segment_markdown(self, text: str, title: str) -> str:
        ...

Uses an OpenAI-backed client to infer semantically appropriate boundaries. Accepts any object with a complete(prompt: str) -> str interface. If no client is provided, creates an OpenAILLMClient automatically.

Behavior¶

Input	Output
Raw prose	Markdown with `#` and `##` headers inserted
Already-structured Markdown	Returned unchanged (normalized)
Inverted structure (module before chapter)	Rebuilt in correct order
Empty input	Returns minimal valid structure

Segmentation guarantees that downstream systems always receive valid structural input. The parser never has to guess. The inferencer never has to hallucinate boundaries.

Design note¶

LLMSegmenter uses a local import to obtain OpenAILLMClient at instantiation time. This keeps the segmentation layer independent of the inference layer at module load time, preserving strict layering.

LLMSegmenter inherits segment_markdown override from ModuleSegmenter (it does not call super().segment_markdown()). The LLM fully replaces the deterministic logic for this call.

Role in the pipeline¶

Segmentation is the guaranteed first phase:

Markdown → Segment → Parse → Infer → Assess

If segmentation fails, nothing downstream is trustworthy.

This is not decoration. It is compilation.

API: Segmentation Layer¶

LLMSegmenter ¶

__init__ ¶

segment_markdown ¶

ModuleSegmenter ¶

segment_markdown ¶

Core classes¶

ModuleSegmenter¶

LLMSegmenter¶

Behavior¶

Design note¶

Role in the pipeline¶

init ¶