MultiLevel Pattern Miner

MPLM discovers repeated structural patterns in text documents across six hierarchical levels, then turns those mined patterns into compiled specs, validation bundles, and review dashboards.

Level Hierarchy

flowchart TD
    D["DOCUMENT (6)"]
    S["SECTION (5)"]
    C["CHUNK (4)"]
    P["PARAGRAPH (3)"]
    L["LINE / Sentence (2)"]
    PH["PHRASE (1)"]
    D --> S --> C --> P --> L --> PH

Pipeline

flowchart LR
    F["📄 Source Text"] --> IT["io.load_text()"]
    IT --> RP["run_pipeline()"]
    RP --> NP["Normalizer.parse()"]
    NP --> BD["BoundaryDetector.detect()"]
    BD --> PM["PatternMiner.mine()"]
    PM --> PL["📊 patterns.yml"]
    PL --> CS["compile-spec"]
    CS --> SPEC["🧭 compiled-pattern-spec.yml"]
    SPEC --> VI["validate-index"]
    VI --> BUNDLE["📦 validation-index.json / .md"]
    BUNDLE --> DASH["🖥️ validation-dashboard.html"]

Install

pip install -e ".[dev]"

Or with full NLP support:

pip install -e ".[dev,nlp]"

Quick Start

Mine patterns from a single Markdown or plain-text file:

mplm mine examples/example1.md --out patterns.yml --min-support 1

Compile the mined library into an LLM-ready bridge spec:

mplm compile-spec patterns.yml --out compiled-pattern-spec.yml

Validate a folder of drafts against one compiled template:

mplm validate-index compiled-pattern-spec.yml examples/dashboard-drafts \
  --template-id tmpl-l3-s-s-s \
  --out validation-index.json

Render the dashboard from the JSON bundle:

mplm render-dashboard validation-index.json --out validation-dashboard.html

Or render directly from the compiled spec plus drafts:

mplm render-dashboard \
  --spec-path compiled-pattern-spec.yml \
  --drafts-root examples/dashboard-drafts \
  --template-id tmpl-l3-s-s-s \
  --out validation-dashboard.direct.html

Preview the parsed AST without mining:

mplm preview examples/example1.md

View pipeline log messages:

mplm --log-level INFO mine examples/example1.md --out patterns.yml

What the Repo Produces

  • patterns.yml: mined structural evidence
  • compiled-pattern-spec.yml: bridge schema for templates, prompts, and validation
  • validation-index.json or .md: aggregate batch validation bundle
  • validation-dashboard.html: static review dashboard with filtering and CSV export

Validation and Dashboards

Use the dedicated guide for the current review workflow:

Python API

from mplm import run_pipeline
from mplm.io import load_text, save_library

text = load_text("examples/example1.md")
lib = run_pipeline(text, source="example1.md", min_support=2)
save_library(lib, "patterns.yml")

for p in lib.patterns:
    print(p.id, p.structure["signature"], p.support)

Run Tests

pytest                        # all tests + coverage report
pytest tests/unit/ -v         # unit tests only
pytest tests/integration/ -v  # integration tests only

Architecture

See DESIGN.md for the full pipeline diagram, stage contracts, and design pattern justification.