MultiLevel Pattern Miner
MPLM discovers repeated structural patterns in text documents across six hierarchical levels, then turns those mined patterns into compiled specs, validation bundles, and review dashboards.
Level Hierarchy
flowchart TD
D["DOCUMENT (6)"]
S["SECTION (5)"]
C["CHUNK (4)"]
P["PARAGRAPH (3)"]
L["LINE / Sentence (2)"]
PH["PHRASE (1)"]
D --> S --> C --> P --> L --> PH
Pipeline
flowchart LR
F["📄 Source Text"] --> IT["io.load_text()"]
IT --> RP["run_pipeline()"]
RP --> NP["Normalizer.parse()"]
NP --> BD["BoundaryDetector.detect()"]
BD --> PM["PatternMiner.mine()"]
PM --> PL["📊 patterns.yml"]
PL --> CS["compile-spec"]
CS --> SPEC["🧭 compiled-pattern-spec.yml"]
SPEC --> VI["validate-index"]
VI --> BUNDLE["📦 validation-index.json / .md"]
BUNDLE --> DASH["🖥️ validation-dashboard.html"]
Install
pip install -e ".[dev]"
Or with full NLP support:
pip install -e ".[dev,nlp]"
Quick Start
Mine patterns from a single Markdown or plain-text file:
mplm mine examples/example1.md --out patterns.yml --min-support 1
Compile the mined library into an LLM-ready bridge spec:
mplm compile-spec patterns.yml --out compiled-pattern-spec.yml
Validate a folder of drafts against one compiled template:
mplm validate-index compiled-pattern-spec.yml examples/dashboard-drafts \
--template-id tmpl-l3-s-s-s \
--out validation-index.json
Render the dashboard from the JSON bundle:
mplm render-dashboard validation-index.json --out validation-dashboard.html
Or render directly from the compiled spec plus drafts:
mplm render-dashboard \
--spec-path compiled-pattern-spec.yml \
--drafts-root examples/dashboard-drafts \
--template-id tmpl-l3-s-s-s \
--out validation-dashboard.direct.html
Preview the parsed AST without mining:
mplm preview examples/example1.md
View pipeline log messages:
mplm --log-level INFO mine examples/example1.md --out patterns.yml
What the Repo Produces
patterns.yml: mined structural evidencecompiled-pattern-spec.yml: bridge schema for templates, prompts, and validationvalidation-index.jsonor.md: aggregate batch validation bundlevalidation-dashboard.html: static review dashboard with filtering and CSV export
Validation and Dashboards
Use the dedicated guide for the current review workflow:
Python API
from mplm import run_pipeline
from mplm.io import load_text, save_library
text = load_text("examples/example1.md")
lib = run_pipeline(text, source="example1.md", min_support=2)
save_library(lib, "patterns.yml")
for p in lib.patterns:
print(p.id, p.structure["signature"], p.support)
Run Tests
pytest # all tests + coverage report
pytest tests/unit/ -v # unit tests only
pytest tests/integration/ -v # integration tests only
Architecture
See DESIGN.md for the full pipeline diagram, stage contracts, and design pattern justification.