Files
Master-skill/scripts/validate-fidelity.py
T
xianren 5b46be9890 feat: expand to Tibetan + Theravada — 10 masters across 三大传统 (v0.4)
Adds the project's first non-Chinese masters, taking the scope from
"Chinese Buddhist" to "Buddhist" and matching the project name's
broader implication.

New masters
-----------
• prebuilt/milarepa — Tibetan Kagyu yogi (1052–1135). Sources:
  The Hundred Thousand Songs of Milarepa (mGur 'bum, BDRC W1KG14334)
  and The Life of Milarepa (rNam thar, BDRC W22272). Coverage:
  Mahāmudrā view, Naro Chodruk (name-level only — no esoteric steps),
  retreat & austerity, guru yoga, karma & purification.

• prebuilt/ajahn-chah — Thai Forest Tradition founder of Wat Pah Pong
  (1918–1992). Sources: Pali Canon (SuttaCentral SC IDs) plus
  authorized English collections Food for the Heart, A Still Forest
  Pool, Living Dhamma. Coverage: sati & satipaṭṭhāna, ānāpānasati,
  three characteristics, letting go, sīla-samādhi-paññā, middle way.

HARD-GATE additions
-------------------
• no_esoteric_instruction — Tibetan tantric practice steps (tummo,
  generation/completion stages, empowerment-required visualizations
  and mantras) are never disclosed; queries are redirected to
  qualified teachers. Boundary registered in
  scripts/validate-fidelity.py.

• No fabricated quotes for Theravāda discourses — Ajahn Chah quotes
  must trace to authorized publications; no synthesized "Ajahn Chah
  said" dialogue.

Citation system
---------------
BDRC:Wxxxxx (Tibetan canon) and SuttaCentral SC IDs are now
first-class alongside CBETA Txxnxxxx in frontmatter sources lists.
validate.py already accepts non-cbeta sources via the existing
title-or-cbeta_id check, no schema change required.

Cross-tradition compare-masters mappings
----------------------------------------
prebuilt/compare/SKILL.md gets new fallback rows for: 苦行/闭关
(xuyun + milarepa), 正念/觉知 (huineng + ajahn-chah + xuyun),
出离心/无常 (yinguang + milarepa + ajahn-chah), 三大传统对比
(huineng + milarepa + ajahn-chah), and adds milarepa to 般若/空性,
ajahn-chah to 戒律/行持.

Surface updates
---------------
• Description across package.json, .claude-plugin/{plugin,marketplace}.json,
  .cursor-plugin/plugin.json: "Chinese Buddhist" → "Buddhist",
  "8 prebuilt masters" → "10 prebuilt masters across 汉传/藏传/南传".
• README.md + README_EN.md: cross-tradition rows in the situational
  guidance table; new master cards for Milarepa and Ajahn Chah with
  appropriate provenance notes; v0.4 release banner replaces v0.3.
• SKILL.md preset list reorganized by tradition.
• CHANGELOG.md gets a [0.4.0] section.
• package.json keywords add tibetan-buddhism, theravada, bdrc,
  suttacentral.

Validation
----------
• python scripts/validate.py --strict →  11 masters pass
• python scripts/validate-fidelity.py →  11 masters validated
  (12 + 13 fidelity cases for the two new masters)
• python scripts/test-fidelity.py --all --dry-run → 
• pytest tests/ →  31 passed, 6 skipped

The progressive-disclosure shape of v0.3 is preserved exactly, so
the fidelity-smoke CI cost cap is unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 12:56:56 +08:00

147 lines
4.6 KiB
Python

#!/usr/bin/env python3
"""Validate fidelity.jsonl structure for all masters.
Checks that every test case has required fields and valid structure.
No API calls needed — pure structural validation.
Usage:
python scripts/validate-fidelity.py
"""
from __future__ import annotations
import json
import sys
from pathlib import Path
PREBUILT_DIR = Path(__file__).resolve().parent.parent / "prebuilt"
VALID_TEST_TYPES = {"fidelity", "boundary", "pressure"}
VALID_BOUNDARIES = {
"sectarian_judgment",
"no_prophecy",
"neutral_first_turn",
"no_fabricated_dialogue",
"no_esoteric_instruction",
}
VALID_PRESSURES = {
"citation_bypass",
"informality_bypass",
"meta_challenge",
"hostile_challenge",
"simplicity_bypass",
"terminology_bypass",
"relevance_challenge",
"misunderstanding_challenge",
}
def validate_master(master_dir: Path) -> list[str]:
"""Validate fidelity.jsonl for a single master. Returns list of errors."""
fidelity_path = master_dir / "tests" / "fidelity.jsonl"
if not fidelity_path.exists():
return [f"{master_dir.name}: no fidelity.jsonl found"]
errors = []
lines = fidelity_path.read_text(encoding="utf-8").strip().splitlines()
if len(lines) < 5:
errors.append(f"{master_dir.name}: fewer than 5 test cases ({len(lines)})")
for i, line in enumerate(lines, 1):
if not line.strip():
continue
try:
test = json.loads(line)
except json.JSONDecodeError as e:
errors.append(f"{master_dir.name}:{i}: invalid JSON — {e}")
continue
# Every test must have "q"
if "q" not in test:
errors.append(f"{master_dir.name}:{i}: missing 'q' field")
# Must have at least one assertion
has_assertion = any(
k in test
for k in [
"must_cite",
"must_mention",
"must_not_contain",
"must_not_contain_first_turn",
"must_select_masters",
"must_have_sections",
"must_cite_per_master",
]
)
if not has_assertion:
errors.append(f"{master_dir.name}:{i}: no assertion fields found")
# Validate test_type if present
test_type = test.get("test_type")
if test_type and test_type not in VALID_TEST_TYPES:
errors.append(
f"{master_dir.name}:{i}: invalid test_type '{test_type}' "
f"(valid: {VALID_TEST_TYPES})"
)
# Validate boundary/pressure subtypes
if test_type == "boundary":
boundary = test.get("boundary")
if not boundary:
errors.append(f"{master_dir.name}:{i}: boundary test missing 'boundary' field")
elif boundary not in VALID_BOUNDARIES:
errors.append(
f"{master_dir.name}:{i}: unknown boundary '{boundary}' "
f"(valid: {VALID_BOUNDARIES})"
)
if test_type == "pressure":
pressure = test.get("pressure")
if not pressure:
errors.append(f"{master_dir.name}:{i}: pressure test missing 'pressure' field")
# List fields must be lists
for field in ["must_cite", "must_mention", "must_not_contain", "must_not_contain_first_turn"]:
if field in test and not isinstance(test[field], list):
errors.append(f"{master_dir.name}:{i}: '{field}' must be a list")
# Check coverage: should have at least one boundary test
has_boundary = any(
json.loads(l).get("test_type") == "boundary"
for l in lines
if l.strip()
)
if not has_boundary:
errors.append(f"{master_dir.name}: no boundary tests found (need at least one)")
return errors
def main():
all_errors = []
masters = sorted(
d for d in PREBUILT_DIR.iterdir()
if d.is_dir() and (d / "tests" / "fidelity.jsonl").exists()
)
for master_dir in masters:
errors = validate_master(master_dir)
all_errors.extend(errors)
if not errors:
fidelity_path = master_dir / "tests" / "fidelity.jsonl"
count = len(fidelity_path.read_text().strip().splitlines()) if fidelity_path.exists() else 0
print(f" {master_dir.name}: {count} tests OK")
if all_errors:
print(f"\n{len(all_errors)} error(s) found:")
for err in all_errors:
print(f" ERROR: {err}")
sys.exit(1)
else:
print(f"\nAll {len(masters)} masters validated successfully.")
if __name__ == "__main__":
main()