use-smart-humanize-text · iteration 8

Runner cursor-agent · Generated by humblskills eval. Pass rate and tokens are aggregated per session index (longitudinal runs compound in session order for smart_skill).
indie-launch-copy-iteration

What to read first

Charts

Pass rate

Tokens (mean per session) — bars

Violations (rule-checker count per session) — lines

Pass^k (all assertions pass this session)

Brain: patterns.md entries (smart_skill)

The patterns series tracks cumulative entries in references/patterns.md after each session (flat and no_skill are typically flat at zero). Rising values indicate the brain is accumulating lessons across sessions. The violations series is the sum of count fields across any *-check.json sidecars the agent produced; lower is better.

Derived metrics

armlearning velocitytoken ratio (last/first)

Longitudinal: first vs last session (per scenario)

scenarioarmfirst passlast passdeltasessions

Cross-section (mean over all runs)

armpass_ratetokenstime scost $

Deltas (smart_skill baseline comparisons)

pair Δ pass_rate% change Δ tokens% change Δ time s% change Δ cost $% change

Δ is the absolute difference (smart − baseline). % change is Δ / baseline × 100. For tokens / time / cost, smaller is better, so a negative % is an improvement and colors green.