Trust Report

Task Decomposition

v0.1.0 · skill · @superagentskill

Unverified

Trust score

—

Last 7 days

0runs

— success

Last 30 days

0runs

— success

Lifetime

0runs

— success

Trust vector

v2.0.0Unverified

Not enough adversarial evidence yet to publish a verified score. We never default an untested package to a comfortable number — the dimensions below show what evidence exists so far, gated by confidence.

Safety

Adversarial robustness (lower-bounded + severity-weighted)

Competence

Real-world success rate (Wilson lower bound)

Freshness

Recent verification + signed releases

Coverage

How much evidence backs the score

Confidence0

Published score = quality × confidence. More adversarial runs, case coverage and real-world executions raise confidence — so a few lucky runs can't earn a high score.

Latency (30d)

p50

—

p95

—

Robustness

Public findings

Critical

Model heatmap (30d)

No telemetry yet. Connected agents can call report_execution via MCP after using this skill.

Robustness findings

CVE-style report of adversarial failures discovered by the SkillForge red-team pipeline. Public by default — we publish what we find so you can trust what you ship.

No public findings. The skill either passed every adversarial probe so far or is still being audited.

Compatibility matrix

Independent cross-model probe — runs every published example on each major frontier model and lets a neutral judge score the outcome. Updated by the SkillForge compatibility sweep.

No compatibility runs yet. The next sweep will populate Claude, GPT and Gemini results here.

How this trust score is computed

Trust Score v2 is evidence-gated: published score = quality × confidence, where quality is a weighted blend of four dimensions (safety, competence, freshness, coverage). Pass and success rates use the Wilson lower confidence bound, so large samples beat a handful of lucky runs, and an untested package shows Unverified rather than a default number. Real-world signals come from agent executions reported via the MCP report_execution tool. The formula is pure and reproducible offline — see src/lib/trust/scoring.ts.