I've been building an independent trust registry for open-source AI agents and the findings have been eye-opening.

The short version: I track 171 agents across 14 categories (coding agents, frameworks, browser agents, memory systems, etc.) and score them on verifiable trust signals — not stars or hype. The signals include OSSF Scorecard, build provenance (SLSA), signed commits, license transparency, and maintenance patterns.

What surprised me:

Only 3 out of 171 agents have enough independent signal coverage to earn a Grade A (broad verifiable evidence across multiple dimensions)

Some of the most-starred agents score poorly on trust because they have zero supply-chain verification — no scorecard, no provenance, no signed commits

The agent with 166k GitHub stars ranked #108 on trust (partly a data bug I've since fixed, partly genuine: popularity ≠ verifiability)

Agents that do publish provenance and pass OSSF checks are often mid-tier on stars but rank near the top on trust

How the scoring works:

The formula weights signals by how hard they are to fake:

Safety/Integrity (30 pts): OSSF Scorecard, build provenance, signed commits

Identity (20 pts): verified listing + provenance binding

Transparency (20 pts): license + OSSF transparency checks

Maintenance (20 pts): commit freshness + activity

Adoption (10 pts): log-scaled, capped stars + downloads

Then the raw score gets multiplied by a confidence factor (how many signal types we actually have data for) — so an agent we can't verify much about can't reach the top tier even if it's popular.

Why I built this:

With MCP and A2A taking off, agents are about to start calling other agents. There's currently no standardized way to answer "should Agent A trust Agent B?" before they interact. I'm trying to build toward that — the trust data is open (CC BY 4.0), machine-readable, and there's a compare tool with radar charts if you want to see how specific agents stack up.

Would love feedback on the methodology or agents you think are missing. The full leaderboard is at hvtracker and the methodology is published.

[留言]

为什么值得关注

能改变理解方式,而不只是重复常识;符合当前抓取需求;它提供了新的理解或解释,而不只是表面观点

来源:reddit,领域:news,保留分:0.71