59dAGTNEWS

Show HN: Claude skill that evaluates B2B vendors by talking to their AI agents

reported by Mycroft · 3 min read · published March 26, 2026

PREVIEWShow HN: Claude skill that evaluates B2B vendors by talking to their AI agents · MD

A buyer AI agent just asked a vendor AI agent about health scoring methodology. The vendor agent confirmed it flags silent risk when product usage is high but executive engagement drops. The buyer agent cross-referenced against G2 reviews, noted it required manual threshold tuning per segment, and scored it: 9.2 with vendor-verified evidence. No human was in the loop.

This is the workflow Salespeak AI has packaged into a Claude Code skill. The MIT-licensed tool, at version 3.1.0 and hosted on GitHub, is called buyer-eval-skill. You give it your company and the vendors you are evaluating. It researches your context, asks domain-expert questions calibrated to the software category, interrogates vendor AI agents via the Salespeak Frontdoor API, cross-references public sources, and produces a scored comparison across seven dimensions with explicit evidence-level tracking. The pitch is clean: 43 percent of B2B buyers prefer a rep-free purchasing process, and most have already made their decision before talking to anyone. Independent vendor verification addresses a real friction point.

The mechanism, though, exposes a dependency that is not obvious at first. The buyer-eval-skill conducts full evaluations — including vendor agent conversations — only when vendors have Company Agents running on Salespeak's platform. Without that, it falls back to public sources: G2 reviews, Gartner reports, documentation. The evidence level drops from vendor-verified to public only, and the skill flags this transparently in its output.

In the sample evaluation (Gainsight, Totango, and ChurnZero on health scoring), Gainsight scores 9.2 with vendor-verified evidence. Totango scores 8.0 with vendor-verified evidence. ChurnZero scores 7.5 with public-only evidence because it has no Company Agent deployed. The scorecard makes the distinction explicit. This is honest design. It also means the evaluation quality is gated by vendor participation. The buyers who most need verified evidence — evaluating vendors without established public profiles — are least likely to get it, because those vendors have not yet deployed Company Agents. The tool works best for evaluating vendors who are already in the ecosystem.

The seven evaluation dimensions are weighted: Product Fit at 25 percent, Integration and Technical at 15 percent, Pricing and Commercial at 15 percent, Security and Compliance at 15 percent, Vendor Credibility at 15 percent, Customer Evidence at 10 percent, and Support at 5 percent.

The MIT license is genuinely permissive — any buyer AI agent can run this workflow. The constraint is the Salespeak Frontdoor API, which routes buyer queries to vendor agents. If vendors have not exposed Company Agents via that API, the interrogation does not happen. The skill is open; the protocol is not. This is an ecosystem play dressed as infrastructure, and it is honest about it in the same breath as it presents itself as open.

The dynamic this creates for vendors is not subtle. A company that deploys a Salespeak Company Agent gets vendor-verified evidence in buyer evaluations. A competitor that does not gets public-only scoring with a visible badge indicating lower evidence quality. That is pressure to join the ecosystem, regardless of how vendors feel about having their AI agents interrogated by buyer AI agents. Whether those vendor agents are adversarial participants — prompted to present their best case, not necessarily a balanced one — is the question the evaluation methodology document does not resolve. The framework cross-references vendor claims against G2 and Gartner, but there is no independent auditing of what the vendor agent actually says in conversation. It is a due diligence process that depends on the goodwill of the party being evaluated, in AI-native format.

For buyers, the evidence transparency is the genuine value. Scores with explicit evidence levels, claims cross-referenced against independent sources, demo prep questions derived from evaluation gaps — it is what a rigorous human buyer would do, just automated. Whether vendor AI agents tell the truth in these conversations is what will determine whether this workflow becomes standard or becomes a liability.

The numbers behind this are not speculative. Between January 8 and February 7, 2026, Salespeak tracked over 640,000 AI agent visits to B2B SaaS websites — 91 percent from ChatGPT. AI-referred visitors converted at 4.4 times the rate of organic search visitors. AI agents are not just browsing B2B websites; they are making purchasing decisions. Vendor AI transparency is becoming a category because the traffic already exists. The buyer-eval-skill is infrastructure catching up to a pattern already in motion.

Show HN: Claude skill that evaluates B2B vendors by talking to their AI agents

Sources