Claim Verification · Medical

A hard gate between your AI and ungrounded output

LLMs confidently cite papers that contradict their own claims. Skippy Ground verifies every medical claim before it reaches your users — returning a calibrated verdict, source citations, and explicit contradictions in a single API call.

Request a Demo API Reference →

POST /v1/ground/verify · 200 OK

Claim

Simvastatin combined with clarithromycin significantly increases plasma statin levels via CYP3A4 inhibition.

SUPPORTEDcross-validated

0.91

T1DailyMed0.94

T2DrugBank0.89

T2DDInter0.87

source_count: 3conflicts: []retraction_flagged: false

4 verdicts

SUPPORTED · CONTRADICTED · CONTESTED · NOT_COVERED

4 tiers

T1–T4 source weighting — tier_weight: 1.0 → 0.45

16+ domains

DDI, PGx, PA, RWE, Trials, Coding, and more

SHA-256

Hash-chained audit on every verification call

Live Demo

Try Skippy Ground

Claim verification, AI output grounding, and prior authorization evaluation — same API shape as production.

POST /v1/ground/verify · application/jsonsimulation

SUPPORTEDcross-validated

0.91

Evidence Chain — 3 sources

T1DailyMed0.94

Clarithromycin is a potent CYP3A4 inhibitor and markedly increases simvastatin AUC — concomitant use is contraindicated.

T2DrugBank0.89

Simvastatin is a major CYP3A4 substrate; strong inhibitors increase myopathy and rhabdomyolysis risk.

T2DDInter0.87

Macrolide CYP3A4 inhibition produces 8–12× increase in simvastatin plasma levels.

Simulated output representative of real API responses. ECE-calibrated confidence: 0.85 confidence = 85% empirical accuracy. Every call generates a SHA-256 hash-chained audit record.

What It Does

Six endpoints. One API.

Claim verification, output grounding, prior authorization, audit retrieval, safety signal evaluation, and retraction checking — all callable from a single integration.

POST /v1/ground/verify

Claim verification

Verifies an individual medical claim against the evidence base. Returns a four-way verdict (SUPPORTED / CONTRADICTED / CONTESTED / NOT_COVERED), calibrated confidence score, full evidence chain with source tier labels, and conflict detail when sources disagree.

POST /v1/ground/score-output

AI output grounding

Accepts a block of AI-generated text and scores each extractable sentence independently. Returns per-sentence verdict, grounding score, source attribution, and an aggregate grounded_fraction — so you know exactly which claims in an LLM response are supported and which are hallucinated.

POST /v1/auth/evaluate

Prior authorization evaluation

Evaluates a drug-disease pair for prior authorization eligibility against FDA-approved indications, CPIC guidelines, and payer NCD/LCD coverage policies. Returns APPROVE / DENY / HUMAN_REVIEW / INSUFFICIENT_EVIDENCE with a decision_id and SHA-256 snapshot_id for appeals.

GET /v1/auth/audit/{decision_id}

Audit record retrieval

Retrieves the full immutable audit record for any prior authorization decision — input claim, verdict, evidence chain, snapshot_id, and timestamp. Hash-chained for tamper-evidence. 7-year retention. Appeals-rationale via /v1/auth/appeals-rationale.

POST /v1/rwe/safety-signal

Real-world safety signal

Evaluates whether a proposed drug-event association has post-market safety signal evidence. Draws on FAERS, SIDER, and ONSIDES. Returns signal strength with source citations — used for pharmacovigilance grounding and RWE claim verification.

POST /v1/ground/retraction-check

Retraction check

Flags AI output that cites retracted or expression-of-concern literature. Queries the Retraction Watch database and CrossRef. Returns a retraction_flagged_count and per-citation status — so retracted evidence never silently reaches clinical decisions.

Verdict System

Four verdicts. No silent ambiguity.

Every verification call returns one of four verdicts — with explicit justification. Contested findings are never quietly dropped. Ungrounded claims are never passed through.

SUPPORTED

Multiple tier-weighted evidence sources agree on the claim. Confidence exceeds the calibrated threshold. Cross-validation confirms source convergence — no material contradictions detected.

CONTRADICTED

The claim directly conflicts with high-confidence evidence. At least one T1 or T2 source returns an opposing finding with higher confidence than the supporting evidence. Confidence in the original claim falls below threshold.

CONTESTED

Sources disagree without clear resolution. A confidence penalty (−0.10 to −0.20) is applied. Both sides are surfaced explicitly — with their sources and relative weights. Ambiguity is never silently dropped.

NOT_COVERED

No findings exist in the evidence base for this claim. Reason codes: NO_BELIEFS (claim is outside current domain coverage), LOW_CONFIDENCE (findings exist but fall below threshold), or DOMAIN_NOT_SUPPORTED.

Evidence Architecture

Four source tiers.

Confidence is tier-weighted — not source-counted. A single T1 regulatory label contributes more signal than multiple T4 spontaneous reports. Weights are calibrated against held-out data across 143 medical sources.

T1weight 1.0

Regulatory & Clinical Guidelines

FDA DailyMedCPICNCD / LCD (CMS)PMDA

Authoritative regulatory labels and evidence-based clinical guidelines. Highest weight. Contradictions from T1 sources drive CONTRADICTED verdicts.

T2weight 0.85

Curated Biomedical Databases

DrugBankDrugCentralPharmGKB

Manually curated small-molecule and pharmacogenomics databases with expert curation. High weight — structural and mechanistic depth that regulatory labels lack.

T3weight 0.65

Peer-Reviewed Literature

Cochrane ReviewsPubMed indexed studies

Systematic reviews and primary literature. Moderate weight — broad coverage but variable quality. Retraction-checked before inclusion in evidence chains.

T4weight 0.45

Post-Market Surveillance

FAERSSIDERONSIDES

Spontaneous adverse event and real-world safety signal databases. Lower weight due to reporting bias — informative for signal detection, not mechanism verification.

Coverage

Sixteen medical domains.

Ground verifies claims across the core domains of clinical AI — drug interactions, pharmacogenomics, prior authorization, and beyond.

Drug-Drug Interactions

~1.2M catalogued pairs via DDInter + OpenFDA

Pharmacogenomics

CPIC-grounded gene-drug pairs across 11 pharmacogenes

Prior Authorization

FDA-approved indications + CMS NCD/LCD coverage

Safety Signals (RWE)

FAERS adverse event associations with vigiRank scoring

Drug Repurposing

Off-label use evidence via DrugCentral + Open Targets

Clinical Trials

ClinicalTrials.gov endpoint and eligibility claims

Pharmacovigilance

Signal detection + Naranjo causality scoring

Diagnosis Coding

ICD-10 medical necessity claims against NCD/LCD

The hallucination problem in clinical AI

LLMs cite papers that don't exist. They confidently assert drug interactions that are contraindicated. They recommend doses outside FDA-approved ranges — and explain why with plausible-sounding mechanism detail.

These are not edge cases. In a clinical context, an unverified claim about a contraindication, a wrong dose, or a hallucinated drug interaction can cause direct patient harm. Skippy Ground is a hard gate — not a soft warning. Ungrounded output is rejected at the API level before it reaches your users.

Calibrated confidence, not a float from a model

“0.85 confidence means 85% of claims with this confidence score are empirically accurate — not that the model was 85% certain when it generated the response.”

Ground confidence scores are calibrated against held-out evaluation data. ECE = 0.09 on the 707-item Cochrane benchmark (Gate 1 confirmed passing). The score is a probability estimate derived from evidence quantity, source tier weights, and cross-validation — not a model logit. A model can be highly confident in a wrong answer. Ground is confident only when evidence agrees.

Who It's For

Teams deploying clinical AI

EHR Vendors

Ground AI-generated clinical summaries, medication recommendations, and diagnostic suggestions before they reach clinicians.

Clinical AI Developers

Add a hard verification step to any LLM output that touches clinical claims — drug interactions, dosing, indications, contraindications.

Health Plans & Payers

Evidence-backed prior authorization decisions with SHA-256 audit records ready for member appeals and regulatory review.

Life Sciences

Ground AI-generated medical content for regulatory submissions, publications, and compendia against the same evidence base used in clinical decisions.

Regulatory

Regulatory-grade by design

Every verification request produces an immutable SHA-256 hash-chained audit record — input claim, verdict, evidence chain, source versions, and timestamp — suitable for regulatory submission without post-processing.

Prior authorization decisions carry a decision_id and snapshot_id traceable to the exact evidence versions used at decision time. Appeals rationale is retrievable via /v1/auth/appeals-rationale. 7-year audit log retention meets FDA 21 CFR Part 11 electronic records requirements.

HIPAA-ready with BAA available. AES-256 encryption at rest, TLS 1.3 in transit, VPC isolation. ECE-calibrated confidence (ECE = 0.09, Gate 1 confirmed) meets FDA SaMD AI/ML guidance requirements.

HIPAA ReadyBAA AvailableFDA 21 CFR Part 11FHIR R4CDS HooksSOC 2 Type II Certified

Other Skippy medical products

Skippy DDIDrug Interactions

Drug-drug interactions cause 30% of adverse drug events. DDI checks all N×(N-1)/2 pairs in one call — explains the CYP mechanism, scores panel risk, and returns verified alternatives.

Learn more →

Skippy PharmacogenomicsPharmacogenomics

CPIC-grounded dosing recommendations for 11 pharmacogenes. Ground verifies PGx claims; PGx verifies the genotype-dose relationship that Ground is checking.

Learn more →

Skippy PharmacovigilanceSafety Surveillance

Post-market safety signal detection with vigiRank composite scoring and Naranjo causality assessment. Ground uses the same FAERS/SIDER data to verify RWE claims.

Learn more →

Clinical Decision Support · Not a Substitute for Clinical Judgment. Skippy Ground is an evidence-grounded verification tool designed to support clinical AI systems. Verification results, verdicts, and prior authorization evaluations are intended to support — not replace — the judgment of qualified clinicians, pharmacists, and payers. All findings should be evaluated in the context of individual patient history, comorbidities, and local clinical guidelines.

See Ground in your workflow

We work with EHR vendors, health systems, clinical AI developers, and payers. Let's talk about your verification problem.

Request a Demo ← Back to Medical