Posted on

May 7, 2026

2026 ONC Health IT Certification: AI Note Provenance A Playbook for CMIOs & IT Directors

2026 ONC Health IT Certification: AI Note Provenance A Playbook for CMIOs & IT Directors

Posted on

May 14, 2026

2026 ONC Health IT Certification: AI Note Provenance — The Clinical Operations Playbook

TL;DR: ONC's HTI-1 rule (2026 certification surveillance) requires EHRs to expose the "lineage" of AI-generated clinical notes. CMS's signature guidance (MLN905364, July 2025) says clinicians using AI scribes need only sign to authenticate—but it never addresses how to prove which sentences were machine-generated, which were clinician-verified, or how to trace content back to source audio. Scribing.io closes that gap with sentence-level FHIR Provenance tagging: every AI-generated sentence receives a stable UUID, a fragment anchor, a SHA-256 hash linking it to source audio, and structured Device metadata—delivering the granular "Note Lineage" visibility that HTI-1 mandates and that payer audits now demand.

Table of Contents

  • 1. What HTI-1's 2026 Certification Surveillance Actually Requires for AI Note Lineage

  • 2. Sentence-Level FHIR Provenance — The Architecture Most Vendors Miss

  • 3. Clinical Logic Masterclass — Handling Unverified Copy-Forward in Anticoagulation Management

  • 4. Technical Reference: ICD-10 Documentation Standards for Stroke and Anticoagulation Lineage

  • 5. Copy-Forward Differential Analysis — The 6-Year Audit Lookback Architecture

  • 6. CMIO Implementation Checklist: 90-Day Deployment Timeline

  • 7. The HTI-1 Note Lineage Viewer — What Your ONC Testing Lab Expects

1. What HTI-1's 2026 Certification Surveillance Actually Requires for AI Note Lineage

The 21st Century Cures Act's Health Data, Technology, and Interoperability rule (HTI-1), finalized December 2023 with enforcement phased through 2026, introduced Decision Support Intervention (DSI) transparency requirements and expanded the definition of "predictive" tools subject to certification criteria. Scribing.io was engineered from its first commit against these criteria—not retrofitted after publication. For CMIOs running certification gap analyses right now, the practical obligations break down into three categories:

  1. Transparency of AI involvement — Certified EHRs must indicate when content is generated or suggested by an AI/ML system. This is not optional labeling; it is a certification condition subject to ONC-ACB surveillance.

  2. Source attribution — The system must identify the algorithm, model version, and training data epoch that produced the output. ONC's DSI certification criteria (§ 170.315(b)(11)) explicitly require "source attributes" that survive export.

  3. Risk management and ongoing surveillance — Developers must demonstrate audit trails that survive version upgrades and support retrospective review, including the ability to reconstruct which model version generated which content in which encounter at any point during a 6-year lookback window.

CMS's MLN905364 fact sheet (July 2025) acknowledges AI scribes for the first time, but its guidance is exactly one sentence: "If you use a scribe, including artificial intelligence technology, sign the entry to authenticate the documents and the care you provided or ordered. You don't need to document who or what transcribed the entry."

That sentence is doing enormous damage in the field. CMIOs are reading it as a compliance safe harbor. It is not. Signing authenticates the note as a whole; it does not satisfy HTI-1's requirement that AI-generated content be independently identifiable, traceable, and auditable at the element level. The gap between CMS signature policy and ONC certification criteria is where documentation risk concentrates—and where Scribing.io's architecture operates. For the regulatory landscape at the state level, our California AI Laws analysis maps how SB-1120 and AB-3030 compound these federal requirements.

CMS Signature Requirements vs. ONC HTI-1 Certification: Gap Analysis

Dimension

CMS MLN905364 (2025)

ONC HTI-1 (2026 Surveillance)

Gap for CMIOs

Scope of authentication

Whole document (signature = acceptance)

Element-level AI identification

No sentence-level attribution in CMS guidance

AI identification

"You don't need to document who or what transcribed"

DSI transparency: model, version, data source required

Direct conflict; signing alone is insufficient for certification

Audit trail

Attestation statements, signature logs (created at any time)

Immutable provenance chain, hash-verified, export-ready

Attestation is retrospective; HTI-1 requires prospective metadata

Tamper evidence

"Protections against modification" (generic)

Cryptographic hash, content-addressable identifiers

No hash specification in CMS policy

Copy-forward handling

Not addressed

Implied by "ongoing surveillance" of AI outputs across encounters

Total blind spot in CMS guidance

The AMA's Principles for Augmented Intelligence reinforce this distinction: Principle 3 ("Transparency and education of users") states that clinicians must be able to understand AI's contribution to clinical outputs. A whole-document signature, by definition, obscures that contribution. ONC's HTI-1 framework operationalizes the AMA principle into testable certification criteria—criteria your EHR must pass.

2. Sentence-Level FHIR Provenance — The Architecture Most Vendors Miss

Most EHR vendors implementing HTI-1 compliance stop at resource-level authorship: a single Provenance resource pointing at the entire Composition or DocumentReference. This satisfies a literal reading of FHIR R4's Provenance specification but fails the spirit of HTI-1's transparency mandate—and critically, fails under payer audit when a single contested sentence determines medical necessity for a 99214.

For context on how Scribing.io handles the security layer underpinning this architecture, see our Safety & Privacy Guide.

Scribing.io's Sentence-Level Provenance Model

Every machine-generated sentence in a Scribing.io note receives five discrete metadata artifacts:

  1. A stable UUID — Persists across revisions, exports, and system migrations. No reassignment on edit; the original UUID survives as a tombstone if the sentence is deleted, maintaining audit continuity.

  2. A fragment anchor — Rendered as Composition/{id}#sent-{uuid}, enabling direct deep-linking to the exact sentence in any FHIR-conformant viewer. This is the mechanism that allows an auditor to click a payer query straight to the contested sentence.

  3. A dedicated Provenance resource containing:

    • Provenance.target → the fragment reference (Composition/{id}#sent-{uuid})

    • Provenance.agent.whoDevice/scribingio-LLM with extensions for model build, dataset epoch, and software version

    • Provenance.entity.whatni://sha-256;{hash} referencing the source audio Media resource and the specific transcript chunk

    • Provenance.recorded → ISO 8601 timestamp of generation

    • Provenance.activity → coded value from a controlled vocabulary: ai-generated, ai-suggested, ai-generated-unverified, clinician-edited, or clinician-originated

  4. DocumentReference.content.attachment.hash — SHA-256 of the final signed document, linking the rendered output to its provenance chain.

  5. EHI Export Bundle inclusion — All Provenance resources travel with the note during Electronic Health Information (EHI) exports per ONC's EHI Export requirements, supporting 6-year audit lookback and copy-forward differential analysis.

FHIR Provenance Resource Example

{
  "resourceType": "Provenance",
  "id": "prov-sent-7f3a2c91",
  "target": [
    { "reference": "Composition/note-20260115#sent-7f3a2c91" }
  ],
  "recorded": "2026-01-15T09:42:18Z",
  "activity": {
    "coding": [{
      "system": "http://scribing.io/fhir/activity-type",
      "code": "ai-generated",
      "display": "Machine-generated sentence"
    }]
  },
  "agent": [{
    "who": { "reference": "Device/scribingio-LLM" },
    "extension": [{
      "url": "http://scribing.io/fhir/ext/model-build",
      "valueString": "cardio-note-v4.2.1-epoch-20260108"
    }]
  }],
  "entity": [{
    "what": {
      "identifier": {
        "system": "urn:ietf:params:ni",
        "value": "ni://sha-256;a3f2b8c1d4e5f6..."
      }
    },
    "role": "source"
  }]
}
{
  "resourceType": "Provenance",
  "id": "prov-sent-7f3a2c91",
  "target": [
    { "reference": "Composition/note-20260115#sent-7f3a2c91" }
  ],
  "recorded": "2026-01-15T09:42:18Z",
  "activity": {
    "coding": [{
      "system": "http://scribing.io/fhir/activity-type",
      "code": "ai-generated",
      "display": "Machine-generated sentence"
    }]
  },
  "agent": [{
    "who": { "reference": "Device/scribingio-LLM" },
    "extension": [{
      "url": "http://scribing.io/fhir/ext/model-build",
      "valueString": "cardio-note-v4.2.1-epoch-20260108"
    }]
  }],
  "entity": [{
    "what": {
      "identifier": {
        "system": "urn:ietf:params:ni",
        "value": "ni://sha-256;a3f2b8c1d4e5f6..."
      }
    },
    "role": "source"
  }]
}
{
  "resourceType": "Provenance",
  "id": "prov-sent-7f3a2c91",
  "target": [
    { "reference": "Composition/note-20260115#sent-7f3a2c91" }
  ],
  "recorded": "2026-01-15T09:42:18Z",
  "activity": {
    "coding": [{
      "system": "http://scribing.io/fhir/activity-type",
      "code": "ai-generated",
      "display": "Machine-generated sentence"
    }]
  },
  "agent": [{
    "who": { "reference": "Device/scribingio-LLM" },
    "extension": [{
      "url": "http://scribing.io/fhir/ext/model-build",
      "valueString": "cardio-note-v4.2.1-epoch-20260108"
    }]
  }],
  "entity": [{
    "what": {
      "identifier": {
        "system": "urn:ietf:params:ni",
        "value": "ni://sha-256;a3f2b8c1d4e5f6..."
      }
    },
    "role": "source"
  }]
}

Why Resource-Level Provenance Fails the Audit

Consider what a CERT auditor sees when they pull a 99214 note documented with resource-level-only Provenance: "Author: Dr. Smith. AI involvement: Yes." That tells the auditor nothing about which content was AI-generated, whether it was verified, or what source data informed it. Contrast this with Scribing.io's output, where the auditor can query the specific ROS sentence, see it tagged as ai-generated-unverified or clinician-edited, and verify the audio hash. The difference is the difference between a denial and a clean pass.

3. Clinical Logic Masterclass — Handling Unverified Copy-Forward in Anticoagulation Management

Scenario: A cardiology clinic manages a patient on warfarin after AF ablation. An AI draft carries forward "No bleeding" into the ROS. The clinician signs without noticing. A GI bleed occurs two days later and the payer's audit challenges medical necessity and documentation integrity for the prior 99214.

Step-by-Step: The Failure Mode Without Sentence-Level Provenance

  1. AI scribe generates draft note for the follow-up visit. The model detects "No bleeding" in the prior encounter's ROS and carries it forward into the current draft—a standard copy-forward behavior in most ambient AI systems.

  2. The signed note shows "Author: Dr. Smith" on the entire document. No metadata distinguishes clinician-originated sentences from AI-generated ones.

  3. The sentence "No bleeding" has no link to source audio—because the patient never said it during this encounter. The AI inherited it from the prior visit.

  4. Two days later, the patient presents to the ED with melena. The gastroenterologist documents a significant upper GI bleed. A JAMA Internal Medicine study (2024) documented that copy-forward errors in anticoagulated patients contribute to delayed recognition of bleeding events in 8–12% of cases reviewed.

  5. The payer pulls the 99214 note. Under review, the clinic cannot prove whether the clinician actively assessed bleeding status or passively accepted an AI suggestion. The payer challenges medical necessity: if bleeding was truly "absent," why was the patient hemorrhaging 48 hours later?

  6. Result: Claim denial, potential fraud referral under the False Claims Act, malpractice exposure.

Step-by-Step: How Scribing.io Prevents This

Scribing.io Sentence-Level Provenance Workflow: Warfarin Copy-Forward Scenario

Step

System Behavior

FHIR Artifact Generated

1. AI generates draft

"No bleeding" sentence created with activity: ai-generated. System recognizes the source is the prior encounter, not current audio.

Provenance/prov-sent-{uuid} with entity.what pointing to prior-encounter Composition, not current Media resource

2. Source audio check

System searches current encounter audio for any confirming verbal segment matching bleeding assessment keywords.

No matching Media chunk found; entity.what hash field remains empty for current encounter

3. Attestation gate fires

Sentence flagged as "unverified—no confirming audio in current encounter" in clinician review UI. Visual indicator: amber highlight with source tag "Carried forward from visit 2026-01-08."

Provenance.activity updated to ai-generated-unverified

4. Sign-off blocked

Clinician cannot attest note until every ai-generated-unverified sentence is either verbally confirmed (triggering new audio hash) or manually removed/edited.

Workflow enforcement via Task resource with status: ready, intent: order, blocking note finalization

5. Clinical resolution

Clinician asks patient about bleeding. Patient reports intermittent epigastric discomfort. Clinician edits sentence to: "Patient reports intermittent epigastric discomfort; denies frank bleeding but notes dark stools intermittently."

New Provenance created with activity: clinician-edited, fresh SHA-256 audio hash linked to the 00:14:32–00:15:18 segment of current encounter recording

6. Downstream clinical action

Clinician orders stat INR, stool guaiac, and GI consult based on the newly elicited history.

Orders linked to the updated ROS sentence via supportingInfo references

7. Payer audit (post-event)

Auditor queries sentence-level Provenance for the ROS section. Finds clinician-verified content with audio timestamp correlation. The 99214 medical decision-making is substantiated by documented assessment of a new symptom.

Full lineage exported in EHI Bundle with all Provenance resources, audio hashes, and Task completion records

Outcome: The clinic averts the denial and demonstrates defensible lineage during the payer review. The 99214 is substantiated because the signed ROS reflects real-time clinical assessment, not passively inherited AI content. More critically, the patient received appropriate clinical follow-up because the attestation gate forced the clinician to actually ask about bleeding—converting a documentation workflow into a patient safety mechanism.

Why This Matters for Anticoagulation Documentation Specifically

Patients on warfarin carry elevated risk for bleeding events that demand assessed, not assumed, documentation. The AHA/ACC 2023 Atrial Fibrillation Guidelines explicitly recommend structured bleeding assessment at every anticoagulation follow-up. A falsely documented "No bleeding" may:

  • Delay INR rechecks beyond the safe interval

  • Prevent appropriate GI referral when subclinical bleeding is present

  • Create liability exposure when adverse events occur within the documentation window—exactly the scenario described above

  • Distort HAS-BLED score calculations in registries that pull from structured EHR data

Scribing.io's attestation gate converts a passive signing workflow into an active verification event, aligned with the clinical principle that anticoagulation status must be assessed at every encounter. For the latest on how HIPAA enforcement intersects with these AI documentation workflows, see our HIPAA 2026 Update.

4. Technical Reference: ICD-10 Documentation Standards for Stroke and Anticoagulation Lineage

HTI-1's provenance requirements intersect directly with ICD-10 specificity mandates. When AI-generated notes carry forward vague diagnoses or uncorroborated clinical statements, the coding downstream inherits that imprecision—creating audit risk and quality measure distortion. This section maps the provenance-to-coding pipeline for two high-frequency, high-risk code families in the cardiology/neurology intersection.

Focus Codes

ICD-10 Codes Requiring Provenance-Aware Documentation

ICD-10 Code

Description

Documentation Requirement

Provenance Implication

I63.9 - Cerebral infarction

Cerebral infarction, unspecified

Requires specification of vessel, laterality, and mechanism when known; "unspecified" triggers auditor scrutiny under ICD-10-CM Official Guidelines Section I.A.9

If AI drafts "stroke" without qualifying detail, sentence-level Provenance must show the source is AI-generated and flag for clinician specification before coding extraction

unspecified | Z79.01 - Long term (current) use of anticoagulants

Long term (current) use of anticoagulants

Must reflect active medication reconciliation, not historical carry-forward; requires documentation of specific anticoagulant and indication

Provenance.entity.what must link to current-encounter medication reconciliation or verbal confirmation, not prior-visit data; copy-forward of Z79.01 without current-visit confirmation risks RADV recoupment

The Specificity–Provenance Nexus

Published analyses from the National Institutes of Health (2023) indicate that "unspecified" stroke codes (I63.9) appear in 15–25% of inpatient stroke documentation, often because initial AI-generated problem lists carry forward ED impressions without updating to final imaging-confirmed diagnoses. Under HTI-1, each instance of I63.9 in a note must be traceable to its source:

  • Was it AI-generated from an imported CCD? The Provenance resource will show entity.role: source pointing to the CCD DocumentReference.

  • Was it copied from a prior note? The Provenance chain reveals the originating encounter and flags the absence of current-encounter confirmation.

  • Was it clinician-affirmed after MRI review? The activity: clinician-edited tag with a timestamp correlating to the imaging result availability confirms active clinical judgment.

Scribing.io's fragment-level Provenance enables coders and auditors to trace why a code is unspecified—differentiating between genuinely unavailable information (clinically appropriate I63.9) and an AI draft that persisted an imprecise term the clinician never updated (audit-vulnerable I63.9). This distinction directly supports Risk Adjustment Data Validation (RADV) audit defense, where the ability to prove a diagnosis was clinician-affirmed—not AI-hallucinated—determines whether recoupment stands or falls.

How Scribing.io Drives Maximum Specificity

  1. Pre-attestation specificity check: When the AI draft contains an "unspecified" code-eligible term (e.g., "stroke" without vessel/laterality), the system flags the sentence with activity: ai-generated-unverified and presents the clinician with structured prompts: "Specify vessel? Laterality? Mechanism confirmed by imaging?"

  2. Coding-layer integration: The sentence-level Provenance travels downstream to the coding engine. If a sentence tagged ai-generated-unverified is the sole support for a diagnosis code, the coder receives an alert: "Source sentence not clinician-verified; query physician before assigning specific code."

  3. HCC recalculation trigger: For Risk Adjustment-eligible conditions, Scribing.io tracks whether the supporting documentation was clinician-originated or AI-generated. AI-generated-only support for an HCC-mapped code generates a compliance flag visible to the coding integrity team before claim submission.

5. Copy-Forward Differential Analysis — The 6-Year Audit Lookback Architecture

The most significant gap in CMS's signature guidance is its silence on copy-forward. MLN905364 addresses missing signatures and attestation timing but never acknowledges that modern AI scribes can propagate content across encounters without explicit clinician action—creating what documentation integrity researchers call "ghost documentation": clinically stale content that persists in the medical record because no system flagged it for re-evaluation.

A 2020 JAMA study found that up to 54% of progress note text was duplicated from prior encounters, with clinicians unable to identify which content was original in 35% of tested scenarios. AI scribes amplify this problem because they can carry forward with higher semantic fidelity—rephrasing prior content just enough that simple text-matching deduplication fails.

Scribing.io's Copy-Forward Differential Engine

Every Scribing.io note maintains a revision-aware provenance chain that supports differential analysis across the full encounter history:

Copy-Forward Differential Architecture

Component

Function

Audit Value

Sentence UUID persistence

When a sentence is carried forward, it retains its original UUID and receives a new Provenance resource with entity.role: revision pointing back to the originating encounter's Provenance

Auditor can trace any sentence to its first appearance in the record, regardless of how many encounters it has traversed

Staleness scoring

Each carried-forward sentence receives a staleness score based on: (a) time since last clinician verification, (b) number of intervening encounters without modification, (c) clinical domain risk weight (anticoagulation sentences score higher)

Sentences exceeding the staleness threshold trigger mandatory re-verification before attestation

Encounter-to-encounter diff view

The clinician review UI renders a visual diff showing which sentences are new (green), carried-forward-unmodified (amber), carried-forward-and-edited (blue), and removed (strikethrough)

Provides the exact "what changed" view that OIG auditors request during RADV and CERT reviews

EHI Export Bundle packaging

All Provenance resources—including historical revision chains—are packaged into EHI Export-compliant FHIR Bundles

Supports the full 6-year Medicare audit lookback window without requiring the auditor to have access to the originating EHR system

This architecture transforms copy-forward from an invisible liability into a visible, auditable, and clinically useful feature. The clinician sees exactly what was carried forward and must make an active decision about each flagged sentence. The auditor sees exactly when content was last verified and by whom. The compliance team sees encounter-over-encounter trends in copy-forward rates by provider, department, and diagnosis category.

6. CMIO Implementation Checklist: 90-Day Deployment Timeline

Deploying sentence-level Provenance is not a feature toggle. It requires coordination across clinical informatics, compliance, revenue cycle, and IT infrastructure teams. The following 90-day timeline reflects Scribing.io's standard enterprise deployment sequence:

90-Day Sentence-Level Provenance Deployment

Phase

Days

Activities

Responsible Team

Phase 1: Gap Assessment

1–15

Audit current EHR Provenance implementation (resource-level vs. element-level); map HTI-1 DSI criteria against existing certification; identify copy-forward hotspots by specialty

CMIO + Clinical Informatics

Phase 2: Technical Integration

16–45

Deploy Scribing.io FHIR Provenance module; configure sentence UUID generation and fragment anchoring; integrate SHA-256 audio hashing pipeline; validate EHI Export Bundle formatting with ONC testing tools

IT Infrastructure + Scribing.io Engineering

Phase 3: Clinical Workflow Calibration

46–70

Configure attestation gate thresholds by specialty (e.g., anticoagulation notes require zero unverified carry-forward; dermatology notes allow 72-hour staleness); train clinicians on diff view and amber/green/blue sentence indicators; pilot with 3 high-volume providers

CMIO + Department Chiefs

Phase 4: Compliance Validation

71–85

Run simulated payer audit against pilot notes; verify sentence-level Provenance resolves previously identified copy-forward denials; confirm RADV response bundles contain complete lineage chains

Compliance + Revenue Cycle

Phase 5: Full Deployment

86–90

Enterprise rollout; activate real-time copy-forward analytics dashboard; schedule 30-day post-deployment review

All teams

7. The HTI-1 Note Lineage Viewer — What Your ONC Testing Lab Expects

ONC-ACB surveillance under HTI-1 will require certified health IT modules to demonstrate AI content lineage—not merely assert it. This means your ONC testing lab will request exportable evidence that every AI-generated sentence is independently identifiable, source-linked, and hash-verifiable.

See our 2026 HTI-1 Note Lineage viewer: sentence-level FHIR Provenance with SHA-256 audio hashes and EHI Export-ready bundles you can hand to your ONC testing lab.

The Scribing.io Note Lineage Viewer renders three layers of provenance visibility for any given note:

  1. Sentence-level activity overlay: Each sentence is color-coded by its Provenance.activity value. AI-generated sentences display a machine icon; clinician-edited sentences display an edit icon with diff tooltip; clinician-originated sentences display a microphone icon with audio timestamp.

  2. Hash verification panel: Clicking any sentence opens its Provenance resource, including the SHA-256 audio hash. The viewer performs real-time hash verification against the stored Media resource, displaying a green checkmark (match) or red alert (mismatch/absent).

  3. Export bundle generator: One-click export produces an EHI-compliant FHIR Bundle containing the Composition, all Provenance resources, the DocumentReference with attachment hash, and the Device resource describing the AI model. This bundle is the artifact your ONC testing lab will ingest during surveillance review.

What This Means for Your Next ONC-ACB Surveillance Cycle

If your EHR vendor cannot produce sentence-level AI identification, source attribution with cryptographic verification, and export-ready provenance bundles, your certification is at risk. HTI-1's DSI criteria are not aspirational—they are conditions of certification maintenance. The question for every CMIO is not whether to implement AI note provenance, but whether your current vendor's implementation will survive the surveillance review that is already on your ONC-ACB's calendar.

Scribing.io was built to make that question trivially answerable: every sentence tagged, every source hashed, every lineage exportable. That is what defensible AI documentation looks like in 2026.

Still not sure? Book a free discovery call now.

Frequently

asked question

Answers to your asked queries

What is Scribing.io?

How does the AI medical scribe work?

Does Scribing.io support ICD-10 and CPT codes?

Can I edit or review notes before they go into my EHR?

Does Scribing.io work with telehealth and video visits?

Is Scribing.io HIPAA compliant?

Is patient data used to train your AI models?

How do I get started?

Still not sure? Book a free discovery call now.

Frequently

asked question

Answers to your asked queries

What is Scribing.io?

How does the AI medical scribe work?

Does Scribing.io support ICD-10 and CPT codes?

Can I edit or review notes before they go into my EHR?

Does Scribing.io work with telehealth and video visits?

Is Scribing.io HIPAA compliant?

Is patient data used to train your AI models?

How do I get started?

Still not sure? Book a free discovery call now.

Frequently

asked question

Answers to your asked queries

What is Scribing.io?

How does the AI medical scribe work?

Does Scribing.io support ICD-10 and CPT codes?

Can I edit or review notes before they go into my EHR?

Does Scribing.io work with telehealth and video visits?

Is Scribing.io HIPAA compliant?

Is patient data used to train your AI models?

How do I get started?

Didn’t find what you’re looking for?
Book a call with our AI experts.

Didn’t find what you’re looking for?
Book a call with our AI experts.

Didn’t find what you’re looking for?
Book a call with our AI experts.