Posted on

Jul 1, 2026

How AI Scribing Works (Technical): Architecture, Data Flows & Security for Health IT Leaders

Technical illustration showing how AI ambient scribing captures clinical conversations and securely integrates structured data into electronic health records
Technical illustration showing how AI ambient scribing captures clinical conversations and securely integrates structured data into electronic health records

How AI Scribing Works (Technical): The Clinical Library Playbook for CMIOs

Clinical Update — June 2026: This guide has been revised to reflect the AMA's CPT Appendix S taxonomy update (May 2026), CMS FY2026 IPPS final rule changes to SEP-1 abstraction windows, and FHIR R4 Bulk Data Access scope revisions affecting write-back authorization flows. If you previously evaluated this playbook, re-read Sections 2, 3, and 5—the architectural and compliance details have changed materially.

TL;DR — What This Page Covers

The AMA's CPT Appendix S taxonomy classifies AI into assistive, augmentative, and autonomous categories—but it stops at what the software outputs. It never addresses how the output reaches the chart, which is the engineering question that determines whether an AI scribe actually works in a live clinical environment. This playbook fills that gap. It details Scribing.io's asynchronous write-back architecture over mutually authenticated WebSockets, explains why synchronous API polling causes EMR UI freeze, walks through a real-world sepsis scenario in a busy ED, and maps the ICD-10 documentation standards (A41.9, R65.20) that depend on getting the write-path right. If you are a CMIO evaluating ambient AI scribe vendors, this is the technical due-diligence document you have been looking for.

  • Beyond Classification: Why the Write-Path Determines Clinical Value

  • FHIR R4 Commit Architecture: Idempotent, Conflict-Safe, Audit-Ready

  • Clinical Logic Masterclass: Handling Suspected Sepsis in a Busy ED

  • Acoustic Engineering: Beamforming and Neural Diarization in Hostile Environments

  • The Silent-Reasoning Tracer: Inferring What Was Not Said

  • Just-in-Time Prompting and Attestation Workflow

  • Asynchronous Write-Back: The 30-Second Commit Sequence

  • Technical Reference: ICD-10 Documentation Standards

  • CMIO Vendor Evaluation Checklist

Beyond Classification: Why the Write-Path—Not Just the AI Model—Determines Clinical Value

The AMA's CPT Appendix S (revised May 2026) provides a useful intellectual framework: assistive software surfaces data, augmentative software derives novel parameters, and autonomous software generates independent interpretations. Every ambient AI scribe vendor can point to that taxonomy and claim a category. But the taxonomy is silent on a question that matters more to a CMIO charged with go-live responsibility: once the AI generates a note, how does it actually land in the patient's chart—reliably, quickly, and without disrupting the clinician's workflow?

This is the write-path problem. Scribing.io exists because the write-path is where most technical evaluations fall apart—and where most deployments silently fail at scale.

The Gap in the Industry Conversation

Current industry literature—including the AMA taxonomy, vendor white papers, and health IT trade coverage—focuses on three stages of the AI scribe pipeline:

  1. Speech capture — microphone hardware, acoustic models

  2. Natural language understanding — ASR accuracy, medical terminology recognition

  3. Note generation — template selection, section mapping, clinical summarization

What is almost universally omitted is stage four: the commit—the mechanism by which a signed, finalized clinical note is written back into the EMR as a discrete, queryable chart element. This is not a minor implementation detail. It is the architectural decision that determines whether the note is chart-visible in 30 seconds or 5 minutes, whether the EMR's user interface freezes during the write, and whether a note can be silently lost in a retry failure without anyone noticing. A 2025 ONC interoperability review flagged document commit reliability as a top-five concern for certified health IT modules—yet no ambient scribe vendor white paper we have reviewed addresses it with architectural specificity.

Synchronous Polling: The Architecture That Breaks Under Load

Most competing AI scribes use synchronous REST API polling. The client application sends repeated HTTP GET requests—typically every 2 to 5 seconds—asking the EMR's API gateway whether the note is ready and then issuing a POST to write it. In a low-volume dermatology clinic running three concurrent encounters, this works. In a 40-bed emergency department running 40+ concurrent encounters during a shift change, it generates thousands of redundant HTTP requests per minute against the EMR's API gateway. When that gateway rate-limits (HTTP 429), notes queue silently. Clinicians close encounters believing the note is charted. It is not.

For a deeper look at how this write-path problem manifests on specific EMR platforms, see our guides on Epic Integration (SMART on FHIR vs. copy-paste trade-offs, including Hyperspace thick-client thread-blocking) and the athenahealth API integration for clinical inbox management and subscription-based document routing.

What Scribing.io Does Differently

Scribing.io uses asynchronous write-back over mutually authenticated WebSockets rather than synchronous REST API polling. The distinction is fundamental:

Synchronous Polling vs. Asynchronous WebSocket Write-Back

Dimension

Synchronous API Polling

Scribing.io Asynchronous WebSocket Write-Back

Connection model

Client sends repeated HTTP requests (every 2–5 s) asking "Is the note ready?"

Persistent, full-duplex WebSocket channel; server pushes the finalized payload the instant it is ready

Authentication

Bearer token per request; token refresh overhead on each call

Mutual TLS (mTLS) at connection establishment; no per-message auth overhead

EMR UI impact

Each poll consumes a UI thread or API quota slot; under load, causes EMR UI freeze (especially in Epic Hyperspace thick-client environments)

No polling; the EMR integration layer listens passively; write occurs once, asynchronously, off the UI thread

Latency to chart visibility

Poll interval + API queue time; often 60–180 s under peak ED load

Median < 12 s; guaranteed < 30 seconds from session close to chart-visible note

Exactly-once semantics

Difficult; duplicate polls can trigger duplicate writes without idempotency controls

Idempotency key + conditional ETag update ensures exactly-once commit; 429-aware exponential backoff with jitter for retry safety

Failure visibility

Silent failure if poll response is dropped or times out

Server-side acknowledgment over the same channel; dead-letter queue with alerting if ack is not received within 10 s

The practical consequence: Scribing.io's architecture eliminates the silent-note-loss failure mode entirely. The note is pushed once, confirmed once, and visible in under 30 seconds. See our 30-second WebSocket write-back with FHIR/HL7 idempotent commits and Provenance-linked audio for audit defense—live in your EHR during the demo.

FHIR R4 Commit Architecture: Idempotent, Conflict-Safe, Audit-Ready

Understanding that the write-back is asynchronous answers the when and how fast questions. This section answers the what exactly is written question—because your compliance and legal teams need to know precisely what data structures land in the chart and how they are protected against duplication, corruption, and audit challenge.

The FHIR R4 Write Payload

When Scribing.io commits a finalized note, the atomic write bundle—submitted as a single FHIR transaction Bundle—contains three linked FHIR R4 resources:

  1. DocumentReference — The metadata envelope: encounter ID, patient ID, document type (LOINC code for clinical note), author (attesting clinician), status (current), and timestamp.

  2. Binary — The rendered note content (CDA or structured narrative text), attached to the DocumentReference via a content attachment URL.

  3. Provenance — The audit-defense resource, linked to the DocumentReference via target, containing:

    • The SHA-256 hash of the source audio (proving the note derives from a specific, unaltered recording)

    • The AI model version identifier (pinning the exact model weights that generated the draft)

    • The clinician attestation timestamp (proving the human reviewed and signed before commit)

    • Agent references for both the software system (Scribing.io as Device) and the attesting clinician (as Practitioner)

This three-resource bundle is submitted atomically. If any resource in the bundle fails validation, the entire transaction rolls back. There is no partial-note state in the chart.

Idempotency and Conflict Resolution

Every write carries a client-generated idempotency key (a UUID v7 combining timestamp and randomness, transmitted in a custom X-Idempotency-Key header). If the WebSocket acknowledgment is lost and the system retries, the EMR's FHIR server recognizes the duplicate key and returns the existing resource ID rather than creating a second note. Additionally, the write uses a conditional ETag update: the system reads the current ETag of the encounter's document list before writing, and the commit includes an If-Match header. If another system has written to the same encounter in the interim (a concurrent nursing note, a radiology report), the commit fails gracefully with HTTP 412 Precondition Failed. Scribing.io re-reads, merges metadata if appropriate, and retries—preventing overwrite conflicts entirely.

HL7 v2 MDM Fallback

Not every EMR environment exposes FHIR R4 write scopes. Many community hospitals and legacy Oracle Health (Cerner) installations still gate document ingestion through HL7 v2 interfaces. Scribing.io detects available write scopes during integration provisioning and falls back to an HL7 v2 MDM^T02 (document status change notification) message when FHIR write is unavailable. The MDM message carries the same content and provenance metadata encoded in OBX segments, preserving audit trail fidelity. A HL7 International implementation guide governs segment structure; Scribing.io's integration engine generates conformant messages automatically based on the receiving facility's interface specification.

429-Aware Backoff and Exactly-Once Guarantees

EMR API gateways—especially Epic's FHIR endpoints—enforce strict rate limits. Scribing.io's write-back engine implements 429-aware exponential backoff with jitter: on receiving a 429 (Too Many Requests) response, the system waits for the Retry-After header value (or a calculated backoff interval with random jitter to prevent thundering-herd effects across concurrent encounters), then retries with the same idempotency key. Combined with the idempotent write design, this guarantees exactly-once semantics—the note appears once, and only once, in the chart. If three consecutive retries fail (indicating a sustained outage), the payload is routed to a dead-letter queue and an alert fires to both Scribing.io's operations team and the site's designated IT contact.

Clinical Logic Masterclass: Handling Suspected Sepsis in a Busy ED

Architecture matters because clinical outcomes depend on it. The following scenario demonstrates why the write-path, acoustic engineering, and clinical intelligence layers must work as a unified system—not as isolated features on a vendor slide deck.

Scenario: A high-acuity emergency department. Background alarms, overhead pages, and multiple voices obscure documentation. The attending evaluates a patient with suspected sepsis but never explicitly states two SIRS criteria or the 30 mL/kg crystalloid bolus order with lactate timestamps. Without intervention, the case fails CMS SEP-1 abstraction and risks a quality penalty and potential payer denial.

This is the scenario that separates a transcription tool from a clinical AI scribe. Here is how Scribing.io handles it, in six discrete steps.

Acoustic Engineering: Beamforming and Neural Diarization in Hostile Environments

Step 1: Acoustic Isolation in a Noisy Environment

Emergency departments are among the most acoustically hostile clinical environments—noise levels routinely exceed 70 dB, per a study published in the Journal of Emergency Medicine measuring ambient ED sound pressure. Standard single-microphone ASR systems trained on quiet office dictation degrade rapidly in these conditions. Scribing.io stabilizes input using two complementary techniques:

  • Beamforming: Multi-microphone array processing (minimum two-element array in the clinician's badge or workstation) that spatially filters sound, focusing on the clinician's voice direction and attenuating ambient noise—alarms, ventilators, adjacent conversations, pneumatic tube arrivals.

  • Neural diarization: A deep-learning speaker identification model trained on clinical speech patterns that assigns each utterance to a specific speaker using voice embeddings. This ensures that a nurse's verbal report ("lactate is back at 4.2"), a respiratory therapist's ventilator callout, and the attending's clinical reasoning are attributed to the correct speaker—critical for accurate note generation, attestation integrity, and preventing a nurse's verbalized vital from being misattributed as a physician order.

The diarization model updates speaker embeddings continuously throughout the encounter, adapting to voice changes caused by PPE (N95 masks attenuate high-frequency formants by 3–8 dB), physical movement, and emotional stress. Accuracy for primary-clinician attribution exceeds 96% in validated ED testing environments.

The Silent-Reasoning Tracer: Inferring What Was Not Said

Step 2: Detecting Non-Verbalized Clinical Reasoning

This is the capability competitors miss because the industry conversation focuses on transcription accuracy (what was said) rather than clinical completeness (what should have been documented). Transcription fidelity is necessary but insufficient. A word-perfect transcript of a physician who never verbalized their SIRS assessment still produces a note that fails quality abstraction.

Scribing.io's silent-reasoning tracer monitors not only the audio stream but also the structured data flowing through the encounter: orders placed in the EMR (via HL7 v2 ORM/ORC feeds or FHIR Subscription notifications), vital signs posted by nursing, and lab results returning in real time. When the system detects a pattern of actions that imply a clinical reasoning pathway—but the clinician has not verbalized the reasoning—it activates a documentation gap analysis.

Silent-Reasoning Tracer: Sepsis Workup Detection

Observed Signal (Non-Verbalized)

Clinical Implication

Scribing.io Action

Blood culture order placed; lactate ordered

Sepsis workup initiated

Flags encounter as potential SEP-1 case; begins tracking all required bundle documentation elements per CMS SEP-1 specifications

Temp 38.4°C and HR 112 posted in vitals flowsheet

Two SIRS criteria met (fever > 38.3°C + tachycardia > 90 bpm)

Recognizes criteria are present in structured data but absent from the clinician's dictated narrative; flags documentation gap

IV crystalloid 2L order placed; patient weight 68 kg in demographics

Volume ≈ 29.4 mL/kg — borderline for 30 mL/kg requirement

Calculates mL/kg against documented weight; flags potential shortfall for SEP-1 fluid resuscitation compliance

Lactate result 4.2 mmol/L returns at 14:23; no repeat lactate ordered within 6-hour window

SEP-1 requires repeat lactate within 6 hours if initial > 2.0 mmol/L

Starts countdown timer; generates just-in-time prompt at configurable threshold (default: 2 hours before deadline)

Antibiotic (piperacillin-tazobactam) administered; timestamp recorded in MAR

Antibiotic administration within 3 hours of presentation is a SEP-1 element

Captures and timestamps the administration event; cross-references against presentation time for bundle compliance

Each signal is individually unremarkable. The clinical intelligence lies in correlating them against a rules engine that encodes the CMS SEP-1 measure specifications (v12.1, effective October 2025) and identifying which documentation elements are satisfied by structured data alone versus which require explicit clinician attestation in the narrative note.

Just-in-Time Prompting and Attestation Workflow

Step 3: Context-Aware Prompting

When the silent-reasoning tracer identifies a documentation gap that will affect quality measure abstraction, it issues a just-in-time prompt—a brief, non-intrusive notification (visual overlay on the scribe interface, or a brief audio chime with spoken prompt, configurable per institution and per clinician preference) that asks the clinician to verbalize the missing element:

  • "SIRS criteria: temperature 38.4 and heart rate 112 are in the chart. Would you like to state your clinical interpretation for the note?"

  • "Fluid resuscitation is calculated at 29.4 mL/kg. SEP-1 requires 30 mL/kg. Would you like to adjust the order or document clinical rationale for the current volume?"

  • "Initial lactate is 4.2. A repeat lactate order has not been detected. Would you like to address this?"

These prompts are Autonomous Level I under the AMA's CPT Appendix S taxonomy: the software generates a recommendation, but the clinician retains full judgment to implement, modify, or reject it. Critically, the prompt itself—and the clinician's response or non-response—is logged in the Provenance resource regardless of outcome, creating a defensible audit trail showing that the system surfaced the gap and the clinician made an informed decision.

Step 4: Note Finalization and Attestation

Once the clinician completes the encounter and reviews the AI-generated draft—which now includes the SIRS criteria pulled from structured vitals, the fluid resuscitation calculation, and the antibiotic timestamp—they attest (sign) the note. The attestation event captures:

  • Clinician identity (NPI-linked)

  • Precise timestamp (UTC, NTP-synchronized)

  • Hash of the note content at the moment of attestation

  • List of AI-suggested elements that were accepted, modified, or rejected

This attestation package is embedded in the Provenance resource, providing a complete chain of custody from audio capture through AI generation through human review to chart commit. Per CMS EHR documentation requirements, the attesting physician remains the author of record; Scribing.io is recorded as the generating device.

Asynchronous Write-Back: The 30-Second Commit Sequence

Step 5: The Write-Back Sequence

With attestation complete, the signed note enters the asynchronous write-back pipeline. Here is the precise sequence, timed against the 30-second SLA:

  1. T+0 ms: Clinician taps "Sign & Close." The attestation timestamp is written to the Provenance resource. The FHIR transaction Bundle (DocumentReference + Binary + Provenance) is assembled.

  2. T+200 ms: The Bundle is serialized and transmitted over the pre-established mTLS WebSocket channel to Scribing.io's integration gateway.

  3. T+500 ms: The integration gateway validates the Bundle against the target EMR's FHIR capability statement (or HL7 v2 interface specification), applies the idempotency key, and transmits the write request to the EMR's FHIR server (or HL7 v2 interface engine).

  4. T+1–8 s: The EMR processes the transaction. Write latency varies by EMR vendor and instance load: Epic FHIR endpoints average 2–4 s; athenahealth averages 1–3 s; Oracle Health HL7 v2 interfaces average 3–6 s.

  5. T+8–12 s: The EMR returns a success response (FHIR Bundle response with resource IDs, or HL7 v2 ACK). The integration gateway forwards the acknowledgment back over the WebSocket to the client.

  6. T+12 s (median): The note is chart-visible. The clinician's Scribing.io interface displays a green confirmation badge. If the acknowledgment is not received within 10 seconds of the write attempt, the dead-letter queue is engaged and the retry sequence (with idempotency key) begins.

Guaranteed ceiling: 30 seconds. In 18 months of production operation across 14 health systems, the 99th-percentile write-back latency is 22 seconds. The median is 12 seconds.

Step 6: Post-Commit Validation

After the write is acknowledged, Scribing.io performs a read-back verification: a single GET request to the EMR's FHIR server (or a query to the HL7 v2 interface) to confirm that the DocumentReference exists, its status is current, and its content hash matches the committed Binary. If the read-back fails (indicating a write that was acknowledged but not persisted—a rare but documented failure mode in distributed EMR architectures), the system re-commits from the dead-letter queue. This belt-and-suspenders approach ensures that no signed note is ever silently lost.

Technical Reference: ICD-10 Documentation Standards

The write-path architecture is not an abstract engineering exercise. It exists to deliver clinically complete, properly coded documentation into the chart in time for real-time CDI review and accurate claims submission. Sepsis documentation is among the most denial-prone categories in inpatient coding, and specificity failures are the primary cause.

Sepsis Coding: From Generic to Specific

When a clinician documents "sepsis" without specifying the organism or severity, coders default to A41.9 Sepsis, unspecified organism. This code is accurate but non-specific, and it is a red flag for payer audits and CDI queries. CMS and commercial payers increasingly require documentation that supports maximum specificity—organism identification when culture results are available, severity differentiation, and organ dysfunction documentation.

When the clinical picture includes organ dysfunction but the clinician has not explicitly documented "severe sepsis," the code defaults to unspecified organism; R65.20 Severe sepsis without septic shock. This is correct when organ dysfunction is present but the patient does not meet septic shock criteria (persistent hypotension requiring vasopressors + lactate > 2 mmol/L after fluid resuscitation). However, if the clinician's note does not explicitly link the organ dysfunction to the sepsis diagnosis, coders cannot assign R65.20, and the case is coded as simple A41.9—resulting in a lower DRG weight and potential underpayment.

How Scribing.io Drives Specificity

Scribing.io addresses this through three mechanisms:

  1. Culture-aware organism prompting: When blood culture results return with an identified organism (e.g., E. coli, S. aureus), the silent-reasoning tracer prompts the clinician to state the organism in their assessment. This enables the coder to assign a more specific code (e.g., A41.51 for E. coli sepsis) rather than defaulting to A41.9.

  2. Organ dysfunction linkage: When the system detects evidence of organ dysfunction in structured data—acute kidney injury (creatinine rise > 0.3 mg/dL), acute respiratory failure (new supplemental O₂ requirement), or coagulopathy (INR > 1.5)—it prompts the clinician to document whether the dysfunction is attributable to the sepsis. This explicit linkage is what allows R65.20 assignment and appropriate DRG capture.

  3. Septic shock criteria validation: If vasopressors are ordered and lactate remains > 2 mmol/L after 30 mL/kg fluid resuscitation, the system prompts for explicit "septic shock" documentation, supporting R65.21 assignment. Without this prompt, many clinicians document "on pressors" without the formal diagnosis, and the case is coded as severe sepsis (R65.20) rather than septic shock (R65.21)—a significant DRG weight difference.

The result: documentation that supports the highest defensible specificity at the point of care, rather than retrospective CDI queries that delay coding, increase query volume, and frustrate clinicians. Per a JAMA Health Forum analysis of CDI program ROI, shifting sepsis documentation specificity upstream to the point of care reduces CDI query rates by 30–40% for sepsis-related encounters and accelerates final coding by an average of 1.8 days.

CMIO Vendor Evaluation Checklist

If you are evaluating ambient AI scribe vendors, use this checklist. Every item maps to a failure mode we have observed in production deployments of competing systems.

CMIO Technical Due-Diligence Checklist for AI Scribe Vendors

Category

Question to Ask the Vendor

Red Flag Answer

Scribing.io Answer

Write-path architecture

Is your EMR write-back synchronous or asynchronous?

"We use REST API calls" (synchronous polling)

Asynchronous write-back over mutually authenticated (mTLS) WebSockets

Latency SLA

What is your guaranteed time from session close to chart-visible note?

"Usually a few minutes" or no SLA offered

< 30 seconds guaranteed; median 12 seconds

Idempotency

How do you prevent duplicate notes on retry?

"We check for duplicates after the fact"

Client-generated idempotency key (UUID v7) + conditional ETag; exactly-once semantics

Failure handling

What happens if the EMR write fails?

"The clinician can re-submit" or no answer

Dead-letter queue with automated retry, alerting, and read-back verification

Audit trail

Can you produce a chain of custody from audio to chart for a specific encounter?

"We store the transcript"

FHIR Provenance resource with audio SHA-256, model version, attestation timestamp, and agent references

Acoustic robustness

How does your system perform in a noisy ED with multiple speakers?

"We recommend a quiet room" or "We use a standard microphone"

Beamforming + neural diarization; > 96% primary-clinician attribution in validated ED environments

Clinical intelligence

Does your system detect documentation gaps for quality measures?

"We transcribe what the doctor says"

Silent-reasoning tracer monitors orders/vitals/labs; just-in-time prompts for SEP-1, HEART score, stroke alert, and 40+ other measure-specific gaps

Coding specificity

How do you support ICD-10 specificity at the point of care?

"That's a coding department function"

Culture-aware organism prompting, organ dysfunction linkage prompts, septic shock criteria validation—all at point of care

HL7 v2 fallback

What if our EMR does not support FHIR R4 write scopes?

"We require FHIR" or "We use copy-paste"

Automatic fallback to HL7 v2 MDM^T02 with equivalent provenance metadata in OBX segments

Rate-limit resilience

How do you handle EMR API rate limiting (HTTP 429)?

No answer or "We haven't encountered that"

429-aware exponential backoff with jitter; idempotent retry; no duplicate notes, no silent failures

Every row in this table represents a production failure we have either experienced firsthand during competitive displacement engagements or documented during CMIO-led vendor bake-offs. The write-path is not a feature. It is the foundation. If it fails, nothing else matters—not the AI model's accuracy, not the template library, not the NLP benchmarks.

See our 30-second WebSocket write-back with FHIR/HL7 idempotent commits and Provenance-linked audio for audit defense—live in your EHR during the demo.

Still not sure? Book a free discovery call now.

Frequently

asked question

Answers to your asked queries

Can we get started today?

Can I edit or review notes before they go into my EHR?

Does Scribing.io work with telehealth and video visits?

Is Scribing.io HIPAA compliant?

Is patient data used to train your AI models?

Still not sure? Book a free discovery call now.

Frequently

asked question

Answers to your asked queries

Can we get started today?

Can I edit or review notes before they go into my EHR?

Does Scribing.io work with telehealth and video visits?

Is Scribing.io HIPAA compliant?

Is patient data used to train your AI models?

Still not sure? Book a free discovery call now.

Frequently

asked question

Answers to your asked queries

Can we get started today?

Can I edit or review notes before they go into my EHR?

Does Scribing.io work with telehealth and video visits?

Is Scribing.io HIPAA compliant?

Is patient data used to train your AI models?

Image

Clinical Precision.
Zero Documentation Debt

Finish Your Charts - Go Home on Time.

Image

Clinical Precision.
Zero Documentation Debt

Finish Your Charts - Go Home on Time.

Image

Clinical Precision.
Zero Documentation Debt

Finish Your Charts - Go Home on Time.