Posted on
Jun 16, 2026
What Does an AI Medical Scribe Do? A Guide for Residency Program Directors
Clinical Update — June 2026: This guide has been revised to reflect the 2023 AMA E/M guideline revisions now in full enforcement across all Medicare Administrative Contractors, updated CMS post-pay audit extrapolation methodologies effective Q1 2026, and Scribing.io's expanded pharmacovigilance knowledge base (v4.2) covering 340+ IMT-qualifying medication–lab pairings. If you read a prior version of this playbook, the clinical logic walkthrough in Section 3 has been substantially rewritten with new FHIR R4 Provenance binding details and HL7 v2 augmentation pathways for Epic November 2025 and Oracle Health (Cerner) Millennium 2026.1 endpoints.
What Does an AI Medical Scribe Do? The Clinical Library Playbook for CMIOs
TL;DR — What Does an AI Medical Scribe Do Beyond Transcription?
Most AI medical scribes transcribe and summarize. That's table stakes. The audit-critical question CMIOs should ask is whether their AI scribe performs Clinical Synthesis—capturing implied clinical intent (the "Layer 2" of every visit) and structuring it into the Medical Decision Making (MDM) framework required for E/M level defense. This playbook explains how Scribing.io cross-walks RxNorm medication data with LOINC lab codes to auto-detect intensive monitoring for toxicity (IMT) contexts, converts them into defensible MDM risk statements backed by FHIR Provenance, and prevents the revenue clawbacks that plague rheumatology, oncology, cardiology, and primary care practices nationwide. If your AI scribe can't prove MDM risk at audit, it isn't a scribe—it's a liability.
Table of Contents
What Does an AI Medical Scribe Actually Do? Defining the Three Layers of Clinical Documentation AI
Beyond Transcription: How Clinical Synthesis Captures Implied Intent and Proves MDM Risk
Scribing.io Clinical Logic: Rheumatology Encounter — Preventing a $22,750 Clawback
Technical Reference: ICD-10 Documentation Standards
FHIR Provenance Architecture: How Audit Trails Are Built
Competitor Gap Analysis: Layer 1 vs. Layer 2 vs. Layer 3
Implementation Checklist for CMIOs
Book a Demo: See Real-Time MDM Synthesis
What Does an AI Medical Scribe Actually Do? Defining the Three Layers of Clinical Documentation AI
The phrase "AI medical scribe" has become a catch-all for any ambient listening tool that converts a patient–physician conversation into a clinical note. The AMA's own coverage of ambient AI documentation—and virtually every competitor in this space—frames the value proposition around three outcomes: reduced burnout, time savings, and improved patient engagement. Those outcomes are real. But they describe only Layer 1 of what an AI medical scribe should do.
Scribing.io was built on a different premise: the primary failure mode of clinical documentation is not speed—it is the gap between what the physician does and what the note proves. Every encounter contains clinical reasoning that never reaches the microphone. That unspoken reasoning is where revenue lives and where audits strike.
Here is the framework CMIOs should use to evaluate any AI scribe platform:
Layer | Function | What Most AI Scribes Deliver | What Scribing.io Delivers |
|---|---|---|---|
Layer 1 — Transcription & Summarization | Converts speech to text; organizes into SOAP/note format | ✅ Yes — this is the baseline capability | ✅ Yes — with speaker diarization, medical ontology normalization, and specialty-aware templates |
Layer 2 — Clinical Synthesis | Captures implied clinical intent; maps actions to MDM elements (number/complexity of problems, data reviewed, risk of management) | ❌ Not addressed — notes read as prose narratives without MDM-aligned structure | ✅ Yes — cross-walks RxNorm + LOINC to detect IMT contexts, surfaces prompts for clinician confirmation, writes structured MDM risk lines |
Layer 3 — Audit-Defensible Provenance | Binds every MDM assertion to its source evidence with machine-readable provenance chains | ❌ Not addressed — no FHIR Provenance, no traceability from note assertion to source lab/order/external record | ✅ Yes — FHIR R4 Provenance resources link MDM risk statements to MedicationRequest, Observation (lab values), and DocumentReference (external notes); HL7 v2 ORU/ORM augmentation where EHR APIs lack order "reason" fields |
The competitive landscape in 2026—including tools from Nabla, DeepScribe, Abridge, and Nuance DAX—remains concentrated at Layer 1. Some platforms have begun generating "suggested" billing codes. But code suggestion without defensible MDM documentation is worse than no suggestion at all: it creates a false sense of security that collapses at post-pay review. A 2024 JAMA Health Forum analysis of AI-assisted documentation found that note completeness improved but that MDM specificity—the element auditors actually scrutinize—remained physician-dependent. That dependency is the gap.
The question isn't whether an AI scribe can reduce after-hours documentation time (current clinical benchmarks indicate 1.5–3 hours per week in perceived savings across implementations). The question is whether it can protect the revenue it helps generate.
For a detailed look at how these layers apply in Family Medicine workflows—where the volume of chronic disease management encounters makes IMT detection particularly high-yield—see our specialty-specific analysis.
Beyond Transcription: How Clinical Synthesis Captures Implied Intent and Proves MDM Risk
This is the insight the industry has missed.
When a rheumatologist reviews a patient's liver function tests during a methotrexate follow-up, orders repeat labs, adjusts the monitoring interval, and moves on to the next topic, they have performed intensive monitoring for toxicity. That action carries specific weight under the 2023 AMA E/M guidelines: it qualifies as "drug therapy requiring intensive monitoring for toxicity" under the MDM Risk table, which can support a high-risk classification—the difference between a 99214 and a 99215, between approximately $130 and $185 per encounter (current Medicare Physician Fee Schedule national averages), and between $0 and a five-figure clawback across a panel.
But physicians almost never narrate their monitoring rationale out loud. They don't say, "I am now performing intensive monitoring for hepatotoxicity related to methotrexate." They say, "Labs look okay, let's recheck in four weeks." The clinical intent is implied. A human scribe might not catch it. A Layer 1 AI scribe will transcribe the words and move on. This is the "Layer 2" problem—and it is where Scribing.io's Clinical Synthesis engine operates.
The Six-Step Clinical Synthesis Pipeline
RxNorm Cross-Walk: The system identifies active medications from the patient's medication list (pulled via FHIR MedicationRequest or HL7 v2 RDE segments). Methotrexate (RxNorm CUI: 6851) is flagged as a drug with known toxicity monitoring requirements per the NIH/NLM methotrexate monitoring guidelines.
LOINC Lab Mapping: The system queries recent and pending lab orders. It identifies ALT (LOINC 1742-6) and AST (LOINC 1920-8) results and orders, mapping them to the hepatotoxicity monitoring protocol for methotrexate.
Cadence Inference: By analyzing historical order intervals (e.g., LFTs ordered every 4 weeks over the past 6 months), the system infers a monitoring cadence consistent with IMT—not incidental lab ordering.
Clinician Prompt (5 seconds): Rather than auto-generating documentation without physician input—which would raise compliance concerns under CMS documentation integrity standards—Scribing.io surfaces a confirmation prompt: "Confirm IMT for hepatotoxicity? q4w LFTs; hold if AST/ALT >3× ULN; last ALT LOINC 1742-6 = 88 U/L (05/14)."
Structured MDM Risk Statement: On confirmation, the system inserts an audit-aligned statement into the note with specific parameters, thresholds, and evidence citations.
FHIR Provenance Binding: The MDM risk statement is linked via FHIR R4 Provenance to the source MedicationRequest (methotrexate), Observation resources (ALT and AST values with dates), and ServiceRequest (the repeat lab order). If the EHR's R4 API omits the order "reason" field—a known gap in some Epic November 2025 and Oracle Health Millennium 2026.1 endpoints—Scribing.io augments the provenance chain via HL7 v2 ORU/ORM message feeds to preserve the source-of-truth link.
This is the audit defense layer that no competitor has built. The result: implied clinical intent is converted into defensible MDM, not prose.
The same Clinical Synthesis logic applies across specialties. In Cardiology, the engine detects amiodarone ↔ TSH/LFT/ECG (QTc interval) monitoring contexts. In psychiatry, it maps clozapine ↔ ANC (LOINC 770-8) per FDA REMS requirements. In oncology, it handles dozens of chemotherapy agents with overlapping toxicity panels. The pharmacovigilance knowledge base (v4.2, June 2026) covers 340+ medication–lab pairings most frequently associated with IMT-qualifying monitoring.
Scribing.io Clinical Logic: Rheumatology Encounter — Preventing a $22,750 Clawback
Scenario: Rheumatology clinic, one-party consent state. A 61-year-old with seropositive RA on methotrexate presents with fatigue and mild transaminitis. The physician reviews two prior LFTs and orders repeat labs but does not state that this is intensive monitoring for toxicity. A post-pay review downcodes ten encounters from 99215 to 99214.
Without Scribing.io: The Anatomy of a Clawback
The physician's note, generated by a conventional AI scribe, reads:
"Patient reports ongoing fatigue. Reviewed recent labs—LFTs mildly elevated. Will recheck in 4 weeks. Continue current medications."
This note is clinically accurate. It is also audit-indefensible at the 99215 level. The post-pay auditor applies the 2023 AMA MDM framework:
Number and Complexity of Problems: Chronic illness, stable — Moderate ✅
Amount and/or Complexity of Data: Review of prior labs — could qualify as Moderate or High depending on documentation of independent interpretation and source attribution
Risk of Complications and/or Morbidity or Mortality of Patient Management: The note says "continue current medications." There is no documentation of intensive monitoring for toxicity. The auditor classifies risk as Moderate, not High.
Two of three MDM elements must reach the billed level. With Risk stuck at Moderate, the encounter supports 99214, not 99215. Across ten encounters, the clawback is approximately $55 × 10 = $550 in direct repayment per audit sample. But CMS and commercial payers apply statistical extrapolation to the full claim population. For a busy rheumatology practice with similar documentation patterns across 350 methotrexate patients seen quarterly, the annualized exposure—accounting for extrapolation multipliers—lands in the range of $15,000–$22,750 for a single provider. Add in compliance counsel fees, Corrective Action Plan costs, and the operational drag of appeals, and the true cost exceeds the dollar figure.
With Scribing.io Enabled: The 5-Second Intervention
The encounter proceeds identically. The physician says the same words. But Scribing.io's Clinical Synthesis engine has already completed its cross-walk:
Step | Scribing.io Action | Time | Data Source |
|---|---|---|---|
1. Med Detection | Methotrexate (RxNorm 6851) identified on active medication list | Pre-visit (background) | FHIR MedicationRequest / HL7 v2 RDE |
2. Lab Cross-Walk | ALT (LOINC 1742-6) = 88 U/L (05/14); AST (LOINC 1920-8) = 62 U/L (05/14); prior ALT = 45 U/L (04/16) | Pre-visit (background) | FHIR Observation / HL7 v2 ORU |
3. Cadence Inference | LFTs ordered q4w × 6 months — consistent with IMT protocol | Pre-visit (background) | FHIR ServiceRequest history |
4. Ambient Trigger | Physician mentions "recheck in 4 weeks" during encounter — confirms ongoing monitoring | Real-time (during visit) | Ambient speech recognition |
5. Clinician Prompt | "Confirm IMT for hepatotoxicity? q4w LFTs; hold if AST/ALT >3× ULN; last ALT LOINC 1742-6 = 88 U/L (05/14)." | 5 seconds | Prompt displayed on physician's device |
6. MDM Risk Line | Structured statement inserted into note with drug name, monitoring parameters, thresholds, and evidence | Instantaneous on tap | Auto-generated, physician-confirmed |
7. Provenance Binding | FHIR Provenance resource links MDM statement → MedicationRequest (methotrexate) + Observation (ALT 88, AST 62) + ServiceRequest (repeat LFT order) | Instantaneous | FHIR R4 Provenance; HL7 v2 ORM augmentation if needed |
The resulting note now contains:
"Patient reports ongoing fatigue. Reviewed prior hepatic function panels (ALT 45 U/L on 04/16, ALT 88 U/L on 05/14 — LOINC 1742-6; AST 62 U/L on 05/14 — LOINC 1920-8). Trend shows rising transaminases within monitoring parameters. Management includes prescription drug therapy requiring intensive monitoring for toxicity: methotrexate 15 mg/week for seropositive rheumatoid arthritis (M05.79) with serial hepatic function monitoring (ALT/AST q4w). Hold parameters established: ALT or AST >3× upper limit of normal. Repeat LFT ordered for 4 weeks. This management decision supports high-risk MDM per 2023 AMA E/M guidelines, Table of Risk."
When the post-pay auditor reviews this note, the Risk column is High. Combined with Moderate complexity of problems, the encounter meets 99215 criteria. The claim stands. Multiply this across 350 patients, and the protected revenue is not theoretical—it is the difference between a practice that sustains its operations and one that hemorrhages on appeals.
Technical Reference: ICD-10 Documentation Standards
Accurate ICD-10 coding is the second axis of audit defense (the first being MDM documentation). Scribing.io's Clinical Synthesis engine doesn't just generate notes—it ensures that the ICD-10 codes submitted with each encounter reach maximum specificity, which is the standard CMS requires to avoid denials and the standard commercial payers increasingly enforce via pre-pay edits.
How Scribing.io Ensures Maximum ICD-10 Specificity
Consider the rheumatology scenario above. The diagnosis is seropositive rheumatoid arthritis—coded at maximum specificity as M05.79 (Rheumatoid arthritis with rheumatoid factor of unspecified site, or the site-specific subcategory when laterality is documented). But the encounter also involves long-term methotrexate use and the monitoring context that supports it. Scribing.io auto-suggests the supporting Z-codes that many practices under-report:
Z79.01 — Long term (current) use of anticoagulants; Z79.899 — Other long term (current) drug therapy — For methotrexate, Z79.899 is the correct supporting code. This code signals to payers that the patient is on a long-term medication regimen requiring ongoing management, which contextualizes the medical necessity of serial lab monitoring. When Z79.899 is absent from the claim, payers' automated pre-pay edits may flag the repeated LFT orders as potentially unnecessary—triggering Additional Documentation Requests (ADRs) that consume staff time even when the claim is ultimately paid.
R74.01 (Elevation of levels of liver transaminase levels) — Documents the clinical finding that prompted the monitoring review. Without this code, the "mild transaminitis" noted in the physician's narrative has no structured representation on the claim.
T45.1X5A or the appropriate T-code for adverse effect — When transaminase elevation is attributed to methotrexate, this code links the finding to the drug, closing the causal chain that auditors look for.
Scribing.io surfaces these codes as suggestions during note finalization, mapped directly from the Clinical Synthesis output. The physician confirms or modifies. The codes are then written to the EHR's billing module via the same FHIR or HL7 interface used for note delivery, ensuring that the ICD-10 codes on the claim match the documentation in the note—a consistency requirement that the CMS ICD-10 documentation standards make explicit and that auditors verify by comparing the 1500/837P with the encounter note.
Common Denial Patterns Scribing.io Prevents
Denial Trigger | Root Cause | Scribing.io Prevention Mechanism |
|---|---|---|
Unspecified code submitted (e.g., M06.9 instead of M05.79) | AI scribe transcribes "RA" without pulling seropositivity from problem list | Clinical Synthesis cross-references active problem list for highest-specificity code; prompts if documentation supports a more specific code |
Missing Z-code for long-term drug therapy | Coder or auto-coder doesn't link active medication list to supporting codes | RxNorm → Z-code mapping auto-suggests Z79.899 when long-term DMARDs, immunosuppressants, or anticoagulants are detected |
Lab order flagged as "not medically necessary" | No ICD-10 code on the lab order links to a condition requiring monitoring | When IMT is confirmed, Scribing.io pre-populates the lab order's diagnosis pointer with the primary condition code + R74.01 if transaminitis is documented |
Claim-note mismatch on laterality or specificity | Note says "right knee RA" but claim submits M05.79 (unspecified site) | Anatomical entity recognition from ambient audio maps laterality to the correct 7th-character extension; flags mismatches before note finalization |
FHIR Provenance Architecture: How Audit Trails Are Built
Audit defense is not just about what the note says—it is about whether you can prove where the information came from. The FHIR R4 Provenance resource exists precisely for this purpose: it creates a machine-readable record of who generated a piece of data, what source it came from, and when it was recorded.
Scribing.io generates a Provenance resource for every MDM assertion that relies on external data (lab values, medication lists, external records). The structure follows this pattern:
Target: The DocumentReference containing the encounter note
Agent: The physician (as confirming clinician) and the Scribing.io system (as assembling agent), with distinct role codes per FHIR Provenance Agent Type
Entity (source): The specific FHIR Observation (lab result), MedicationRequest (active prescription), or ServiceRequest (lab order) that the MDM assertion references, each with its own resource ID and timestamp
Recorded: The timestamp of the physician's confirmation tap
This architecture means that when an auditor requests evidence for a high-risk MDM assertion—"Where is the proof that this physician reviewed serial LFTs for methotrexate hepatotoxicity monitoring?"—the answer is not "read the note." The answer is a computable, timestamped chain from MDM statement → lab values → medication → order history, each link independently verifiable in the EHR.
For EHR endpoints that do not expose a complete FHIR R4 API—specifically, those that lack the ServiceRequest "reasonCode" or "reasonReference" field, which is common in pre-2026 Epic FHIR facades and some Oracle Health Millennium configurations—Scribing.io falls back to HL7 v2 ORU (lab result) and ORM (order) message feeds. These messages contain OBR-31 (Reason for Study) and ORC-16 (Order Control Code Reason) segments that capture the clinical indication for the order, which Scribing.io parses and links into the Provenance chain. The dual-pathway approach (FHIR primary, HL7 v2 fallback) ensures provenance integrity regardless of EHR configuration maturity.
Competitor Gap Analysis: Layer 1 vs. Layer 2 vs. Layer 3
CMIOs evaluating ambient AI documentation tools should test for specific capabilities, not marketing claims. This matrix reflects publicly documented feature sets as of June 2026:
Capability | Scribing.io | Nuance DAX Copilot | Abridge | DeepScribe | Nabla |
|---|---|---|---|---|---|
Ambient speech → SOAP note | ✅ | ✅ | ✅ | ✅ | ✅ |
Speaker diarization | ✅ | ✅ | ✅ | ✅ | ✅ |
Specialty-aware templates | ✅ (42 specialties) | ✅ (select specialties) | ✅ (select specialties) | ✅ (select specialties) | Partial |
RxNorm ↔ LOINC IMT cross-walk | ✅ (340+ drug–lab pairs) | ❌ | ❌ | ❌ | ❌ |
Cadence inference from historical orders | ✅ | ❌ | ❌ | ❌ | ❌ |
Clinician-confirmed MDM risk line insertion | ✅ | ❌ | ❌ | ❌ | ❌ |
FHIR R4 Provenance binding for MDM assertions | ✅ | ❌ | ❌ | ❌ | ❌ |
HL7 v2 ORU/ORM fallback for provenance | ✅ | N/A | N/A | N/A | N/A |
ICD-10 specificity enforcement with Z-code mapping | ✅ | Partial (code suggestion only) | Partial | Partial | ❌ |
The pattern is consistent: competitors deliver Layer 1 (transcription and summarization) with growing competence. Some are adding code suggestion features. None have built the Layer 2 (Clinical Synthesis) and Layer 3 (Provenance) infrastructure that turns documentation into audit defense. The gap isn't cosmetic—it is structural, requiring pharmacovigilance knowledge bases, interoperability with EHR order systems, and FHIR Provenance resource generation that cannot be retrofitted onto a transcription-first architecture.
Implementation Checklist for CMIOs
Before selecting or renewing an AI scribe contract, CMIOs should validate the following capabilities through live demonstration—not slide decks:
Run the methotrexate test. Load a test patient with active methotrexate, three months of q4w LFT orders, and a most recent ALT above 60 U/L. Conduct a simulated encounter where the physician says only "labs look okay, recheck in four weeks." Verify that the system detects IMT context, prompts for confirmation, and generates a structured MDM risk line. If it doesn't, the system is Layer 1 only.
Inspect the Provenance resource. After the test encounter, request the FHIR Provenance resource generated for the MDM risk assertion. Verify that it links to the specific Observation (lab values), MedicationRequest (methotrexate), and ServiceRequest (repeat order). If the vendor cannot produce a Provenance resource, their audit defense claim is marketing.
Test the HL7 v2 fallback. If your EHR runs on a FHIR R4 facade that omits ServiceRequest.reasonCode (common in Epic pre-November 2025 builds), verify that the vendor can parse HL7 v2 ORM messages to capture order indications. Request a sample ORM segment showing OBR-31 extraction.
Verify ICD-10 specificity enforcement. Submit the test encounter with an intentionally unspecified code (e.g., M06.9). Verify that the system flags the specificity gap and suggests M05.79 + Z79.899 based on the clinical context.
Request a post-pay audit simulation. Ask the vendor to walk through how their documentation would respond to a post-pay review requesting evidence for 99215 billing. Specifically test whether the MDM Risk column can be defended without relying on the physician's memory of the encounter.
Validate consent handling. Confirm the system's ambient recording initiation complies with your state's consent framework (one-party vs. two-party) and that consent documentation is captured in the EHR audit log.
Benchmark against your denial rate. Request a 90-day pilot with a defined panel of high-IMT-prevalence patients (rheumatology, oncology, cardiology). Measure pre/post denial rates for 99215 claims and LFT/CBC lab order medical necessity denials.
Book a Demo: See Real-Time MDM Synthesis
The gap between what physicians do and what their notes prove is measurable, preventable, and—with the right infrastructure—closable in under 30 seconds per encounter.
Book a demo with Scribing.io to see real-time MDM Synthesis with an RxNorm–LOINC IMT detector and EHR-bound FHIR Provenance/HL7 trail that turns implied intent into audit-ready MDM in under 30 seconds. We'll run the methotrexate scenario live against your EHR environment, show you the Provenance resource chain, and benchmark projected revenue protection for your specific specialty mix and payer profile.
Your AI scribe should do more than write notes. It should defend them.



