Agentic AI

Services

Architechture

Use case

If you're evaluating agentic AI vendors, you need to understand how the architecture actually works.

This page explains the reference architecture for healthcare agentic AI systems. Five layers. Each serves a specific purpose. All five required.

The Five-Layer Architecture

Each layer addresses a specific requirement healthcare organizations have. Remove any layer, and the system fails compliance, safety, or operational requirements.

Layer 1

Orchestration

Orchestration

Orchestration

Multi-agent coordination and planning

Multi-agent coordination and planning

Multi-agent coordination and planning

Layer 2

Knowledge

Knowledge

Knowledge

Retrieval-augmented generation over your data, MCP, A2A and Function calling to your data sources

Retrieval-augmented generation over your data, MCP, A2A and Function calling to your data sources

Retrieval-augmented generation over your data, MCP, A2A and Function calling to your data sources

Layer 3

Integration

Integration

Integration

Tool calling and system connectivity

Tool calling and system connectivity

Tool calling and system connectivity

Layer 4

Governance

Governance

Governance

Human oversight, guardrails, compliance

Human oversight, guardrails, compliance

Human oversight, guardrails, compliance

Layer 5

Observability

Observability

Observability

Monitoring, cost tracking, quality assurance

Monitoring, cost tracking, quality assurance

Monitoring, cost tracking, quality assurance

Layer 1 – Orchestration:
How Agents Plan and Execute

Layer 1 – Orchestration:
How Agents Plan and Execute

The Problem

Healthcare workflows are complex. A prior authorization requires retrieving patient data, analyzing payer policies, generating documentation, getting human approval, submitting to portals, and tracking status. You need a system that can plan this sequence and coordinate execution.

The Solution : Multi-Agent Orchestration

Instead of one massive AI trying to handle everything, the orchestration layer uses specialized agents coordinated by a supervisor agent that plans overall workflow, routes tasks to specialists, and handles exceptions.

Specialized Sub-Agents:

  • Patient Data Agent → Retrieves and synthesizes clinical information

  • Policy Agent → Looks up payer-specific requirements

  • Documentation Agent → Generates medical necessity letters

  • Submission Agent → Interfaces with payer portals

  • Monitoring Agent → Tracks status and handles responses

Why This Matters:

Single-agent systems become brittle as complexity increases. When one agent tries to handle chart review, policy interpretation, documentation generation, and portal submission, performance degrades and failures are hard to diagnose.


Multi-agent architecture gives you modularity. When payer policies change, you update the Policy Agent. When documentation requirements change, you update the Documentation Agent. You don't rebuild the entire system.

Layer 2 – Knowledge:
How Agents Access Your Data

The Problem

AI models trained on generic medical data don't know your patient's history, your payer contracts, your institutional protocols, or your historical prior authorization patterns. Even worse, they can't interact with your systems to retrieve records, check eligibility, or trigger workflows. You need agents grounded in your actual data and equipped to act on it.

The Solution : Retrieval-Augmented Generation (RAG) + Function Calling

This layer gives agents two essential capabilities:


Retrieval-Augmented Generation (RAG) connects agents to your enterprise knowledge in real-time. When an agent needs information, it retrieves it from your systems before generating responses.


Function and Tool Calling enables agents to take action: query databases, check real-time eligibility, call APIs, trigger notifications, or write back to systems.


Together, these capabilities transform agents from text generators into systems that reason over your data and execute workflows.

Your Knowledge Sources:

  • EHR Data → Patient records, clinical notes, labs, imaging, medications

  • Payer Policies → Medical necessity criteria, coverage requirements

  • Institutional Protocols → Your formularies, care pathways, documentation standards

  • Historical Data → Prior successful authorizations, denial patterns, appeal strategies

Why This Matters:

Healthcare AI without RAG is guessing. Healthcare AI without function calling is trapped in read-only mode.


RAG ensures agents work with your actual data. When payer policies change, agents access updated policies immediately. When clinical guidelines evolve, agents reference current standards. No retraining required.


Function calling ensures agents can act on that data. Check eligibility in real-time. Route documents to the right reviewer. Trigger alerts when thresholds are crossed. Execute workflows end-to-end.


Together, they make agents useful in production healthcare environments.

Example: Prior Authorization for Specialty Medication

Agent processes prior authorization request for patient with rheumatoid arthritis requiring Humira.

Without RAG + Function Calling:

Agent generates:

"Patient has rheumatoid arthritis and requires Humira for disease management. This medication is medically necessary for treatment."

Letter submitted → Denied 5 days later for missing step therapy documentation → Staff manually searches records → Resubmitted after 3 additional days.

With RAG + Function Calling:

Agent calls formulary function → Discovers step therapy requirement (methotrexate trial required)

Agent retrieves from EHR via RAG:

  • Methotrexate 15mg weekly for 6 months (documented trial)

  • Treatment failure: DAS28 score >5.1, persistent joint swelling

  • Current disease activity: 8 tender joints, ESR 45 mm/hr

Agent retrieves payer policy via RAG → Confirms requirement met (≥3 months conventional DMARD with inadequate response)

Agent generates: "Patient completed 6-month methotrexate trial (15mg weekly, Jan-June 2024) with inadequate response. DAS28 remains elevated at 5.8 with 8 tender joints, 6 swollen joints, and ESR 45 mm/hr. Meets payer step therapy requirement for biologic initiation."

Agent calls review function → Routes to rheumatology nurse for approval before submission

All requirements addressed upfront. Submitted correctly on first attempt.

Without RAG + Function Calling:

Agent generates:

"Patient has rheumatoid arthritis and requires Humira for disease management. This medication is medically necessary for treatment."

Letter submitted → Denied 5 days later for missing step therapy documentation → Staff manually searches records → Resubmitted after 3 additional days.

With RAG + Function Calling:

Agent calls formulary function → Discovers step therapy requirement (methotrexate trial required)

Agent retrieves from EHR via RAG:

  • Methotrexate 15mg weekly for 6 months (documented trial)

  • Treatment failure: DAS28 score >5.1, persistent joint swelling

  • Current disease activity: 8 tender joints, ESR 45 mm/hr

Agent retrieves payer policy via RAG → Confirms requirement met (≥3 months conventional DMARD with inadequate response)

Agent generates: "Patient completed 6-month methotrexate trial (15mg weekly, Jan-June 2024) with inadequate response. DAS28 remains elevated at 5.8 with 8 tender joints, 6 swollen joints, and ESR 45 mm/hr. Meets payer step therapy requirement for biologic initiation."

Agent calls review function → Routes to rheumatology nurse for approval before submission

All requirements addressed upfront. Submitted correctly on first attempt.

Layer 3 – Integration:
How Agents Connect to Your Systems

Layer 3 – Integration:
How Agents Connect to Your Systems

Layer 3 – Integration:
How Agents Connect to Your Systems

The Problem

Your workflows span EHRs, payer portals, lab systems, imaging platforms, scheduling software. Agents need to read from and write to these systems. Manual integration for each system and each workflow is expensive and brittle.

The Solution : Tool Calling & Model Context Protocol (MCP)

Agents execute actions by calling tools—functions that interact with your systems. MCP standardizes how tools are defined, secured, and invoked.

Tool Examples:

Why This Matters:

Healthcare has dozens of disconnected systems. Tool calling eliminates manual system-hopping and data entry. MCP makes integration maintainable as your systems evolve.


Without MCP, you build custom integrations for every tool. With MCP, you define tools once using a standard protocol. Agents discover them dynamically. Adding new tools doesn't require re-engineering agents.

Layer 4 – Governance:
How You Maintain Control

The Problem

Healthcare AI systems access sensitive patient data, influence clinical workflows, and generate documentation affecting care decisions. You need oversight mechanisms, safety controls, and audit trails meeting regulatory requirements.

The Solution : Governance Layer with Four Components

Component 1: Human-in-the-Loop (HITL)

Clinical and operational staff review and approve agent actions at defined decision points. When an agent completes documentation or plans a significant action, staff see the output, source citations, and reasoning chain. They can approve, edit and approve, reject, or escalate. Every decision is logged with user ID and timestamp.

Component 2: Guardrails

Technical controls prevent agents from taking out-of-scope actions. Agents are authorized only for defined tasks (prior auth documentation, not clinical diagnosis). Data access is limited to minimum necessary. Tool invocations are restricted by role-based permissions. Clinical recommendations are filtered against institutional formularies, contraindications are checked, and off-label use is flagged for physician review. All generated content must reference retrieved source data—no hallucinated clinical facts, and payer criteria citations must match actual policy text.

Component 3: Audit Trails

Comprehensive logging of all agent activities for compliance and quality improvement. Every action is logged: what patient data was retrieved and when, which tools were called with what parameters, agent reasoning chains, staff reviews and approvals, and system errors or escalations. Audit trails use tamper-proof storage with 6-7 year retention periods, are searchable and exportable for investigations, and provide real-time access for compliance teams.

Component 4: Continuous Evaluation

Ongoing quality assurance monitors agent accuracy, clinical appropriateness, and operational performance. Automated evaluations compare outputs to ground truth, check policy compliance, and analyze consistency across similar cases. Physicians review random samples (5-10% of outputs) for clinical appropriateness and documentation quality. Operational metrics track approval rates, edit rates, escalation rates, and time savings. Quality thresholds trigger alerts: accuracy must stay above 95%, approval rate above 90%, clinical appropriateness above 4/5—or the system pauses for review.

Component 1: Human-in-the-Loop (HITL)

Clinical and operational staff review and approve agent actions at defined decision points. When an agent completes documentation or plans a significant action, staff see the output, source citations, and reasoning chain. They can approve, edit and approve, reject, or escalate. Every decision is logged with user ID and timestamp.

Component 2: Guardrails

Technical controls prevent agents from taking out-of-scope actions. Agents are authorized only for defined tasks (prior auth documentation, not clinical diagnosis). Data access is limited to minimum necessary. Tool invocations are restricted by role-based permissions. Clinical recommendations are filtered against institutional formularies, contraindications are checked, and off-label use is flagged for physician review. All generated content must reference retrieved source data—no hallucinated clinical facts, and payer criteria citations must match actual policy text.

Component 3: Audit Trails

Comprehensive logging of all agent activities for compliance and quality improvement. Every action is logged: what patient data was retrieved and when, which tools were called with what parameters, agent reasoning chains, staff reviews and approvals, and system errors or escalations. Audit trails use tamper-proof storage with 6-7 year retention periods, are searchable and exportable for investigations, and provide real-time access for compliance teams.

Component 4: Continuous Evaluation

Ongoing quality assurance monitors agent accuracy, clinical appropriateness, and operational performance. Automated evaluations compare outputs to ground truth, check policy compliance, and analyze consistency across similar cases. Physicians review random samples (5-10% of outputs) for clinical appropriateness and documentation quality. Operational metrics track approval rates, edit rates, escalation rates, and time savings. Quality thresholds trigger alerts: accuracy must stay above 95%, approval rate above 90%, clinical appropriateness above 4/5—or the system pauses for review.

Why This Matters:

Healthcare organizations are accountable for AI system actions under HIPAA, Joint Commission standards, and medical practice laws. Governance isn't optional. It's how you deploy AI with confidence in safety, compliance, and institutional control.

Layer 5 – Observability:
How You Monitor and Optimize

Layer 5 – Observability:
How You Monitor and Optimize

Layer 5 – Observability:
How You Monitor and Optimize

The Problem

AI systems in production consume resources (compute, API calls, staff time) and produce outputs of varying quality. Without visibility, you can't measure ROI, identify issues, or optimize performance. You're flying blind.

The Solution : Observability Platform

Real-time monitoring of agent behavior, costs, quality, and operational impact with dashboards, alerts, and analytics.

Why This Matters:

You can't justify AI investment to leadership without clear ROI data. You can't optimize performance without understanding bottlenecks. You can't catch quality issues before they affect patients without monitoring.


Observability transforms AI from a black box into a managed operational system with clear metrics, continuous improvement, and data-driven decision-making.

Why Scalefresh's Reference Architecture

We built this architecture specifically for healthcare operational workflows after working with multiple health systems. Scalefresh’s reference architecture provides a repeatable foundation for deploying agentic AI safely across healthcare workflows—without locking organizations into brittle, one-off implementations.

Healthcare-Specific Design:

  • Multi-agent orchestration handles clinical workflow complexity

  • RAG over EHR data prevents hallucination of clinical facts

  • Tool calling integrates with Epic, Cerner, MEDITECH

  • HITL checkpoints designed with clinical staff input

  • Audit trails meet HIPAA and Joint Commission requirements

Production-Ready:

  • Deployed at hospitals processing hundreds of workflows daily

  • Cost optimization (typical $1-2 per prior authorization)

  • Performance monitoring with real-time alerts

  • Continuous evaluation with clinical expert review

Transparent Implementation:

  • Architecture documentation provided before deployment

  • Source code review available for security teams

  • Clear data flow diagrams showing what goes where

  • Cost modeling with actual usage data

Healthcare-Specific Design:

  • Multi-agent orchestration handles clinical workflow complexity

  • RAG over EHR data prevents hallucination of clinical facts

  • Tool calling integrates with Epic, Cerner, MEDITECH

  • HITL checkpoints designed with clinical staff input

  • Audit trails meet HIPAA and Joint Commission requirements

Production-Ready:

  • Deployed at hospitals processing hundreds of workflows daily

  • Cost optimization (typical $1-2 per prior authorization)

  • Performance monitoring with real-time alerts

  • Continuous evaluation with clinical expert review

Transparent Implementation:

  • Architecture documentation provided before deployment

  • Source code review available for security teams

  • Clear data flow diagrams showing what goes where

  • Cost modeling with actual usage data