AI UX Patterns

The craft of AI UX is managing unpredictability. LLM responses have variable latency, variable quality, and variable length. Every pattern here exists to make that unpredictability feel smooth to users.

Streaming UI

SSE (Server-Sent Events) remains the production standard for token streaming. Not WebSockets — SSE auto-reconnects, works over standard HTTP, simpler infrastructure.

Rendering Rules

RuleWhy
Buffer incomplete markdown before renderingPartial code fences and half-formed links break the UI
Container grows without shifting surrounding elementsPrevent CLS (Cumulative Layout Shift)
Stop button prominent during streamingUsers need control over runaway responses
Throttle rendering during fast deliveryPrevent excessive re-renders (Vercel AI SDK: experimental_throttle)

Accessibility for Streaming

  • aria-live="polite" on response container
  • aria-atomic="false" so screen readers announce only new tokens
  • Debounce announcements to every few seconds (not every token)

Framework Reference

Vercel AI SDK useChat manages submitted, streaming, ready, and error states out of the box. This is the most mature React abstraction for streaming AI UI.

Loading States

LatencyPatternExample
<2sSkeleton shimmer (3-5 lines, decreasing widths)Standard AI response panel
2-5sSkeleton + subtle pulse animationLonger generation tasks
5-15sPhase indicator (“Searching… Analyzing… Generating…“)RAG + generation pipeline
>15sProgress bar with elapsed time + cancel buttonBatch processing, agent tasks

Key insight: Skeleton screens outperform spinners for AI because they communicate “active processing” rather than “waiting.”

Error Handling

The Rule: Errors at Point of Action

Never use global toasts for AI errors. Show the error where the action happened, with a retry button inline.

Error TypeUser-Facing Pattern
LLM timeout”Taking longer than expected. [Retry] or [Try simpler question]“
Rate limit”Busy right now. Try again in ~30 seconds.” (show countdown)
Model error”Something went wrong. [Retry with different approach]“
Quality concernInline disclaimer: “AI-generated — verify important details”
AI unavailableShow manual alternatives with clear CTA

Fallback Strategy

Primary model (Sonnet) fails
  → Retry once
  → Fall back to cheaper model (Haiku) with quality disclaimer
  → Fall back to cached/template response
  → Show "AI unavailable" with manual alternative

Confidence and Disclaimers

Don’t show numerical confidence scores to end users. ChatGPT, Claude.ai, and Notion AI all avoid this — numbers create false precision and confuse non-technical users.

Do use:

  • Inline disclaimers: “AI-generated — verify important details”
  • Source citations: “Based on [Document X, page 3]”
  • Uncertainty language: “I’m not sure about this, but…”

Feedback Mechanisms

The production standard:

MechanismPurposeImplementation
Thumbs up/downPrimary quality signalOn every AI response
RegenerateLet user retry without changing inputButton below response
Edit responseUser corrects AI outputInline edit mode
CopyUtilityOne-click copy button
ReportSafety/quality escalationSeparate from thumbs down

Track whether users accepted, modified, or rejected AI suggestions — this is more informative than thumbs up/down alone. GitHub Copilot and Intercom Fin use this as their primary quality signal.

Conversation UX Decisions

DecisionPatternWhen
Show context usageProgress bar or “X% context used”Power users, long conversations
”New chat” buttonProminent, always visibleWhen context staleness is a risk
Pin/unpin messagesLet users mark important contextLong multi-turn sessions
Auto-summarize historyBackground summarization on context pressureLong conversations (see Claude Code’s 5 compression strategies)

Real Product References

ProductKey UX Decision
ChatGPTStreaming + markdown rendering, model selector, regenerate button
Claude.aiArtifact panel (code/docs side-by-side), extended thinking indicator
Notion AIInline AI in existing document, not a separate chat panel
CursorTab completion (copilot) + side panel (chat) + background agent
Intercom FinAI resolves, shows sources, hands off to human seamlessly

Sources