AI UX Patterns

The craft of AI UX is managing unpredictability. LLM responses have variable latency, variable quality, and variable length. Every pattern here exists to make that unpredictability feel smooth to users.

Streaming UI

SSE (Server-Sent Events) remains the production standard for token streaming. Not WebSockets — SSE auto-reconnects, works over standard HTTP, simpler infrastructure.

Rendering Rules

Rule	Why
Buffer incomplete markdown before rendering	Partial code fences and half-formed links break the UI
Container grows without shifting surrounding elements	Prevent CLS (Cumulative Layout Shift)
Stop button prominent during streaming	Users need control over runaway responses
Throttle rendering during fast delivery	Prevent excessive re-renders (Vercel AI SDK: `experimental_throttle`)

Accessibility for Streaming

aria-live="polite" on response container
aria-atomic="false" so screen readers announce only new tokens
Debounce announcements to every few seconds (not every token)

Framework Reference

Vercel AI SDK useChat manages submitted, streaming, ready, and error states out of the box. This is the most mature React abstraction for streaming AI UI.

Loading States

Latency	Pattern	Example
<2s	Skeleton shimmer (3-5 lines, decreasing widths)	Standard AI response panel
2-5s	Skeleton + subtle pulse animation	Longer generation tasks
5-15s	Phase indicator (“Searching… Analyzing… Generating…“)	RAG + generation pipeline
>15s	Progress bar with elapsed time + cancel button	Batch processing, agent tasks

Key insight: Skeleton screens outperform spinners for AI because they communicate “active processing” rather than “waiting.”

Error Handling

The Rule: Errors at Point of Action

Never use global toasts for AI errors. Show the error where the action happened, with a retry button inline.

Error Type	User-Facing Pattern
LLM timeout	”Taking longer than expected. [Retry] or [Try simpler question]“
Rate limit	”Busy right now. Try again in ~30 seconds.” (show countdown)
Model error	”Something went wrong. [Retry with different approach]“
Quality concern	Inline disclaimer: “AI-generated — verify important details”
AI unavailable	Show manual alternatives with clear CTA

Fallback Strategy

Primary model (Sonnet) fails
  → Retry once
  → Fall back to cheaper model (Haiku) with quality disclaimer
  → Fall back to cached/template response
  → Show "AI unavailable" with manual alternative

Confidence and Disclaimers

Don’t show numerical confidence scores to end users. ChatGPT, Claude.ai, and Notion AI all avoid this — numbers create false precision and confuse non-technical users.

Do use:

Inline disclaimers: “AI-generated — verify important details”
Source citations: “Based on [Document X, page 3]”
Uncertainty language: “I’m not sure about this, but…”

Feedback Mechanisms

The production standard:

Mechanism	Purpose	Implementation
Thumbs up/down	Primary quality signal	On every AI response
Regenerate	Let user retry without changing input	Button below response
Edit response	User corrects AI output	Inline edit mode
Copy	Utility	One-click copy button
Report	Safety/quality escalation	Separate from thumbs down

Track whether users accepted, modified, or rejected AI suggestions — this is more informative than thumbs up/down alone. GitHub Copilot and Intercom Fin use this as their primary quality signal.

Conversation UX Decisions

Decision	Pattern	When
Show context usage	Progress bar or “X% context used”	Power users, long conversations
”New chat” button	Prominent, always visible	When context staleness is a risk
Pin/unpin messages	Let users mark important context	Long multi-turn sessions
Auto-summarize history	Background summarization on context pressure	Long conversations (see Claude Code’s 5 compression strategies)

Real Product References

Product	Key UX Decision
ChatGPT	Streaming + markdown rendering, model selector, regenerate button
Claude.ai	Artifact panel (code/docs side-by-side), extended thinking indicator
Notion AI	Inline AI in existing document, not a separate chat panel
Cursor	Tab completion (copilot) + side panel (chat) + background agent
Intercom Fin	AI resolves, shows sources, hands off to human seamlessly

KahWei's Wiki

Explorer

AI UX Patterns

AI UX Patterns

Streaming UI

Rendering Rules

Accessibility for Streaming

Framework Reference

Loading States

Error Handling

The Rule: Errors at Point of Action

Fallback Strategy

Confidence and Disclaimers

Feedback Mechanisms

Conversation UX Decisions

Real Product References

Sources

Graph View

Table of Contents

Backlinks

KahWei's Wiki

Explorer

AI UX Patterns

AI UX Patterns

Streaming UI

Rendering Rules

Accessibility for Streaming

Framework Reference

Loading States

Error Handling

The Rule: Errors at Point of Action

Fallback Strategy

Confidence and Disclaimers

Feedback Mechanisms

Conversation UX Decisions

Real Product References

Related Pages

Sources

Graph View

Table of Contents

Backlinks