React Suspense and Streaming SSR Internals: How `renderToPipeableStream` Splits and Sends HTML Chunks, and How the Client Reassembles Them
I once inherited a dashboard with 3-second page response times. At first I assumed the APIs were just slow, so I started with caching — but the real problem was elsewhere. renderToString was holding the response hostage until the entire React tree was rendered in memory on the server. Even when the fast sections were already ready, one 3-second API call was blocking everything. Watching users give up on a blank screen pushed me to take streaming SSR seriously.
If you're using the Next.js App Router, you're already building on top of the solution to this problem. renderToPipeableStream and <Suspense>, introduced in React 18, are at the core of it — though honestly, I first dismissed it as "just an API for showing loading spinners." Looking under the hood, I found that React uses a surprisingly sophisticated approach: keeping an HTTP stream open and appending HTML to it in pieces.
This post examines how <Suspense> boundaries determine where HTML is cut and how, and how those pieces are stitched back together on the client into a complete page. The key insight is: everything happens over a single open HTTP stream — no new network requests. Understanding this principle makes it much clearer where to place Suspense boundaries and where to catch errors. I'll assume you already know the basics of <Suspense>.
Core Concepts
Traditional SSR vs. Streaming SSR
The problem with the old renderToString approach is straightforward. The server renders the entire React tree in memory, then sends the HTML string to the client. If there's one slow data source, all the fast content has to wait along with it.
[ Traditional SSR ]
Server: [===== Data Loading + Full Render =====] → Send HTML → Client Receives
↑
One 3-second API
blocks everythingStreaming SSR reverses this order. It sends what's ready first, and appends the rest later while keeping the stream open.
[ Streaming SSR ]
Server: [Render Shell] → Send immediately → [Data loading...] → Send additional chunks
Client: Receive HTML → Begin parsing & rendering → Receive additional chunks → Update DOMTerm: Shell — The static content region outside
<Suspense>boundaries. It's the HTML skeleton that can be rendered immediately without data, and it becomes the first chunk of the stream.
The Chunk-Splitting Mechanism of renderToPipeableStream
<Suspense> boundaries act as "seams" where the stream forks. Here's the server's internal behavior, step by step:
Client Server
| |
|<-- ① Shell HTML + skeleton -----| Sent immediately when pipe(res) is called in onShellReady
| |
| | ② Async data loading (server-side)
| |
|<-- ③ Actual content HTML chunk--| Appended to the same stream when data is ready
|<-- ④ <script> for DOM swap -----| Swaps fallback for actual contentThis felt surprising at first, but if you look at the HTML arriving on the client, you can actually see text being continuously appended within a single response. Open a Network tab in Chrome DevTools, select the HTML response, and leave the "Response" panel open — you'll watch the content grow in real time without the response closing.
<!-- ① Arrives first: Shell + fallback (skeleton) -->
<!DOCTYPE html>
<html>
<body>
<header>...</header>
<!--$?--><template id="B:0"></template>
<div><!-- Skeleton UI --></div><!--/$-->
</body>
</html>
<!-- ③④ Arrives later over the same TCP connection -->
<div hidden id="S:0">
<article>Actual blog post list...</article>
</div>
<script>$RC("B:0","S:0")</script>$RC is a function React injects globally that replaces the fallback at id="B:0" with the hidden content at id="S:0". This isn't a new fetch or a WebSocket — it's text continuously appended to the same HTTP response stream that was opened at the start.
Why does this matter? The HTTP status code lives in the response headers, and headers are sent the moment streaming begins (when the Shell is sent). That means once streaming has started, you cannot change the status code. If an error occurs after the Shell, you have to deliver that error UI inside a 200 response. This is covered in detail under caveats.
Deep Dive: Selective Hydration and Client Reassembly
This is worth exploring after you understand the basic flow. When HTML chunks arrive, React doesn't simply stop at updating the DOM — the hydration process has also evolved to match streaming.
| Concept | Description |
|---|---|
| Progressive Hydration | Each <Suspense> chunk hydrates independently as it arrives, without blocking the entire page |
| Selective Hydration | Components the user clicks or scrolls to are prioritized in the hydration queue |
| Event Preservation | Events like clicks that occur before hydration are queued and replayed after hydration completes |
What makes Selective Hydration particularly impressive is that if a user clicks elsewhere while a heavy component is being hydrated, React immediately reorders the hydration sequence on the fly. User interaction becomes a "hydration priority signal."
Deep Dive: React Flight Protocol (If You're Using RSC)
When using RSC (React Server Components), what gets streamed is not HTML but a separate format. You can inspect it in the Network tab by opening a stream with a text/x-component content type.
# Flight format example (actual stream text)
0:["$","div",null,{"children":["$","ul",null,...]}]
1:I["./ClientComponent",[...]]
2:{"id":1,"title":"Hello World"}Lines in the format ID:type:data are streamed, and the client reassembles them as chunk references to build the React tree. In December 2025, a security vulnerability (CVE-2025-55182) was discovered in the Flight Payload deserialization process and subsequently patched. If you're using RSC, check the CVE details page for the affected React version range and the patch version, and update accordingly.
Practical Application
Example 1: Next.js App Router — Automatic Suspense Boundary with loading.tsx
This is the easiest entry point. A single file creates a Suspense boundary automatically. It seems almost too simple — I was skeptical at first too — but it works. The cost-to-benefit ratio here is the best of any approach.
// app/dashboard/loading.tsx
// Simply having this file wraps the entire page.tsx in Suspense
export default function Loading() {
return <DashboardSkeleton />;
}// app/dashboard/page.tsx
export default async function Page() {
// Even if this fetch takes 3 seconds, the Shell has already been sent to the client
const data = await fetchSlowData();
return <Dashboard data={data} />;
}Even with a 3-second API, users see the layout and skeleton immediately.
Example 2: Component-Level Suspense Boundaries — Independent Streaming
Especially useful for dashboards with multiple independent data sources. I initially wrapped an entire page in a single <Suspense> and found the benefit was cut in half — splitting each section was the key.
LCP (Largest Contentful Paint) elements like hero images or main titles must be placed outside <Suspense>, i.e., in the Shell. When designing skeleton UX, it's tempting to wrap everything in Suspense to show skeletons during loading — but if LCP elements end up inside, their appearance is delayed and Core Web Vitals scores suffer.
// app/dashboard/page.tsx
export default function Page() {
return (
<>
{/* Included in Shell — sent immediately without data */}
<Header />
<Navigation />
<HeroSection /> {/* LCP element — placed outside Suspense */}
{/* Each streams independently — faster ones appear first */}
<Suspense fallback={<RecentPostsSkeleton />}>
<RecentPosts />
</Suspense>
<Suspense fallback={<RecommendationsSkeleton />}>
<Recommendations />
</Suspense>
</>
);
}| Component | When Sent | Reason |
|---|---|---|
<Header />, <Navigation /> |
Immediately (Shell) | No data required |
<HeroSection /> |
Immediately (Shell) | Protects LCP element |
<RecentPosts /> |
After data is ready | Inside Suspense |
<Recommendations /> |
After data is ready (independently) | Separate Suspense boundary |
Example 3: Direct Setup with Express + renderToPipeableStream
When setting this up without a framework, the key is the onShellReady callback timing and the error handling branches. It can look complex with so many callbacks, but knowing when each one is called makes it quite clear.
import { renderToPipeableStream } from 'react-dom/server';
app.get('/', (req, res) => {
let didError = false;
const { pipe, abort } = renderToPipeableStream(<App />, {
bootstrapScripts: ['/main.js'],
onShellReady() {
// Headers cannot be changed after pipe() is called
// All header configuration must be completed at this point
res.statusCode = didError ? 500 : 200;
res.setHeader('Content-Type', 'text/html');
pipe(res);
},
onShellError(err) {
// The Shell itself failed — no response has been sent yet, so a 500 is possible
res.statusCode = 500;
res.send('<h1>Something went wrong</h1>');
},
onError(err) {
// Error in a chunk after the Shell — 200 is already in progress
didError = true;
console.error(err);
},
});
// Without this, server memory will slowly fill up
setTimeout(abort, 10000);
});The distinction between onShellReady and onShellError is important. If Shell rendering itself fails (onShellError), no response has been sent yet, so the status code can still be set to 500. But if an error occurs in a chunk after the Shell (onError), a 200 response is already in progress and the status code can no longer be changed.
Example 4: Handling Streaming Errors with Error Boundary
In streaming SSR, if an error occurs after the Shell, the HTTP status code is already 200 — but a client-side Error Boundary can catch the error for that chunk and display a fallback UI. In practice, Suspense and Error Boundary always go together. If you focus only on server-side error handling and omit the client Error Boundary, errors after the Shell will be exposed as-is.
// components/ErrorBoundary.tsx
'use client';
import { Component, ReactNode } from 'react';
interface Props {
fallback: ReactNode;
children: ReactNode;
}
interface State {
hasError: boolean;
}
class ErrorBoundary extends Component<Props, State> {
state: State = { hasError: false };
static getDerivedStateFromError(): State {
return { hasError: true };
}
render() {
if (this.state.hasError) {
return this.props.fallback;
}
return this.props.children;
}
}
export default ErrorBoundary;// app/dashboard/page.tsx
import ErrorBoundary from '@/components/ErrorBoundary';
export default function Page() {
return (
<ErrorBoundary fallback={<p>Failed to load data.</p>}>
<Suspense fallback={<PostsSkeleton />}>
<RecentPosts />
</Suspense>
</ErrorBoundary>
);
}Example 5: Integrating Suspense with the use() Hook in React 19
Starting with React 19, you can use the use() Hook to connect a Promise directly to Suspense. The most natural pattern is fetching data with async/await in a Server Component and consuming it with use() in a Client Component. Using use() in a Server Component is less intuitive than async/await, so keeping the client boundary clearly defined is important.
// app/users/[id]/page.tsx (Server Component)
interface User {
id: string;
name: string;
}
async function fetchUserData(id: string): Promise<User> {
const res = await fetch(`/api/users/${id}`);
return res.json();
}
export default function Page({ params }: { params: { id: string } }) {
// Create a Promise in the Server Component and pass it to the client
const userPromise = fetchUserData(params.id);
return (
<Suspense fallback={<UserSkeleton />}>
<UserProfile userPromise={userPromise} />
</Suspense>
);
}// components/UserProfile.tsx (Client Component)
'use client';
import { use } from 'react';
interface User {
id: string;
name: string;
}
function UserProfile({ userPromise }: { userPromise: Promise<User> }) {
// If the Promise is pending, it triggers the nearest Suspense
const user = use(userPromise);
return <div>{user.name}</div>;
}
export default UserProfile;Pros and Cons
Advantages
| Item | Description |
|---|---|
| Improved TTFB | Shell is sent immediately without waiting for data, reducing first-byte latency |
| Better FCP | Browser begins parsing and rendering as soon as it receives the Shell HTML |
| Independent Loading | Slow data sources don't block fast content regions |
| Selective Hydration | Regions with user interaction are hydrated first, improving TTI |
| SEO Compatible | Crawlers can progressively read streaming HTML |
| Backpressure Handling | When the network is congested, the renderer automatically throttles speed, reducing server memory waste |
Term: Backpressure — A Node.js Streams mechanism that automatically regulates production speed when a server generates data faster than a client can process it. In streaming SSR, this translates to a benefit where the server slows its rendering pace when the client network is slow, preventing unnecessary memory consumption.
Disadvantages and Caveats
| Item | Description | Mitigation |
|---|---|---|
| HTTP Status Code Limitation | Status code cannot be changed after the Shell is sent | Handle onShellError and onError separately; handle post-Shell errors with a client Error Boundary |
| CSS-in-JS Conflict | Runtime libraries like styled-components can cause FOUC | Use CSS Modules or zero-runtime CSS (e.g., Linaria) |
| Suspense Boundary Design | Too many boundaries cause frequent reflows; too few reduce streaming benefit | Recommended to define boundaries per data source |
| Skeleton Size Mismatch | Layout reflow occurs when fallback is replaced | Design skeletons with dimensions similar to actual content |
| Server Memory | Memory usage increases with high numbers of concurrent streams | Stream timeout (abort()) configuration is essential |
| Implementation Complexity | More considerations compared to renderToString: error handling, header timing, etc. |
Use framework abstractions like Next.js App Router when possible |
Term: FOUC (Flash of Unstyled Content) — A phenomenon where unstyled HTML is momentarily visible before styles are applied. Can occur when runtime CSS-generation libraries conflict with streaming.
Most Common Mistakes in Practice
-
Placing LCP elements inside
<Suspense>. LCP targets like hero images and main titles must be included in the Shell. When focused on designing skeleton UX, it's easy to wrap everything in Suspense — but if LCP elements end up inside, Core Web Vitals scores take a significant hit. -
Calling
res.setHeader()after the Shell has been sent. Headers cannot be changed afterpipe(res). All header configuration, including Content-Type, must be done inside theonShellReadycallback, before thepipe()call. It's fairly common to miss the header timing when adding error handling later. -
Not setting a stream timeout. If the server permanently fails to fetch data, an indefinitely open stream will continuously consume server memory. This can silently be a cause of server slowdowns, so it's recommended not to omit the single line
setTimeout(() => abort(), 10000).
Closing Thoughts
renderToPipeableStream and <Suspense> shift the paradigm from "build everything, then send" to "send what's ready first," delivering real improvements across TTFB, FCP, and TTI metrics. When I applied this structure to the 3-second dashboard introduced at the start, the time to first visible screen dropped to the hundreds of milliseconds range. The data still arrives after 3 seconds — but during those 3 seconds, users see a layout and skeleton instead of a blank screen.
Here are 3 steps you can try right now.
-
Add
loading.tsxto a slow page in your Next.js App Router project. Simply creatingloading.tsxand returning a skeleton component activates streaming SSR. Open the Response panel for the HTML response in Chrome DevTools Network tab and refresh the page — you can watch the response grow in real time. -
Separate components with different data sources into their own
<Suspense>boundaries. Once you see faster sections appearing before slower ones without waiting, you'll immediately get an intuition for where boundaries should go. -
Visualize hydration timing in the React DevTools Profiler tab. Observe the actual order in which Selective Hydration proceeds, and try placing
<Suspense>boundaries deeper for lower-priority components to reduce initial hydration cost.
If you'd like to be notified when future posts are published, consider subscribing via RSS.
References
- renderToPipeableStream – React Official Docs
- <Suspense> – React Official Docs
- New Suspense SSR Architecture in React 18 – reactwg/react-18 Discussion #37
- React 19.2 Official Blog
- React 19.2 Brings Suspense Batching to Server Rendering – Medium
- Streaming SSR with React 18 – LogRocket Blog
- Streaming Server-Side Rendering – patterns.dev
- React renderToPipeableStream with Express: A Deep Dive – DEV Community
- Guides: Streaming – Next.js Official Docs
- How Suspense + Streaming + Selective Hydration Drive Next-Level Page Speed – Makers Den
- Demystifying Hydration and Streaming in React 19 – The Frontend Dev (Medium)
- React Flight Protocol – DeepWiki
- Selective Hydration – patterns.dev
- React Server Components Streaming Performance Guide 2026 – SitePoint