i’m building a small AI chat ui in react (next.js app router, react 18) that streams tokens from my backend (openai style stream). basic flow:
user types a prompt
i push the user message into
messagesstatethen i call my
/api/chatendpointthe endpoint returns a stream of tokens
on each chunk i update state to append the partial assistant message
it “kind of” works locally, but when i click fast / send multiple prompts or in production build, the ui goes crazy:
some assistant messages get completely replaced by the last chunk
sometimes old chunks vanish and i only see the final one
sometimes a previous conversation suddenly re-appears
in strict mode it’s even worse (looks like double-render is exposing something)
i know about react 18 concurrent rendering / strict mode double invoking, stale closures, etc. my gut says i’m closing over messages inside the async function and then using it in setMessages([...messages, ...]) while new renders already happened. but i’m not 100% sure what is the idiomatic pattern here for streaming ai tokens:
should i be using a
useReducerinstead ofuseStatefor the messages array?should i hold the current assistant message in a
refand only commit it every X ms?do i need to cancel the previous stream with
AbortControllerwhen user sends a new prompt to avoid race conditions?what’s the cleanest way to handle this in react 18 so that streaming is stable even under strict mode?
here’s a simplified version of what i’m doing right now (this is the broken one). where exactly is the bug and how would you structure this properly for streaming ai responses?
import { useState } from "react";
type Message = {
id: string;
role: "user" | "assistant";
content: string;
};
export default function Chat() {
const [messages, setMessages] = useState<Message[]>([]);
const [loading, setLoading] = useState(false);
const handleSend = async (userInput: string) => {
if (!userInput.trim()) return;
// push user message
const userMsg: Message = {
id: crypto.randomUUID(),
role: "user",
content: userInput,
};
setMessages([...messages, userMsg]);
setLoading(true);
try {
const res = await fetch("/api/chat", {
method: "POST",
body: JSON.stringify({
messages: messages, // send whole history
input: userInput,
}),
headers: { "Content-Type": "application/json" },
});
const reader = res.body?.getReader();
let assistantMsg: Message = {
id: crypto.randomUUID(),
role: "assistant",
content: "",
};
if (!reader) {
setLoading(false);
return;
}
// stream chunks
while (true) {
const { value, done } = await reader.read();
if (done) break;
const chunk = new TextDecoder().decode(value || new Uint8Array());
assistantMsg.content += chunk;
// ❌ this is where things go wrong when sending multiple messages fast
// messages here is not the latest one and strict mode makes it worse
setMessages([
...messages,
userMsg,
assistantMsg, // keeps getting overwritten / duplicated
]);
}
} catch (e) {
console.error(e);
} finally {
setLoading(false);
}
};
return (
<div>
{/* imagine there is an input that calls handleSend */}
{messages.map((m) => (
<div key={m.id}>
<b>{m.role}:</b> {m.content}
</div>
))}
{loading && <div>Thinking…</div>}
</div>
);
}