
Building a Personal AI Chatbot UI in Next.js + FastAPI (Part 4)
Shipping the chatbot meant picking where to put it (new site vs. portfolio), how to draw the UI (template vs. custom), and where to run the OpenAI calls (Next.js API routes vs. a separate Python service). This is Dev Log 4 of the OmarAI series. I ended up putting the UI in my existing portfolio and splitting the backend into FastAPI. Final URL: theomar.me/omar-ai.
TL;DR
- Host it in the portfolio site, not a new domain — less context-switching for visitors.
- Skip drop-in React chatbot templates with version conflicts; steal the CSS, write the wiring.
- Split the backend into FastAPI — Python is faster to iterate for AI calls.
- Stream replies over server-sent-events so the UI feels snappy.
Question 1 — separate site or portfolio site?
Two options:
- New site dedicated to OmarAI. Clean, product-like, easy to share.
- Embed it in my existing portfolio alongside Projects, Blogs, Resume.
I went with #2. The portfolio already has everything about me in one place — adding an actual Omar chatbot to the mix is a natural extension. Someone reading my project list can immediately ask the bot a clarifying question. That loop is the product.
Question 2 — template or custom UI?
I started by looking for a prebuilt chatbot frame. A Subframe template for AI chat looked like it'd save me half a day. When I tried to drop it into my Next.js project, the build immediately failed:
- Different React major version than my site.
- Different Tailwind config.
- A handful of runtime deps I didn't have and didn't want.
Classic dependency conflict. When a template is 80% styling and 20% wiring, you're better off copying the styling and writing the wiring by hand. So I did.
Question 3 — backend in Next.js API routes, or a separate service?
First I tried Next.js API routes. The pure Node SDK call is straightforward. But the moment I wanted streaming, a clean system-prompt builder, and good local debugging, the iteration speed in TypeScript lagged behind what I'd do in Python. I'm much more experienced in Python for AI work, and small differences in DX compound over dozens of iterations.
So I split the backend into FastAPI:
Browser ─┐
├→ Next.js (Vercel): static portfolio + /omar-ai page
Browser ─┘ │
▼ fetch POST /chat
FastAPI service (Python)
│
▼
OpenAI API (fine-tuned gpt-4o-*)
The minimal frontend → backend contract
// frontend
const res = await fetch(`${API}/chat`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ messages }),
});
const reader = res.body!.getReader();
const decoder = new TextDecoder();
let assistant = "";
while (true) {
const { value, done } = await reader.read();
if (done) break;
assistant += decoder.decode(value, { stream: true });
setCurrent(assistant); // re-render UI incrementally
}
# backend (sketch)
@app.post("/chat")
async def chat(body: ChatBody):
def stream():
for delta in openai_stream(
model=FINE_TUNED_MODEL,
messages=[SYSTEM_PROMPT_WITH_JSON_MEMORY, *body.messages],
stream=True,
):
yield delta
return StreamingResponse(stream(), media_type="text/plain")
Server-sent-events keep the UI feeling responsive even when the model takes a few seconds to produce a full reply.
Rendering the reply
Replies come back as markdown. I use marked on the frontend to render them — same library I use to render blog content. This keeps the chat replies styled consistently with the rest of the site.
Lessons from the UI work
| Decision | Why it held up | |---|---| | Chatbot inside the portfolio | Existing traffic shortcut | | Custom UI instead of a template | No dep conflicts, fully themeable | | FastAPI over Next.js API routes | Faster Python iteration for AI code | | Server-sent-events streaming | Snappy UX, trivially supported by browsers |
What's next
The site was live, the UI was working, and people started using it. Part 5 is what I learned from the feedback — length of replies, persona consistency, and a side quest into Llama / MPT / Falcon fine-tuning.
Key takeaways
- The right domain for a personal chatbot is the personal site.
- Drop-in React templates don't drop in across dependency trees. Copy CSS, not
package.json. - Split by language strength. FastAPI in Python for AI work, Next.js for the portfolio, fetch in between.
References
- OpenAI — Fine-tuning API
- FastAPI — StreamingResponse docs
- Try OmarAI live
- Series: Part 1 · Part 2 · Part 3 · Part 4 (you are here) · Part 5
Frequently Asked Questions
Why not host the whole chatbot in Next.js?
I started there. Next.js API routes are great for simple endpoints, but anything involving OpenAI SDK quirks, streaming tokens, tool orchestration, or debugging the fine-tuned-model call was much faster for me in Python. I'm more experienced in Python for AI work, and the iteration speed in FastAPI was noticeably higher.
Why did a Subframe chatbot template not work in an existing Next.js project?
Drop-in React chatbot templates assume a specific version tree of React, Next.js, and UI libraries. Pulling the template into a project with different versions of those dependencies broke the build — classic dependency-conflict territory. When a template is 80% styling and 20% wiring, you're better off stealing the CSS and writing the wiring yourself.
How is the backend deployed?
The FastAPI service runs on a small VM behind HTTPS. The Next.js frontend calls it over fetch with the user message; the backend handles the OpenAI call (system prompt, JSON memory, fine-tuned model) and returns the reply. Streaming is server-sent-events to keep the UI responsive.
Can you show a minimal version of the frontend-backend contract?
Client POSTs { messages: [{role, content}, ...] } to /chat. Server responds with a server-sent-events stream of text deltas followed by a 'done' event. The frontend buffers deltas into the current assistant message and renders with marked for markdown. That's the whole protocol.
Is the chatbot available on the portfolio?
Yes — the live version is at theomar.me/omar-ai. It carries the fine-tuned persona from Parts 1–2, the JSON memory from Part 3, and the UI from this post. Post-launch tuning is covered in Part 5.