Digital Dan
About this project
What this is
Digital Dan is an interactive AI portfolio agent. Instead of a static CV, recruiters and hiring managers can ask the agent questions about my background, projects, and the kind of role I'm looking for, and get answers in something close to my voice. Optionally, those answers can be played back as audio, using a Professional Voice Clone of me trained through ElevenLabs.
The agent runs on Claude. It only knows what I have written about myself: a corpus of around six thousand words covering my work and background. It will not invent facts about my experience, commit to anything on my behalf, or comment on third parties.
Why I built it
Two reasons.
First, it is a portfolio piece. I make the claim, in my CV and elsewhere, that I can build with AI tools. The strongest evidence for that claim is something the person evaluating me can interact with directly. A live system is harder to fake than a bullet point.
Second, it works as a recruiting funnel. A CV asks someone to commit time to reading it and following up. An interactive agent lowers that friction. A recruiter can spend two minutes interrogating it from their phone and walk away with a clearer sense of whether I am worth a conversation. The agent works while I am asleep.
Stack
- Next.js App Router on Vercel for the frontend and API routes
- TypeScript throughout, strict mode
- The Anthropic Claude API for generating responses, with my corpus loaded as the system prompt on each request
- The ElevenLabs Text-to-Speech API for voice output, using a Professional Voice Clone trained on roughly thirty minutes of my own recorded audio
- Inter and JetBrains Mono via next/font/google
Decisions I made and why
A small portfolio project like this could be over-engineered in dozens of directions. I tried to avoid that. The decisions below were deliberate, not defaults.
Text input only. Voice output optional.
I considered a full voice loop where the user speaks and the agent speaks back. Decided against it. Text input keeps the system simple, works on any device with no microphone permission, and makes the agent usable in meetings or other quiet contexts. Voice is output only, per message, by clicking a button. People can read faster than they can listen, so for most interactions text is the better default anyway.
Streaming text, on-demand voice.
Claude's responses stream token by token, so the user sees text appear as it is generated rather than waiting for the full reply. Voice is the opposite. It is not streamed and not auto-played. Each agent message gets a play button, and audio is only generated when someone clicks it. The default interaction stays fast and quiet. Voice is available whenever someone wants to hear it.
No retrieval-augmented generation. The full corpus loads into the system prompt.
The corpus is small. Roughly eight thousand tokens. Claude has a two hundred thousand token context window. RAG would mean adding a vector database, an embedding pipeline, retrieval logic, and several new failure modes, for marginal benefit at this scale.
Full-context retrieval also tends to outperform RAG on small corpora, because the model sees the whole picture and can reason across it instead of fetching what it thinks is relevant. If the corpus ever grows past around a hundred thousand tokens, the right move is to migrate to a vector database with chunk-level retrieval. Until then, the simpler approach wins.
ElevenLabs TTS API rather than ElevenLabs Agents.
ElevenLabs has a managed conversational AI product that handles the full voice loop end to end. It is the right tool for a real-time voice agent. Since the input here is text only, all I need from ElevenLabs is text-to-speech, which is a direct API call with my voice ID and a string of text. Less infrastructure, less cost, and full control over the user interface.
Honest framing
I am a commercial operator with eight years in life sciences, and I have become genuinely hands-on with AI tools over the last couple of years. I am not a trained software engineer, and I will not pretend to be one.
The system architecture, the prompt design, the corpus, the product decisions, and the choice of what to build and what to leave out are mine. The code itself was written with the help of Claude and other AI coding tools, with me directing, reviewing, and debugging.
I take the view that this distinction matters less than it used to. What matters is whether the system works, whether the decisions behind it are sound, and whether the person doing the work understands what they have built well enough to defend it. I think I clear all three bars on this one.
What I'd build differently
If I were to properly productionise this for someone else (say, a colleague who wanted the same thing for themselves) I would add per-user data, an admin interface for editing the source material through a UI, and a streaming voice loop instead of the current per-message playback. None of that is on the agenda for the version on this site, which is a portfolio piece, not a product.
Get in touch
If you have read this far, the easiest next step is to message me on LinkedIn or send me an email. Both links are in the footer below.