About
I'm Harrison Brown. I build AI features that don't fall over.
Six years writing production code, the last three working almost entirely on LLM-powered systems. Before AI, I shipped backend and data pipelines for fintech and e-commerce — which is why I tend to care more about reliability and evals than benchmarks.
My approach
The exciting part of AI is the model. The dependable part is everything around it: evals, retrieval, observability, cost budgets, and a deploy story that doesn't make you nervous. I lead with the boring stuff because that's what lets the interesting stuff actually ship.
What I won't do
I won't take on unbounded consulting engagements, ship a chatbot that's clearly the wrong solution, or charge a retainer that I can't quantify the value of. If your problem is a better SQL query, I'll tell you.
What I love
Workflow automation that gives someone their afternoon back. Eval suites that catch a regression before a customer does. The moment a clean prompt replaces 200 lines of brittle glue.
Toolbox
Skills & tools
Provider-agnostic by default. The right answer depends on your data and budget.
LLM & Agents
- OpenAI
- Anthropic
- xAI / Grok
- Google Gemini
- Mistral
- Groq
- Tool use & function calling
- Multi-step agents
Retrieval & Data
- Pinecone
- pgvector
- Weaviate
- Hybrid search
- Embedding evaluation
- Chunking strategies
- Document parsing
Evals & Observability
- LLM-as-judge
- Golden datasets
- Langfuse
- OpenTelemetry tracing
- Cost & latency budgeting
- Drift detection
Engineering
- TypeScript / Node
- Python / FastAPI
- Next.js / Vercel
- Postgres
- Temporal / queues
- AWS / Cloudflare Workers
Let's build
Have an AI feature that needs to ship without falling over?
Tell me what you're trying to automate. I'll come back with whether it's a 2-week build, a 2-month build, or honestly not a good fit.