Prompt Engineering
Designing reliable instructions for LLMs to ensure consistent, accurate business outcomes.
Read moreI help businesses ship reliable AI features — prompt engineering, agent workflows, RAG, and automations that actually move metrics.
+30
AI features shipped
6 yrs
engineering
3 days
median first ship
100%
production-grounded
Services
I focus on the AI features that move metrics — not demos. These are the engagements I take most often.
Designing reliable instructions for LLMs to ensure consistent, accurate business outcomes.
Read moreConnecting disparate tools and LLMs into seamless, end-to-end automated processes.
Read moreBuilding secure chatbots that converse with your proprietary data to boost team efficiency.
Read moreIntegrating state-of-the-art AI models into your existing software infrastructure securely.
Read moreSelected work
A few representative engagements across finance, e-commerce, and operations.
A mid-market finance team was rekeying invoices from PDFs into NetSuite — eight FTEs, daily, with a 3% error rate.
An extraction pipeline using GPT-4 Vision + structured outputs, validated against a deterministic rules engine, with a human-in-the-loop UI for low-confidence rows.
90% reductionin processing time
A DTC brand was paying for tier-1 support tickets that were mostly answered by their own help center, just buried under bad search.
A context-aware RAG chatbot trained on product docs, returns/shipping policy, and order data — with grounded answers and citations, plus a graceful handoff to humans.
45% deflectionof support tickets
Engineering, ops, and customer success each had their own wikis. Nobody could find anything; people pinged each other instead of searching.
A semantic search engine across all three corpora with permission-aware retrieval, plus a Slack bot that answers natural-language questions with citations.
3h saved/weekper employee
How it works
The exciting part of AI is the model. The dependable part is the process around it.
We map the workflow you actually want changed and pick the smallest, highest-leverage AI surface to ship first.
I draft prompts, schemas, and a thin architecture — with explicit acceptance criteria and a clear cost/latency budget.
Production code from day one — typed, tested, observable, and wired into the tools your team already uses.
Eval suites and red-team prompts catch regressions before they ship. We measure quality, not vibes.
Vercel, AWS, or your stack — secrets, logs, and rollouts handled. Zero-downtime cutover from your old workflow.
Once it's live, the data is the moat. I tune prompts, retrieval, and routing against real traffic, not synthetic demos.
Toolbox
Provider-agnostic by default. The right answer depends on your data and budget.
LLM & Agents
Retrieval & Data
Evals & Observability
Engineering
Trust
“Shipped our first real AI feature in three weeks. It hasn't been the lead story in a postmortem once — that's a first for us.”
Maya R.
VP Engineering · Series B fintech
“He treated prompts like code. Eval suites, regressions, the whole thing. Our quality stopped being a vibe.”
Daniel K.
Head of Product · DTC e-commerce
“We were ready to hire two ops people. Instead we hired him for a month and the work just… disappeared.”
Priya S.
COO · Healthtech startup
Engagement options
From a 90-minute strategy call to a monthly optimization retainer. Honest pricing — no platinum-bundle nonsense.
Strategy Session
$750
one-time
A 90-minute working session for teams who know AI fits somewhere but want a senior second opinion before they spend.
MVP Build
$8k–$25k
per engagement
A focused 2–6 week build that takes a single workflow from idea to production — typed, evaluated, deployed.
AI Optimization Retainer
$3.5k+
per month
Once it's live, the data is the moat. I keep your AI features improving against real traffic instead of decaying.
Let's build
Tell me what you're trying to automate. I'll come back with whether it's a 2-week build, a 2-month build, or honestly not a good fit.