Your AI pipeline is failing in production. I fix that.

Your RAG system hallucinates. Your agent crashes on edge cases. Your extraction pipeline returns garbage JSON three times a day.

You built a demo that impressed the board. Then you shipped it. Now you spend your mornings reading error logs instead of building features.

I used to think these problems were unique. Every company describes them differently. But the root cause is the same. Somewhere between the prototype and production, someone skipped the engineering.

Prompt engineering alone will not save you. No-code tools will not save you. Another wrapper around GPT-4 will not save you.

Every week that pipeline stays broken, your team burns engineering hours on symptoms instead of shipping the product that justifies the AI investment.

What saves you is someone who has diagnosed these failures before and knows which layer is lying to you.

Why me

  • Co-created the MLOps platform at Qwak, a startup acquired for $230M. I built infrastructure that deployed ML models to production at scale.
  • ML systems handling 50 billion requests per month. That is my current day job. I work at the scale where small mistakes cost real money.
  • O’Reilly co-author and 500+ technical articles. I wrote the chapter on testing in 97 Things Every Data Engineer Should Know. I have been publishing on production AI/ML since 2017.

What this looks like in practice

I diagnosed a failing no-code AI pipeline and fixed it in less than one working day.

The result: 95% extraction accuracy with 100% correctness on structured fields. Running locally on Mistral 7B. No API costs. No cloud dependency.

The pipeline was supposed to extract structured data from documents. Instead it was crashing. Hallucinations, JSON mismatches, parsing errors. The extraction accuracy was unknown because the system never ran long enough to measure it.

Read the full case study

What I offer

AI Pipeline Diagnostic - $1,500

A 90-minute recorded session where I analyze your failing AI system. RAG, agents, extraction pipelines, structured output systems.

You walk away with a 1-page architecture recommendation. Specific fixes. No vague advice.

If you decide to proceed to implementation, the diagnostic fee is credited toward the project. The $1,500 is not an extra cost.

Implementation Sprint - scoped after diagnostic

Hands-on fix of your production AI system. I do the work.

1-2 weeks. Fixed scope. Fixed price. Typical engagements range from $5,000 to $15,000.

No retainers. No open-ended consulting. I come in, fix the problem, and leave you with a system that works.

Tell me what is breaking

I take on 2-3 diagnostic clients per month.

Apply for a Diagnostic

Not ready yet?

Read my 500+ technical articles first. Or subscribe to my newsletter where I write about building AI systems that survive contact with real users.

When something breaks badly enough, you know where to find me.