Bartosz Mikulski - AI Consulting for fintech teams
Stop AI hallucinations in production with 14-day RAG Hardening Sprint
Within two weeks your team goes from “we think it’s fine” to a monitored, production RAG service with a clearly measured drop in hallucinations (using a KPI we set together after the baseline audit.)
Day | What Happens | Details |
---|---|---|
1–3 | Baseline & KPI Lock | We inspect code, data, and queries. Agree on a percentage reduction or error threshold that fits your reality. No guesswork promises. |
4–10 | Collaborative Build | Pair-programming on AWS or on-prem. Retrieval logic, prompt tuning, evaluation harness. |
11–14 | Stress-Test & Transfer | Load tests, docs, hand-off call. Plus a 90-day Slack hotline for follow-ups. |
- 1–3 days: Baseline & KPI Lock
We inspect code, data, and queries. Agree on a percentage reduction or error threshold that fits your reality. No guesswork promises. - 4–10 days: Collaborative Build
Pair-programming on AWS or on-prem. Retrieval logic, prompt tuning, evaluation harness. - 11–14 days: Stress-Test & Transfer
Load tests, docs, hand-off call. Plus a 90-day Slack hotline for follow-ups.
Investment: $12,000 per 14-day sprint (6k upfront, 6k on KPI hit)
If we don’t hit the agreed KPI by Day 14, I continue working free for up to four more weeks and you keep 50% of the fee.
When we hit the KPI, you pay the remaining 50% of the fee.
Featured AI projects I have built:
-
Retrieval Augmented Generation for Customer Support
Implemented the semantic search solution based on a vector database. Performed data analysis, including clustering and topic modeling, to find types of past support requests. Used GenAI to draft a suggested solution to new customer support requests.
-
AI-based reporting solution
Worked on an AI-based solution for analyzing online reviews and comparing the performance of different branches of the same company.
What people say about me:
-
Martyna Urbanek-Trzeciak
(Product Manager - Data Engineering)
I worked with Bartosz while he was a member of Data Engineering team at Fandom. He is very professional and open to share his knowledge with his teammates and beyond. His approach was always very data-driven and he has great knowledge in Data Engineering area what made him very valuable partner in discussions.
-
Mariusz Kuriata
(Senior Manager of Engineering - Head of Ops)
It was my pleasure to work with Bartosz. Bartosz is a dedicated and experienced Data Engineer who showed a range of skills and readiness to help. I appreciated that I could count on Bartosz to lead sophisticated technical projects. Highly recommend!
-
Workshop participant
I'm extremely impressed with Bartosz's expertise and experience. We covered all assignments, addressing various details, scenarios, and potential errors. Every question we asked was answered thoroughly. The workshop format of the sessions and small group activities were particularly enjoyable. We had opportunities to apply our new knowledge practically. The trainer remained accessible whenever questions arose. If any uncertainties emerged, the facilitator explained everything with patience."
Frequently Asked Questions:
-
How much of our engineering team's time will this require?
Minimal. I expect your tech lead involvement ~2‑3 h/week for reviews/decisions, a backend/ML engineer ~2‑4 h/week for interface/infra hooks, security/compliance ~1 h/week for sign‑offs, and product/support ~2‑3 h/week for acceptance, with async communication and the option to run through a single liaison if you're resource‑constrained. Until the final handoff, of course, then I will need your team for the knowledge transfer session.
-
How will your solution integrate with our existing tech stack and architecture?
I add two thin layers around your current AI flows: retrieval/grounding that connects to your existing data sources or vector store, and validation/guardrails that enforce citations, "I‑don't‑know" fallbacks, and policy checks before responses reach users. All of these are deployed as a sidecar service, in‑app library, or API behind your gateway, wired into your observability (OpenTelemetry → Datadog/Prometheus), feature flags, CI/CD, and secrets/KMS across AWS/GCP/Azure or on‑prem.
-
Will the solution work with the AI models and platforms we currently use (e.g. our existing LLM provider or infrastructure), or will we need to adopt new technologies?
Yes. I support OpenAI/Azure OpenAI, Anthropic, Google, Cohere, Mistral, and self‑hosted models; an adapter layer insulates you from vendor changes, and I can add policy‑based model routing (cost/latency/quality) with offline evals so upgrades don't regress quality.
-
After the engagement, will our team be able to maintain and extend the solution on their own?
That's the point. I design myself out with a clean repo, IaC (e.g., Terraform), runbooks, an automated evaluation suite (unit/scenario/regression) with dashboards, plus knowledge transfer sessions (engineering + ops) and a 90‑day Slack help line. I can offer a light, cancel‑anytime retainer for model/vendor updates if you want it.
-
We deal with sensitive financial data. How do you ensure data security and regulatory compliance (e.g. privacy laws) during the project?
I treat security as a hard constraint: data minimization and PII redaction at ingress, in‑VPC inference/vector stores with no data leaving your region without explicit approval, BYOK + KMS at rest and TLS in transit, SSO/SAML/OIDC with least‑privilege RBAC and tamper‑proof audit logs, vendor governance with DPAs and zero‑retention LLM endpoints or on‑prem options, alignment with GDPR (DPIA, records of processing), EU AI Act readiness (risk class, provenance, human oversight), and PCI‑DSS adjacency for payment flows.
1-Day RAG Reliability Workshop – $4 000
For teams that prefer to learn before they buy.
Module | What your engineers take away |
---|---|
Morning – RAG Foundations | Live walkthrough of retrieval pitfalls, demo of a failing vs. fixed pipeline. |
Midday – Hands-On Lab | Each participant builds a small RAG endpoint with evaluation tests they keep. |
Afternoon – Team Code Clinic | We review your repo in real time, flag top risk areas, and outline next steps. |
Take-Home Pack | Slide deck, lab repo, evaluation notebook, and a 30-day Slack Q&A channel. |
- Morning – RAG Foundations: Live walkthrough of retrieval pitfalls, demo of a failing vs. fixed pipeline.
- Midday – Hands-On Lab: Each participant builds a small RAG endpoint with evaluation tests they keep.
- Afternoon – Team Code Clinic: We review your repo in real time, flag top risk areas, and outline next steps.
- Take-Home Pack: Slide deck, lab repo, evaluation notebook, and a 30-day Slack Q&A channel.
Up to 6 participants per workshop.
If the team says they learned nothing they can apply next week, shred the invoice (no charge).
Apply the full $4 000 toward the 14-Day RAG Hardening Sprint if you upgrade within 30 days.