Service · freelance AI engineer

Building AI agents. Production — not demo.

I build AI agent systems that actually run — end-to-end. LLM orchestration, tool-calling, RAG, and the integration into your existing stack. Solo, from Eindhoven, available 2-3 days a week for clients in the Netherlands and the EU.

Book a call →See ECHO as a reference

What I help with

The projects where I add the most value:

Standing up an agent from scratch. LLM orchestration · tool-calling · memory layer · the whole architecture. Not a chatbot that only talks, but an agent that gets tasks done and picks up where you left off yesterday.
Improving an existing LLM integration. Routing between models, lower costs, lower latency, higher reliability. Often with multi-tier fallback and local models for the cheap queries.
Setting up RAG on your own documents. Embeddings, chunking, retrieval strategy, evaluation. No generic "load and chat" — configured for your corpus and your questions.
Claude API integration. Tool-use, MCP servers, streaming, prompt caching. For teams that want to go from a prototype to production without running into the usual pitfalls.
Workflow automation with agents. SDR agents, support flows, invoice processing, content pipelines. With the right guardrails and monitoring so it doesn't quietly break.

The stack I reach for

Stack follows the problem, but this is what I touch most in practice for agent work:

LLM

Anthropic Claude (Sonnet · Haiku · Opus). Local Ollama (Qwen 2.5, Llama 3.2) for cheap routes and offline fallback.

Backend

Python and FastAPI. Async I/O · tool registry · streaming. Rust where latency matters.

Tooling

Tool-calling · MCP · the agentskills.io pattern (OpenClaw-compatible). Multi-tier fallback for production reliability.

Memory

Obsidian vault as a context repository (Letta-style), ADD-only extraction, optional Qdrant for vector recall.

Infra

Supabase EU · Cloudflare R2 · Sentry EU · Stripe. Vercel or a self-hosted VPS with Coolify when it fits.

Not

No vendor lock-in without a reason. No LangChain spaghetti. No agent framework when 100 lines of Python will do.

Proof — ECHO

I don't only build for clients — I build for myself. ECHO is my own agent orchestrator that runs on my desk every day. Voice-first, routed between local Ollama and Claude, its own memory in an Obsidian vault, a live HUD with system stats. Ten years of audio DNA underneath for the voice layer.

What ECHO proves for client work: I know where agents break in production, and how to prevent it. The architecture choices you make early (memory layer, routing, tool registry, fallback strategy) decide whether you're still happy with the codebase six months from now — or whether you start over.

→ Read about ECHO

Who this works best for

SaaS companies that want to get an AI feature into production without hiring a permanent team. One or two sprints from me is often enough to get it working.
SMEs with internal automation. Agents for support, sales development, invoice processing, content flows. Work that otherwise stays on the shelf because there's no team for it.
Solo founders and small teams that want the AI layer under their product, but would rather not figure out the whole LLM economy themselves.

How it works

A short email or message. What the problem is, a rough shape of what you're after, your time horizon. One paragraph is enough.
A 30-minute call. If it clicks, we scope it. If it doesn't, I'll tell you that too. I'll show ECHO live if you want.
One paid week first. For longer engagements: one week of work to check the rhythm before we go further.

Ready to start?

Rate, availability and the form are on the hire page. Available for clients in the Netherlands and EU remote, 2-3 days a week, from Eindhoven.

→ To /hire (rate + contact)