Agentic AI and Voice AI for Dutch businesses
Agentic AI is more than a chatbot: it is software that plans, executes and monitors tasks itself. I build agentic AI and voice AI agents that take over predictable work: voice reception, document extraction, email routing, RAG on your own knowledge base, multi-step workflows. With monitoring, human escalation and audit trails. Start small, scale what works.
Agentic AI is not magic. It is software that understands language, which is why it can take steps previously only humans could. Pick up a phone and book an appointment. Read an invoice and post it to the correct ledger. Read an inbound email, look up the correct answer and reply. Do research through a web portal and summarise the results. That is what DataDream builds. Not demos, not impressive showcases without follow-up, but agents that run in production and handle work every day.
The difference between agentic AI and a classic chatbot is autonomy. A chatbot answers a question based on a script or knowledge base. An agentic AI system has a goal, can decide which tools and steps are needed to reach it, and keeps working until it is done, even when input deviates from a fixed template. The difference with classic RPA sits in language understanding and judgment: agents can interpret what is asked, while scripted bots only follow pre-defined rules. For processes with variation that is the difference between "works sometimes" and "works every day". For the RPA-vs-agentic trade-off see RPA.
Voice AI is a specific branch of agentic AI: software that understands spoken language, formulates an answer and speaks back with a natural-sounding voice. A voice AI agent can pick up the phone for you, schedule an appointment, answer first-line customer questions or run an intake. For customer service that wants to be reachable outside office hours, for receptions that always get the same standard questions, or for sales intakes with a fixed script, voice AI is often cheaper and faster than an extra hire. Tools I use: ElevenLabs, OpenAI Voice, Vapi, Retell. Logging is on by default for GDPR compliance.
The approach is engineer-pragmatic. First decide which task is actually suited to an agentic AI approach and which is better served by classic automation or a human. Then build a defined pilot, put it in production with a limited user group and activate monitoring. Measure how often it decides correctly, how often it escalates, how often something goes wrong and what the fallback is. Only then scale up. A Quickscan upfront helps pick the right use case before you start building. Sector-specific applications, such as invoice processing for accountants, customer service bots, document review for lawyers or reception bots for tourism, have their own page; the broader automation roadmap for the Netherlands sits separately.
Three levels of AI autonomy
Not every task needs full autonomy. Pick the right rung for each use case, and scale up once the numbers are stable.
Assistant
Human chooses, AI executes
Your team member is at the wheel. AI speeds up the work they would already do: drafting an email, summarising a document, writing notes after a customer call. No autonomy, just speed.
example
Sales engineer asks Claude to write a first draft of a quote based on the transcript of the discovery call.
tools
Claude, ChatGPT, Gemini, Microsoft Copilot, Notion AI
Workflow
Rules decide, AI executes
The steps are defined upfront. AI fills in the steps where judgment or language understanding is required. If-then logic with AI built in where strict scripts fall short.
example
Inbound invoice is automatically read, fields extracted, classified and booked into Exact. Uncertain cases get a doubt flag.
tools
n8n + Claude API, Make + GPT, Zapier + AI, custom Python
Autonomous
Goal is set, AI plans
You set a goal. The agent decides which tools and steps are needed to reach it. Human-in-the-loop for uncertain or sensitive decisions, audit trail for everything.
example
Voice agent picks up the phone, classifies the conversation, handles first-line questions or escalates to a human via Slack.
tools
LangGraph, Vapi, Retell, custom agentic loops
What I put into production
Voice AI agents (phone and voice reception)
Voice AI agents that pick up the phone for first-line questions, schedule appointments, transfer to the right person or send a note to the department. Works for reception, a practice, a hotel or a customer service team that wants to be reachable after hours.
Built with ElevenLabs or OpenAI Voice for natural voice, integrated with your calendar, CRM or phone system. Unknown questions go cleanly to a human. For customer service flows see AI customer service.
Document extraction and classification
Invoices, passports, contracts, policies, BSN forms, delivery notes. Documents currently read manually, classified and entered into a system. At volume this costs speed or accuracy.
An agent reads the document, extracts the right fields, classifies and posts to accounting or DMS. Uncertain cases go to a human with a doubt flag. For accounting see AI for accountants, legal AI for lawyers.
Email and chat routing with escalation
Inbound email or chat queries currently read and forwarded by a human. Many are standard (status questions, billing issues, opening hours) but still take time away from work that really matters.
An agent reads the inbound message, fetches the answer from your knowledge base or CRM, sends a direct reply or routes to the right department. On doubt or complaint: escalate via Slack, Teams or a ticket.
Multi-step workflow agents
Workflows that tie multiple systems together and require choices along the way that simple if-then logic cannot capture. A new lead that needs qualifying, enriching and assigning to the right account manager, for example.
With n8n, Make or LangGraph I build the workflow, with Claude, GPT or Gemini as decision step where judgment is needed. Every step is loggable and testable. For pure RPA see RPA.
On-premise and RAG on your own knowledge base
Companies with sensitive data or compliance requirements often cannot send documents to a cloud AI. At the same time there is huge value in an agent that knows your own manuals, contracts or wiki.
RAG systems on your own infrastructure or in an EU-only environment you control (available on request). Vector database (PG-vector, Weaviate, Pinecone), open or commercial model. Audit trails AI Act compliant; see AI Act.
The stack I use
No vendor lock-in. Per use case I pick what fits your situation, your integrations and your compliance requirements.
// Language models
- Claude (Anthropic)
- GPT (OpenAI)
- Gemini (Google)
- Mistral, Llama, Qwen (open)
// Voice
- ElevenLabs
- Vapi
- Retell
- OpenAI Voice
// Workflow
- n8n
- Make
- LangGraph
- Custom Python
// RAG / vector
- PG-vector
- Weaviate
- Pinecone
- Qdrant
// Integrations
- HubSpot, Salesforce, Teamleader
- Exact, Twinfield, Yuki
- WhatsApp Business, Intercom
- SharePoint, Drive, Dropbox
// Telephony
- Twilio
- Aircall
- RingCentral
- SIP trunk integrations
What it delivers
- Agentic AI agents actually running in production, not demos or showcases
- Voice AI agents with natural voice (ElevenLabs, OpenAI Voice, Vapi, Retell)
- Human-in-the-loop built in via Slack or Teams by default
- Audit trails for every decision, AI Act compliant
- Monitoring dashboard with success, escalation and error rates
- Integrations with existing CRM, accounting, telephony and email
- On-premise or EU-only deployment available on request
- RAG on your own knowledge base, no external training data
- Multi-step workflows with n8n, Make, LangGraph or custom Python
- Start small with a defined pilot, scale what works
Frequently asked questions
What is agentic AI?
Agentic AI is software that does not just answer questions like a chatbot, but autonomously plans, executes and monitors tasks. An agentic AI system has a goal, can decide which steps are needed to reach it, and can use tools (call an API, read a document, send an email) to execute those steps. The difference with classic generative AI is autonomy: an agent keeps working until the goal is achieved, instead of reacting per prompt. The difference with classic RPA is language understanding: an agent can interpret what is being asked, even when the input deviates from a fixed template.
What is an AI agent and how does it differ from a chatbot?
An AI agent is software that works with language and takes decisions or executes actions on your behalf. A chatbot only answers questions based on a script or FAQ. An agent is broader: it reads an inbound email, looks up the information needed itself, takes a decision (reply, route, escalate), and logs what it did. We build agents that run in production and handle real work every day, with monitoring and escalation to a human if confidence is too low. A chatbot is one possible application of an agent, but far from the only one.
What is voice AI and when do you use it?
Voice AI is software that understands spoken language, formulates an answer and speaks back with a natural-sounding voice. A voice AI agent can therefore pick up the phone for you, schedule an appointment, answer first-line customer questions or run an intake. For customer service that wants to be reachable outside office hours, for receptions that always get the same standard questions, or for sales intakes with a fixed script, voice AI is often cheaper and faster than an extra hire. Tools we use: ElevenLabs, OpenAI Voice, Vapi, Retell. Logging is on by default, so you can hear exactly what was said.
How do I know if an AI agent is reliable enough for production?
Reliability does not come from the model name, it comes from the design. We build agents with clear boundaries: they only get access to the tools and data they need, they operate within a defined task, and they have explicit instructions on what to do when uncertain. For production we test every agent on a set of realistic scenarios, including edge cases and adversarial use. We measure success rate, hallucination rate and escalation rate. You get a dashboard showing weekly what the agent did, what went well and what did not. Only when those numbers are stable do we scale volume. Small and correct first, then scale.
What if the agent makes a mistake or misinterprets something?
We think about that upfront, not after the fact. Every agent has a fail-safe: when confidence drops below a threshold, it escalates to a human via Slack, Teams or email. For real errors (wrong answers, hallucinations, failed API calls) the incident is logged with full context: input, prompt, model output, steps taken. So you can review and adjust. We also build in a correction loop by default: users can give feedback and that feedback improves prompts and retrieval over time. An agent that never makes mistakes does not exist, an agent that makes mistakes visible and learns from them does.
How does escalation to a human actually work?
Human-in-the-loop is the rule for us, not the exception. For every agent we explicitly define when a human must step in: low confidence, sensitive decisions (finance, legal, complaints), unknown input patterns, or simply when a customer asks. Escalation goes through the channel your team already uses: Slack, Teams, a ticketing system or email. The team member receives full context, the agent's proposal, and can approve, adjust or take over with a single click. You decide how strict the thresholds are. A new agent runs stricter than an agent that has proven itself.
What happens with our sensitive data?
For truly sensitive data we build on-premise or in an EU-only cloud environment that you control. Nothing leaves your infrastructure. For less sensitive use cases we work with providers (Anthropic, OpenAI, Google) that contractually guarantee inputs are not stored or used for training. We document per use case where the data goes, how long it is retained and who has access. For clients with strict GDPR requirements or sector-specific rules (healthcare, legal, financial) on-premise is often the best route. Vector databases like PGVector or Weaviate can run locally, as can open models like Llama or Mistral. You have the choice.
Can you integrate with our existing systems?
In most cases yes. We work daily with CRMs (HubSpot, Salesforce, Pipedrive, Teamleader), accounting systems (Exact, Twinfield, Yuki), email (Outlook, Gmail), document platforms (SharePoint, Drive, Dropbox), telephony (RingCentral, Twilio, Aircall) and chat platforms (WhatsApp Business, Intercom). If a system has an API we integrate directly. If it does not, we work through Zapier, Make, n8n or as a last resort browser automation. We always start with a short technical check so we know upfront whether an integration can be robust, or whether a workaround is needed. No surprises mid-project.
How does AI Act compliance work for agents?
The AI Act imposes requirements on logging, transparency and human supervision, especially for agents that affect people (customers, employees, applicants). We therefore build audit trails by default: every agent decision is recorded with input, output, model version, timestamp and any human approval. For agents that may fall into a high-risk category (e.g. recruitment, credit scoring or medical advice) we set up logging even stricter: the prompt version and retrieval sources are also retained. We also ensure transparency to end users: they know they are talking to an agent and how to escalate to a human. You get a compliance file per agent. See AI Act for the broader context.
How do I start small without a months-long project?
By picking a use case that is well-defined and where the pain actually sits. Not "we want AI agents", but "our reception gets 200 booking requests per week and that costs an hour a day". Such a use case is concrete enough to put into a pilot in production with a limited user group. Then measure what it delivers in time or quality, adjust, and expand. An AI Quickscan upfront helps choose the right use case. No platform projects without a first delivery. Having something in production that works, small but real, is far more valuable than a large roadmap without a concrete outcome.
One defined use case. Working in production first, then scale.
No platform project without a first delivery. One task, one agent, in production with monitoring. Scale up only when the numbers show it works.