Kokó AI Platform — Own AI Customer Service Ecosystem
Why we built our own AI customer service platform when the market has a dozen chatbots? Because none were Hungarian-native, GDPR-compliant, and cost-effective at <3s response time. This is the road from zero to production.
The problem
Hungarian SMEs and enterprises kept coming to us with AI chatbot requests. Market solutions (Intercom, Drift, ChatBot.com) had 3 problems:
- Language quality: Hungarian output was choppy, sometimes outright wrong.
- GDPR risk: Customer data stored on US servers — potential fines.
- Cost: A single active widget cost 50-150k HUF/month in API calls for small businesses, hundreds of thousands for enterprises.
Customers wanted no compromises — we needed a solution. That's how Kokó was born.
The solution — Kokó architecture
Kokó is a Python-based microservices architecture with Gemini 2.0 Flash AI engine. The key innovation is our custom Gemini Context Caching layer: repeatable knowledge base elements are cached, only customer-specific parts go to Gemini.
Main components
- Chat widget (embeddable JS) — on Chatwoot base
- Email integration — auto-response generator for incoming mail
- Discord channel integration — direct customer service on Discord
- Scheduling system — over Google Calendar API
- Knowledge base manager — admin UI for FAQ and knowledge updates
- Analytics dashboard — conversation categories, success rates, satisfaction
Development timeline
Week 1 — Discovery
BMAD Business Discovery: precise KPIs, architecture, data-flow planning
Weeks 2-3 — Core chat engine
Gemini integration, Chatwoot widget, base conversation flow
Week 4 — Context Caching
The most critical part: cache layer development and benchmarking
Week 5 — Integrations
Email, Discord, Google Calendar, scheduling UI
Week 6 — Admin + Analytics
Knowledge manager interface, dashboard, config tools
Weeks 7-8 — Tuning + pilot
Go-live at 2 clients, data-driven optimization
Challenge: the "What's my order status?" problem
During the first pilot at a webshop client, Kokó produced a surprising bug: when a customer asked "what's my order status", the bot randomly replied "being delivered" — but it had never actually checked the order.
Root cause: Gemini hallucinated because function calling for order data wasn't implemented. We fixed this by adding a function calling layer (Tool use) — on Gemini's request, the backend queries the actual order.
Result — in numbers
- 75% API cost reduction — at typical 500 questions/day per customer, the monthly Gemini bill drops from ~45k HUF to 11k HUF
- <3 second average response time — thanks to the cache
- 24/7 availability — zero human intervention
- 67% automation rate — 2/3 of questions handled by the bot, 1/3 escalated to humans
What this delivered to our clients
Clients using Kokó typically:
- Break even on the platform within 1-2 months
- Free up 1-3 FTE from customer service to higher-value work
- See customer satisfaction (NPS) improve by ~15 points thanks to 24/7 availability
Lessons for MyForge Labs
Kokó was our first major BMAD-method project. What we learned:
- The BMAD Architect agent flagged the cache-layer need in design. Without it the product would cost 3× more today.
- The QA agent's systematic pre-mortem caught 4 critical bugs early (e.g. the "order status" hallucination we already suspected at architecture stage).
- Own product = more freedom, but more internal priority conflicts too. The Product Owner agent helped a lot with the "what NOT to build now" question.
Kokó currently runs at 5 active clients, handling 150,000+ messages per month. Target for end of 2026: 15-20 clients.
Want to pilot Kokó at your company?
2-week pilot, free — if it works, keep it; if not, walk away with no obligation.