Own product · AI integration

Kokó AI Platform — Own AI Customer Service Ecosystem

Why we built our own AI customer service platform when the market has a dozen chatbots? Because none were Hungarian-native, GDPR-compliant, and cost-effective at <3s response time. This is the road from zero to production.

75%
API cost saved
<3s
Response time
24/7
Availability
10+
Microservices
PythonGemini 2.0 FlashChatwootDockerPostgreSQLRedisGoogle Calendar APIDiscord.py

The problem

Hungarian SMEs and enterprises kept coming to us with AI chatbot requests. Market solutions (Intercom, Drift, ChatBot.com) had 3 problems:

  1. Language quality: Hungarian output was choppy, sometimes outright wrong.
  2. GDPR risk: Customer data stored on US servers — potential fines.
  3. Cost: A single active widget cost 50-150k HUF/month in API calls for small businesses, hundreds of thousands for enterprises.

Customers wanted no compromises — we needed a solution. That's how Kokó was born.

The solution — Kokó architecture

Kokó is a Python-based microservices architecture with Gemini 2.0 Flash AI engine. The key innovation is our custom Gemini Context Caching layer: repeatable knowledge base elements are cached, only customer-specific parts go to Gemini.

This cache layer delivered the 75% cost reduction. A typical customer question's token footprint: 3000 tokens (knowledge base) + 100 tokens (question). 2500 of those 3000 tokens repeat across questions — we cache them for 30 days. Only 500 tokens actually go to Gemini fresh.

Main components

Development timeline

Week 1 — Discovery

BMAD Business Discovery: precise KPIs, architecture, data-flow planning

Weeks 2-3 — Core chat engine

Gemini integration, Chatwoot widget, base conversation flow

Week 4 — Context Caching

The most critical part: cache layer development and benchmarking

Week 5 — Integrations

Email, Discord, Google Calendar, scheduling UI

Week 6 — Admin + Analytics

Knowledge manager interface, dashboard, config tools

Weeks 7-8 — Tuning + pilot

Go-live at 2 clients, data-driven optimization

Challenge: the "What's my order status?" problem

During the first pilot at a webshop client, Kokó produced a surprising bug: when a customer asked "what's my order status", the bot randomly replied "being delivered" — but it had never actually checked the order.

Root cause: Gemini hallucinated because function calling for order data wasn't implemented. We fixed this by adding a function calling layer (Tool use) — on Gemini's request, the backend queries the actual order.

Lesson: however good an LLM is, without deterministic data access it will hallucinate. Tool/Function Calling is not optional — it's critical.

Result — in numbers

What this delivered to our clients

Clients using Kokó typically:

"Since Kokó launched, the support team finally works on complaints, not simple questions. We're simply using human energy in better places now." — customer service lead, fintech partner (name anonymized under NDA)

Lessons for MyForge Labs

Kokó was our first major BMAD-method project. What we learned:

  1. The BMAD Architect agent flagged the cache-layer need in design. Without it the product would cost 3× more today.
  2. The QA agent's systematic pre-mortem caught 4 critical bugs early (e.g. the "order status" hallucination we already suspected at architecture stage).
  3. Own product = more freedom, but more internal priority conflicts too. The Product Owner agent helped a lot with the "what NOT to build now" question.

Kokó currently runs at 5 active clients, handling 150,000+ messages per month. Target for end of 2026: 15-20 clients.

Want to pilot Kokó at your company?

2-week pilot, free — if it works, keep it; if not, walk away with no obligation.

Related pages