Automated FAQ Extraction (Knowledge Governance System)

Situation

At Viator, the “industry standard” approach of clustering chat logs and summarizing answers failed in production. A pilot revealed that models would hallucinate questions to fit positive sentiment (e.g., claiming food was included because a chat mentioned pets). Because customers are “chaotic browsers” who often chat about one product while viewing another, 30% of support data was contaminated. This led to staleness, inconsistency, and poor coverage for long-tail products.

Task

Reframe from a simple prompting problem to a system design problem. Build a “Knowledge Lifecycle” system that keeps FAQs true over time and at scale. Targets: 90% reduction in hallucinations and 30% reduction in informational support tickets.

Action

Pillar 1: Self-Adaptive RAG (Five-Stage Pipeline)

Query Analysis: Classify incoming query by intent type (policy, product-specific, general)
Source Selection: Route to appropriate knowledge base (policies, product data, chat history)
Retrieval: Hybrid semantic + lexical search with MMR diversification
Source Authority Weighting: Score sources by tier - Gold (official policies, weight=1.0), Silver (structured product data, weight=0.7), Bronze (chat logs, weight=0.3)
Self-Critique: Retrieved context evaluated for relevance and sufficiency; re-retrieve if confidence <0.8

Pillar 2: GraphRAG Knowledge Structure

Ontology Schema:

Entity Types: Product, Supplier, Policy, FAQ, Attribute
Relationship Types: HAS_POLICY, SUPERSEDES, INHERITS_FROM, APPLIES_TO
Schema Evolution: New entity/relationship types require PR review; backward-compatible migrations only

Truth Propagation Example:
When “Weather Policy” node updates to allow cancellations for >30C heat, all linked Product nodes’ FAQs are flagged for regeneration. Graph traversal ensures no orphaned FAQs.

Pillar 3: Agentic Validation (Coordination Architecture)

Agent	Trigger	Action	Output	Monitor
Detect	Hourly cron + webhook on data change	Detect price/policy/API drift	Change event to queue	Change event received
Validate	Change event received	Check FAQ accuracy against current data	Validation score + delta	Validation score + delta
Correct	Validation score <0.9 or delta >threshold	Trigger FAQ regeneration	New FAQ candidate	New FAQ candidate
Govern	New FAQ candidate	Compare to previous; enforce quality gates	Publish or human review	Quality gate outcomes

Coordination: Redis Streams for event queue; agents are stateless workers consuming from streams. Idempotency keys prevent duplicate processing. Circuit breaker prevents cascade failures (max 100 regenerations/hour).

Results

Metric	Baseline	System	Method
Hallucination Rate	23%	2.1%	Manual audit of 500 FAQs
Ticket Deflection	—	34%	A/B test: FAQ-enabled vs. control
Product Coverage (Day 1)	12%	81%	Products with >=3 FAQs
FAQ Freshness	45 days avg	< 24 hours	Time from source change to FAQ update
LLM Latency (serving)	2.1s	0ms	Cache hit rate 99.2%

Sibling Transfer Validation

Similarity threshold: Products must share >80% attribute overlap
Validation: Transferred FAQ passed through Validate agent with source product context
Confidence scoring: Transferred FAQs marked with confidence; <0.85 triggers human review
Audit: 10% random sample of transfers reviewed weekly; current accuracy: 91%

System Design & Architecture

Offline Generation / Online Serving Split

Offline: FAQ generation runs async; no latency impact on user experience
Online: FAQs served from Redis cache; cache miss falls back to Elasticsearch
Consistency: Write-through cache; FAQ updates atomic with cache invalidation

Graph Update Consistency

Transaction Model: Graph updates wrapped in Neo4j transactions; rollback on partial failure
Batch Processing: Downstream regenerations queued and processed in batches of 50
Progress Tracking: Each regeneration job tracked; incomplete jobs resumed on worker restart
Consistency Check: Daily job verifies all FAQs are consistent with current graph state

Risks & Mitigations

Risk	Impact	Mitigation	Monitoring
Cascading Regeneration	Policy change triggers thousands of updates	Circuit breaker (100/hr max); priority queue for high-traffic products	Regeneration queue depth alerts
False Positive Floods	Minor price changes trigger unnecessary work	Dampening layer: ignore changes <5% or within normal variance	False positive rate tracking
Regression on Correction	New FAQ worse than stale original	Critic score comparison; auto-rollback if score drops >10%	Before/after score distribution
Entity Resolution	Sibling transfer fails on entity mismatch	Fuzzy matching + manual entity linking for top suppliers	Entity resolution accuracy dashboard
Graph Staleness	Reality changes faster than graph updates	Agentic monitoring loop; webhook integration with data sources	Graph freshness SLA monitoring