Key Contributions
- Designed hybrid retrieval - BM25 + vector search with Reciprocal Rank Fusion, intent and treatment-stage boosting
- Built multi-layer guardrails: input safety, output validation, dialog monitoring, medical safety, faithfulness checking
- Introduced ChromaDB and built the ingestion pipeline from Notion CSV exports
- Designed two-tier response caching (L1 in-memory + L2 Redis) - reduced repeat query latency from ~5s to under 10ms
- Integrated Langfuse for full request tracing with per-stage spans and confidence scoring
- Implemented circuit breaker pattern on guardrails for resilience
- Built comprehensive test suite across unit, integration, and API layers
Production system serving real patients at an IVF clinic. Evaluation test cases across multiple datasets for regression testing.






