Designing an LLM Serving Architecture: Batching, Caching & Autoscaling Parvesh Sandila / November 17, 2025