Scaling a Vector Search Pipeline: Sharding and Latency Optimization