Document Ingestion Patterns: PDFs, HTML, Audio, Logs