Intelligent document processing (IDP) is the practice of extracting structured data from semi-structured documents (invoices, claims, contracts, MSAs, KYC forms) using vision-capable language models that read context, not just characters. Unlike traditional OCR (Tesseract, AWS Textract baseline) which converts pixels to text without understanding which text is a vendor name or an invoice total, IDP outputs structured JSON keyed to the business fields the downstream system expects. Unlike rule-based extraction (regex on text-only OCR), IDP handles layout variation, table extraction, multi-page joins, and language switching without rewriting rules. Common stacks pair GPT-5 vision or Claude Sonnet 4.6 vision with Unstructured.io or Azure Document Intelligence for layout preprocessing, attach a per-field confidence score, route fields below 0.85 confidence to a human reviewer, and persist every extraction with its source document hash for audit.