AI Agent for Document Processing & Invoice Automation
Your accounts payable team processes hundreds of invoices a month. Most of that work is mind-numbing data entry. Here's how to build an AI agent that handles extraction, validation, and booking — while humans handle exceptions.
The Problem Nobody Talks About
Every company has a dirty secret: somewhere in the building, someone is manually typing invoice numbers into a spreadsheet. They're squinting at PDF scans, copy-pasting amounts, cross-referencing purchase orders, and praying they don't transpose a digit.
The average accounts payable clerk processes 5-12 invoices per hour. That's not because they're slow — it's because every invoice is different. Different layouts. Different languages. Handwritten notes. Blurry scans. Credit notes that look nothing like invoices.
Traditional OCR "solutions" promised to fix this a decade ago. They didn't. Template-based OCR breaks every time a supplier changes their invoice layout. Rule-based extraction fails on edge cases. And "AI-powered" SaaS tools charge $2-5 per document while still requiring human review on 30%+ of invoices.
Here's the thing: LLMs are embarrassingly good at reading documents. Not because of OCR — because they understand context. They know that the number next to "Total incl. BTW" is the total amount. They know that "Factuurnummer" is Dutch for invoice number. They handle messy layouts, mixed languages, and weird formatting without templates.
You can build a document processing agent for $15-40/month that outperforms most enterprise solutions. Let me show you how.
Architecture: Three Layers
├── Email attachment watcher (IMAP/Gmail API)
├── Shared drive scanner (Google Drive / SharePoint)
├── Upload portal (web form / Slack bot)
└── PDF → image conversion (for scans)
🧠 INTELLIGENCE LAYER (extraction + validation)
├── Vision LLM for scanned documents
├── Text LLM for digital PDFs
├── Structured output (JSON schema)
├── PO matching & validation rules
└── Anomaly detection (duplicate, amount mismatch)
📊 ACTION LAYER (booking + routing)
├── Accounting system push (Exact, Twinfield, Xero, QBO)
├── Approval workflow (Slack/email for exceptions)
├── Document archive (tagged, searchable)
└── Dashboard & reporting
The key insight: you don't need one monolithic system. You need three independent layers that talk to each other through structured data. If your intake changes (new email provider), the intelligence layer doesn't care. If you switch accounting systems, only the action layer changes.
Step 1: Document Intake
Most invoices arrive in one of three ways: email attachments, shared drive uploads, or physical mail (scanned). Your agent needs to watch all three.
Email Watcher
The simplest approach: a cron job that checks a dedicated inbox every 5 minutes.
import imaplib
import email
from pathlib import Path
def check_invoice_inbox():
mail = imaplib.IMAP4_SSL('imap.gmail.com')
mail.login('invoices@yourcompany.com', app_password)
mail.select('INBOX')
# Search for unprocessed emails
_, messages = mail.search(None, 'UNSEEN')
for msg_id in messages[0].split():
_, msg_data = mail.fetch(msg_id, '(RFC822)')
msg = email.message_from_bytes(msg_data[0][1])
for part in msg.walk():
if part.get_content_type() == 'application/pdf':
filename = part.get_filename()
pdf_bytes = part.get_payload(decode=True)
# Save to processing queue
path = Path(f'queue/{msg_id}_{filename}')
path.write_bytes(pdf_bytes)
# Add metadata
save_metadata(path, {
'from': msg['From'],
'date': msg['Date'],
'subject': msg['Subject'],
'source': 'email'
})
mail.logout()
Set up a dedicated email address like invoices@yourcompany.com and tell suppliers to send there. This keeps your agent's intake clean and avoids processing random PDFs from marketing emails.
Drive Scanner
For companies that use a shared "Invoices" folder on Google Drive or SharePoint:
from googleapiclient.discovery import build
def scan_drive_folder(folder_id):
service = build('drive', 'v3', credentials=creds)
results = service.files().list(
q=f"'{folder_id}' in parents and mimeType='application/pdf' and not appProperties has { key='processed' }",
fields='files(id, name, createdTime)'
).execute()
for file in results.get('files', []):
# Download PDF
content = service.files().get_media(fileId=file['id']).execute()
# Queue for processing
queue_document(content, file['name'], source='drive')
# Mark as picked up
service.files().update(
fileId=file['id'],
body={'appProperties': {'processed': 'queued'}}
).execute()
Step 2: Intelligent Extraction
This is where AI agents destroy traditional OCR. Instead of template matching, you send the document to an LLM and ask it to extract structured data.
For Digital PDFs (text-based)
import anthropic
import json
from pypdf import PdfReader
client = anthropic.Anthropic()
def extract_invoice_data(pdf_path):
# Extract text from PDF
reader = PdfReader(pdf_path)
text = '\n'.join(page.extract_text() for page in reader.pages)
response = client.messages.create(
model='claude-sonnet-4-20250514',
max_tokens=2000,
messages=[{
'role': 'user',
'content': f"""Extract invoice data from this document. Return valid JSON only.
Document text:
{text}
Required fields:
{{
"invoice_number": "string",
"invoice_date": "YYYY-MM-DD",
"due_date": "YYYY-MM-DD or null",
"supplier_name": "string",
"supplier_vat": "string or null",
"supplier_iban": "string or null",
"currency": "EUR/USD/GBP",
"line_items": [
{{
"description": "string",
"quantity": number,
"unit_price": number,
"vat_rate": number,
"line_total": number
}}
],
"subtotal": number,
"vat_amount": number,
"total_amount": number,
"po_reference": "string or null",
"payment_reference": "string or null",
"notes": "string or null",
"confidence": "high/medium/low"
}}"""
}]
)
return json.loads(response.content[0].text)
For Scanned Documents (image-based)
import base64
from pdf2image import convert_from_path
def extract_scanned_invoice(pdf_path):
# Convert PDF pages to images
images = convert_from_path(pdf_path, dpi=200)
# Encode first page (most invoices are single-page)
img_buffer = io.BytesIO()
images[0].save(img_buffer, format='PNG')
img_b64 = base64.standard_b64encode(img_buffer.getvalue()).decode()
response = client.messages.create(
model='claude-sonnet-4-20250514',
max_tokens=2000,
messages=[{
'role': 'user',
'content': [
{
'type': 'image',
'source': {
'type': 'base64',
'media_type': 'image/png',
'data': img_b64
}
},
{
'type': 'text',
'text': 'Extract all invoice data from this scanned document. Return valid JSON with: invoice_number, invoice_date, due_date, supplier_name, supplier_vat, line_items (description, quantity, unit_price, vat_rate, line_total), subtotal, vat_amount, total_amount, po_reference, confidence.'
}
]
}]
)
return json.loads(response.content[0].text)
Vision API calls cost more than text. A typical invoice image uses ~1,500 input tokens. At Claude Sonnet pricing, that's about $0.005 per invoice for extraction. At 500 invoices/month, you're looking at $2.50 — still 100x cheaper than enterprise OCR platforms.
Step 3: Validation & PO Matching
Extraction is only half the battle. The real value is validation — catching errors before they hit your books.
def validate_invoice(extracted_data, po_database):
issues = []
# 1. Math check: do line items add up?
calculated_subtotal = sum(
item['line_total'] for item in extracted_data['line_items']
)
if abs(calculated_subtotal - extracted_data['subtotal']) > 0.01:
issues.append({
'type': 'math_error',
'severity': 'high',
'detail': f"Line items sum to {calculated_subtotal}, subtotal says {extracted_data['subtotal']}"
})
# 2. VAT check: correct rates?
for item in extracted_data['line_items']:
if item['vat_rate'] not in [0, 9, 21]: # NL rates
issues.append({
'type': 'vat_rate',
'severity': 'medium',
'detail': f"Unusual VAT rate: {item['vat_rate']}% on {item['description']}"
})
# 3. Duplicate check
existing = db.query(
"SELECT * FROM invoices WHERE invoice_number = ? AND supplier_name = ?",
[extracted_data['invoice_number'], extracted_data['supplier_name']]
)
if existing:
issues.append({
'type': 'duplicate',
'severity': 'critical',
'detail': f"Invoice {extracted_data['invoice_number']} already processed on {existing[0]['processed_date']}"
})
# 4. PO matching
if extracted_data.get('po_reference'):
po = po_database.get(extracted_data['po_reference'])
if po:
if abs(extracted_data['total_amount'] - po['amount']) > po['amount'] * 0.05:
issues.append({
'type': 'po_mismatch',
'severity': 'high',
'detail': f"Invoice total {extracted_data['total_amount']} differs from PO amount {po['amount']} by more than 5%"
})
else:
issues.append({
'type': 'po_not_found',
'severity': 'medium',
'detail': f"PO reference {extracted_data['po_reference']} not found in system"
})
# 5. Supplier verification
known_supplier = db.query(
"SELECT * FROM suppliers WHERE vat_number = ?",
[extracted_data.get('supplier_vat')]
)
if not known_supplier:
issues.append({
'type': 'unknown_supplier',
'severity': 'medium',
'detail': f"Supplier {extracted_data['supplier_name']} not in approved supplier list"
})
return {
'valid': len([i for i in issues if i['severity'] in ['high', 'critical']]) == 0,
'issues': issues,
'auto_bookable': len(issues) == 0
}
Step 4: Accounting System Integration
Once validated, the agent pushes data to your accounting system. Here's the pattern for common Dutch/EU systems:
Exact Online (Netherlands)
import requests
def book_to_exact(invoice_data, exact_config):
headers = {
'Authorization': f'Bearer {exact_config["token"]}',
'Content-Type': 'application/json'
}
# Create purchase invoice
payload = {
'Journal': exact_config['purchase_journal'],
'InvoiceNumber': invoice_data['invoice_number'],
'InvoiceDate': invoice_data['invoice_date'],
'Supplier': resolve_supplier_id(invoice_data['supplier_name']),
'Currency': invoice_data['currency'],
'Description': f"Invoice {invoice_data['invoice_number']} - {invoice_data['supplier_name']}",
'PurchaseInvoiceLines': [
{
'GLAccount': map_gl_account(item['description']),
'Description': item['description'],
'AmountFC': item['line_total'],
'VATCode': map_vat_code(item['vat_rate']),
'Quantity': item['quantity'],
'UnitPrice': item['unit_price']
}
for item in invoice_data['line_items']
]
}
response = requests.post(
f'{exact_config["base_url"]}/api/v1/{exact_config["division"]}/purchaseinvoice/PurchaseInvoices',
headers=headers,
json=payload
)
return response.json()
GL Account Mapping with AI
One of the trickiest parts: mapping invoice line items to the right general ledger account. Instead of maintaining a massive rules table, let the LLM do it:
def map_gl_account(description, chart_of_accounts):
response = client.messages.create(
model='claude-haiku-4-20250414',
max_tokens=100,
system=f"""You are an accounting classifier. Given a chart of accounts and an invoice line item description, return the most appropriate GL account code. Return ONLY the account code, nothing else.
Chart of accounts:
{json.dumps(chart_of_accounts)}""",
messages=[{
'role': 'user',
'content': f'Classify: "{description}"'
}]
)
return response.content[0].text.strip()
Every time a human corrects a GL mapping, log it. After 50-100 corrections, you have a fine-tuning dataset. Add the corrections to the system prompt as examples — accuracy jumps from ~85% to 95%+ within weeks.
Step 5: Exception Handling & Approval Workflow
Not every invoice should be auto-booked. Your agent needs to know when to ask for help.
def process_invoice(pdf_path):
# Extract
data = extract_invoice_data(pdf_path)
# Validate
validation = validate_invoice(data, po_database)
if validation['auto_bookable']:
# Green path: auto-book
result = book_to_accounting(data)
notify_slack(f"✅ Auto-booked: {data['invoice_number']} from {data['supplier_name']} — €{data['total_amount']}")
archive_document(pdf_path, data, status='booked')
elif validation['valid']:
# Yellow path: bookable but has minor issues
send_approval_request(data, validation['issues'])
archive_document(pdf_path, data, status='pending_approval')
else:
# Red path: needs human review
send_exception_alert(data, validation['issues'])
archive_document(pdf_path, data, status='exception')
def send_approval_request(data, issues):
"""Send Slack message with approve/reject buttons"""
blocks = [
{
'type': 'header',
'text': {'type': 'plain_text', 'text': f"🟡 Invoice needs approval"}
},
{
'type': 'section',
'fields': [
{'type': 'mrkdwn', 'text': f"*Supplier:* {data['supplier_name']}"},
{'type': 'mrkdwn', 'text': f"*Amount:* €{data['total_amount']:,.2f}"},
{'type': 'mrkdwn', 'text': f"*Invoice #:* {data['invoice_number']}"},
{'type': 'mrkdwn', 'text': f"*Issues:* {len(issues)}"}
]
},
{
'type': 'section',
'text': {'type': 'mrkdwn', 'text': '\n'.join(f"⚠️ {i['detail']}" for i in issues)}
},
{
'type': 'actions',
'elements': [
{'type': 'button', 'text': {'type': 'plain_text', 'text': '✅ Approve & Book'}, 'action_id': 'approve_invoice', 'style': 'primary'},
{'type': 'button', 'text': {'type': 'plain_text', 'text': '❌ Reject'}, 'action_id': 'reject_invoice', 'style': 'danger'},
{'type': 'button', 'text': {'type': 'plain_text', 'text': '📄 View PDF'}, 'action_id': 'view_invoice'}
]
}
]
slack.chat_postMessage(channel='#finance', blocks=blocks)
Step 6: The Full Pipeline (Cron Loop)
# invoice_agent.py — runs every 5 minutes via cron
import schedule
import time
def run_pipeline():
# 1. Check intake sources
check_invoice_inbox()
scan_drive_folder(INVOICES_FOLDER_ID)
# 2. Process queue
queue_dir = Path('queue/')
for pdf in sorted(queue_dir.glob('*.pdf')):
try:
process_invoice(pdf)
pdf.rename(f'processed/{pdf.name}')
except Exception as e:
logging.error(f"Failed to process {pdf.name}: {e}")
pdf.rename(f'failed/{pdf.name}')
notify_slack(f"❌ Failed: {pdf.name} — {str(e)[:200]}")
# 3. Check for pending approvals > 24h
check_stale_approvals()
# 4. Daily summary (at 18:00)
if datetime.now().hour == 18:
send_daily_summary()
schedule.every(5).minutes.do(run_pipeline)
while True:
schedule.run_pending()
time.sleep(60)
Cost Breakdown: AI Agent vs. Manual vs. Enterprise OCR
| Manual Processing | Enterprise OCR (ABBYY, Kofax) | AI Agent | |
|---|---|---|---|
| Setup cost | €0 | €5,000-50,000 | €0 (build yourself) |
| Monthly (500 invoices) | €2,500-4,000 (labor) | €500-2,500 | €15-40 |
| Accuracy | 96-99% (human error) | 80-92% (template-dependent) | 94-98% (improves over time) |
| New supplier handling | Immediate | Template creation needed | Immediate (zero-shot) |
| Processing time | 5-12 min/invoice | 10-30 sec/invoice | 5-15 sec/invoice |
| Multi-language | Depends on staff | Extra cost per language | Built-in (100+ languages) |
The math is simple: an AI agent processing 500 invoices/month costs less than €0.08 per invoice. Enterprise OCR charges €1-5 per document. Manual processing costs €5-8 per invoice in labor.
5 Mistakes That'll Wreck Your Invoice Agent
1. Skipping the validation layer
Extraction without validation is a liability. An LLM might read "€1,234.56" as "€1234.56" or "€123.456" depending on locale. Always validate math, always check for duplicates, always verify amounts against POs.
2. Auto-booking everything from day one
Start with 100% human review. Then move to auto-booking for low-risk invoices (known suppliers, matched POs, amounts under €500). Gradually increase the threshold as confidence builds. Never go from 0 to full automation overnight.
3. Not handling credit notes
Credit notes look like invoices but have negative amounts. Your agent needs to recognize them and process them differently — usually as a reversal against the original invoice. If you forget this, your books will be wrong by the end of month one.
4. Ignoring document archival
Dutch law requires 7-year document retention. Your agent should archive every processed document with its extracted data, validation results, and booking reference. Make it searchable. If the Belastingdienst asks, you need to find invoice #2024-1847 from Supplier X in under 30 seconds.
5. Hardcoding GL account mappings
Don't build a 500-line if/else chain for GL account classification. Use the LLM with your chart of accounts as context. It handles edge cases, new categories, and ambiguous descriptions far better than rules. Plus it maintains itself.
🤖 Want the Complete AI Employee Blueprint?
Invoice processing is just one use case. The AI Employee Playbook covers 12+ agent types — from inbox management to fleet operations — with production-ready code, deployment guides, and real cost breakdowns.
Get the Playbook — €29Real-World Example: 400 Invoices/Month → 92% Auto-Booked
A logistics company processing 400 invoices/month from 60+ suppliers built this exact pipeline. Here's what happened:
- Week 1-2: Ran in "shadow mode" — extracted data but didn't book. Compared results against manual processing. Accuracy: 89%.
- Week 3-4: Enabled auto-booking for top 10 suppliers (known formats, matched POs). Human review for the rest. Auto-book rate: 35%.
- Month 2: Added GL mapping feedback loop. Expanded auto-booking to all suppliers with matched POs. Auto-book rate: 68%.
- Month 3: Refined validation rules based on false positives. Added credit note handling. Auto-book rate: 85%.
- Month 6: Stable at 92% auto-booking. Human reviews ~32 invoices/month (was 400). AP clerk now handles exceptions + supplier negotiations instead of data entry.
Total cost: $35/month (Claude API) + $10/month (hosting). Time saved: 60+ hours/month. ROI: Paid for itself in 3 days.
Getting Started: Minimum Viable Agent
Don't build everything at once. Here's your week-1 setup:
- Day 1: Set up email watcher for your invoice inbox. Queue PDFs in a local folder.
- Day 2: Build the extraction function. Test with 20 real invoices. Measure accuracy.
- Day 3: Add validation (math check, duplicate check). Log results.
- Day 4: Build the Slack/email notification for exceptions. No auto-booking yet.
- Day 5: Run in shadow mode. Compare AI extraction vs. manual for every invoice.
After week 1, you'll know exactly how accurate your agent is with your specific invoices. That data tells you whether to proceed to auto-booking or refine the extraction first.
"The best document processing system is the one your AP team trusts. Build that trust with transparency — show them every extraction, every validation, every decision the agent makes. They'll go from skeptical to advocates in two weeks."
🛠️ Build Your First AI Agent Today
Need a soul, personality, and system prompt for your document processing agent? Start with the Soul Generator — it's free.
Try the Soul Generator →