A production AI agent that reads inbound B2B purchase orders — in any format a customer sends — and turns them into validated, structured orders. It matches products, prices the order, and writes it to the system, escalating to a human the moment it isn't sure.
Distributors take orders the way their customers want to send them. That means an inbox full of forwarded emails, scanned POs, spreadsheets, and EDI files — each one read and retyped into the ERP by hand. It's slow, and the mistakes are expensive.
The same order shows up as a forwarded email, a scanned PDF, an Excel sheet, or an EDI 850. No two layouts match.
Someone reads each order and keys line items into the ERP — slow, and a single transposed quantity becomes a wrong shipment.
An unrecognized SKU or an off-catalog price is exactly where a silent mistake hurts most — and where naive automation fails quietly.
To let software touch real orders, an operator needs to see what it extracted, why it decided what it did, and where it wasn't sure.
Four stages, fully autonomous on the happy path — with a human pulled in exactly when the agent decides it needs one.
A customer emails a purchase order — in the body, or as a PDF, spreadsheet, or EDI attachment. The Laravel backend pulls it in over the Gmail API and queues it for the agent.
A fresh agent spins up per request and picks the right extractor for the format — pulling the customer, PO number, and every line item out of unstructured text, scanned pages, or X12 EDI.
Tools match each line to a real catalog product (SKU, UPC, or fuzzy description), find the customer, apply pricing and per-customer quantity rules, and validate the whole order.
Confident orders are written straight to the system and pushed to the ERP. Anything ambiguous — an unknown product, a price mismatch, a cancellation — becomes a typed escalation for a human to review.
Built on Mastra with Claude as the model. Each order gets its own agent that works in a loop — up to forty steps — calling real tools and reacting to what they return, instead of trying to one-shot the answer. A Sonnet → Haiku fallback and ephemeral prompt caching keep it resilient and cheap.
Every real action — extract, match, price, validate, write, escalate — is a typed tool with a Zod schema. The model never touches the database directly; it composes these tools, and each one is bound to the tenant making the request.
EDI · PDF (text + vision) · CSV · Excel · email · voice
SKU match · fuzzy match · catalog search · status
match by domain · search · details · order history
create · add line items · validate · confirm · cancel
calculate · validate against catalog · price sheets
raise typed escalations · log every decision
parse X12 · generate 810 invoice · fetch content
The agent runs as a Hono service that the Laravel app calls over an authenticated endpoint when an order arrives. It reads and writes catalog, customer, and order data through a typed API client, and talks to a dedicated EDI service for X12 parsing and 810 invoice generation. I rewrote the whole thing from an earlier Python version into this typed, tool-first TypeScript design.
Automating order entry is easy until it's wrong. The work that made this safe to run on real orders is all about transparency and knowing where the line is.
Instead of guessing, the agent raises one of a dozen typed escalations — unknown product, price mismatch, customer not found, cancellation requested — each with a priority. Edge cases land in a human's queue instead of becoming bad orders.
Extracted values carry source spans — character offsets in text, and pixel bounding boxes on PDFs — so the review UI can highlight exactly where on the original document each number came from.
Customer match, product match, price validation, order creation — every step the agent takes is logged as a typed decision, so an operator can audit the whole chain of reasoning after the fact.
Each request builds its own tool set bound to the tenant; side effects (created order IDs, escalations, decisions) accumulate in capture buckets the response is assembled from — no shared global state between concurrent orders.
Built and validated against real customer documents from live B2B suppliers. I've kept the numbers here to what the system actually does rather than quoting an accuracy figure I can't stand behind out of context.