Case Study · AI Agent

OrderSync Order Agent
Email to order, on its own

A production AI agent that reads inbound B2B purchase orders — in any format a customer sends — and turns them into validated, structured orders. It matches products, prices the order, and writes it to the system, escalating to a human the moment it isn't sure.

AI agentB2B order automationHuman-in-the-loopProduction
Role
Designed & built the agent end to end
Stack
TypeScript · Mastra · Claude · Hono · Zod
Surface
Email, EDI, PDF, CSV, Excel & voice intake
Focus
Accuracy · auditability · graceful escalation
The problem

B2B orders arrive as a mess, and humans key them in.

Distributors take orders the way their customers want to send them. That means an inbox full of forwarded emails, scanned POs, spreadsheets, and EDI files — each one read and retyped into the ERP by hand. It's slow, and the mistakes are expensive.

Every customer sends it differently

The same order shows up as a forwarded email, a scanned PDF, an Excel sheet, or an EDI 850. No two layouts match.

Ops teams retype it by hand

Someone reads each order and keys line items into the ERP — slow, and a single transposed quantity becomes a wrong shipment.

The hard cases are the costly ones

An unrecognized SKU or an off-catalog price is exactly where a silent mistake hurts most — and where naive automation fails quietly.

Trust requires a paper trail

To let software touch real orders, an operator needs to see what it extracted, why it decided what it did, and where it wasn't sure.

How it works

Inbox in, structured order out.

Four stages, fully autonomous on the happy path — with a human pulled in exactly when the agent decides it needs one.

01

Order lands in the inbox

A customer emails a purchase order — in the body, or as a PDF, spreadsheet, or EDI attachment. The Laravel backend pulls it in over the Gmail API and queues it for the agent.

02

The agent reads it

A fresh agent spins up per request and picks the right extractor for the format — pulling the customer, PO number, and every line item out of unstructured text, scanned pages, or X12 EDI.

03

It resolves the order

Tools match each line to a real catalog product (SKU, UPC, or fuzzy description), find the customer, apply pricing and per-customer quantity rules, and validate the whole order.

04

It acts — or asks

Confident orders are written straight to the system and pushed to the ERP. Anything ambiguous — an unknown product, a price mismatch, a cancellation — becomes a typed escalation for a human to review.

How it's built

A tool-using agent, not a prompt.

01 · The loop

Reason, call a tool, read the result, repeat.

Built on Mastra with Claude as the model. Each order gets its own agent that works in a loop — up to forty steps — calling real tools and reacting to what they return, instead of trying to one-shot the answer. A Sonnet → Haiku fallback and ephemeral prompt caching keep it resilient and cheap.

02 · The hands

Thirty-five tools across the whole order lifecycle.

Every real action — extract, match, price, validate, write, escalate — is a typed tool with a Zod schema. The model never touches the database directly; it composes these tools, and each one is bound to the tenant making the request.

Extraction

7

EDI · PDF (text + vision) · CSV · Excel · email · voice

Products

5

SKU match · fuzzy match · catalog search · status

Customers

4

match by domain · search · details · order history

Orders

10

create · add line items · validate · confirm · cancel

Pricing

3

calculate · validate against catalog · price sheets

Escalation

2

raise typed escalations · log every decision

EDI

3

parse X12 · generate 810 invoice · fetch content

03 · The wiring

A microservice behind the product.

The agent runs as a Hono service that the Laravel app calls over an authenticated endpoint when an order arrives. It reads and writes catalog, customer, and order data through a typed API client, and talks to a dedicated EDI service for X12 parsing and 810 invoice generation. I rewrote the whole thing from an earlier Python version into this typed, tool-first TypeScript design.

The part I'm proudest of

It knows when not to trust itself.

Automating order entry is easy until it's wrong. The work that made this safe to run on real orders is all about transparency and knowing where the line is.

Confidence-gated escalation

Instead of guessing, the agent raises one of a dozen typed escalations — unknown product, price mismatch, customer not found, cancellation requested — each with a priority. Edge cases land in a human's queue instead of becoming bad orders.

Every field traces back to the source

Extracted values carry source spans — character offsets in text, and pixel bounding boxes on PDFs — so the review UI can highlight exactly where on the original document each number came from.

A decision log, not a black box

Customer match, product match, price validation, order creation — every step the agent takes is logged as a typed decision, so an operator can audit the whole chain of reasoning after the fact.

Per-request tools that capture their effects

Each request builds its own tool set bound to the tenant; side effects (created order IDs, escalations, decisions) accumulate in capture buckets the response is assembled from — no shared global state between concurrent orders.

By the numbers
35
tools the agent can call
6
input formats, one pipeline
12
typed escalation kinds
40
step reasoning budget
8
extraction engines evaluated
16K
token budget per order

Built and validated against real customer documents from live B2B suppliers. I've kept the numbers here to what the system actually does rather than quoting an accuracy figure I can't stand behind out of context.

What I'd do next
  • Surface a confidence score per field in the review UI to triage escalations faster.
  • Expand the golden eval set and gate deploys on extraction accuracy in CI.
  • Auto-generate EDI 810 acknowledgements back to trading partners on confirm.
TypeScriptMastraClaude (Anthropic)Vercel AI SDKHonoZodLaravel API