Work public

invoice-parse-agent

Invoice parsing agent with a bonded-warehouse-manifest dashboard. PDFs in, structured JSON out, every field with provenance.

  • agents
  • parsing
  • evals
  • bun
  • hono
  • dashboard
Repo
github.com/mj-deving/invoice-parse-agent
Published
2026-05-26

What it is

A parsing agent for invoices that returns structured JSON with a confidence score and provenance for every field. The dashboard renders as a bonded-warehouse manifest: rows of records, audit log per row, a parse button that runs the actual extraction against your file.

The problem it solves

Invoice extraction is a solved problem in vendor marketing and a hard problem in practice. Real invoices arrive as scanned PDFs, in five layouts, in three languages, with handwritten annotations and stamps. A vendor demo handles the first ten; the eleventh starts costing real engineering hours.

This is an opinionated take at how the eval surface should look so that “is the parser working?” becomes a question you can answer in five seconds at the dashboard.

Design decisions worth defending

  1. Bonded-warehouse manifest aesthetic. The dashboard makes parsing legible as warehousing: records, lots, audit lines. The aesthetic is borrowed from customs forms, which is what this kind of dashboard actually is.
  2. Per-field provenance, not whole-document confidence. A whole-document score hides which field is rotten. Per-field provenance shows the exact chunk and the parser tier that produced each value.
  3. Eval endpoint shipping next to the parser. The dashboard exposes /eval directly. If parser quality regresses, the eval surface catches it in the same UI the parser ships from.

Stack

Bun + Hono backend. Server-rendered HTML dashboard (the parser is a backend product, not a SPA). Element IDs and fetch URLs preserved across redesigns so QA replay scripts keep working.