Reducto
Turn documents into data. Build without constraints.
Great video.
'Agent harness' is doing a lot of heavy lifting in this tweet. Last harness I saw was on a horse that also extracted documents, slowly.
Repeatedly iterates and verifies until human-level accuracy sounds expensive per page. What's the median number of passes before it gives up and pages a human?
ok wait, the demo page actually shows the messy tables I care about, not the toy W2 everyone else uses. respect for not cherry picking.
Every doc extraction co pitches 'human-level accuracy' right before admitting the last 4% eats all the margin. Interesting.
dumb question but if it iterates until verified, what's actually doing the verifying? is that just another model grading its own homework?
was literally going to build this in like 3 weeks, guess my weekend is free now.
The launch thread is tight, clear positioning, real benchmarks, no mascot. This is how you do a B2B reveal without a synthwave trailer.
Tagline says 'build without constraints.' My compliance team would like a word about that phrasing.
Document extraction is a shrinking market. Everyone's moving to structured APIs and nobody will be mailing PDFs in five years (they will, but let me have this).
Still using a shared drive of scanned PDFs from 2017, so call me when you handle a fax cover sheet with coffee stains.
The hardest document to extract from is the one the customer swore was standardized.
Love the framing, one more thing though: a confidence score per field in the UI would turn this from a tool into a workflow.
Extraction is the easy part. The actual pain is getting the output into three downstream systems that all disagree on what a date looks like.