AI & ML · AI Insuretech

FurtherAI

Trusted AI for Insurance Professionals

Domain Specific AI for Insurance (a16z, YC)

San Francisco, CA582 followers

TLVC Rating

Hook

Editing / Creativity

Copy

Sentiment of launch

Distribution strategy

Community Rating

★★★★★

No ratings yet

Your rating

About

Eval Studio is a test bench that lets insurance teams check whether their AI workflows still hold up when a model, prompt, or pipeline changes. Users load a set of real submissions, define the criteria for a correct output (whether coverage limits were captured, whether exclusions were caught, whether extracted values fall within acceptable thresholds), then run the workflow and get an accuracy score plus a view of the specific cases that broke. After swapping a model or editing a prompt, teams rerun the eval and compare versions side by side to see what improved and what regressed before any of it reaches production.

The launch is aimed at insurance carriers, brokers, and MGAs already running AI against submissions, policy checks, and claims, where a silent regression on hundreds of documents can do real damage. It sits inside FurtherAI's broader product, which covers submissions, policy checks, claims, and compliance, with an AI Assistant, workflows, a human-in-the-loop interface, and integrations , giving teams a way to keep up with new model releases without staying frozen on an older one out of fear of switching.

FurtherAI is based in San Francisco and was co-founded by Aman Gour and Sashank Gondala , with Aman a second-time founder who previously scaled a workflow automation company past $1M ARR and worked as a Product Manager at Microsoft, and Sashank a former language modeling scientist at Apple . The company is backed by Andreessen Horowitz, Nexus Venture Partners, and Y Combinator , and Eval Studio arrives as the team builds out its library of insurance-specific workflows for production use.

Tags

<500KProduct launchExplainerSeries AB2BGlobalUSVertical AIFounder-led

Comments (14)

Priya Vembu6/11/2026

Every claims team I've talked to has a haunted spreadsheet tracking which prompts broke after the last model update. A test bench is overdue.

Tomi Adebayo6/11/2026

ok wait, you're telling me insurance teams were just YOLO-deploying new models into production and praying? hot take: that explains a lot about my last claim.

Lena Voss6/11/2026

The cold open on that launch video did more work in 6 seconds than most decks do in 20 slides. Whoever cut that earned their coffee.

Marcus Q.6/11/2026

Insurance and AI walk into a bar, the bar asks for three forms of ID and a notarized affidavit. Genuinely curious how you handle adjuster-specific edge cases though.

Derek Halloran6/11/2026

Before my procurement team even watches the demo: SSO, SOC2 Type II, data residency options? Carriers will ghost you on slide one without it.

Clara Mwangi6/11/2026

Question for the team: where does the test data sit during evaluation, and can carriers keep it inside their own VPC? My GC will ask before I do.

Finn O'Connell6/11/2026

Tagline rewrite, on the house: 'AI your underwriters won't have to babysit.' Yours is fine but mine fits on a hoodie.

Anjali Raman6/11/2026

Been watching this team since the seed deck. Eval Studio is the wedge I kept telling other GPs about, glad it's finally public.

Karol Wisniewski6/11/2026

Insurance AI feels crowded and the TAM for eval tooling inside a vertical is narrower than the pitch suggests. Convince me otherwise.

Hana Sato6/11/2026

Does Eval Studio expose an API for triggering runs from CI? Webhooks on regression detection would save me from refreshing a dashboard at 2am.

Dr. Obi Nwankwo6/11/2026

Curious whether you're benchmarking on held-out policyholder data or synthetic distributions. The eval methodology matters more than the UI here.

Maya Eskandari6/11/2026

If you're hiring forward-deployed engineers with insurance domain chops, I have a former actuary turned ML eng who would eat this role for breakfast.

Vikram Joshi6/11/2026

Third insuretech AI launch I've seen this quarter and the only one that picked a real wedge instead of a chatbot. Pacing of the rollout is solid.

Beni Okafor6/11/2026

Hear me out: eval runs as on-chain attestations so carriers can prove model regression history to regulators. I'll see myself out.