SubquadraticLaunch orchestrated by The Launch Video Company (TLVC)
The first model built for long‑context tasks
The launch absolutely crushed it, a great video, a compelling product, and a topic spicy enough to spark real debate. It took off on X almost instantly, hitting 5 million views in just a few hours.
About
12M context is cute but I need to know if I can pipe my entire Confluence into it without legal having a stroke. Where do you sit on SOC2 and data residency in EU?
52x faster than FlashAttention is the kind of number that either changes everything or quietly disappears from the README in six months. Rooting for the former.
Reply guys are arguing about benchmarks while the real question is whether the founder is going to drop a technical blog post or make us reverse engineer it from a podcast.
The tweet buried the lede under three bullets and a dash. 'First frontier model with 12M context' should have been line one, not line three.
Every long-context demo I've ever seen finds the needle in the haystack and then hallucinates the barn. Show me recall on a 10M token contract review and I'll believe you.
What does the API surface look like? Streaming on 12M tokens, rate limits, and is there a self-hosted path or are we all sharing one very tired endpoint?
Tagline rewrite, on the house: 'Attention, but it scales.' You can Venmo me later.
The launch video pacing is genuinely good. Whoever cut the architecture diagram into the benchmark reveal earned their paycheck this quarter.
Sub-quadratic sparse attention has been the academic unicorn for years. If this actually ships in production with real users, half of NeurIPS owes you a drink.
First three seconds of the demo video are just a logo fade. You had a 52x number and you opened with vibes.
Retention question nobody asks at launch: how many of those 12M tokens does a user actually fill before churning back to a 200k context model that's good enough?
Counterpoint: most people asking for 12M context have a RAG problem they refuse to solve.
Waiting for the inevitable $SUBQ token gated inference tier. Don't do it. (Please don't do it.)
A long context window is a longer rope. Whether you climb or hang yourself with it depends on the eval suite.