Back to directory
AI & ML · Proptech

Build AI

learning from humans

founder of @buildpbc
sf - sz - blr42K followers
TLVC Rating

Sick video with great hook and animations.

Community Rating
No ratings yet
Your rating
Sign in to rate this launch.

About

Egocentric-1M is an open dataset of roughly one million hours of first-person video captured from factory workers wearing Build AI's custom head-mounted glasses, published on Hugging Face under an Apache 2.0 license for free commercial use. It is aimed at robotics teams and foundation model researchers training manipulation policies, visuomotor models, and embodied agents that need dense human hand and eye data from real work, rather than staged lab demonstrations or third-person scene footage. The launch matters because it caps a rapid scaling arc from the same team. Build AI released Egocentric-10K in November 2025, Egocentric-100K in December, and now Egocentric-1M, which it describes as the largest egocentric dataset ever released, dwarfing every prior dataset combined . The earlier 100K release covered over 100,000 hours from 14,228 factory workers across Southeast Asia and 10.8 billion frames, captured in real production environments including assembly lines, sorting, packaging, and machining , and the 1M tier extends that approach to a scale that targets pretraining of general-purpose robot foundation models. Build AI is led by founder and CEO Eddy Xu. By April 2025, Xu had dropped out of Columbia to launch Build AI, which defines its mission as collecting human data to accelerate the deployment of general-purpose robots , and the company runs collection operations across SF, Shenzhen, and Bangalore. Framed by Xu as a step toward "the internet for physical AI," this release puts a corpus previously locked inside large labs into the hands of any team building robots that learn from how people actually do work.
Tags
<500KCinematicProduct launchSeries AB2BGlobalAI-generatedUSVertical AI
Comments (12)
Sign in to join the discussion.
Priya Raghavan4/28/2026

egocentric-1M is a fun name until you realize half my coworkers already qualify. what's the licensing situation for the footage?

Tomas Lindqvist4/28/2026

the cut from the warehouse POV to the kitchen pour at 0:14 actually slaps. whoever edited this earned their paycheck.

Okwuchi Eze4/28/2026

curious how you handled the long-tail of weird first-person motion (looking down at phone, sneezing, etc). that's where ego datasets usually fall apart.

Dharmendra P.4/28/2026

every robot trained on this is going to flinch when it sees a doorframe. respectfully.

Mira Voss4/28/2026

internet for physical AI is a strong frame. the second half of that tweet is doing more work than most seed decks I see.

Lucia Marchetti4/28/2026

got a perception eng who literally wrote her thesis on egocentric video. you hiring or do I keep her warm?

Beatriz Camargo4/28/2026

datasets at this scale are a beautiful capex story until storage and annotation bills show up. how are you thinking about unit economics on licensing this out?

Kenji Watanabe4/28/2026

we explored an internal first-person corpus at my last company in 2019 and killed it because legal screamed. interested how you got past that.

rafa4/28/2026

this might genuinely be one of the last big data drops a human team ships before agents start curating these themselves. screenshotting.

Noor Khalid4/28/2026

building adjacent in household robotics and honestly relieved someone is doing the data layer so the rest of us can focus. lightly jealous though.

Samir D.4/28/2026

the thumbnail is doing 40% of the lift here. clean type, moody POV shot, no cringe founder face. learn from this people.

Tara Olusegun4/28/2026

naive question but if the dataset is human POV, how do you bridge to robot embodiments with totally different cameras and heights? feels like a gap.