Tech companies desperately want to film you doing chores
Tech companies desperately want to film you doing chores
科技公司极度渴望拍摄你做家务的样子
Startups are paying people for the real-world data needed to train their robots. 初创公司正在向人们支付费用,以获取训练机器人所需的现实世界数据。
This week, an AI training startup called Shift said it would clean New Yorkers’ homes for free. It has plans to expand into other cities as well, including London, and looking around my flat, I get the appeal. 本周,一家名为 Shift 的人工智能训练初创公司表示,将为纽约市民免费打扫房屋。该公司还计划将业务扩展到包括伦敦在内的其他城市。环顾我的公寓,我确实感受到了这种诱惑力。
But there’s a catch. There’s always a catch. 但天下没有免费的午餐,总是有附加条件的。
In exchange for the cleaning, Shift wants footage of its cleaners at work: scrubbing dishes, wiping counters, dusting tables, mopping floors. It wants everything. Video of all the boring domestic labor we’d happily outsource if we could — and that robotics companies are racing to teach machines to do so they can sell us something to do it for us. 作为免费清洁的交换,Shift 要求拍摄其清洁工工作的视频:洗碗、擦拭台面、掸灰尘、拖地。他们什么都要。这些视频记录了我们如果可以的话很乐意外包出去的枯燥家务劳动——而机器人公司正竞相教导机器完成这些工作,以便向我们兜售能代劳这些家务的产品。
That’s harder than it sounds. Unlike chatbots, image generators, and other AI tools that have exploded in recent years, robots have to deal with the physical world. That means understanding space, motion, force, friction, weird shapes and materials, awkward lighting, and everything else that humans — and other organics — tend to grasp instinctively. It’s why things that are generally easy for us, like folding clothes, picking up an apple, or pouring a glass of water, have proven so maddening for roboticists to codify. 这比听起来要难得多。与近年来爆发的聊天机器人、图像生成器和其他人工智能工具不同,机器人必须处理物理世界。这意味着要理解空间、运动、力、摩擦力、奇形怪状的物体和材料、尴尬的光线,以及人类和其他有机体往往能本能掌握的一切。这就是为什么对我们来说通常很容易的事情,比如叠衣服、捡苹果或倒一杯水,对机器人专家来说却极其难以编程实现。
Teaching machines to do those things takes data. Lots of it. Text, images, and videos could be easily scraped from the internet at an industrial scale. And they were, often without compensating the people who made them. The physical world is harder to scrape, and harder still to scrape quietly without paying for it. This means access to high-quality data is a massive bottleneck for companies developing physical AI. It’s a lucrative opportunity, so companies like Shift are getting creative. 教机器做这些事情需要数据,而且是海量的数据。文本、图像和视频可以从互联网上大规模地轻松抓取。事实也确实如此,而且往往没有补偿那些创造这些内容的人。物理世界的数据更难抓取,且很难在不付费的情况下悄悄获取。这意味着获取高质量数据是开发物理人工智能公司的巨大瓶颈。这是一个有利可图的机会,所以像 Shift 这样的公司正在发挥创意。
They’re not alone. In India, recent reporting revealed that home services platform Pronto has been using clients’ homes as a source of AI training footage for chores like cooking, cleaning, and laundry. Pronto says it only records footage if customers explicitly opt in — it’s not clear what customers get in return, other than a copy of the footage — but the practice still set off a wave of backlash in the market, with rival startups insisting they have never recorded inside homes to train AI and have no plans to do so. 他们并不孤单。在印度,最近的报道显示,家庭服务平台 Pronto 一直利用客户的家庭作为人工智能训练素材的来源,用于烹饪、清洁和洗衣等家务。Pronto 表示,只有在客户明确同意的情况下才会录制视频——除了获得一份视频副本外,尚不清楚客户能得到什么回报——但这种做法仍在市场上引发了一波强烈抵制,竞争对手初创公司坚称他们从未在室内录制视频来训练人工智能,也没有计划这样做。
Other startups are focused on trying to scale data collection. Silicon Valley-based Human Archive, for example, hopes to partner with companies like Pronto and have gig workers record their activities using not-so-stylish camera caps. The hats collect footage from the wearer’s point of view, exactly the kind of “egocentric” or first-person data robotics companies need to teach machines how people navigate physical space. Shift, meanwhile, also taps consumers directly, and claims to have paid tens of thousands of people across 15 countries to record their activities through its app. 其他初创公司则专注于尝试扩大数据收集规模。例如,总部位于硅谷的 Human Archive 希望与 Pronto 等公司合作,让零工人员佩戴并不时尚的摄像头帽子来记录他们的活动。这些帽子从佩戴者的视角收集视频,这正是机器人公司教导机器如何像人类一样在物理空间中导航所需要的“自我中心”或第一人称数据。与此同时,Shift 也直接接触消费者,并声称已向 15 个国家的数万人支付了费用,让他们通过其应用程序记录自己的活动。
Some companies are skipping useful work altogether. Instead, workers are paid to complete the exact same physical tasks again and again while cameras and sensors can capture every movement. Such staged data farms are designed to turn rote physical activity — folding towels, picking up cups, carrying boxes — into AI training material valuable enough to justify paying people to create it. 一些公司甚至完全跳过了实际工作。相反,他们付钱给工人,让他们一遍又一遍地完成完全相同的体力任务,同时摄像头和传感器可以捕捉每一个动作。这种人为搭建的数据农场旨在将机械的体力活动——叠毛巾、捡杯子、搬箱子——转化为人工智能训练材料,其价值足以证明向人们支付报酬来创造这些数据是值得的。
And some data is generated by robots already out in the world. Despite the hype, true automation is still a long way away — hence the need for all this data — but companies are keen to ship products anyway. They’ll use data from customers’ homes to improve the product. Many companies rely on remote workers to step in when the robots inevitably get stuck. They’ll use that data too. 还有一些数据是由已经在现实世界中运行的机器人生成的。尽管炒作不断,但真正的自动化还很遥远——这就是为什么需要所有这些数据——但公司仍然热衷于发布产品。他们会利用客户家中的数据来改进产品。许多公司依赖远程工作人员在机器人不可避免地卡住时进行干预。他们也会利用这些数据。
Of course, the act of trading data for something of value is not new. Companies have been offering discounts, convenience, and free services in exchange for access to your data for years, from loyalty cards and cookies to dashcams, insurance apps monitoring how people drive, and that heinous smart TV that’s always showing ads. 当然,用数据交换有价值的东西并不是什么新鲜事。多年来,公司一直通过提供折扣、便利和免费服务来换取对你数据的访问权限,从会员卡和 Cookie 到行车记录仪、监控驾驶行为的保险应用程序,以及那台总是播放广告的可恶智能电视,无一例外。
What’s new is the kind of data companies are willing to pay for. For now, that means maybe letting a human clean your home in a snazzy hat for free so that, eventually, a company can sell you a robot to do it instead. 新鲜的是公司愿意为之付费的数据类型。目前,这意味着你可能愿意让一个戴着时髦帽子的人免费打扫你的家,以便最终,公司可以向你兜售一台机器人来代劳这些工作。