Gemini’s new AI agent is about as good as Google’s demo
Gemini’s new AI agent is about as good as Google’s demo
Gemini 的新 AI 智能体表现正如 Google 演示的那样出色
Gemini Spark is impressive, but it’s not worth the cost just yet. Gemini Spark 令人印象深刻,但目前还不值得为此买单。
Google’s new “24/7” AI agent, Gemini Spark, can be shockingly good at doing things on your behalf. But I’m not sure it’s worth the financial cost and potential privacy tradeoffs. Google 新推出的“全天候”AI 智能体 Gemini Spark 在代办任务方面表现得惊人地好。但我并不确定它是否值得其经济成本以及潜在的隐私权衡。
The company gave me access to Spark last week. Google advertises Spark as an AI agent that can take on tasks and work on them in the background — even tasks that have multiple steps — allowing you to put your phone down or walk away from your computer. It also advertises at the very top of the Spark website that it’s “always under your direction,” that “you choose to turn it on,” and that “it’s designed to check with you before taking major actions.” Given the mounting skepticism toward AI, it’s very much “my ‘not involved in rogue AI’ T-shirt has people asking questions already answered by my shirt.” 该公司上周向我开放了 Spark 的访问权限。Google 将 Spark 宣传为一种可以承担任务并在后台运行的 AI 智能体——即使是多步骤的任务也不在话下——让你能够放下手机或离开电脑。Spark 网站的最上方还宣称它“始终在你的指挥下”、“由你选择开启”,并且“旨在采取重大行动前与你确认”。考虑到人们对 AI 日益增长的怀疑态度,这简直就像是“我那件写着‘未参与 AI 叛变’的 T 恤,让人们问出了我衣服上早已回答过的问题”。
I didn’t know where to start, so I took a page from my colleague Antonio’s book: I decided to use Spark to tackle tasks like what Google demonstrated onstage at I/O. Would it work as well in my home office as it did on the big stage? 我不知道从何下手,于是借鉴了我同事 Antonio 的做法:我决定用 Spark 来处理一些类似 Google 在 I/O 大会上演示的任务。它在我的家庭办公室里能像在舞台上那样表现出色吗?
At I/O, Google VP Josh Woodward showed off a few different examples. The first was asking Spark to draft an email to a team at Google, compile everything about the Gemini Live launches and “wins from last week,” and use a special AI skill to make the email sound like him. Google asking Google to do things for Google should be the easiest lift in the world, so I tried to push it further. 在 I/O 大会上,Google 副总裁 Josh Woodward 展示了几个不同的例子。第一个是要求 Spark 给 Google 的一个团队起草一封邮件,汇总关于 Gemini Live 发布的所有信息以及“上周的成就”,并利用一种特殊的 AI 技能让邮件听起来像他本人的语气。让 Google 去要求 Google 为 Google 做事应该是世界上最简单的事,所以我决定加大难度。
I asked Gemini to draft an email to my wife that compiles our total monthly average grocery spending in 2026. I figured this test would tell me a few things: Could Spark figure out who my wife was (without me giving Spark her name), could it determine where our budget spreadsheet is in Drive (which does not have “budget” in the file name), and could it actually draft an email in Gmail? 我要求 Gemini 给我的妻子起草一封邮件,汇总我们 2026 年每月的平均杂货支出。我想这个测试能告诉我几件事:Spark 能否找出谁是我的妻子(在我没告诉它名字的情况下),它能否确定我们的预算电子表格在云端硬盘中的位置(文件名里并没有“budget”这个词),以及它能否真的在 Gmail 中起草邮件?
When I got the result from Spark shortly after, I really said: “Wow, that’s actually nuts.” Spark found my wife’s email address, pulled the right information from our 2026 budget spreadsheet, grabbed the monthly grocery totals including the incomplete data from May (which still wasn’t over when I ran the test), averaged the totals, and put it all in a draft email in my Gmail. The text of the email addressed my wife by her first name, even though her email address does not contain her first name. It even included a sign-off that we use just for each other. 不久后,当我收到 Spark 的结果时,我真的惊叹道:“哇,这简直太疯狂了。”Spark 找到了我妻子的邮箱地址,从我们的 2026 年预算表中提取了正确的信息,抓取了每月的杂货总额(包括 5 月份尚未完成的数据,当时测试时 5 月还没过完),计算了平均值,并将这一切放入了 Gmail 的草稿邮件中。邮件正文称呼了我妻子的名字,尽管她的邮箱地址里并不包含她的名字。它甚至还包含了我们之间专用的落款。
In his next example, Woodward asked for some help planning a block party. I’m not planning a block party, but I asked Spark for help using the same questions he asked. It didn’t go well. It created a table of friends and family as a “highly realistic reference for who is bringing what,” drafted an email in my Gmail mentioning a shared sign-up sheet that doesn’t exist, and created an ugly deck with slides detailing information about city permits. 在下一个例子中,Woodward 请求帮助策划一场街区派对。我虽然没打算办派对,但我还是用了他问的同样问题来请求 Spark 的帮助。结果并不理想。它创建了一张亲友表格作为“谁带什么东西的高度现实参考”,在我的 Gmail 中起草了一封提到一个并不存在的共享报名表的邮件,还制作了一份丑陋的幻灯片,详细列出了城市许可证的信息。
To push Spark, I asked it to create that missing sign-up sheet and add a link to the email that was already drafted. While Spark took a few minutes to figure it out, that task did work; it created a spreadsheet and went back to the draft email text and dropped in the link. 为了进一步测试 Spark,我要求它创建那个缺失的报名表,并将链接添加到已经起草好的邮件中。虽然 Spark 花了几分钟才搞定,但任务确实完成了;它创建了一个电子表格,回到草稿邮件中并插入了链接。
Woodward’s last demo was arguably the most impressive. He talked at Spark to ask it to do a bunch of things: make his meetings with CEO Sundar Pichai hot pink on his calendar, write a note to a new neighbor to invite him to his block party, and create a document to help with to-dos for his kids for the end of the school year. For my own version, I asked it to make a calendar event each month ahead of my wife’s birthday and make it hot pink, draft an email to my family about sending them the first episode of the latest season of Taskmaster, and create a document with the top things my wife and I need to know about getting our toddler ready for preschool. Woodward 的最后一个演示可以说是最令人印象深刻的。他通过语音要求 Spark 做一系列事情:将他与 CEO Sundar Pichai 的会议在日历上标记为亮粉色,给新邻居写一张便条邀请他参加街区派对,并创建一个文档来帮助整理他孩子学年末的待办事项。在我的版本中,我要求它在我妻子生日前的每个月创建一个日历事件并将其设为亮粉色,给我的家人起草一封邮件,发送《Taskmaster》最新一季的第一集,并创建一个文档,列出我和妻子在为孩子准备上幼儿园时需要了解的最重要事项。
I started this request at 3:35PM PT on Friday. During I/O, Woodward made a bit of a show about putting his phone down and promising to check the results later in the keynote, which he did. But after addressing one hiccup — Spark wanted to access my contacts, which I declined — my task was done about four minutes later. 我在周五太平洋时间下午 3:35 发起了这个请求。在 I/O 大会期间,Woodward 演示了放下手机并承诺稍后在主题演讲中查看结果的过程,他也确实做到了。但在解决了一个小插曲——Spark 想访问我的联系人,我拒绝了——之后,我的任务在大约四分钟后就完成了。
Once again, I was a little floored by the results, though they were imperfect: 再一次,我被结果惊到了,尽管它们并不完美:
My Google calendar now has events from 9–10AM on the correct day of each month leading up to my wife’s birthday. The reminders are in what Google calls “flamingo,” which isn’t exactly “hot pink,” but close enough. 我的 Google 日历现在在妻子生日前的每个月对应的日期都有上午 9 点到 10 点的事件。提醒颜色是 Google 所谓的“火烈鸟色”,虽然不完全是“亮粉色”,但也足够接近了。
Spark grabbed the emails of my immediate family and put them in a draft email. (Strangely, it didn’t include my wife’s.) The text of the email got the name of the first episode of the latest season of Taskmaster correct, but linked to a trailer instead of the actual episode. The email also included the term “loool,” which is something I write in casual written conversation. Spark 抓取了我直系亲属的邮箱并放入了草稿邮件中。(奇怪的是,它没包含我妻子的。)邮件正文准确写出了《Taskmaster》最新一季第一集的名称,但链接指向的是预告片而不是正片。邮件里还包含了“loool”这个词,这是我在日常非正式交流中会用的词。
Spark made a Google Doc in my Drive with a preschool preparation checklist. However, it’s only available to me; I asked Spark if it could give access to my wife, but it said it isn’t currently able to do that. Spark 在我的云端硬盘中创建了一个包含幼儿园准备清单的 Google 文档。然而,它仅对我可见;我问 Spark 能否给我的妻子访问权限,但它说目前还做不到。
Spark could be a powerful tool. But there are a few caveats I should mention. Like all AI tools, you still have to check its output to make sure it’s accurate, which could have higher stakes when it’s pulling from personal information to prepare things you share with people you know. Although Google pitches Spark as something that can operate on its own, I found myself constantly watching it or checking the notifications it sent to my phone. Spark 可能是一个强大的工具。但我必须指出几点注意事项。像所有 AI 工具一样,你仍然需要检查它的输出以确保准确性,当它从个人信息中提取内容并准备分享给熟人时,风险可能会更高。尽管 Google 将 Spark 宣传为可以独立运行的工具,但我发现自己不得不时刻盯着它,或者检查它发送到我手机上的通知。