Reflecting on a year of Claude Code
When we first released Quad Code, it was like a little video and I remember posting it to Slack and there was like two people that gave like the reaction like people were like excited.
我们第一次发布 Claude Code 的时候,就是一个小视频,我记得当时把它发到 Slack 上,只有两个人给了那种……就是有人很兴奋的反应。
I
我
thought it was really cool, especially for my very easy engineering tasks.
觉得真的挺酷的,特别是对我那些很简单的工程任务来说。
It was quite good at it.
它做得相当不错。
That's like a really nice way to say that it wasn't really good.
这是很委婉的说法,意思就是它当时其实没那么厉害。
I can't believe it's only been a year since we first launched Quad Code.
真不敢相信我们发布 Claude Code 才不过一年。
It's hard to remember what what that was like.
很难回忆起那时候是什么感觉了。
Like it is it's so different than what we're doing today.
跟我们现在做的东西真的差太多了。
Like now I just have like armies of agents that are doing stuff like I'm prompting one agent or I have like an agent that's like prompting agents that's prompting agents and it's like a tree of like thousands of agents.
现在我随手就能调动一大堆 agent 在干活,我给一个 agent 发提示词,或者我有一个 agent 在给 agent 发提示词,然后那些 agent 又在给 agent 发提示词,就像一棵有几千个 agent 的树。
But is I think it's just like the most important idea when working on this stuff is like every single time Quad makes a mistake, I don't tell Quad to do it differently.
但我觉得,做这类事情最重要的一个想法就是:每次 Claude 犯了错,我不会告诉 Claude 下次要怎么做不同。
I tell it to write it to the quadmd or to like make a skill or or something to do it differently.
我会让它把这件事写进 CLAUDE.md,或者做成一个 skill,或者用别的方式来改变这个行为。
And if you can do this, then quad can just like run forever.
如果你能做到这一点,Claude 就可以一直跑下去。
And I I think the other thing that we kind of realized is the verification is is really important.
我想我们也意识到了另一件事:验证真的很重要。
Like we didn't realize that.
我们之前没意识到这一点。
I hear this come up a lot with developers and enterprises that we meet with.
我跟很多开发者和企业打交道,经常听到这个话题。
Um what are your tips for making a really good making quad code really good at verification?
那你有什么建议,能让 Claude Code 真的把验证做好?
I sort of feel like this is this thing that just like everyone misunderstands because whenever we talk about verification, people are thinking like unit test or they're thinking like lint or like type check.
我觉得这件事大家都搞错了,因为每次聊到验证,大家想到的是单元测试、lint,或者类型检查之类的。
These are the things that are obviously really easy to automate and these are the things that were already automated.
这些东西显然很容易自动化,而且本来就已经是自动化的了。
But actually when we talk about verification for agents, it's something slightly different.
但我们说的 agent 验证其实有点不一样。
It's like can the agent run the thing?
是说 agent 能不能真的把那个东西跑起来?
It takes a little bit of uh mental work to figure out how exactly do you do this cuz it's often not straightforward.
这需要花一点脑力去琢磨,具体怎么做,因为往往没有那么直接。
And I think that's like that that's one of the challenges.
我觉得这就是挑战之一。
I remember with uh with Opus 4, Claude tested itself and we we we just like hooked it up to Opus 4 and I was like, "Claude, build a feature and then test yourself in like bash and it opened a little claude CLI and tested its own feature and I was just like whoa, it's crazy."
我记得用 Opus 4 的时候,Claude 测试了自己,我们就把它连上 Opus 4,我说,Claude,做一个功能,然后在 bash 里测试自己,它就打开了一个小的 Claude CLI,然后测试了自己的功能,我当时就想,哇,太疯狂了。
Like now now we're so used to it.
现在我们已经习以为常了。
Like now, you know, now now we have these loops going for, you know, like the iOS simulator and the Android simulator and like computers for desktop.
现在你知道,我们有针对 iOS 模拟器、Android 模拟器,还有桌面端电脑的那些 loop 在跑。
like it's it's not surprising, but back then that was crazy.
现在觉得没什么大不了的,但当时真的挺震撼的。
How how are like how how are you doing it?
你是怎么做的?
So, I've been mainly hacking on the desktop app these days and one of the engineers on the team actually added this desktop development skill that teaches Claude how to run the local desktop app and I've been having it use it and it still runs into issues or like bugs with the staging environment sometimes.
我最近主要在捣鼓桌面应用,团队里有个工程师加了一个桌面开发 skill,教 Claude 如何把本地桌面应用跑起来,我一直在让它用这个,不过有时候还是会遇到 staging 环境的问题或者 bug。
And so what I have it do is in those cases I have it read slack and understand
所以遇到这种情况,我让它去读 Slack,搞清楚
hey is is staging down right now or is there has someone else already hit this?
staging 现在是不是挂了,或者有没有别人已经踩过这个坑。
Um and then when it debugs the whole issue I tell it to update the desktop development skill.
然后等它把整个问题调试完,我让它更新桌面开发 skill。
What the skill does is cloud actually spins up a local desktop app and it uses computer use to quick around it.
这个 skill 的作用是 Claude 会在本地启动一个桌面应用,然后用 computer use 功能在里面点来点去。
And so when I add a new UX it quicks around to invoke the new UX.
所以当我加了新的 UX 时,它会点进去触发那个新的 UX。
It also tests edge cases and when there's an issue it fix it.
它也会测试边界情况,遇到问题就修复。
This says it and rechecks.
然后验证一下,再重新检查。
This is like honestly one of my favorite things about this team is everyone codes.
说真的,这是我最喜欢这个团队的地方之一,就是每个人都在写代码。
I I I have never been on a team where like like the my my PM would code and it's like crazy and like your code is like really good.
我从来没在哪个团队见过产品经理也在写代码,而且你的代码写得还很好,真的挺疯狂的。
Like
就是
here's your noise.
对你来说是噪音,但对我来说是惊喜。
But I I also just feel like it's it's also just becoming easier because it's like essentially like Claude writes the code.
不过我也觉得这件事变得越来越容易,因为说白了就是 Claude 在写代码。
And so what matters a little more is like what what's the idea that you have and I I feel like if you're a person that has like the product context and the business context and you're thinking about the design and the user, you're just going to come up with better ideas.
所以更重要的是你有什么想法,我觉得如果你是一个有产品背景、有商业视野,又在思考设计和用户的人,你就会想出更好的点子。
It's kind of like all the roles are merging.
感觉各种角色都在融合。
I remember seeing Megan or designers PRs and I was just horrified at the beginning.
我记得第一次看到 Megan 或者设计师提 PR,当时真的吓了一跳。
I was like, "Oh my god, why is Megan putting up PRs?"
我当时在想,天哪,Megan 为什么在提 PR?
And then she was like, "Yeah, yeah, I'm just like I'm fixing the button."
然后她说,对啊,我就是在修一下那个按钮。
And I was like, "Okay, all right.
我说,好吧,行吧,
Well, the code looks good, so maybe it's maybe it's fine."
代码看着也没问题,那就没关系了。
And I I feel like now it's just like it's totally normal.
现在我觉得这已经完全是正常操作了。
Yeah.
是啊。
We see this across all the enterprises we talk with.
我们跟各大企业聊,都能看到这个趋势。
Like it's the engineers adopt cloud code first
工程师先开始用 Claude Code,
and then the the adjacent roles look over their shoulder and they're like, "Wa, this thing is very powerful.
然后旁边那些相关岗位的人偷偷看了一眼,说这玩意儿好厉害,
Let me try it out."
我也来试试。
And we found, it's crazy.
然后我们发现,真的很神奇,
We found that like our designers are more productive making prototypes and making changes directly in the app instead of paying an engineer.
我们发现设计师直接在应用里做原型、直接改东西,比找工程师来做效率更高。
PMS are making changes in the app.
产品经理在应用里直接改东西。
Like our finance team runs in cloud code.
我们的财务团队在 Claude Code 里跑分析。
They do their projections there.
他们在里面做财务预测。
Um data science um like if you talk with our data scientists, it's so cool.
数据科学那边也一样,你去跟我们的数据科学家聊,太酷了。
It's just like everyone just has cloud codes up on their screens.
就是每个人的屏幕上都开着 Claude Code。
Yeah.
是啊。
Um I feel like it's it's remarkably versatile for for different roles.
我觉得它对不同岗位的适应性真的很强。
What do you feel like nowadays are like the use cases that are that are pushing the limit?
你觉得现在有哪些使用场景是真的在突破极限?
One that I'm super excited about is routines.
我特别兴奋的一个是 routines。
There's one engineer on our team who launched voice mode across all of our products and um he has this routine set up that just listens for every ticket that comes every GitHub issue, every bug report about voice mode and his claw just picks it up proactively puts up a fix and then pings the PR to him.
我们团队里有一个工程师负责在我们所有产品上线了 voice mode,他设置了一个 routine,专门监听每一个跟 voice mode 相关的 GitHub issue 和 bug 报告,他的 Claude 就会自动接单,主动提一个修复的 PR,然后 ping 他。
And when he got that working for voice mode, he thought, "Okay, we're getting a lot of other feedback that isn't being responded to."
等他把 voice mode 的那套弄好之后,他想,还有很多其他反馈没人回。
So, uh, he also set up a routine to listen for that.
于是他又为那些反馈设置了一个 routine 来监听。
So, I shipped this, uh, small feature and there was like an edge case in it that I didn't see.
我发了一个小功能,里面有个我没发现的边界情况。
And so, someone filed a bug for it and I was going to get it get to the bug that night.
有人提了一个 bug,我打算当晚去修。
And as my quad was working, it said, "Wait a second, another quad has already fixed this."
结果我的 Claude 在工作的时候说,等等,另一个 Claude 已经把这个修好了。
And I was like, "How's this possible?"
我就想,这怎么可能?
Like, I've never talked to him about this feature before.
我从来没跟他讲过这个功能。
And so I pinged him and I was like, "How did you fix this so quickly?"
于是我 ping 了他,问,你是怎么修得这么快的?
And he said he has another routine that just looks for bug reports that haven't been responded to in 5 hours and puts up a fix and he merges the ones that are easy to verify.
他说他有另一个 routine,专门找那些 5 小时内没有人响应的 bug 报告,然后提一个修复,容易验证的他就直接合并了。
Quad tells me this like all the time now
Claude 现在经常告诉我这种事,
that someone else has already fixed it.
说别人的 Claude 已经修好了。
There's always like another person's quad that's working on it.
总是有另一个人的 Claude 在处理这件事。
It's like
就是
Yeah, that's been one of the changes.
对,这是变化之一。
I I feel like we're um a while ago we were trying to figure out like how to use routines.
我觉得我们很久之前一直在想怎么用 routines。
And I feel like just like the agent SDK was this first idea that we could use quad code programmatically, but I feel like at the beginning it just wasn't obvious how do we use it?
就像 Agent SDK 最初是我们想到可以用程序化方式调用 Claude Code 的一个点子,但一开始真的不明显该怎么用,
What do we use it for?
用来做什么?
And I I think routines are the first really obvious application.
我觉得 routines 才是第一个真正显而易见的应用场景。
And um I don't know like it it just does like all the code review.
而且它就这么把所有代码审查都做了。
It it babysits like every PR.
它帮你盯着每一个 PR。
You know, remember back in the day you used to actually have to like respond to code review comments.
记得以前你还得真的去回复代码审查的评论吗?
You used to have to like fix CI.
以前还得手动修 CI。
You used to have to rebase.
以前还得手动 rebase。
Yeah.
是啊。
Like I I haven't done that in a long time.
我已经很久没做这些了。
Yeah.
是啊。
When you're in the CLI and you're synchronously working with Quad, what are the your go-to features?
在 CLI 里同步跟 Claude 协作的时候,你最常用的是什么功能?
Okay.
好,
What they used to be is plan mode.
以前是计划模式。
I don't use that anymore.
我现在不用那个了。
I use
我用
What do you use instead?
那你现在用什么?
Auto mode.
自动模式。
Auto mode.
自动模式。
It's the best.
最好用了。