ポッドキャストに戻る Claude

はじめてのManaged Agentをリリースする

All right. さて。 Hello everyone. 皆さん、こんにちは。 It's great to see you all here today for our session on shipping your first manage agent. 本日は「初めてのManaged Agentを本番にシップする」セッションにお集まりいただき、大変嬉しく思います。 Let's go ahead and get started. では、さっそく始めましょう。 My name is Isabella He. 私はIsabella Heと申します。 I'm a member of technical staff at Anthropic on the Applied AI team. AnthropicのApplied AIチームのメンバーです。 The Applied AI team at Anthropic sits at the intersection of products, research, and our customers, which means that I get to contribute internally to products at Anthropic like Claude code and our Claude harnesses, as well as work externally with our customers that are building on top of Claude and on top of our harnesses. AnthropicのApplied AIチームは、プロダクト・リサーチ・お客様の三つが交わる位置に存在しており、私はClaude CodeやClaudeハーネスといったAnthropicの社内プロダクトへの貢献に加え、Claudeおよびその各種ハーネスの上に構築しているお客様との外部連携も担当しています。 So, my goal today is to get you all hands-on with actually building on top of manage agents, understanding how the harness works under the hood, and getting you ready to actually ship your first incident response management. 本日の目標は、皆さんにManaged Agentsの上で実際に手を動かしてもらい、ハーネスが内部でどのように動作するかを理解し、最初のインシデント対応管理エージェントを実際にシップできる状態にすることです。 So, the quick overview of today's agenda. 本日のアジェンダを簡単にご紹介します。 We're going to cover first a quick refresher of Claude manage agents. まず、Claude Managed Agentsの簡単なおさらいから始めます。 I want to talk you through a little bit about how this harness works under the hood and what makes it so special. このハーネスが内部でどのように動作し、何がそれほど特別なのかを少しご説明したいと思います。 Our team put a lot of thought into the architectural design of Claude manage agents to make sure that it runs ready and reliably for production-ready agents. 私たちのチームはClaude Managed Agentsのアーキテクチャ設計に多くの思慮を重ね、本番対応エージェントとして確実かつ信頼性高く動作するよう設計しました。 So, I want to talk you through a little bit of how that works so that then when we transition to the second portion here, which is the hands-on workshop, you'll actually understand what each of the primitives you're building actually mean for your agents under the hood. その仕組みについて少しご説明することで、後半のハンズオンワークショップに移った際に、皆さんが実際に構築するプリミティブがエージェントにとって内部でどんな意味を持つのかを理解していただけるようにしたいと思います。 So, for the majority of today's session, I want you all to actually have your laptops open, building alongside me, actually working inside of a repository, and getting you ready to actually spin up a working incident response agent. 本日のセッションの大部分では、皆さんにラップトップを開いたまま私と一緒に構築していただき、実際のリポジトリの中で作業し、動作するインシデント対応エージェントを立ち上げるところまで到達できる状態にしたいと思います。 Lastly, we'll talk a little bit about beyond the basics. 最後に、基本の先にあるものについても少しお話しします。 Today's session is the first session of a couple of other ones that will build on top of this on Claude manage agents. 本日のセッションは、Claude Managed Agentsをテーマとした一連のセッションの最初のものです。 Specifically, right after this one, I think there's another session on dreaming, which is one of my favorite new features with Claude manage agents for self-improving agents and memory built into the harness. 特に、このセッションの直後には「ドリーミング」に関するセッションが予定されており、これは私がClaude Managed Agentsの中で最も気に入っている新機能の一つで、ハーネスに組み込まれた自己改善エージェントとメモリに関するものです。 So, encourage everyone to dive in a little bit deeper into what else is in the box after we set you all up for success today with a quick introduction. 本日の簡単な入門で成功への土台を整えた後は、ぜひもっと深く掘り下げて、まだ箱の中に何が入っているかを探求していただければと思います。 So, let's first touch a little bit about how we got here with Claude manage agents. では、まずClaude Managed Agentsがどのようにして今日に至ったかについて触れてみましょう。 When we first released the very first Claude back in 2023, we released a messages API alongside access to Claude. 2023年に最初のClaudeをリリースした際、私たちはClaudeへのアクセスとともにMessages APIも同時にリリースしました。 This provided raw model access to all Claude models. これにより、すべてのClaudeモデルへの生のモデルアクセスが提供されました。 This became the very first way that people could programmatically build on top of Claude and essentially gave a way for people to access tokens in and tokens out via our Claude models. これがClaudeの上でプログラム的に構築できる最初の手段となり、Claudeモデルを通じてトークンの入出力にアクセスする方法を提供しました。 This also meant that for everyone building on top of Claude models, they had to implement all the various primitives themselves. これはまた、Claudeモデルの上で構築する全員が、さまざまなプリミティブをすべて自分自身で実装しなければならないことも意味していました。 Things like context management, the actual agent loop, compaction, etc. コンテキスト管理、実際のエージェントループ、コンパクションなどです。 All the primitives that come alongside making the agent work. エージェントを動作させるために必要なプリミティブのすべてです。 When models were less intelligent back in the early days of let's say 2023, some of these primitives were much simpler because agents could simply do less. 2023年初期のように、モデルがまだあまり賢くなかった頃は、エージェントができることがシンプルだったため、これらのプリミティブの多くはずっと単純なものでした。 But, as we evolved into now with higher model intelligence and as agents are able to take on more complex tasks and actually take actions within environments and come to actually do entire tasks for humans, the primitives that come alongside context management and managing an agent's ability to execute API calls and tool calls becomes much more complex. しかし現在のように、モデルの知性が高まり、エージェントが複雑なタスクをこなし、環境の中で実際にアクションを取り、人間のためにタスク全体を完遂できるようになると、コンテキスト管理とエージェントのAPI呼び出しやツール呼び出しの実行能力の管理に伴うプリミティブははるかに複雑になってきます。 So, that's when we moved to the agent SDK, which became a harness that allows you to programmatically call Claude code, one of our agents at Anthropic. そこで私たちはAgent SDKに移行しました。これはAnthropicのエージェントの一つであるClaude Codeをプログラム的に呼び出せるハーネスです。 So, Claude code is something that an agent has access to a computer and takes actions within file system. Claude Codeとは、コンピューターにアクセスし、ファイルシステムの中でアクションを取るエージェントです。 So, the agent SDK became a way for you to make Claude much more powerful by leveraging the power of Claude code within a harness. つまりAgent SDKは、ハーネス内でClaude Codeの力を活用することで、Claudeをはるかに強力にする手段となりました。 The main thing here though is that with the agent SDK, developers still had to manage hosting and scaling on their own and making sure that the agent SDK would be safe to run within their containers. ただしAgent SDKの場合、開発者はホスティングとスケーリングを自分で管理し、Agent SDKが自社のコンテナ内で安全に動作することを確保する必要がありました。 That's only then evolved into Claude managed agents, which is the first harness to be able to handle scaling and production ready components for you by Anthropic, providing things like a purpose-built harness, sandboxing, observability, tool runtime, all within a managed infrastructure system. それがさらに進化して Claude Managed Agentsとなりました。これはAnthropicが提供する、スケーリングと本番対応コンポーネントを代わりに処理してくれる最初のハーネスであり、専用設計のハーネス、サンドボックス、オブザーバビリティ、ツールランタイムをすべてマネージドインフラシステムの中に提供します。 This means that developers can focus on task and agent configuration, custom tool logic, the things that actually matter for bringing domain expertise and customizability to your agents, where you're handing off the rest of all the primitives and core compute and primitives of essentially managing the basics of agent running to Anthropic. これにより開発者は、タスクとエージェント設定、カスタムツールロジックなど、ドメインの専門知識とカスタマイズ性を自分のエージェントにもたらす上で本当に重要なことに集中でき、エージェント稼働の基本を管理するためのプリミティブとコアコンピューティングの残りはAnthropicに委ねることができます。 So, that brings me to managed agents as the fastest way to build production ready agents on Claude. これが、Claudeの上で本番対応エージェントを最速で構築するManaged Agentsです。 We've seen people build 10 to 15 times faster to production with Claude managed agents by leveraging our purpose-built harness. Claude Managed Agentsを活用することで、専用設計のハーネスを使って本番環境までの構築速度が10倍から15倍に向上したという事例を数多く目撃しています。 Part of the reason why we built Claude managed agents is because is because harnesses should evolve alongside your agents. 私たちがClaude Managed Agentsを構築した理由の一つは、ハーネスはエージェントと共に進化するべきだからです。 For example, back when we were building ourselves on top of models like Sonnet 4.5, we noticed that Sonnet 4.5 emitted a particular behavior called context anxiety. 例えば、私たちがSonnet 4.5のようなモデルの上で自ら構築していた頃、Sonnet 4.5が「コンテキストアンザイエティ」と呼ばれる特定の挙動を示すことに気づきました。 This meant that with Sonnet 4.5, Claude started wrapping up tasks early even when it still had room to spare in its context window. これはSonnet 4.5において、コンテキストウィンドウにまだ余裕があるにもかかわらず、Claudeがタスクを早期に終わらせようとすることを意味していました。 To manage that in our harness, we then added some mitigations to combat against this early stopping behavior. 私たちはハーネス内でこれを管理するために、この早期停止挙動に対抗する緩和策を追加しました。 But, when Opus 4.5 then came out, we actually saw this behavior go away, making all that work we had done inside of the harness essentially obsolete because Claude had evolved beyond that behavior that we had built into the harness to manage. しかしOpus 4.5が登場すると、この挙動が実際に消えていることが分かりました。Claudeがその挙動を超えて進化していたため、ハーネスの中に組み込んでいた対策がすべて事実上無用となってしまったのです。 So, the takeaway there is that it's a lot of work to maintain harnesses and make sure that they actually evolve alongside your agents, which is why with Claude managed agents, we want to make it really easy for Claude and Anthropic to handle all the complexities that come with compaction, caching, things like context anxiety, all these various primitives that come with actually making agent production ready and getting the most out of Claude. そこから得られる教訓は、ハーネスを維持管理し、エージェントとともに確実に進化させ続けることはかなりの労力が必要だということです。だからこそClaude Managed Agentsでは、コンパクション、キャッシング、コンテキストアンザイエティといったエージェントを本番対応にしClaudeを最大限活用するためのさまざまなプリミティブの複雑さすべてを、ClaudeとAnthropicが処理できるよう努めています。 So again, you can focus on the tasks, tools, and things that actually matter for building agents on Claude. つまり、皆さんはClaudeの上でエージェントを構築する上で本当に重要なタスク・ツール・事項に集中できるというわけです。 So three primary resources go into building on Claude managed agents. Claude Managed Agentsの構築には、主に三つのリソースが必要です。 First is the agent's endpoint, which is the persona and capabilities. 一つ目はエージェントのエンドポイントで、これはペルソナと機能に相当します。 This is the core system prompt that powers your agent. これがエージェントを動かすコアシステムプロンプトです。 Essentially here, you're defining the model, the MCP servers, the skills, the various components that your agent can actually leverage when it's able to run in that agent loop. ここでは基本的に、エージェントがそのエージェントループ内で実際に活用できるモデル、MCPサーバー、スキル、各種コンポーネントを定義します。 The next is the environments. 次は環境です。 You can think of this as the hands of the agent, where the previous one is the brain of the agent where the agent is thinking through what to execute, and then it's using an environment to actually have a space and a container to actually take action on your behalf. これはエージェントの「手」と考えることができます。先ほどの「脳」となるエージェントが何を実行するかを考え、次に環境を使って実際に空間とコンテナを確保し、あなたの代わりにアクションを取ります。 Sessions are next the way to tie together agents and environments. セッションは次に、エージェントと環境を結びつける手段です。 A single session has a spun up on an agent instance within an environment. 一つのセッションは、環境内のエージェントインスタンス上でスピンアップされます。 So you can connect the two together and actually stream events back to your user and start to take action on behalf of your humans as part of a Claude powered agent. この二つを接続し、ユーザーにイベントをストリームバックし、Claude搭載エージェントの一部として人間のために実際のアクションを取り始めることができます。 A key thing here, as I alluded to briefly before, Claude managed agent has the agent loop run server side. 先ほど少し触れましたが、重要な点として、Claude Managed Agentsはエージェントループをサーバーサイドで動作させます。 This means that a lot of the complexities that come with managing hosting and scaling are abstracted away. これにより、ホスティングとスケーリングの管理に伴う複雑さの多くが抽象化されます。 And when you close your laptop or you hit hard refresh on your agent that you're building on Claude managed agents, everything is maintained and you don't have to worry about durability, reliability, all these various aspects that usually come to bite you when you're trying to turn your agent from a prototype into production. ラップトップを閉じたり、Claude Managed Agentsの上で構築しているエージェントをハードリフレッシュしても、すべてが維持されており、耐久性・信頼性・プロトタイプを本番環境に転換しようとしたときに通常問題となるさまざまな側面を気にする必要がありません。 And lastly here, before we dive into the hands-on portion, is I want to talk you through a key design decision that went into Claude managed agents. そして最後に、ハンズオンに入る前に、Claude Managed Agentsを設計する上での重要な設計上の意思決定についてご説明したいと思います。 Previously, with a lot of agent harnesses, we saw the agent loop coupled tightly with tool execution. 以前の多くのエージェントハーネスでは、エージェントループがツール実行と密結合していました。 This design pattern made sense and still makes sense for some agents because you want to give the agent powerful abilities to actually take action within the environment. このデザインパターンは一部のエージェントにとって理にかなっており、今でも理にかなっています。なぜならエージェントに環境内で実際にアクションを取る強力な能力を持たせたいからです。 For instance, with Claude Code, we want the agent to be able to access various files on your computer, take action within a file system, and therefore it makes sense for the agent to have access to all those tools spun up on every container. 例えばClaude Codeでは、エージェントがコンピューター上のさまざまなファイルにアクセスし、ファイルシステム内でアクションを取れるようにしたいため、すべてのコンテナ上でスピンアップされるすべてのツールにエージェントがアクセスできるのは理にかなっています。 But, we also realized there are some constraints for this, especially with some agents where you essentially want to be able to decouple the hands from the brains of the agents. しかし私たちはまた、特にエージェントの「手」と「脳」を切り離したい場合など、一部のエージェントにはこれに対する制約が存在することにも気づきました。 For instance, credentials and uh credentials and security became a huge concern. 例えば、クレデンシャルとセキュリティが大きな懸念事項となってきました。 With the ability to have the agent access your file system, you can actually add very distinct sandboxing by decoupling these two components, where the agent is no longer able to access the actual credentials without encryption by decoupling the hands from the sandbox of the agent. エージェントがファイルシステムにアクセスできるようにすることで、エージェントの「手」とサンドボックスを切り離すことによって、エージェントが暗号化なしに実際のクレデンシャルにアクセスできないようにし、非常に明確なサンドボックスを追加できます。 The other aspect here is actually you can see huge benefits by doing these decoupling on things like time to first token and latency. もう一つの側面は、この分離によってタイム・トゥ・ファースト・トークンやレイテンシなど、大きなメリットが得られることです。 Previously, with the agent loop into execution in the same box, you had to spin up containers for every single session that you're spinning up in the agent, which contributed to additional latency from a time to first time to first token perspective. 以前は、エージェントループと実行が同じコンテナに入っていたため、スピンアップするセッションごとにコンテナを起動する必要があり、タイム・トゥ・ファースト・トークンの観点から追加のレイテンシが発生していました。 But, with this now decoupled, our teams actually saw reductions in time to first token along the lines of over 90% reduction in TTFT for our P95 metrics on latency. しかし今は分離されたことで、私たちのチームはレイテンシのP95メトリクスにおけるTTFTで90%超の削減を実際に確認しました。 So, here you can start to see the power of this design decision coming through from the perspective of safety, reliability, latency, and everything else that you care about when it comes to building production-ready agents. つまり、安全性・信頼性・レイテンシ、そして本番対応エージェントを構築する上で重要なその他のあらゆる観点から、この設計上の意思決定の力が現れてくるのです。 All right, so now it's time for the exciting part of today's session, which is where I want you all to open up your laptops and go to this URL here to actually clone a repository, and let's start to actually feel the magic of everything that I just talked through. さて、本日のセッションで最も楽しい部分に入る時間となりました。皆さんにラップトップを開いてこちらのURLにアクセスしていただき、リポジトリをクローンして、今お話ししたすべての魔法を実際に感じていただきたいと思います。 So, I'm going to give everyone a second to just go over to that URL there and just spin up the repository that we have ready for you. では、皆さんが先ほどのURLにアクセスして用意されたリポジトリをスピンアップするのに少し時間をとりましょう。 All right, so here are some additional commands that I want you all to run to make sure this is all set up on your computers. さて、コンピューターへのセットアップを確認するために、皆さんに実行していただきたい追加コマンドをご紹介します。 So, the first step many of you might have done already, but just take that repository, hit the URL, get clone it, and then I want you to CD into the specific repository for the session, which is ship your first manage agent. 最初のステップはすでに完了している方も多いと思いますが、リポジトリを取得し、URLにアクセスしてクローンし、本セッション用のリポジトリ「ship your first manage agent」の中にCDします。 And then, if you're on Mac, you'll see those two commands on the side, the Python and the source. Macをお使いの方は、サイドにPythonとsourceの2つのコマンドが表示されます。 Um, there's a command there for Windows as well. Windowsのコマンドもあります。 And you'll just do the rest there where you want to install the requirements, copy over the environment key into your .env file. 次にrequirementsのインストール、.envファイルへの環境キーのコピーなど残りの作業を行います。 Um, here you'll put in the Anthropic API key that hopefully all of you also received from the QR code for free credits earlier. ここで、先ほどQRコードから無料クレジットとして受け取られた方も多いであろうAnthropicのAPI keyを入力します。 And lastly, we'll just run the app. 最後にアプリを実行します。 All right, let's go ahead and dive in, but as I mentioned before, let me just show everyone where these instructions are. では進みましょう。先ほどお伝えしましたが、この手順がどこにあるか全員にお見せします。 If you go into the repository in the link and then go to ship your first manage agents, you scroll down on the read me, you'll see all the setup instructions here. リンクのリポジトリに移動し、「ship your first manage agents」を開いてREADMEをスクロールダウンすると、すべてのセットアップ手順が表示されます。 So, feel free to do this, um, as we go along or even in your own time later today and continue playing around with it, but as I mentioned before, everything will be also shown on the screen to follow along with. 作業を進めながらでも、または後ほど今日中に自分のペースで行っていただいて構いません。すべてフォローできるよう画面にも表示しますのでご安心ください。 So, do not worry if you did not have time to fully get it set up on your laptop. ラップトップでのセットアップが間に合わなかった方もご心配なく。 Without further ado, let's go ahead and dive in. それでは、さっそく始めましょう。 So, once you run streamlit run app.py, you should be able to see a URL that looks like this and a page that looks like this. streamlit run app.pyを実行すると、このようなURLとこのようなページが表示されるはずです。 What we're doing here is we're going to be simulating an agent, um, interaction here where we have an incident that's going to come up. ここではインシデントが発生するシナリオでエージェントのインタラクションをシミュレートします。 A lot of you who might be software engineers in the room will be intimately familiar with the pain that comes alongside incident response. ソフトウェアエンジニアの方々は、インシデント対応に伴う苦労を身にしみて理解されていることでしょう。 If you are software engineer, you might be woken up at, let's say, 3:00 a.m. in the morning, 2:00 a.m. in the morning when you're out around on on vacation as you're on call, and this is usually a very painful portion of a software engineer's life, uh, because when you're on call, it means that if a server goes down or a service goes down, you have to be immediately the one there to respond and tackle the incident. ソフトウェアエンジニアであれば、休暇中にオンコール中で夜中の3時や2時に叩き起こされた経験があるかもしれません。サーバーやサービスがダウンすれば、すぐに対応して問題を解決しなければならない担当者になるわけですから、オンコールというのはソフトウェアエンジニア生活の中でも特に辛い部分です。 Usually for a human, this means diving into metrics and logs and deployments. 通常、人間にとってこれはメトリクスやログやデプロイメントを掘り下げることを意味します。 You can actually investigate what's going on. 実際に何が起きているのかを調査できます。 And so, what we're going to do is we're going to now have an agent run on Claude manage agents to do all this for us. そこで今回は、Claude Managed Agentsの上で動作するエージェントがこれをすべて代行するようにします。 So, that when we get woken up by 3:00 a.m., we can hand it off to an agent, or maybe we don't even get woken up at all if Claude is able to do everything for us. 夜中の3時に叩き起こされたときにエージェントに引き継ぐか、Claudeがすべてやってくれれば起こされることすらなくなるかもしれません。 Okay. では。 So, let's now go ahead and dive into the code here. コードに飛び込みましょう。 What we're going to open up here is we have the agent.py file on the left and the agent complete on the right. 左側にagent.pyファイル、右側にagent completeを開きます。 If you want to challenge yourself, you can of course try to implement everything yourself here or with Claude. もちろん、すべてを自分でまたはClaudeと一緒に実装してみたい方は挑戦していただいて構いません。 Um but, what we're going to do just for simplicity's sake is just copy over various elements from the completed file onto the incomplete file one by one. ただ今回はシンプルさのために、完成ファイルからさまざまな要素を未完成ファイルに一つずつコピーしていきます。 So, you can see how these primitives compose our agent one piece at a time. これによって、これらのプリミティブがエージェントをどのように一つずつ構成しているかを確認できます。 So, let's go ahead and start off with this very first part, which is the agent. では最初のパート、エージェントから始めましょう。 We mentioned before that the agent is the one that defines the persona and the capabilities of the agent here. 先ほど述べたように、エージェントはここでのエージェントのペルソナと機能を定義するものです。 So, that's model, the system prompts, and the tools in our case for our agent here. 私たちのエージェントの場合、モデル、システムプロンプト、ツールがそれに相当します。 So, let me go ahead and copy over what we see there on the screen. 画面に表示されているものをコピーしましょう。 And you can see here that we're defining the SRE agent. ここでSREエージェントを定義しているのが確認できます。 We're going to use Claude Opus 4.7 here. Claude Opus 4.7を使用します。 And I've preconfigured a system prompt and tools for the agent. エージェント用にシステムプロンプトとツールをあらかじめ設定してあります。