Voltar aos PodcastsClaude

Build a production-ready agent with Claude Managed Agents

Hi everybody. How's everybody doing? Yeah. Okay, cool. Um, raise your hands if you've heard the phrase cloud manage agents get brought up like 50 times today. Okay. And keep your hand raised if you have have any idea what cloud manage agents even is. Okay. Not a lot of hands. Cool. Um and raise your hands if you're excited to learn about cloud manage agents today. Okay, there we go. I like that. Um hopefully this will be a little bit more of a technical deep dive. Um myself and a couple of my uh peers earlier today talked at a very high level. We showed you some slides. Um we talked about the primitives and whatever. And we'll do a little bit of that here today. But uh really what I want to make sure that we do is like have our laptops out, start coding. I have a little starter repo for us to kind of go through together. Um and my hope is that by the end of the session, you will be able to go get started with building on cloud manage agents today. Does that sound good to everybody? Cool. Quick agenda. Uh we'll do a quick overview of what cloud manage agents is. We'll go over a lot of the same kind of um discussion points from earlier today. Um and then we'll we'll spend some time actually building something um relatively from scratch. um and actually get something that you could be deploying to production today if you wanted to. Um and then we'll wrap up the session with some more advanced topics around cloud manage agents and just the the composable things that you can do with it. So um cloud manage agents at a high level is just a set of API endpoints that we've developed and released. um you can go use them with any API key today. Um that give you uh access to uh like scaled ready, production ready uh agent uh and all of the primitives around it that you can just build your own products on top of. You can pick and choose whatever primitives you need um and ditch the rest and then build whatever product experience you need on top of. So we took care of a lot of things like giving cloud access to a computer. Um giving it access to credential vaults if you want to inject things like MCP authentication for your own end users if you need to like get access to their linear MCP or cloud more accurately needs access to that. Um the tool calling um harness and everything uh retries and error recovery that might happen uh in production. um as well as a bunch of really nice primitives around like memory and context management and uh multi- aents uh that make it really really easy to be able to really like skate the exponential and like just benefit from each increased model uh family as they come out. Um, and then the the the really nice thing with cloud manage agents, which we'll show a little bit later today, is that um we built a lot of really nice observability um views into our developer console um that you can go to today in order to like just live debug what these agents are up to. And we'll show a little bit of that as well. Cool. So, the the main kind of like building blocks for cloud manage agents are the agent itself, which is really just like think of it as like a template. You want your agent to have a certain system prompt. You want it to have access to certain skills. Um, maybe you want to define what tools you want that agent to have access to. Um, some agents you want to have access to the bash tool and web search. Some agents you want to really make sure don't have access to the web because you don't want them to get prompt injected or whatever. Um, and you can like really pick and choose which which tools they have access to. Um, and you can also choose which MCP servers they get connected to which is also really really nice. Um, another really nice added uh aspect of like setting up all these agents is that you can also define um certain permission controls on a per tool basis. So you can decide that something like the file read tool um can just autoexecute whereas something like um executing bash or uh calling your databases MCP server um requires explicit approval from from your own end users. Um once you configure that agent you also can configure an environment. Um these environments kind of define the template for how the sandboxes that claude has access to behave. So you can define whether or not you want those containers to have network access. Maybe you want to pre-install certain packages from npm or from uh pip. Um so you can kind of define that in your environment level. And today since we announced uh self-hosted environments and sandboxes, you can also bring your own sandboxes uh that don't just run on Anthropic infrastructure. You can use something like Cloudflare or Modal or Versel um and use those out of the box or even your own um sandboxing fleet. Um once you configure those two things, you can then just like get started on talking to to that agent, right? you you have an ID for your agent, you have an ID for your environment. That's kind of all you need to get started with a session. Um, and a session is just you can think of it as kind of like an ongoing conversation with you and Claude. It's the equivalent of going to claw.ai and clicking new or uh entering Claude Code from your terminal. Um, once you do that, you can like start submitting events from your own end users. Something like them asking for something to to happen. Um, maybe you that that end user is asking for a PR to get put up. Um and that that all of that kind of happens on the session layer. Um and when you create these sessions, you can also include certain GitHub repositories that you want cla to have access to or um certain files that you might want to be preloaded into the container itself. Um so all of those will kind of be downloaded and and provisioned on on your behalf. Um and then finally there's just like the events, right? Like these sessions are just like ongoing event streams uh where you submit messages from your client um and we return s responses that claude is processing them doing whatever it is that it's it needs to be doing. Um and that's that's effectively like the the four major aspects of cloud manage agents. There's a couple of other things um along the sides but those are like the main primitives that that you kind of have to worry about. So diving a little bit deeper into the events that we have um like I mentioned there's like user events. These are things that you might want to submit from your own application. Maybe your user said something or maybe your platform just autosubmitted some sort of um user event. Um these can include images, documents, text, whatever it is that you need. Um you can also send interrupts from your own uh product, your your own platform. Um and you can use those in order to cut off Claude if if it was like going off on a bad tangent or or doing something wrong and you wanted to steer it. Um so you can kind of use that to prevent it from doing any sort of like dangerous action. Um you can also submit uh tool results for custom tools that you define and execute in your own uh platform. Um or confirmations for human in the loop controls for any sort of server executed tools. Um and then you can also define uh outcomes which allow you to like essentially pass us either a file or like a blob of text that you um wrote that uh is almost like a spec that claude will like try to do a first pass at implementing or or doing the thing that the spec defines. Um and then it will launch uh it will essentially enter a mode where it tries to like iterate and check its work against that rubric over and over again until it finds that it can like it actually satisfy that rubric. Um which we found to be really really powerful. Um it was a really really great way to like set up uh your agents for success. Um next we have agent events. Agent events are just anything that Claude is doing. So these are like messages that Claude is sending back to the user. compaction for if cla is uh needs to like uh you know compact its context window because it got too large. Um the tools that it's kind of executing on your behalf whether those are MCP tools or the uh default agent tools that we've defined. Um and then there's also multi-agent coordination events for when claw decides to spawn other agents to help it in doing its work. And we'll show a little bit of what that looks like uh during the live uh demo. Um next we have session events. Session events are just like life cycle events that let you know when the status of the session is changing. Maybe it entered a a retry loop. Uh maybe there's some sort of error. Maybe there the the the session needs to like idle for whatever reason. Um or if it needed to terminate. So there's just like a a bunch of like state transitions um and error uh events. Um and then there's also like an event for like outcome processing.