Terug naar podcasts Claude

Je eerste Managed Agent in productie

All right. Goed. Hello everyone. Hallo iedereen. It's great to see you all here today for our session on shipping your first manage agent. Fijn om jullie hier vandaag te zien voor onze sessie over het bouwen van je eerste Managed Agent. Let's go ahead and get started. Laten we beginnen. My name is Isabella He. Mijn naam is Isabella He. I'm a member of technical staff at Anthropic on the Applied AI team. Ik ben member of technical staff bij Anthropic, in het Applied AI-team. The Applied AI team at Anthropic sits at the intersection of products, research, and our customers, which means that I get to contribute internally to products at Anthropic like Claude code and our Claude harnesses, as well as work externally with our customers that are building on top of Claude and on top of our harnesses. Het Applied AI-team bij Anthropic bevindt zich op het snijpunt van producten, onderzoek en onze klanten. Dat betekent dat ik intern kan bijdragen aan producten zoals Claude Code en onze Claude-harnesses, en extern samenwerk met klanten die bouwen op Claude en op onze harnesses. So, my goal today is to get you all hands-on with actually building on top of manage agents, understanding how the harness works under the hood, and getting you ready to actually ship your first incident response management. Mijn doel vandaag is jullie hands-on aan de slag te laten gaan met Managed Agents, te begrijpen hoe de harness werkt onder de motorkap, en jullie klaar te stomen om je eerste incident-response-agent te shippen. So, the quick overview of today's agenda. Dus, een kort overzicht van de agenda voor vandaag. We're going to cover first a quick refresher of Claude manage agents. We beginnen met een korte opfrissing van Claude Managed Agents. I want to talk you through a little bit about how this harness works under the hood and what makes it so special. Ik wil je meenemen door hoe deze harness onder de motorkap werkt en wat hem zo bijzonder maakt. Our team put a lot of thought into the architectural design of Claude manage agents to make sure that it runs ready and reliably for production-ready agents. Ons team heeft veel nagedacht over het architectureel ontwerp van Claude Managed Agents, zodat het betrouwbaar en klaar is voor productie-agenten. So, I want to talk you through a little bit of how that works so that then when we transition to the second portion here, which is the hands-on workshop, you'll actually understand what each of the primitives you're building actually mean for your agents under the hood. Ik wil je dus een beetje meenemen in hoe dat werkt, zodat je bij de overgang naar het tweede deel, de hands-on workshop, echt begrijpt wat elk van de primitieven die je bouwt onder de motorkap betekent voor je agenten. So, for the majority of today's session, I want you all to actually have your laptops open, building alongside me, actually working inside of a repository, and getting you ready to actually spin up a working incident response agent. Voor het grootste deel van de sessie wil ik dat je je laptop open hebt, samen met mij bouwt in een repository, en klaar bent om een werkende incident-response-agent op te starten. Lastly, we'll talk a little bit about beyond the basics. Ten slotte bespreken we kort wat er verder mogelijk is. Today's session is the first session of a couple of other ones that will build on top of this on Claude manage agents. De sessie van vandaag is de eerste van een reeks sessies die voortbouwen op Claude Managed Agents. Specifically, right after this one, I think there's another session on dreaming, which is one of my favorite new features with Claude manage agents for self-improving agents and memory built into the harness. Direct hierna is er een sessie over dreaming, een van mijn favoriete nieuwe features van Claude Managed Agents, voor zelfverbeterende agenten en geheugen ingebouwd in de harness. So, encourage everyone to dive in a little bit deeper into what else is in the box after we set you all up for success today with a quick introduction. Dus moedig ik iedereen aan om daarna wat dieper in te duiken in wat er verder mogelijk is, nadat we je vandaag met een korte introductie op weg hebben geholpen. So, let's first touch a little bit about how we got here with Claude manage agents. Laten we eerst kort stilstaan bij hoe we hier gekomen zijn met Claude Managed Agents. When we first released the very first Claude back in 2023, we released a messages API alongside access to Claude. Toen we in 2023 de allereerste Claude uitbrachten, hebben we ook een Messages API vrijgegeven samen met toegang tot Claude. This provided raw model access to all Claude models. Dat gaf directe modeltoegang tot alle Claude-modellen. This became the very first way that people could programmatically build on top of Claude and essentially gave a way for people to access tokens in and tokens out via our Claude models. Dit werd de eerste manier waarop mensen programmatisch konden bouwen op Claude en essentieel tokens in en tokens uit konden halen via onze Claude-modellen. This also meant that for everyone building on top of Claude models, they had to implement all the various primitives themselves. Dat betekende ook dat iedereen die op Claude-modellen bouwde, zelf alle primitieven moest implementeren. Things like context management, the actual agent loop, compaction, etc. Zaken als contextbeheer, de eigenlijke agent-loop, compactie, enzovoort. All the primitives that come alongside making the agent work. Alle primitieven die nodig zijn om een agent te laten werken. When models were less intelligent back in the early days of let's say 2023, some of these primitives were much simpler because agents could simply do less. Toen modellen in de begindagen van 2023 nog minder intelligent waren, waren sommige primitieven veel eenvoudiger, omdat agenten simpelweg minder konden doen. But, as we evolved into now with higher model intelligence and as agents are able to take on more complex tasks and actually take actions within environments and come to actually do entire tasks for humans, the primitives that come alongside context management and managing an agent's ability to execute API calls and tool calls becomes much more complex. Maar naarmate we nu meer modelintelligentie hebben en agenten complexere taken op zich kunnen nemen, acties kunnen uitvoeren in omgevingen en hele taken voor mensen kunnen voltooien, wordt het beheer van contextmanagement, API-aanroepen en tool-calls steeds complexer. So, that's when we moved to the agent SDK, which became a harness that allows you to programmatically call Claude code, one of our agents at Anthropic. Daarom zijn we overgestapt naar de Agent SDK, een harness waarmee je programmatisch Claude Code kunt aanroepen, een van onze agenten bij Anthropic. So, Claude code is something that an agent has access to a computer and takes actions within file system. Claude Code is een agent die toegang heeft tot een computer en acties uitvoert binnen het bestandssysteem. So, the agent SDK became a way for you to make Claude much more powerful by leveraging the power of Claude code within a harness. De Agent SDK werd zo een manier om Claude veel krachtiger te maken door gebruik te maken van de kracht van Claude Code binnen een harness. The main thing here though is that with the agent SDK, developers still had to manage hosting and scaling on their own and making sure that the agent SDK would be safe to run within their containers. Het belangrijkste punt hier is echter dat ontwikkelaars met de Agent SDK nog steeds zelf hosting en schaling moesten beheren en ervoor moesten zorgen dat de Agent SDK veilig kon draaien in hun containers. That's only then evolved into Claude managed agents, which is the first harness to be able to handle scaling and production ready components for you by Anthropic, providing things like a purpose-built harness, sandboxing, observability, tool runtime, all within a managed infrastructure system. Dat is vervolgens geëvolueerd naar Claude Managed Agents, de eerste harness die schaling en productie-ready componenten voor je beheert vanuit Anthropic, met dingen als een doelbewuste harness, sandboxing, observability, een tool runtime, allemaal binnen een beheerde infrastructuur. This means that developers can focus on task and agent configuration, custom tool logic, the things that actually matter for bringing domain expertise and customizability to your agents, where you're handing off the rest of all the primitives and core compute and primitives of essentially managing the basics of agent running to Anthropic. Dit betekent dat ontwikkelaars zich kunnen richten op taak- en agentconfiguratie, aangepaste toollogica, de dingen die er echt toe doen om domeinexpertise en maatwerk in je agenten te brengen, terwijl je de rest van de primitieven en de kerninfrastructuur aan Anthropic overdraagt. So, that brings me to managed agents as the fastest way to build production ready agents on Claude. Dat brengt me bij Managed Agents als de snelste manier om productie-ready agenten op Claude te bouwen. We've seen people build 10 to 15 times faster to production with Claude managed agents by leveraging our purpose-built harness. We hebben gezien dat mensen met Claude Managed Agents 10 tot 15 keer sneller naar productie gaan door gebruik te maken van onze doelbewuste harness. Part of the reason why we built Claude managed agents is because is because harnesses should evolve alongside your agents. Een deel van de reden waarom we Claude Managed Agents hebben gebouwd, is dat harnesses mee moeten evolueren met je agenten. For example, back when we were building ourselves on top of models like Sonnet 4.5, we noticed that Sonnet 4.5 emitted a particular behavior called context anxiety. Toen we zelf bouwden op modellen zoals Sonnet 4.5, merkten we dat Sonnet 4.5 een bepaald gedrag vertoonde dat we context anxiety noemden. This meant that with Sonnet 4.5, Claude started wrapping up tasks early even when it still had room to spare in its context window. Dit betekende dat Claude met Sonnet 4.5 taken vroegtijdig afsloot, zelfs als er nog ruimte was in het contextvenster. To manage that in our harness, we then added some mitigations to combat against this early stopping behavior. Om dat in onze harness te beheren, voegden we mitigaties toe om dit vroege-stop-gedrag tegen te gaan. But, when Opus 4.5 then came out, we actually saw this behavior go away, making all that work we had done inside of the harness essentially obsolete because Claude had evolved beyond that behavior that we had built into the harness to manage. Maar toen Opus 4.5 uitkwam, zagen we dit gedrag verdwijnen, waardoor al het werk dat we in de harness hadden gedaan grotendeels overbodig werd, omdat Claude voorbij dat gedrag was gegroeid. So, the takeaway there is that it's a lot of work to maintain harnesses and make sure that they actually evolve alongside your agents, which is why with Claude managed agents, we want to make it really easy for Claude and Anthropic to handle all the complexities that come with compaction, caching, things like context anxiety, all these various primitives that come with actually making agent production ready and getting the most out of Claude. De conclusie is dus dat het veel werk kost om harnesses te onderhouden en ervoor te zorgen dat ze mee evolueren met je agenten, daarom willen we het met Claude Managed Agents eenvoudig maken voor Anthropic om alle complexiteiten van compactie, caching, context anxiety en andere primitieven te beheren, zodat Claude productie-ready is. So again, you can focus on the tasks, tools, and things that actually matter for building agents on Claude. Zodat je je kunt richten op de taken, tools en dingen die er echt toe doen bij het bouwen van agenten op Claude. So three primary resources go into building on Claude managed agents. Er zijn drie primaire resources voor het bouwen op Claude Managed Agents. First is the agent's endpoint, which is the persona and capabilities. De eerste is het agent-endpoint, dat de persona en de mogelijkheden definieert. This is the core system prompt that powers your agent. Dit is de kern-systemprompt die je agent aanstuurt. Essentially here, you're defining the model, the MCP servers, the skills, the various components that your agent can actually leverage when it's able to run in that agent loop. Hier definieer je het model, de MCP-servers, de skills en de verschillende componenten die je agent kan gebruiken in de agent-loop. The next is the environments. De tweede is de omgeving. You can think of this as the hands of the agent, where the previous one is the brain of the agent where the agent is thinking through what to execute, and then it's using an environment to actually have a space and a container to actually take action on your behalf. Zie dit als de handen van de agent: de vorige resource is de hersenen, waar de agent nadenkt over wat hij moet uitvoeren, en de omgeving is de ruimte en container waarbinnen hij namens jou actie onderneemt. Sessions are next the way to tie together agents and environments. Sessies zijn de volgende manier om agenten en omgevingen aan elkaar te koppelen. A single session has a spun up on an agent instance within an environment. Een sessie wordt opgestart op een agent-instantie binnen een omgeving. So you can connect the two together and actually stream events back to your user and start to take action on behalf of your humans as part of a Claude powered agent. Zo verbind je de twee en kun je events streamen naar je gebruiker en beginnen met het uitvoeren van acties namens mensen als onderdeel van een Claude-aangedreven agent. A key thing here, as I alluded to briefly before, Claude managed agent has the agent loop run server side. Een belangrijk punt hier, zoals ik eerder kort aanstipte, is dat de agent-loop bij Claude Managed Agents server-side draait. This means that a lot of the complexities that come with managing hosting and scaling are abstracted away. Dit betekent dat veel complexiteiten rondom hosting en schaling worden weggeabstraheerd. And when you close your laptop or you hit hard refresh on your agent that you're building on Claude managed agents, everything is maintained and you don't have to worry about durability, reliability, all these various aspects that usually come to bite you when you're trying to turn your agent from a prototype into production. En als je je laptop sluit of de pagina hard ververst terwijl je bouwt op Claude Managed Agents, blijft alles behouden en hoef je je niet bezig te houden met duurzaamheid, betrouwbaarheid en andere aspecten die je anders achtervolgen als je een agent van prototype naar productie brengt. And lastly here, before we dive into the hands-on portion, is I want to talk you through a key design decision that went into Claude managed agents. Tot slot wil ik, voordat we naar het hands-on gedeelte gaan, een belangrijke ontwerpbeslissing bespreken die in Claude Managed Agents is verwerkt. Previously, with a lot of agent harnesses, we saw the agent loop coupled tightly with tool execution. Bij veel eerdere agent-harnesses zagen we de agent-loop nauw gekoppeld aan de uitvoering van tools. This design pattern made sense and still makes sense for some agents because you want to give the agent powerful abilities to actually take action within the environment. Dit ontwerppatroon had en heeft nog steeds zin voor sommige agenten, omdat je de agent krachtige mogelijkheden wilt geven om daadwerkelijk actie te ondernemen in de omgeving. For instance, with Claude Code, we want the agent to be able to access various files on your computer, take action within a file system, and therefore it makes sense for the agent to have access to all those tools spun up on every container. Met Claude Code willen we bijvoorbeeld dat de agent toegang heeft tot bestanden op je computer en acties kan uitvoeren binnen het bestandssysteem, dus is het logisch dat de agent toegang heeft tot al die tools in elke container. But, we also realized there are some constraints for this, especially with some agents where you essentially want to be able to decouple the hands from the brains of the agents. Maar we realiseerden ons ook dat er beperkingen zijn, vooral bij agenten waarbij je de handen van de hersenen wilt ontkoppelen. For instance, credentials and uh credentials and security became a huge concern. Credentials en beveiliging werden bijvoorbeeld een grote zorg. With the ability to have the agent access your file system, you can actually add very distinct sandboxing by decoupling these two components, where the agent is no longer able to access the actual credentials without encryption by decoupling the hands from the sandbox of the agent. Door de agent toegang te geven tot je bestandssysteem kun je zeer nauwkeurige sandboxing toepassen door deze twee componenten te ontkoppelen, zodat de agent geen toegang meer heeft tot de eigenlijke credentials zonder encryptie doordat de handen van de sandbox zijn gescheiden. The other aspect here is actually you can see huge benefits by doing these decoupling on things like time to first token and latency. Een ander voordeel van deze ontkoppeling zijn de enorme verbeteringen die je kunt zien op het gebied van time to first token en latentie. Previously, with the agent loop into execution in the same box, you had to spin up containers for every single session that you're spinning up in the agent, which contributed to additional latency from a time to first time to first token perspective. Voorheen, met de agent-loop en uitvoering in dezelfde container, moest je voor elke sessie die je opstarte containers inprovisioneren, wat bijdroeg aan extra latentie vanuit het oogpunt van time to first token. But, with this now decoupled, our teams actually saw reductions in time to first token along the lines of over 90% reduction in TTFT for our P95 metrics on latency. Maar nu dit ontkoppeld is, zagen onze teams reducties in time to first token van meer dan 90% in onze P95-latentie-metrics. So, here you can start to see the power of this design decision coming through from the perspective of safety, reliability, latency, and everything else that you care about when it comes to building production-ready agents. Hier begin je de kracht van deze ontwerpbeslissing te zien, vanuit het perspectief van veiligheid, betrouwbaarheid, latentie en alles waar je om geeft bij het bouwen van productie-ready agenten. All right, so now it's time for the exciting part of today's session, which is where I want you all to open up your laptops and go to this URL here to actually clone a repository, and let's start to actually feel the magic of everything that I just talked through. Goed, nu is het tijd voor het spannende deel van de sessie: ik wil dat je je laptop opent en naar deze URL gaat om een repository te clonen, zodat we de magie kunnen ervaren van alles wat ik net heb besproken. So, I'm going to give everyone a second to just go over to that URL there and just spin up the repository that we have ready for you. Ik geef iedereen even de tijd om naar die URL te gaan en de repository op te starten die we voor jullie hebben klaargezet. All right, so here are some additional commands that I want you all to run to make sure this is all set up on your computers. Oké, hier zijn een paar extra commando's die ik wil dat jullie allemaal uitvoeren om alles goed in te stellen op jullie computers. So, the first step many of you might have done already, but just take that repository, hit the URL, get clone it, and then I want you to CD into the specific repository for the session, which is ship your first manage agent. De eerste stap hebben velen van jullie misschien al gedaan: kloon de repository via de URL en ga dan naar de specifieke map voor deze sessie, genaamd ship your first manage agent. And then, if you're on Mac, you'll see those two commands on the side, the Python and the source. Als je op Mac werkt, zie je die twee commando's aan de zijkant, de Python- en de source-opdracht. Um, there's a command there for Windows as well. Er is ook een commando voor Windows. And you'll just do the rest there where you want to install the requirements, copy over the environment key into your .env file. Daarna installeer je de requirements en kopieer je de omgevingssleutel naar je .env-bestand. Um, here you'll put in the Anthropic API key that hopefully all of you also received from the QR code for free credits earlier. Hier voer je de Anthropic API-sleutel in die jullie hopelijk allemaal hebben ontvangen via de QR-code voor gratis credits eerder. And lastly, we'll just run the app. En tot slot voeren we gewoon de app uit. All right, let's go ahead and dive in, but as I mentioned before, let me just show everyone where these instructions are. Goed, laten we erin duiken, maar zoals ik eerder al zei, laat me iedereen even laten zien waar deze instructies staan. If you go into the repository in the link and then go to ship your first manage agents, you scroll down on the read me, you'll see all the setup instructions here. Als je naar de repository in de link gaat en dan naar ship your first manage agents, en je scrolt naar beneden in de readme, zie je hier alle setup-instructies. So, feel free to do this, um, as we go along or even in your own time later today and continue playing around with it, but as I mentioned before, everything will be also shown on the screen to follow along with. Voel je vrij om dit te doen terwijl we doorgaan of later vandaag in je eigen tijd, en blijf er mee spelen, maar zoals ik al zei wordt alles ook op het scherm getoond om mee te volgen. So, do not worry if you did not have time to fully get it set up on your laptop. Maak je geen zorgen als je niet de tijd had om het volledig in te stellen op je laptop. Without further ado, let's go ahead and dive in. Zonder verder uitstel, laten we erin duiken. So, once you run streamlit run app.py, you should be able to see a URL that looks like this and a page that looks like this. Zodra je streamlit run app.py uitvoert, zou je een URL en een pagina zoals deze moeten zien. What we're doing here is we're going to be simulating an agent, um, interaction here where we have an incident that's going to come up. Wat we hier doen is een agent-interactie simuleren waarbij er een incident opduikt. A lot of you who might be software engineers in the room will be intimately familiar with the pain that comes alongside incident response. Velen van jullie die software engineer zijn, zullen maar al te goed bekend zijn met de pijn van incident response. If you are software engineer, you might be woken up at, let's say, 3:00 a.m. in the morning, 2:00 a.m. in the morning when you're out around on on vacation as you're on call, and this is usually a very painful portion of a software engineer's life, uh, because when you're on call, it means that if a server goes down or a service goes down, you have to be immediately the one there to respond and tackle the incident. Als je software engineer bent, word je misschien om 3 uur 's nachts of 2 uur 's nachts gewekt als je on call bent terwijl je op vakantie bent, en dat is meestal een zeer pijnlijk onderdeel van het leven van een software engineer, want als je on call bent en een server of service uitvalt, moet je onmiddellijk reageren en het incident aanpakken. Usually for a human, this means diving into metrics and logs and deployments. Voor een mens betekent dit normaal gesproken het doornemen van metrics, logs en deployments. You can actually investigate what's going on. Je onderzoekt wat er aan de hand is. And so, what we're going to do is we're going to now have an agent run on Claude manage agents to do all this for us. En dus gaan we nu een agent laten draaien op Claude Managed Agents om dit allemaal voor ons te doen. So, that when we get woken up by 3:00 a.m., we can hand it off to an agent, or maybe we don't even get woken up at all if Claude is able to do everything for us. Zodat wanneer we om 3 uur 's nachts worden gewekt, we het kunnen overdragen aan een agent, of misschien worden we helemaal niet meer gewekt als Claude alles voor ons kan doen. Okay. Oké. So, let's now go ahead and dive into the code here. Laten we nu de code induiken. What we're going to open up here is we have the agent.py file on the left and the agent complete on the right. Wat we hier gaan openen is het bestand agent.py aan de linkerkant en de agent complete aan de rechterkant. If you want to challenge yourself, you can of course try to implement everything yourself here or with Claude. Als je jezelf wilt uitdagen, kun je natuurlijk alles zelf implementeren, al dan niet met Claude. Um but, what we're going to do just for simplicity's sake is just copy over various elements from the completed file onto the incomplete file one by one. Maar voor de eenvoud gaan we gewoon stukje voor stukje elementen van het voltooide bestand naar het onvoltooide bestand kopiëren. So, you can see how these primitives compose our agent one piece at a time. Zo zie je hoe deze primitieven onze agent stap voor stap opbouwen. So, let's go ahead and start off with this very first part, which is the agent. Laten we beginnen met het allereerste onderdeel: de agent. We mentioned before that the agent is the one that defines the persona and the capabilities of the agent here. We noemden eerder dat de agent de persona en de mogelijkheden van de agent definieert. So, that's model, the system prompts, and the tools in our case for our agent here. Dat zijn het model, de systeemprompts en de tools voor onze agent hier. So, let me go ahead and copy over what we see there on the screen. Laat me kopiëren wat we op het scherm zien. And you can see here that we're defining the SRE agent. Je kunt hier zien dat we de SRE-agent definiëren. We're going to use Claude Opus 4.7 here. We gaan hier Claude Opus 4.7 gebruiken. And I've preconfigured a system prompt and tools for the agent. En ik heb een systeemprompt en tools voor de agent geconfigureerd.