Terug naar podcasts No Priors: AI, Machine Learning, Tech, & Startups

Een AI-bewaker bouwen voor de enterprise met Onyx Security-CEO Maxim Bar Kogan

as you're exponentially doing more things with the eyes, you're going to start having really bad actions happen. naarmate je exponentieel meer dingen doet met de AI, zullen er echt slechte acties beginnen te gebeuren. And we've seen some of that happen lately with agents accidentally publishing code and tokens that they weren't supposed to. En we hebben dat de laatste tijd zien gebeuren, waarbij agents per ongeluk code en tokens publiceerden die ze niet mochten publiceren. Like definitely enterprises are starting to realize that that risk is grown exponentially and that they don't have any way to stop the adoption. Grote bedrijven beginnen absoluut te beseffen dat dat risico exponentieel is gegroeid en dat ze geen manier hebben om de adoptie te stoppen. They just now have to do something to reduce the chance of these agent actions being illegitimate or incorrect. Ze moeten nu gewoon iets doen om de kans te verkleinen dat deze agent-acties onrechtmatig of onjuist zijn. But we're allowed to look at a lot of historical data of how these agents have behaved. Maar we mogen wel kijken naar veel historische data over hoe deze agents zich hebben gedragen. But enterprise today are not willing to have Anthropic or open AI keep that historical data because they know these are very data companies that will want to train on that data. Maar grote bedrijven van vandaag zijn niet bereid om Anthropic of OpenAI die historische data te laten bewaren, want ze weten dat dit bedrijven zijn die sterk op data gericht zijn en die data willen gebruiken voor training. Hi listeners, welcome back to No Priors. Hallo luisteraars, welkom terug bij No Priors. Today I'm here with Maximbar Kogan, the co-founder and CEO of Onyx Security, an Israelbased startup of researchers, mathematicians, and engineers building agents to watch the AI agents. Vandaag ben ik hier met Maxim Bar Kogan, de mede-oprichter en CEO van Onyx Security, een Israëlische startup van onderzoekers, wiskundigen en ingenieurs die agents bouwen om de AI-agents in de gaten te houden. We talk about specialized model training, Mythos, alignment research, and the Israeli ecosystem in security and now AI. We praten over gespecialiseerde modeltraining, mythos, alignment-onderzoek en het Israëlische ecosysteem op het gebied van security en nu AI. Welcome, Maxim. Welkom, Maxim. Thanks so much for doing this. Heel erg bedankt dat je dit doet. Thank you. Dank je. Pleasure to be here. Fijn om hier te zijn. Everyone is much more concerned about security and the impact of AI on security than they were um certainly a few months ago. Iedereen maakt zich veel meer zorgen over security en de impact van AI op security dan zeker een paar maanden geleden. The consensus risk story Het consensus-risicoverhaal uh two two years ago when you started the company was basically like DLP for chat bots like what are what are employees putting into chat GPT. uh twee jaar geleden, toen je het bedrijf startte, was in de basis zoiets als DLP voor chatbots, dus wat stoppen medewerkers in ChatGPT. Now we have clearly something that is not quite panic but close to marketwide panic. Nu hebben we duidelijk iets dat geen volledige paniek is, maar er wel dicht bij in de buurt komt, een marktbrede paniek. How did you decide to bet on agent actions um when you started? Hoe heb je besloten te gokken op agent-acties, uh, toen je begon? Look, I think for us the pivotal point was uh AutoGPT. Kijk, ik denk dat voor ons het kantelpunt uh AutoGPT was. I think AutoGPT kind of a let everyone's imagination including ours run wild because it was a Ik denk dat AutoGPT ieders verbeelding, inclusief de onze, een beetje op hol liet slaan, want het was een Can you remind listeners what that was? Kun je de luisteraars even herinneren wat dat was? Sure. Tuurlijk. So, AutoGPT um and I'm sorry if I don't know the guy behind it, but a huge huge fan. Dus, AutoGPT, uh, en het spijt me als ik de persoon erachter niet ken, maar ik ben een enorme fan. H they created the first as far as I know first really autonomous agent running on LLMs right so agent that you know would let LLM not generate text but decide what to do and then give that agent an API access to do that thing a tool to do it and then would do that in a loop so it basically in theory could let agents do very complicated things anything a person could do on a computer now in granted it didn't work that well it was too early. Ze creëerden de eerste, voor zover ik weet, echt autonome agent die op LLMs draaide, dus een agent die de LLM niet tekst liet genereren maar liet beslissen wat te doen, en dan die agent API-toegang gaf om dat te doen met een tool, en dat vervolgens in een loop deed, zodat het in theorie agents in staat kon stellen heel gecompliceerde dingen te doen, alles wat een persoon op een computer kan doen, al werkte het niet zo goed, het was te vroeg. The models were not good enough. De modellen waren niet goed genoeg. GPT4 was not good enough. GPT-4 was niet goed genoeg. But I think it did give everyone a glimpse into the future of you know what if the models were good enough and then basically using that same structure we could have very capable agents doing stuff for us. Maar ik denk dat het iedereen wel een glimp gaf van de toekomst, van wat als de modellen wel goed genoeg waren, en dat we vervolgens met diezelfde structuur heel capabele agents dingen voor ons konden laten doen. I think that was in many ways Claude Code today is not dissimilar to autograph back then. Ik denk dat dat op veel manieren vergelijkbaar is met wat Claude Code vandaag is ten opzichte van AutoGPT destijds. I think they were a bit early on on again before the malls were ready but the concept was right and the thought that stickked with me was I was very IPL even back then. Ik denk dat ze een beetje te vroeg waren, voor de modellen er klaar voor waren, maar het concept klopte, en de gedachte die bij me bleef was dat ik al vroeg erg AGI-overtuigd was. So I was uh I was uh thinking oh my god malls are going to be way smarter than us when that happens. Dus ik zat te denken, mijn god, modellen gaan veel slimmer zijn dan wij wanneer dat gebeurt. How do we oversee these very uh smart uh agents that are, you know, they're smarter than us? Hoe houden we toezicht op deze heel uh slimme uh agents die, weet je, slimmer zijn dan wij? They're very capable. Ze zijn heel capabel. Uh how we're going to feel easy about them doing stuff for us, especially when they start managing really important stuff, you know, then one day they're managing your water supply and your electricity, your uh power grid, right? Uh hoe gaan we ons op ons gemak voelen bij wat ze voor ons doen, zeker als ze echt belangrijke dingen gaan beheren, weet je, dan beheren ze op een dag je watertoevoer en je elektriciteit, je uh elektriciteitsnetwerk, toch? How do you control them? Hoe beheers je ze? And that was like the thing I was kind of obsessed about that thought. En dat was zoiets waar ik een beetje geobsedeerd over was. H I was also too early. H Ik was ook te vroeg. So I think at the time enterprises were not using any agents. Ik denk dat grote bedrijven destijds nog geen agents gebruikten. Uh there were hardly any agents out there and and talking with a lot of security buds at the time they were like oh dude you're way too early like this is not uh something that's going to happen as you question. Uh er waren nauwelijks agents en als ik destijds met veel securitymensen sprak, zeiden ze zoiets van oh man, je bent veel te vroeg, dit is niet iets dat zo snel gaat gebeuren. I said is anyone going to do this before you run out of money? Ik vroeg: doet iemand dit nog voordat je geld op is? And and I think there was a good chance that uh I would have run out of money before because I think you were right like I think it there was an element of chance here but then I think the market did happen. En ik denk dat er een goede kans was dat uh ik eerder geld tekort zou zijn gekomen, omdat ik denk dat je gelijk had, ik denk dat er een element van toeval was, maar daarna denk ik dat de markt het deed overkomen. So we had suddenly reasoning models that could do long horizon tasks. Dus hadden we plotseling redeneermodellen die langdurige taken konden uitvoeren. We had a Claude Code which became like the really first widely used autonomous agent and then we had co-work and Claude Code and and I think we're starting to see now that these types of agents that are very autonomous even though they're like uh everyone was afraid to build them. We hadden Claude Code, dat uitgroeide tot de eerste echt breed gebruikte autonome agent, en daarna hadden we co-work en Claude Code, en ik denk dat we nu beginnen te zien dat dit soort agents die zeer autonoom zijn, ook al was iedereen bang om ze te bouwen. So everyone started building these low code platforms that were much more limited much more based on connectors. Iedereen begon dus deze low-code-platforms te bouwen die veel beperkter waren, veel meer gebaseerd op connectors. H those platforms ended up being quite limited. H Die platforms bleken vrij beperkt te zijn. So that we didn't get the productivity gains from those limited platforms. Dus we behaalden niet de productiviteitswinsten van die beperkte platforms. But when we started getting the crazy benefits from these very unleashed agents that could do everything that had much less controls baked into them and even very large enterprises decided they're going to adopt it. Maar toen we de geweldige voordelen begonnen te krijgen van die volledig vrije agents die alles konden doen, met veel minder ingebouwde controles, besloten zelfs de grootste ondernemingen ze te adopteren. You know like tropics revenue is coming from enterprises that are paying for Claude Code to do a lot of the work that developers used to do. Weet je, een groot deel van de omzet van Anthropic komt van grote bedrijven die betalen voor Claude Code om veel van het werk te doen dat ontwikkelaars vroeger deden. That was a bit about kind of how we started and we definitely were in luck that very autonomous agents appeared uh before uh it was too late. Dat was een beetje hoe we begonnen, en we hadden zeker geluk dat heel autonome agents uh verschenen voordat uh het te laat was. So can you describe a little bit just because it's um I I think both uh close to impossible and then very useful in this period of AI to think about what is deployment right now and then you know what's changing about capability. Kun je dat een beetje beschrijven, want ik denk dat het allebei uh bijna onmogelijk én erg nuttig is in deze periode van AI om na te denken over wat deployment nu precies is en wat er verandert qua capaciteit. What's the oneliner on what the Onyx product does today and then like how you think about long-term vision Wat is de eenliner van wat het Onyx-product vandaag doet, en hoe denk je over de langetermijnvisie today? vandaag? Like Onyx is really does do two two things. Onyx doet eigenlijk twee twee dingen. Number one is we train models and build agents that can oversee other agents. Ten eerste trainen we modellen en bouwen we agents die andere agents kunnen overzien. And the goal of that is to say, okay, we need someone to be able to tell that all of these actions that are now happening by these AIs that we're adopting are legitimate because that number the number of these actions is going exponentially. En het doel daarvan is te zeggen, okay, we hebben iemand nodig die kan zeggen dat al deze acties die nu plaatsvinden door deze AI die we adopteren legitiem zijn, want dat aantal acties, dat getal, groeit exponentieel. And so things that we thought might be useful in the past like a human in the loop now that you're going to have 100x, a thousandx, a millionx of these actions, h that's not going to work. En dingen waarvan we dachten dat ze nuttig konden zijn in het verleden, zoals een mens in de lus, nu je honderdmaal, duizendmaal, miljoenenmaal zoveel van deze acties gaat hebben, h dat gaat niet werken. And then we take that capability and we basically productize it in a product that we call the control plane or the secure control plane where we come to the present say hey let's let's find all of your AIS and autonomous agents and hook them up to onyx to this system where we can oversee what your eyes are doing so that uh you don't run into the risk of as you're exponentially doing more things with the eyes you're going to start having really bad actions happen and and we've seen some of that happen lately with down times that were caused by a just doing the wrong thing, agents accidentally publishing code and tokens uh that they weren't supposed to and so on. En dan nemen we die capaciteit en we productiseren dat in een product dat we het control plane of het secure control plane noemen, waarbij we naar de aanwezige bedrijven komen en zeggen, hé, laten we al je AI-systemen en autonome agents vinden en aansluiten op Onyx, op dit systeem waarmee we kunnen overzien wat je AI-systemen doen, zodat uh je niet in het risico loopt dat naarmate je exponentieel meer dingen doet met de AI, er echt slechte acties beginnen te gebeuren, en dat hebben we de laatste tijd ook zien gebeuren met uitvaltijden die veroorzaakt werden doordat een AI gewoon het verkeerde deed, agents per ongeluk code en tokens uh publiceerden die ze niet mochten, enzovoort. So like definitely enterprise are starting to realize that that risk is growing exponentially and that they don't have any way to stop the adoption. Grote bedrijven beginnen dus zeker te beseffen dat dat risico exponentieel groeit en dat ze geen manier hebben om de adoptie te stoppen. So like they just now have to do something to reduce the chance of these agent actions being uh illegitimate or incorrect. Dus ze moeten nu gewoon iets doen om de kans te verkleinen dat deze agent-acties uh onrechtmatig of onjuist zijn. Yeah, I I think um the one of the core reasons obviously the foundation model labs are going after code is because it is very powerful in general and can do you know in theory all things software can uh over time. Ja, ik denk dat een van de kernredenen dat de foundation model labs uiteraard op code focussen is omdat dat in het algemeen erg krachtig is en in theorie alles wat software kan uh na verloop van tijd aankan. Um the flip side of that is it can do all things software can right and so uh I joyously am already in the camp of having allowed a having been over permissive with my agents such that it deleted data permanently and caused rework. De keerzijde is dat het inderdaad alles kan wat software kan, en dus heb ik me vrolijk al aangesloten bij de club van mensen die te toeschietelijk zijn geweest met hun agents, waardoor die permanent data verwijderd hebben en voor extra werk hebben gezorgd. So I'm like oh okay I think I see I need some guardian guardian spirits around it. Dus ik dacht, oh oké, ik zie het, ik heb wat beschermende geesten nodig die eromheen staan. Um given your deployments today and talking to large enterprises what is the state of deployment Gezien je deployments van vandaag en gesprekken met grote bedrijven, wat is de staat van deployment right? nu? uh like how much do you see uh hoeveel zie je that's within these dat er binnen deze uh more scoped like studio-l like platforms versus uh you know uh free free riding coding agents uh meer afgebakende, studio-achtige platforms zijn versus uh weet je uh echt vrij rondrijdende coding agents you know how how much are you actually seeing in large enterprises in different sectors weet je, hoeveel zie je dat nu daadwerkelijk bij grote bedrijven in verschillende sectoren yeah so I think right now in our typical enterprise we're going to see if we break it down to three categories so we break it down to various SAS platforms that are typically more low code uh where people build agents in this drag and drop way and they're not really autonomous agents, right? ja, dus ik denk dat we nu in een typische grote onderneming drie categorieën zien, als we het opsplitsen: er zijn de verschillende SaaS-platforms die doorgaans meer low-code zijn, waar mensen agents bouwen op een drag-and-drop-manier en het eigenlijk geen echte autonome agents zijn, toch? They're kind of the simp kind of I would think of them more as automations and then there are um first party agents people are building in their cloud potentially because it's an application they want inside the company or even a product they're planning to release to the customers that is agentic. Die zijn meer van het eenvoudige type, ik zou ze meer als automatiseringen beschouwen, en dan zijn er uh first-party agents die mensen bouwen in hun cloud, mogelijk omdat het een applicatie is die ze intern in het bedrijf willen of zelfs een product dat ze aan klanten willen uitbrengen dat agentisch is. And then the third category is very autonomous coding agents and assistants. En dan is de derde categorie de zeer autonome coding agents en assistenten. Of these categories, I would say roughly at this point over 50% is the autonomous uh coding agents and assistance in the average enterprise. Van deze categorieën zou ik zeggen dat op dit moment waarschijnlijk meer dan 50% de autonome uh coding agents en assistenten zijn in de gemiddelde grote onderneming. Then probably 45% is is those uh uh low code automations. Dan is waarschijnlijk 45% die uh uh low-code-automatiseringen. And the last 2% are really the first party ones that they're building themselves because obviously it's much harder to build effective agents. En de laatste 2% zijn echt de first-party agents die ze zelf bouwen, want het is uiteraard veel moeilijker om effectieve agents te bouwen. So, and it's much easier to adopt agents off the shelf or or build them with low code. Dus het is veel gemakkelijker om kant-en-klare agents te adopteren of ze te bouwen met low-code. So, and that's what we're seeing and we're inducing that the autonomous are also the fastest growing category. En dat is wat we zien, en we stellen vast dat de autonome ook de snelst groeiende categorie zijn. So, it used to be that only developers and we would see Claude Code growing like fire in our customer base and now we're seeing a cloud co-working even faster. Vroeger waren het alleen ontwikkelaars, en we zagen Claude Code groeien als een lopend vuur in onze klantenbase, maar nu zien we dat Claude co-work nog sneller groeit. We're starting to see to our own surprise actually people adopting openclaw as a legitimate sanctioned tool in the company because the CEO is very driven to adopt AI. We zien, tot onze eigen verbazing eigenlijk, dat mensen OpenClaw adopteren als een legitiem goedgekeurd hulpmiddel in het bedrijf, omdat de CEO sterk aandringt op AI-adoptie. H so I think that today autonomous ads are by far the fastest growing category and and uh today typically comes without any controls. H dus ik denk dat vandaag autonome agents verreweg de snelst groeiende categorie zijn en dat ze tegenwoordig doorgaans worden geleverd zonder enige controles. So enterprises uh already buy let's say a hundred billion dollars of security today. Grote bedrijven kopen laten we zeggen al honderd miljard dollar aan security vandaag. Um they have uh lots of different protections at the endpoint and network and cloud and identity domains. Uh ze hebben uh veel verschillende beschermingen op het gebied van endpoint-, netwerk-, cloud- en identity-domeinen. Uh what's relevant here for securing agents or is none of it like how do you how do you think about the existing protection set? Uh wat is hier relevant voor het beveiligen van agents, of is er helemaal niets van relevant, hoe denk je over het bestaande beschermingspakket? Security is always a space where you have some overlap between different tooling but in this and you have the concept of defensive debt as well. Security is altijd een ruimte waar er enige overlap is tussen verschillende tooling, en je hebt ook het concept van defensieve schuld. So you want to have defenses at different levels of your technology stack to solve the problem. Dus je wilt verdedigingen hebben op verschillende niveaus van je technologie-stack om het probleem op te lossen. And that said, I think in this space we're kind of in a lot of enterprise are are kind of helpless because I'll take an example the identity approach. Dat gezegd hebbende, denk ik dat in deze ruimte veel grote bedrijven een beetje hulpeloos zijn, want ik geef een voorbeeld van de identity-aanpak. Like traditionally if we have an software system that's running in our company we'll our first and most important control will be to limit what permission it has right because and then no matter what even if it goes wrong even if it's compromised it can't um typically do stuff that was originally allowed to do but with these autonomous AIs with these assistants with these coding agents we kind of want them to have our permissions because we want to we want to tell cloud co to do something or cloud co-work to do something and we want to then go have lunch and we want to come back and see that it's done and we want to give it so many diverse tasks as well that we kind of can't find the right set of permissions to do so suddenly our identity security software is not very useful then if you think about endpoint security right or or API security like if we tell our Claude Code that we want to recreate a database and it should delete it and recreate it. Als we traditioneel een softwaresysteem hebben dat in ons bedrijf draait, is onze eerste en belangrijkste controle het beperken van de permissies die het heeft, toch, want dan kan het, zelfs als het misgaat of gecompromitteerd is, doorgaans niet doen wat het oorspronkelijk niet mocht, maar met deze autonome AI-systemen, met deze assistenten, met deze coding agents willen we eigenlijk dat ze onze permissies hebben, want we willen Claude co iets vertellen of Claude co-work iets laten doen, en dan gaan we lunchen en we willen terugkomen en zien dat het gedaan is, en we willen het zo veel uiteenlopende taken geven dat we eigenlijk niet de juiste set permissies kunnen vinden, waardoor onze identity security software plotseling niet erg nuttig is, en als je denkt aan endpoint security of API security, als we onze Claude Code vertellen dat we een database opnieuw willen aanmaken en die moet verwijderen en opnieuw aanmaken. That's great. Prima. That's going to save our DevOps team and our platform teams a lot of time. Dat gaat ons DevOps-team en onze platformteams veel tijd besparen. It's it's a great benefit of cloud code. Het is echt een geweldig voordeel van Claude Code. But if cloud code is working on an unrelated task and suddenly thinks that maybe the right thing to do is to delete our database and recreate it, maybe we don't want that to happen. Maar als Claude Code bezig is met een andere taak en plotseling denkt dat het misschien verstandig is om onze database te verwijderen en opnieuw aan te maken, willen we dat misschien niet. And unfortunately our endpoint providers or API security tools, they don't know what cloud was thinking. En helaas weten onze endpoint-providers of API security-tools niet wat Claude aan het denken was. why is it doing what it's doing? waarom doet het wat het doet? Right? Toch? So, a lot of these existing tools, they don't have the context to understand what these very flexible, unpredictable systems are doing. Veel van deze bestaande tools hebben de context niet om te begrijpen wat deze heel flexibele, onvoorspelbare systemen doen. If you're not building some kind of controls that are built for these systems, then you're either going to end up limiting them a lot, making them almost uh much less useful to the enterprise, or uh you're going to miss a lot of pretty dangerous things that they might be doing. Als je niet een soort controles bouwt die zijn ontworpen voor deze systemen, dan ga je ze ofwel sterk beperken, waardoor ze bijna veel minder nuttig zijn voor de grote onderneming, of uh ga je veel vrij gevaarlijke dingen missen die ze mogelijk doen. As somebody who has worked in security for a long time, my first very traditional instinct on a problem like this is like that sounds like a problem for a proxy with a policy engine. Als iemand die lang in security heeft gewerkt, is mijn eerste zeer traditionele instinct bij zo'n probleem zoiets van dat klinkt als een probleem voor een proxy met een beleidsengine. We make some rules, we make the rules smarter. We maken een aantal regels, we maken de regels slimmer. Like why why doesn't that work or did you did you try it? Waarom werkt dat niet, of heb je het geprobeerd? There are few things that I mean proxies integration method I would say. Er zijn een paar dingen, ik bedoel dat proxies een integratiemethode zijn, zou ik zeggen. So there's some there are some AI systems where like you would want to integrate with a proxy if that's the easiest way to do it. Er zijn sommige AI-systemen waarbij je zou willen integreren via een proxy als dat de makkelijkste manier is. But number one, there's a lot of systems where that's just not viable technically because AI today runs on the cloud on someone else's infrastructure on your endpoint and just proxy is not always an option. Maar ten eerste zijn er veel systemen waarbij dat technisch gewoon niet haalbaar is, want AI draait vandaag in de cloud, op de infrastructuur van iemand anders, op je endpoint, en een proxy is niet altijd een optie.