Terug naar podcastsDwarkesh Patel

Terence Tao – Hoe de beste wiskundige ter wereld AI gebruikt

Today, I'm chatting with Terence Tao, who needs no introduction. Vandaag spreek ik met Terence Tao, die geen introductie nodig heeft. Terence, I want to begin by having you retell the story of how Kepler discovered the laws of planetary motion because I think this will be a great jumping off point to talk about AI for math. Terence, ik wil beginnen door je te vragen het verhaal te vertellen van hoe Kepler de wetten van de planetenbeweging ontdekte, want ik denk dat dit een goed startpunt is om over AI en wiskunde te praten. I've always had an amateur interest in astronomy. Ik heb altijd een amateuristische interesse in astronomie gehad. I've loved stories of how the early astronomers worked out the nature of the universe. Ik houd van verhalen over hoe de vroege astronomen de aard van het universum uitwerkten. Kepler was building on the work of Copernicus, who was himself building on the work of Aristarchus. Kepler voortbouwde op het werk van Copernicus, die zelf voortbouwde op het werk van Aristarchus. Copernicus very famously proposed the heliocentric model, that instead of the planets and the Sun going around the Earth, the Sun was at the center of the solar system and the other planets were going around the Sun. Copernicus stelde het heliocentrische model voor: in plaats van de planeten en de zon die om de aarde draaien, stond de zon in het middelpunt van het zonnestelsel en draaiden de andere planeten om de zon. Copernicus proposed that the orbits of the planets were perfect circles. Copernicus stelde dat de banen van de planeten perfecte cirkels waren. His theory fit the observations that the Greeks, the Arabs, and the Indians had worked out over centuries. Zijn theorie paste bij de observaties die de Grieken, de Arabieren en de Indiërs gedurende eeuwen hadden uitgewerkt. Kepler learned about these theories in his studies, and he made this observation that the ratios of the size of the orbits that Copernicus predicted seemed to have some geometric meaning. Kepler leerde over deze theorieën tijdens zijn studie, en hij maakte de observatie dat de verhoudingen van de orbitegroottes die Copernicus voorspelde een geometrische betekenis leken te hebben. He started proposing that if you take the orbit of the Earth and you enclose it in a cube, the outer sphere that encloses the cube almost perfectly matched the orbit of Mars, and so forth. Hij begon te opperen dat als je de baan van de aarde neemt en die in een kubus omsluit, de buitenste bol die de kubus omsluit bijna perfect de baan van Mars zou overeenkomen, enzovoort. There were six planets known at the time and five gaps between them, and there were five perfect Platonic solids: the cube, the tetrahedron, icosahedron, octahedron, and dodecahedron. Er waren zes planeten bekend destijds en vijf tussenruimten, en er waren vijf perfecte Platonische lichamen: de kubus, het tetraëder, icosaëder, octaëder en dodecaëder. So he had this theory, which he thought was absolutely beautiful, that you could inscribe these Platonic solids between the spheres of the planets. Dus had hij deze theorie, die hij absoluut prachtig vond, dat je deze Platonische lichamen kon inscriberen tussen de sferen van de planeten. It seemed to fit, and it seemed to him that God's design of the planets was matching this mathematical perfection of the Platonic solids. Het leek te kloppen, en het leek hem dat Gods ontwerp van de planeten paste bij deze wiskundige perfectie van de Platonische lichamen. He needed data to confirm this theory. Hij had data nodig om deze theorie te bevestigen. At the time, there was only one really high-quality dataset in existence. Destijds bestond er slechts één echt hoogwaardige dataset. Tycho Brahe, this very wealthy, eccentric Danish astronomer, had managed to convince the Danish government to fund this extremely expensive observatory. Tycho Brahe, deze zeer rijke, excentrieke Deense astronoom, had de Deense overheid weten te overtuigen om een uiterst kostbare sterrenwacht te financieren. In fact, it was an entire island where he had taken decades of observations of all the planets, like Mars and Jupiter, at least every night for which the weather was clear, with the naked eye. Het was zelfs een heel eiland waar hij tientallen jaren lang observaties van alle planeten had verricht, zoals Mars en Jupiter, minstens elke heldere avond, met het blote oog. He was the last of the naked-eye astronomers. Hij was de laatste van de bloteoogastronomen. He had all this data which Kepler could use to confirm his theory. Hij had al deze data die Kepler kon gebruiken om zijn theorie te bevestigen. Kepler started working with Tycho, but Tycho was very jealous of the data. Kepler begon samen te werken met Tycho, maar Tycho was erg jaloers op de data. He only gave him little bits of it at a time. Hij gaf hem er maar kleine stukjes van tegelijk. Kepler eventually just stole the data. Kepler stal de data uiteindelijk gewoon. He copied it and had to have a fight with Brahe's descendants. Hij kopieerde die en moest een conflict uitvechten met Brahe's nakomelingen. He did get the data, and then he worked out, to his disappointment, that his beautiful theory didn't quite work. Hij kreeg de data toch, en werkte toen, tot zijn teleurstelling, uit dat zijn prachtige theorie niet helemaal klopte. The data was off from his Platonic solid theory by 10% or something. De data week zo'n 10% af van zijn Platonische lichaamstheorie. He tried all kinds of fudges, moving the circles around, and it didn't quite work. Hij probeerde allerlei aanpassingen, verplaatste de cirkels, maar het klopte maar niet. But he worked on this problem for years and years, and eventually, he figured out how to use the data to work out the actual orbits of the planets. Maar hij werkte jarenlang aan dit probleem, en uiteindelijk vond hij een manier om de data te gebruiken om de werkelijke banen van de planeten te bepalen. That was an incredibly clever, genius amount of data analysis. Dat was een ongelooflijk slimme, geniale hoeveelheid data-analyse. And then he worked out that the orbits were actually ellipses, not circles, which was shocking for him. En toen ontdekte hij dat de banen eigenlijk ellipsen waren, geen cirkels, wat voor hem een schok was. So he worked out the two laws of planetary motion: the ellipses, and also that equal areas sweep out equal times. Zo werkte hij twee wetten van de planetenbeweging uit: de ellipsen, en ook dat gelijke oppervlakten in gelijke tijden worden afgelegd. Then ten years later, after collecting a lot of data—the furthest planets like Saturn and Jupiter were the hardest for him to work out—he finally worked out this third law, that the time it takes for a planet to complete its orbit was proportional to some power of the distance to the Sun. Dan, tien jaar later, nadat hij veel data had verzameld, want de verste planeten zoals Saturnus en Jupiter waren het moeilijkst voor hem, werkte hij eindelijk deze derde wet uit: dat de tijd die een planeet nodig heeft om zijn baan te volbrengen evenredig is met een bepaalde macht van de afstand tot de zon. These are the three famous Kepler's laws of motion. Dit zijn de drie beroemde wetten van Kepler. He had no explanation for them. Hij had er geen verklaring voor. It was all driven by experiment, and it took Newton a century later to give a theory that explained all three laws at once. Het was volledig gestuurd door experiment, en het duurde tot Newton, een eeuw later, om een theorie te geven die alle drie de wetten tegelijk verklaarde. The take I want to try on you is that Kepler was a high-temperature LLM. De interpretatie die ik op jou wil uitproberen is dat Kepler een hoge-temperature LLM was. Newton comes up with this explanation of why the three laws of planetary motion must be true. Newton komt met een verklaring waarom de drie wetten van de planetenbeweging noodzakelijkerwijs waar zijn. Of course, the way that Kepler discovers the laws of planetary motion, or figures out the relative orbits of the different planets, is as you say a work of genius. Uiteraard is de manier waarop Kepler de wetten van de planetenbeweging ontdekt, of de relatieve banen van de verschillende planeten uitwerkt, zoals jij zegt een werk van genialiteit. But through his career, he's just trying random relationships. Maar gedurende zijn carrière probeert hij gewoon willekeurige verbanden. In fact, in the book in which he writes down the third law of planetary motion, it's an aside on The Harmonics of the World, which is just a book about how all these different planets have these different harmonies. In het boek waarin hij de derde wet van de planetenbeweging opschrijft, is het eigenlijk een zijopmerking in De Harmonie der Werelden, gewoon een boek over hoe al deze planeten verschillende harmonieën hebben. And the reason there's so much famine and misery on Earth is because the Earth is mi-fa-mi, that's the note of Earth. En de reden dat er zoveel hongersnood en ellende op aarde is, is omdat de aarde mi-fa-mi is, dat is de noot van de aarde. It's all this random astrology, but in there is the cube-square law, which tells you what relationship the period has to a planet's distance from the Sun. Het is allemaal willekeurige astrologie, maar daarin zit de kwadraat-kubusregel, die vertelt welk verband er bestaat tussen de omlooptijd en de afstand van een planeet tot de zon. As you were detailing, if you add that to Newton's F=ma and the equation for centripetal acceleration, you get the inverse-square law. Zoals je uitlegde, als je dat combineert met Newtons F=ma en de formule voor middelpuntzoekende versnelling, krijg je de omgekeerde kwadratenwet. And so Newton works that out. En zo werkt Newton dat uit. But the reason I think this is an interesting story is that I feel LLMs can do the kind of thing of trying random relationships for twenty years, some of which make no sense, as long as there's a verifiable data bank like Brahe's dataset. Maar de reden dat ik dit een interessant verhaal vind, is dat ik denk dat LLMs het soort dingen kunnen doen van twintig jaar lang willekeurige verbanden proberen, waarvan sommige nergens op slaan, zolang er een verifieerbare databank is zoals de dataset van Brahe. "Ok, I'm going to try out random things about musical notes, Platonic objects, or different geometries, I have this bias that there's some important thing about the geometry of these orbits." "Ok, ik ga willekeurige dingen proberen over muzieknoten, Platonische lichamen, of verschillende geometrieën, ik heb dit idee dat er iets belangrijks zit in de geometrie van deze banen." Then one thing works. Dan werkt er één ding. As long as you can verify it, these empirical regularities can then drive actual deep scientific progress. Zolang je het kunt verifiëren, kunnen deze empirische wetmatigheden werkelijk diepgaande wetenschappelijke vooruitgang aandrijven. Traditionally, when we talk about the history of science, idea generation has always been the prestige part of science. Traditioneel, als we praten over de geschiedenis van de wetenschap, was het genereren van ideeën altijd het prestigieuze deel van de wetenschap. A scientific problem comes with many steps. Een wetenschappelijk probleem bestaat uit veel stappen. You have to identify a problem, and then you have to identify a good, fruitful problem to work on. Je moet een probleem identificeren, en dan een goed, vruchtbaar probleem vinden om aan te werken. Then you need to collect data, figure out a strategy to analyze the data, and make a hypothesis. Dan moet je data verzamelen, een strategie bedenken om de data te analyseren, en een hypothese opstellen. At this point, you need to propose a good hypothesis, and then you need to validate. Op dit punt moet je een goede hypothese voorstellen, en dan moet je valideren. Then you need to write things up and explain. Dan moet je alles opschrijven en uitleggen. There are a dozen different components. Er zijn een dozijn verschillende componenten. The ones we celebrate are these eureka genius moments of idea generation. Degenen die we vieren zijn deze eureka-geniesmomenten van ideeëngeneratie. Kepler certainly had to cycle through many ideas, several of which didn't work. Kepler moest zeker veel ideeën doorlopen, waarvan er meerdere niet werkten. I bet there were many that he didn't even publish at all because they just didn't fit. Ik wed dat er veel waren die hij helemaal niet publiceerde, omdat ze gewoon niet pasten. That's an important part of the process, trying all kinds of random things and seeing if they worked. Dat is een belangrijk onderdeel van het proces: allerlei willekeurige dingen proberen en kijken of ze werkten. But as you say, it has to be matched by an equal amount of verification, otherwise it's slop. Maar zoals jij zegt, het moet gepaard gaan met een evenredige hoeveelheid verificatie, anders is het rotzooi. We celebrate Kepler, but we should also celebrate Brahe for his assiduous data collection, which was ten times more precise than any previous observation. We vieren Kepler, maar we moeten ook Brahe vieren voor zijn nauwgezette dataverzameling, die tien keer preciezer was dan enige eerdere observatie. That extra decimal point of accuracy was essential for Kepler to get his results. Dat extra decimaal van nauwkeurigheid was essentieel voor Kepler om zijn resultaten te bereiken. He was using Euclidean geometry and the most advanced mathematics he could use at the time to match his models with the data. Hij gebruikte Euclidische meetkunde en de meest geavanceerde wiskunde die hij destijds kon gebruiken om zijn modellen met de data te laten overeenstemmen. All aspects had to be in play: the data, the theory, and the hypothesis generation. Alle aspecten moesten in spel zijn: de data, de theorie en het genereren van hypothesen. I'm not sure nowadays that hypothesis generation is the bottleneck anymore. Ik weet niet zeker of het genereren van hypothesen tegenwoordig nog de bottleneck is. Science has changed in the century since. De wetenschap is veranderd in de afgelopen eeuw. Classically, the two big paradigms for science were theory and experiment. Klassiek gezien waren theorie en experiment de twee grote paradigma's voor de wetenschap. Then in the 20th century, numerical simulation came along, so you can do computer simulations to test theories. In de 20e eeuw kwam numerieke simulatie erbij, zodat je computersimulaties kunt doen om theorieën te testen. Finally, in the late 20th century, we had big data. En ten slotte, aan het einde van de 20e eeuw, hadden we big data. We had the era of data analysis. We hadden het tijdperk van data-analyse. A lot of new progress is actually driven now by analyzing massive datasets first. Veel nieuwe vooruitgang wordt nu feitelijk gedreven door eerst massale datasets te analyseren. You collect large datasets and then draw patterns from them to deduce thoughts. Je verzamelt grote datasets en trekt er dan patronen uit om conclusies te trekken. This is a little bit different from how science used to work, where you make a few observations or have one out-of-the-blue idea, and then collect data to test your idea. Dit is een beetje anders dan hoe de wetenschap vroeger werkte, waarbij je een paar observaties deed of een idee uit het niets had, en dan data verzamelde om je idee te testen. That's the classic scientific method. Dat is de klassieke wetenschappelijke methode. Now it's almost reversed. Nu is het bijna omgekeerd. You collect big data first, and then you try to get hypotheses from it. Je verzamelt eerst big data, en dan probeer je er hypothesen uit te halen. Kepler was maybe one of the first early data scientists, but even he didn't start with Tycho's dataset and then analyze it. Kepler was misschien een van de eerste vroege datawetenschappers, maar zelfs hij begon niet met de dataset van Tycho om die dan te analyseren. He had some preconceived theories first. Hij had eerst enkele vooropgezette theorieën. It seems like this is less and less the way we make progress, just because the data is so much more massive and useful. Het lijkt erop dat dit steeds minder de manier is waarop we vooruitgang boeken, gewoon omdat de data zoveel omvangrijker en bruikbaarder is. Oh, interesting. Oh, interessant. I feel like the 20th-century science that you're describing actually very well describes what happened with Kepler. Ik heb het gevoel dat de 20e-eeuwse wetenschap die jij beschrijft eigenlijk heel goed beschrijft wat er met Kepler gebeurde. He did have these ideas—1595 and '96 is where he comes up with the polygons and then the Platonic objects theory—but they were wrong. Hij had wel die ideeën, in 1595 en '96 bedenkt hij de veelhoeken en daarna de Platonische lichaamstheorie, maar ze klopten niet. Then a few years later, he gets Brahe's data, and it's only after twenty years of trying random things that he gets this empirical regularity. Dan, een paar jaar later, krijgt hij de data van Brahe, en pas na twintig jaar willekeurige dingen proberen, vindt hij deze empirische wetmatigheid. It actually feels a bit closer to Brahe's data being analogous to some massive data bank of simulations, and now that you've got the data, you can keep trying random things. Het voelt eigenlijk een beetje alsof de data van Brahe vergelijkbaar is met een enorme databank van simulaties, en nu je de data hebt, kun je steeds willekeurige dingen blijven proberen. If it wasn't for that, Kepler would be out there just writing books about harmonics and Platonic objects, and there would be nothing to actually verify against. Als dat er niet was, zou Kepler gewoon boeken over harmonieleer en Platonische lichamen schrijven, en er zou niets zijn om werkelijk tegenaan te toetsen. The data was extremely important. De data was uitermate belangrijk. The distinction I was trying to make was that traditionally, you make a hypothesis and then you test it against data. Het onderscheid dat ik probeerde te maken was dat je traditioneel een hypothese opstelt en die dan aan data toetst. But now with machine learning, data analysis, and statistics, you can start with data and through statistics work out laws that were not present before. Maar nu kun je met machine learning, data-analyse en statistiek beginnen met data en via statistiek wetten afleiden die er voorheen niet waren. Kepler's third law is a little bit like this, except that instead of having the thousand data points that Brahe had, Kepler had six data points. Keplers derde wet lijkt hier een beetje op, maar in plaats van de duizend datapunten die Brahe had, had Kepler zes datapunten. For every planet, he knew the length of the orbit and the distance to the Sun. Voor elke planeet kende hij de lengte van de baan en de afstand tot de zon. There were five or six data points, and he did what we would now call regression. Er waren vijf of zes datapunten, en hij deed wat we nu regressie zouden noemen. He fit a curve to these six data points and got a square-cube law, which was amazing. Hij paste een curve op die zes datapunten en vond een kwadratische kubusregel, wat verbluffend was. But he was quite lucky that these six data points gave him the right conclusion. Maar hij had toch behoorlijk geluk dat die zes datapunten hem de juiste conclusie gaven. That's not enough data to be really reliable. Dat is niet genoeg data om echt betrouwbaar te zijn. There was a later astronomer, Johann Bode, who took the same data—the distances to the planets—and inspired by Kepler, he had a prediction that the distances to the planets formed a shifted geometric progression. Er was een latere astronoom, Johann Bode, die dezelfde data nam, de afstanden tot de planeten, en, geïnspireerd door Kepler, de voorspelling deed dat de afstanden tot de planeten een verschoven meetkundige reeks vormden. He also fit a curve, except there was one point missing. Hij paste ook een curve, maar er ontbrak één punt. There was a big gap between Mars and Jupiter. Er was een grote kloof tussen Mars en Jupiter. His law predicted that there was a missing planet. Zijn wet voorspelde dat er een ontbrekende planeet was. It was kind of a crank theory, except when Uranus was discovered by Herschel, the distance to Uranus fit exactly this pattern. Het was een soort kwaktheorie, maar toen Uranus door Herschel werd ontdekt, paste de afstand tot Uranus precies op dit patroon. Then Ceres was discovered in the asteroid belt, and it also fit the pattern. Daarna werd Ceres ontdekt in de asteroïdengordel, en ook dat paste op het patroon. People got really excited that Bode had discovered this amazing new law of nature. Mensen raakten heel enthousiast dat Bode deze geweldige nieuwe natuurwet had ontdekt.