返回播客Dwarkesh Patel
陶哲轩——世界顶级数学家如何使用 AI
Today, I'm chatting with Terence Tao, who needs no introduction.
今天,我和陶哲轩对谈,他不需要任何介绍。
Terence, I want to begin by having you retell the story of how Kepler discovered the laws of planetary motion because I think this will be a great jumping off point to talk about AI for math.
陶哲轩,我想从开普勒发现行星运动定律的故事讲起,因为我觉得这会是讨论 AI 与数学的绝佳切入点。
I've always had an amateur interest in astronomy.
我一直对天文学有业余爱好。
I've loved stories of how the early astronomers worked out the nature of the universe.
我很喜欢早期天文学家探索宇宙本质的那些故事。
Kepler was building on the work of Copernicus, who was himself building on the work of Aristarchus.
开普勒是在哥白尼的工作基础上发展的,而哥白尼本人又是在阿里斯塔克斯的基础上发展的。
Copernicus very famously proposed the heliocentric model, that instead of the planets and the Sun going around the Earth, the Sun was at the center of the solar system and the other planets were going around the Sun.
哥白尼提出了著名的日心说,认为不是行星和太阳绕地球运转,而是太阳位于太阳系中心,其他行星绕太阳运转。
Copernicus proposed that the orbits of the planets were perfect circles.
哥白尼认为行星的轨道是完美的圆形。
His theory fit the observations that the Greeks, the Arabs, and the Indians had worked out over centuries.
他的理论契合了希腊人、阿拉伯人和印度人几个世纪以来积累的观测数据。
Kepler learned about these theories in his studies, and he made this observation that the ratios of the size of the orbits that Copernicus predicted seemed to have some geometric meaning.
开普勒在学习这些理论时,注意到哥白尼预测的各行星轨道尺寸之比似乎具有某种几何意义。
He started proposing that if you take the orbit of the Earth and you enclose it in a cube, the outer sphere that encloses the cube almost perfectly matched the orbit of Mars, and so forth.
他开始设想,如果把地球轨道内接于一个正方体,那么包住这个正方体的外接球体几乎完美地匹配火星轨道,以此类推。
There were six planets known at the time and five gaps between them, and there were five perfect Platonic solids: the cube, the tetrahedron, icosahedron, octahedron, and dodecahedron.
那时已知六颗行星,它们之间有五个间隔,恰好对应五种正多面体:正方体、四面体、二十面体、八面体和十二面体。
So he had this theory, which he thought was absolutely beautiful, that you could inscribe these Platonic solids between the spheres of the planets.
于是他提出了这个他认为无比美妙的理论:可以把这些正多面体内接在各行星的球形轨道之间。
It seemed to fit, and it seemed to him that God's design of the planets was matching this mathematical perfection of the Platonic solids.
这个理论似乎很吻合,他觉得上帝对行星的设计正是对应着正多面体的数学完美性。
He needed data to confirm this theory.
他需要数据来验证这个理论。
At the time, there was only one really high-quality dataset in existence.
当时全世界只有一套真正高质量的数据集。
Tycho Brahe, this very wealthy, eccentric Danish astronomer, had managed to convince the Danish government to fund this extremely expensive observatory.
第谷·布拉赫,这位极其富有、行事古怪的丹麦天文学家,设法说服了丹麦政府资助一座造价极为高昂的天文台。
In fact, it was an entire island where he had taken decades of observations of all the planets, like Mars and Jupiter, at least every night for which the weather was clear, with the naked eye.
那实际上是整整一座岛,他在那里花了几十年,用肉眼对所有行星,包括火星和木星,几乎每一个晴夜都进行观测记录。
He was the last of the naked-eye astronomers.
他是最后一位用肉眼观天的天文学家。
He had all this data which Kepler could use to confirm his theory.
他拥有所有这些数据,开普勒可以用来验证自己的理论。
Kepler started working with Tycho, but Tycho was very jealous of the data.
开普勒开始和第谷合作,但第谷对这些数据非常嫉妒。
He only gave him little bits of it at a time.
他每次只给开普勒一点点。
Kepler eventually just stole the data.
开普勒最终直接把数据偷走了。
He copied it and had to have a fight with Brahe's descendants.
他复制了数据,还不得不和布拉赫的后代打了一场官司。
He did get the data, and then he worked out, to his disappointment, that his beautiful theory didn't quite work.
他确实拿到了数据,然后令他失望地发现,他那个美丽的理论根本行不通。
The data was off from his Platonic solid theory by 10% or something.
数据与他正多面体理论的偏差大概有 10%。
He tried all kinds of fudges, moving the circles around, and it didn't quite work.
他尝试了各种修补方法,移动圆圈的位置,但都差那么一点。
But he worked on this problem for years and years, and eventually, he figured out how to use the data to work out the actual orbits of the planets.
他在这个问题上研究了好多年,最终搞清楚了如何利用这些数据推算行星的真实轨道。
That was an incredibly clever, genius amount of data analysis.
那是极为聪明、堪称天才的数据分析工作。
And then he worked out that the orbits were actually ellipses, not circles, which was shocking for him.
他最终发现行星轨道实际上是椭圆形,而不是圆形,这让他大为震惊。
So he worked out the two laws of planetary motion: the ellipses, and also that equal areas sweep out equal times.
就这样他推导出了行星运动的两条定律:椭圆轨道定律,以及等面积定律。
Then ten years later, after collecting a lot of data—the furthest planets like Saturn and Jupiter were the hardest for him to work out—he finally worked out this third law, that the time it takes for a planet to complete its orbit was proportional to some power of the distance to the Sun.
然后在此后十年,收集了大量数据之后——最远的行星,比如土星和木星,是最难推算的——他终于发现了第三定律:行星完成一次公转所需的时间,与它到太阳距离的某次方成正比。
These are the three famous Kepler's laws of motion.
这就是著名的开普勒行星运动三定律。
He had no explanation for them.
他对此没有任何解释。
It was all driven by experiment, and it took Newton a century later to give a theory that explained all three laws at once.
这完全是由实验驱动的,要再过一个世纪,才由牛顿给出了能同时解释三条定律的理论。
The take I want to try on you is that Kepler was a high-temperature LLM.
我想和你分享一个看法:开普勒就是一个高温 LLM。
Newton comes up with this explanation of why the three laws of planetary motion must be true.
牛顿提出了为什么行星运动三定律必然成立的解释。
Of course, the way that Kepler discovers the laws of planetary motion, or figures out the relative orbits of the different planets, is as you say a work of genius.
当然,开普勒发现行星运动定律,也就是推算各行星相对轨道的方式,正如你所说,是天才之作。
But through his career, he's just trying random relationships.
但纵观他的整个生涯,他不过是在尝试各种随机的规律关系。
In fact, in the book in which he writes down the third law of planetary motion, it's an aside on The Harmonics of the World, which is just a book about how all these different planets have these different harmonies.
事实上,他写下行星运动第三定律的那本书里,这个定律只是个旁注,出现在《世界的和谐》这本书里,那本书讲的是各个行星具有不同的和声。
And the reason there's so much famine and misery on Earth is because the Earth is mi-fa-mi, that's the note of Earth.
书里还说地球饥荒和苦难如此之多,是因为地球的音符是 mi-fa-mi。
It's all this random astrology, but in there is the cube-square law, which tells you what relationship the period has to a planet's distance from the Sun.
这全都是随机的占星学内容,但就藏在其中的,是那个平方立方定律,它告诉你行星公转周期与其到太阳距离之间的关系。
As you were detailing, if you add that to Newton's F=ma and the equation for centripetal acceleration, you get the inverse-square law.
正如你所说,把它加到牛顿的 F=ma 以及向心加速度方程里,就能推导出平方反比定律。
And so Newton works that out.
牛顿就是这样推导出来的。
But the reason I think this is an interesting story is that I feel LLMs can do the kind of thing of trying random relationships for twenty years, some of which make no sense, as long as there's a verifiable data bank like Brahe's dataset.
我觉得这个故事有趣的地方在于,LLM 完全可以做开普勒做的那件事,花二十年时间尝试各种随机关系,其中很多毫无意义,只要有一个像布拉赫数据集那样可供验证的数据库。
"Ok, I'm going to try out random things about musical notes, Platonic objects, or different geometries, I have this bias that there's some important thing about the geometry of these orbits."
"好,我来随机尝试各种音符关系、正多面体关系,或者不同几何结构,我预感这些轨道的几何形状里藏着某个重要东西。"
Then one thing works.
然后其中某件事管用了。
As long as you can verify it, these empirical regularities can then drive actual deep scientific progress.
只要能验证,这些经验规律就能推动真正深刻的科学进步。
Traditionally, when we talk about the history of science, idea generation has always been the prestige part of science.
传统上,当我们谈到科学史时,思想的产生始终是科学中最受推崇的部分。
A scientific problem comes with many steps.
解决一个科学问题涉及很多步骤。
You have to identify a problem, and then you have to identify a good, fruitful problem to work on.
你得先识别问题,再判断哪个问题值得深入研究。
Then you need to collect data, figure out a strategy to analyze the data, and make a hypothesis.
然后要收集数据,想出分析数据的策略,提出假设。
At this point, you need to propose a good hypothesis, and then you need to validate.
到这一步,你需要提出一个好的假设,然后进行验证。
Then you need to write things up and explain.
再然后是写作和阐释。
There are a dozen different components.
大概有十几个不同的环节。
The ones we celebrate are these eureka genius moments of idea generation.
我们最推崇的是那些灵光一闪的天才时刻。
Kepler certainly had to cycle through many ideas, several of which didn't work.
开普勒当然经历了很多想法的循环,其中好几个都行不通。
I bet there were many that he didn't even publish at all because they just didn't fit.
我敢说有很多他根本就没有发表,因为根本对不上。
That's an important part of the process, trying all kinds of random things and seeing if they worked.
这是这个过程中很重要的一部分:尝试各种随机可能,看看哪个管用。
But as you say, it has to be matched by an equal amount of verification, otherwise it's slop.
但正如你所说,这必须配以同等分量的验证,否则就是一堆垃圾。
We celebrate Kepler, but we should also celebrate Brahe for his assiduous data collection, which was ten times more precise than any previous observation.
我们推崇开普勒,但我们也该推崇布拉赫,他孜孜不倦地收集数据,精度是此前任何观测的十倍。
That extra decimal point of accuracy was essential for Kepler to get his results.
那多出来的一位小数点的精度,对开普勒得出结论至关重要。
He was using Euclidean geometry and the most advanced mathematics he could use at the time to match his models with the data.
他用的是欧几里得几何和那个时代最先进的数学,来让他的模型与数据吻合。
All aspects had to be in play: the data, the theory, and the hypothesis generation.
各个方面缺一不可:数据、理论和假设的产生。
I'm not sure nowadays that hypothesis generation is the bottleneck anymore.
我不确定现在假设的产生是否还是瓶颈所在。
Science has changed in the century since.
近一个世纪以来,科学已经变了。
Classically, the two big paradigms for science were theory and experiment.
传统上,科学有两大范式:理论和实验。
Then in the 20th century, numerical simulation came along, so you can do computer simulations to test theories.
到了 20 世纪,数值模拟出现了,你可以用计算机仿真来检验理论。
Finally, in the late 20th century, we had big data.
最后,在 20 世纪末,我们有了大数据。
We had the era of data analysis.
进入了数据分析的时代。
A lot of new progress is actually driven now by analyzing massive datasets first.
如今许多新的进展实际上是先通过分析海量数据集推动的。
You collect large datasets and then draw patterns from them to deduce thoughts.
你收集大型数据集,然后从中提取规律来推导想法。
This is a little bit different from how science used to work, where you make a few observations or have one out-of-the-blue idea, and then collect data to test your idea.
这和以前的科学做法有点不同,以前是先做少量观察或者突然冒出一个想法,再收集数据来验证这个想法。
That's the classic scientific method.
那是经典的科学方法。
Now it's almost reversed.
现在几乎颠倒过来了。
You collect big data first, and then you try to get hypotheses from it.
你先收集大数据,然后试图从中提取假设。
Kepler was maybe one of the first early data scientists, but even he didn't start with Tycho's dataset and then analyze it.
开普勒也许是最早的数据科学家之一,但他也不是从第谷的数据集出发再加以分析的。
He had some preconceived theories first.
他先有一些预设的理论。
It seems like this is less and less the way we make progress, just because the data is so much more massive and useful.
似乎这越来越不是我们取得进展的方式了,原因就在于数据变得如此庞大,如此有用。
Oh, interesting.
哦,有意思。
I feel like the 20th-century science that you're describing actually very well describes what happened with Kepler.
我觉得你描述的 20 世纪科学,其实很好地描述了开普勒的故事。
He did have these ideas—1595 and '96 is where he comes up with the polygons and then the Platonic objects theory—but they were wrong.
他确实有那些想法——1595 年和 96 年他提出多边形理论,然后是正多面体理论——但那些都是错的。
Then a few years later, he gets Brahe's data, and it's only after twenty years of trying random things that he gets this empirical regularity.
几年后他拿到布拉赫的数据,也是在又尝试了二十年随机可能之后,才得出那个经验规律。
It actually feels a bit closer to Brahe's data being analogous to some massive data bank of simulations, and now that you've got the data, you can keep trying random things.
感觉更像是布拉赫的数据相当于一个庞大的仿真数据库,有了这些数据,你就可以不断尝试各种可能。
If it wasn't for that, Kepler would be out there just writing books about harmonics and Platonic objects, and there would be nothing to actually verify against.
如果没有那些数据,开普勒就只会坐在那里写和声学和正多面体的书,根本没有任何东西可以用来验证。
The data was extremely important.
数据极其重要。
The distinction I was trying to make was that traditionally, you make a hypothesis and then you test it against data.
我想做的区分是:传统上,你先提出假设,再用数据检验。
But now with machine learning, data analysis, and statistics, you can start with data and through statistics work out laws that were not present before.
但现在有了机器学习、数据分析和统计,你可以从数据出发,通过统计方法找出以前不存在的规律。
Kepler's third law is a little bit like this, except that instead of having the thousand data points that Brahe had, Kepler had six data points.
开普勒第三定律有点像这种情况,只不过布拉赫有上千个数据点,而开普勒只有六个。
For every planet, he knew the length of the orbit and the distance to the Sun.
对每颗行星,他知道轨道长度和到太阳的距离。
There were five or six data points, and he did what we would now call regression.
有五六个数据点,他做了我们现在所说的回归分析。
He fit a curve to these six data points and got a square-cube law, which was amazing.
他对这六个数据点拟合了一条曲线,得出了平方立方定律,这非常了不起。
But he was quite lucky that these six data points gave him the right conclusion.
但他相当幸运,这六个数据点恰好给了他正确的结论。
That's not enough data to be really reliable.
这样的数据量其实不够可靠。
There was a later astronomer, Johann Bode, who took the same data—the distances to the planets—and inspired by Kepler, he had a prediction that the distances to the planets formed a shifted geometric progression.
后来有一位天文学家约翰·波得,他拿了同样的数据——各行星到太阳的距离——受开普勒启发,预测行星距离构成一个错位的等比数列。
He also fit a curve, except there was one point missing.
他也拟合了一条曲线,只不过有一个数据点缺失。
There was a big gap between Mars and Jupiter.
火星和木星之间有一个很大的空隙。
His law predicted that there was a missing planet.
他的定律预测那里有一颗缺失的行星。
It was kind of a crank theory, except when Uranus was discovered by Herschel, the distance to Uranus fit exactly this pattern.
这有点像奇谈怪论,但当赫歇尔发现天王星时,天王星到太阳的距离完全符合这个规律。
Then Ceres was discovered in the asteroid belt, and it also fit the pattern.
后来谷神星在小行星带被发现,它也符合这个规律。
People got really excited that Bode had discovered this amazing new law of nature.
人们非常兴奋,以为波得发现了一条惊人的自然新定律。