Back in 2001 I was the math consultant for "A Beautiful Mind". One spends a lot of time waiting on a film set. Eventually one wonders why.
The majority of wait time was the cinematographer lighting each scene. I imagined a workflow where secondary digital cameras captured 3D information, and all lighting took place in post production. Film productions hemorrhage money by the second; this would be a massive cost saving.
I described this idea to a venture capitalist friend, who concluded one already needed to be a player to pull this off. I mentioned this to an acquaintance at Pixar (a logical player) and they went silent.
Still, we don't shoot movies this way. Not there yet...
Yes. I've been using it today with Zed (a mind-blowing editor, by the way).
One must use an API key to work through Zed, but my Max subscription can be used with Claude Code as an external agent via Zed ACP. And there's some integration; it's a better experience than Claude Code in a terminal next to file viewing in an editor.
One of my side projects has been to recover a K&R C computer algebra system from the 1980's, port to modern 64-bit C. I'd have eight tabs at a time assigned files from a task server, to make passes at 60 or so files. This nearly worked; I'm paused till I can have an agent with a context window that can look at all the code at once. Or I'll attempt a fresh translation based on what I learned.
With a $200 monthly Max subscription, I would regularly stall after completing significant work, but this workflow was feasible. I tried my API key for an hour once; it taught me to laugh at the $200 as quite a deal.
I agree that Opus 4.5 is the only reasonable use of my time. We wouldn't hire some guy off the fryer line to be our CTO; coding needs best effort.
Nevertheless, I thought my setup was involved, but if Boris considers his to be vanilla ice cream then I'm drinking skim milk.
Mathematicians get enamored with particular ways of looking at things, and fall into believing this is gospel. I should know: I am one, and I fight this tendency at every turn.
On one hand, "rational" and "algebraic" are far more pervasive concepts than mathematicians are ever taught to believe. The key here is formal power series in non-commuting variables, as pioneered by Marcel-Paul Schützenberger. "Rational" corresponds to finite state machines, and "Algebraic" corresponds to pushdown automata, the context-free grammars that describe most programming languages.
On the other hand, "Concrete Mathematics" by Donald Knuth, Oren Patashnik, and Ronald Graham (I never met Oren) popularizes another way to organize numbers: The "endpoints" of positive reals are 0/1 and 1/0. Subdivide this interval (any such interval) by taking the center of a/b and c/d as (a+c)/(b+d). Here, the first center is 1/1 = 1. Iterate. Given any number, its coordinates in this system is the sequence of L, R symbols to locate it in successive subdivisions.
Any computer scientist should be chomping at the bit here: What is the complexity of the L, R sequence that locates a given number?
From this perspective, the natural number "e" is one of the simpler numbers known, not lost in the unwashed multitude of "transcendental" numbers.
Most mathematicians don't know this. The idea generalizes to barycentric subdivision in any dimension, but the real line is already interesting.
Nice! There is a Japanese feel to the lead graphic, their prevalence of cartoon imagery, that one might not recognize without having traveled in Japan.
Is the design debate public? I'd imagine it would make great reading.
In grad school around 1980 I took a cab home from a midnight showing of the reggae film "The Harder They Come". The cab driver asked me out of the blue, "Is it true you can't tell the difference between +i and -i?"
Cambridge, MA but still ... unexpected.
If someone hands you a blank board representing the complex numbers, and offers to tell you either the sum or the product of any two places you put your fingers, you can work out most of the board rather quickly. There remains which way to flip the board, which way is up? +i and -i both square to -1.
This symmetry is the camel's nose under the tent of Galois theory, described in 1831 by Évariste Galois before he died in a duel at age twenty. This is one of the most amazing confluences of ideas in mathematics. It for example explains why we have the quadratic formula, and formulas solving degree 3 and 4 polynomials, but no general formula for degree 5. The symmetry of the complex plane is a toggle switch which corresponds to a square root. The symmetries of degree 3 and 4 polynomials are more involved, but can all be again translated to various square roots, cube roots... Degree 5 can exhibit an alien group of symmetries that defies such a translation.
The Greeks couldn't trisect an angle using a ruler and compass. Turns out the quantity they needed exists, but couldn't be described in their notation.
Integrating a bell curve from statistics doesn't have a closed form in the notation we study in calculus, but the function exists. Statisticians just said "oh, that function" and gave it a new name.
Roots of a degree 5 polynomial exist, but again can't be described in the primitive notation of square roots, cube roots... One needs to make peace with the new "simple group" that Galois found.
This is arguably the most mind blowing thing one learns in an undergraduate math education.
The original post is written by AI so I will read it briefly, but your comment is fascinating. I got through undergrad math by brute force memorization and taking the C. Or sometimes the C-. The underlying concepts were never really clear to me. I did take a good online calculus class later that helped.
However, I have questions:
"Turns out the quantity they needed exists, but couldn't be described in their notation" What is this about? Sounds interesting.
"Statisticians just said "oh, that function" and gave it a new name." What is this?
I never understood there is a relationship between quadratic equations and some kind of underlying mathematic geometric symmetry. Is there a good intro to this? I only memorized how to solve them.
And the existential question. Is there a good way to teach this stuff?
> I never understood there is a relationship between quadratic equations and some kind of underlying mathematic geometric symmetry.
In a polynomial equation, the coefficients can be written as symmetric functions of the roots: https://en.wikipedia.org/wiki/Vieta%27s_formulas - symmetric means it doesn't matter how you label the roots, because it would not make sense if you could say "r1 is 3, r2 is 7" and get a different set of coefficients compared to "r1 is 7, r2 is 3".
Since the coefficients are symmetric functions of the roots, that means that you can't write the roots as a function of the coefficients - there's no way to break that symmetry. This is where root extraction comes in - it's not a function. A function has to return 1 answer for a given input, but root extraction gives you N answers for the nth root of a given input. So that's how we're able to "choose" roots - consider the expression (r1 - r2) for a quadratic equation. That's not symmetric (the answer depends on which one we label as r1 and which we label as r2), so we can't write that expression as a function of the coefficients. But what about (r1 - r2)^2? That expression IS symmetric - you get the same answer regardless of how you label the roots. If we expand that out we get r1^2 - 2r1r2 + r2^2, which is symmetric, which means we can write it as a function of the coefficients. So we've come up with an expression whose square root depends on the way we've labeled the roots (using Vieta's formulas you can show it's b^2-4c, which you might recognize from the quadratic equation).
Galois theory is used to show that root extraction can only break certain types of symmetries, and that fifth degree polynomials can exhibit root symmetries that are not breakable by radicals.
The Greeks "notation" was a diagram full of points labeled by letters (Α, Β, Γ, ...) with various lines connecting them and a list of steps to do with an unmarked ruler and a compass, some of which added new points. But those tools alone can't be used to describe cube roots of arbitrary numbers (or, equivalently, trisections of arbitrary angles).
> What is this [statisticians' function]?
The integral of the bell curve (normal distribution) is called its cumulative distribution function (CDF). The CDF of the normal distribution is closely related to a special function called the "error function" erf(x).
> Is there a good intro to [the symmetry interpretation of the quadratic formula]?
What is so unbelievably frustrating about math education is that these interesting questions are not even hinted at until far, far down the line (and before people make the assumption I was educated outside the US).
I avoided math like the plague until my PhD program. Real analysis was a program requirement so I had to quickly teach myself calculus and get up to speed—and I found I really, really liked it. These high level questions are just so interesting and beyond the rote calculation I thought math was.
I hope I can give my daughter a glimpse of the interesting parts before the school system manages to kill her interest altogether (and I would welcome tips to that end if anyone has them).
Geometric proofs are really accessible. You don't need any algebra to prove Pythagoras' theorem, or that the sum of the inner angles of a triangle is 180 degrees, for example. Compass and straight-edge construction of simple figures is also fun.
> However, I have questions: "Turns out the quantity they needed exists, but couldn't be described in their notation" What is this about? Sounds interesting.
There are hierarchies of numbers (quantities) in mathematics, just as there are hierarchies of patterns (formal languages) in computer science, based on how difficult these objects are to describe. The most widely accepted hierarchy is actually the same in math and CS: rational, algebraic, transcendental.
In math, a rational number is one that can be described by dividing two integers. In CS, a rational pattern is one that can be described by a regular expression (regex). This is still "division": Even when we can't do 1-x or 1/x, we can recognize the pattern 1/(1-x) = 1 + x + x^2 + x^3... as "zero or more occurrences of x", written in a regex as x*.
In math, an algebraic number is one can be found as a root of a polynomial with integer coefficients. The square root of 2 is the poster child, solving x^2 - 2 = 0, and "baby's first proof" in mathematics is showing that this is not a fraction of two integers.
In CS, an algebraic pattern is one that can be described using a stack machine. Correctly nested parentheses (()(())) is the poster child here; we throw plates on a stack to keep track of how deep we are. The grammars of most programming languages are algebraic: If the square root of math is like nested parentheses, then roots of higher degree polynomials are like more complicated nested expressions such as "if then else" statements. One needs lots of colors of plates, but same idea.
In math, everything else (e, Pi, ...) is called trancendental. CS has more grades of eggs, but same idea.
One way to organize this is to take a number x and look at all expressions combining powers of x. If x^3 = 2, or more generally if x is the root of any polynomial, then the list of powers wraps around on itself, and one is looking at a finite dimensional space of expressions. If x is transcendental, then the space of expressions is infinite.
So where were the Greeks in all this? Figuring out where two lines meet is linear algebra, but figuring out where a line meets a circle uses the quadratic formula, square roots. It turns out that their methods could reach some but not all algebraic numbers. They knew how to repeatedly double the dimension of the space of expressions they were looking at, but for example they couldn't triple this space. The cube root of 2 is one of the simplest numbers beyond their reach. And "squaring the circle" ? Yup, Pi is transcendental. Way out of their reach.
When you have a hammer you see nails. When you have a circle you see doubling.
You can: the equation x^2 = x holds for 1, but not for -1, so you can separate them. There is no way to write an equation without mentioning i (excluding cheating with Im, which again can't be defined without knowing i) that holds for i, but not -i.
Rather than incrementing each counter by one, dither the counters to reduce cache conflicts? So what if the dequeue becomes a bit fuzzy. Make the queue a bit longer, so everyone survives at least as long as they would have survived before.
Or simply use a prime length queue, and change what +1 means, so one's stride is longer than the cache conflict concern. Any stride will generate the full cyclic group, for a prime.
My understanding is that there is no global ordering anyway: allocation of a node has a total order, but actually writing data to a node can be arbitrarily delayed. At this point might as well use separate queues for each writer.
edit: the author claims the algorithm is linearizable, so I'll keep reading the article.
edit2: Right, writer acquires ticket, when done attempts to set the done flag on node. Reader acquires ticket, attempts to consume the node. On failure the writer will retry. Even if you have one producer and one consumer, might not you, theoretically, end up in a scenario where the writer never make progress? I guess as technically no node was produced, this is still technically lock-free, but still...
... oh, I should have read it further, the single writer scenario (even in the dynamic case) is special cased! It will still produce the node.
I once toured a dairy farm that had been a pioneer test site for Lasix. Like all good hippies, everyone I knew shunned additives. This farmer claimed that Lasix wasn't a cheat because it only worked on really healthy cows. Best practices, and then add Lasix.
I nearly dropped out of Harvard's mathematics PhD program. Sticking around and finishing a thesis was the hardest thing I've ever done. It didn't take smarts. It took being the kind of person who doesn't die on a mountain.
There's a legendary Philadelphia cook who does pop-up meals, and keeps talking about the restaurant he plans to open. Professional chefs roll their eyes; being a good cook is a small part of the enterprise of engineering a successful restaurant.
(These are three stool legs. Neurodivergents have an advantage using AI. A stool is more stable when its legs are further apart. AI is an association engine. Humans find my sense of analogy tedious, but spreading out analogies defines more accurate planes in AI's association space. One doesn't simply "tell AI what to do".)
Learning how to use AI effectively was the hardest thing I've done recently, many brutal months of experiment, test projects with a dozen languages. One maintains several levels of planning, as if a corporate CTO. One tears apart all code in many iterations of code review. Just as a genius manager makes best use of flawed human talent, one learns to make best use of flawed AI talent.
My guess is that programmers who write bad code with AI were already writing bad code before AI.
When I was a senior at Swarthmore College, Herb Wilf came over from U Penn to teach a course in combinatorial algorithms. I was encouraged to attend.
He claimed that choosing a subset of k integers at random from {1..n} should have a log in its complexity, because one needs to sort to detect duplicates. I realized that if one divided [1..n] into k bins, one could detect duplicates within each bin, for a linear algorithm. I chose bubble sort because the average occupancy was 1, so bubble sort gave the best constant.
I described this algorithm to him around 5pm, end of his office hours as he was facing horrendous traffic home. I looked like George Harrison post-Beatles, and probably smelled of pot smoke. Understandably, he didn't recognize a future mathematician.
Around 10pm the dorm hall phone rang, one of my professors relaying an apology for brushing me off. He got it, and credited me with many ideas in the next edition of his book.
Of course, I eventually found all of this and more in Knuth's books. I was disillusioned, imagining that adults read everything. Later I came to understand that this was unrealistic.
Love anecdotes like this! But admittedly I feel a bit lost, so please forgive my ignorance when I ask: why does choosing a subset of k integers at random require deduplication? My naive intuition is that sampling without replacement can be done in linear time (hash table to track chosen elements?). I’m probably not understanding the problem formulation here.
your random number function might return the same number multiple times? So to choose k random but unique numbers you may have to call the random number function more than k times?
Of course my intuition would be that you can do a random shuffle and then take the first k, which is O(N). So I might be misunderstanding.
1977. And I didn't know what a hash table was, though I can't explain now why they didn't think of using a hash table. I was effectively using a dumb hash function.
Their 1978 second edition works in exactly the memory needed to store the answer, by simulating my algorithm in a first pass but only saving the occupancy counts.
Oh, and thanks (I guess). I really didn't expect to ever be reading FORTRAN code again. One learned to program at Swarthmore that year by punching cards, crashing our IBM 1130, and bringing the printout to my supervisor shift. I'd find the square brackets and explain how you'd overwritten your array. I even helped an economics professor Frederic Pryor (the grad student in the "Bridge of Spies" cold war spy swap) this way, when I made an ill-advised visit to the computer center on a Saturday night. Apparently I could still find square brackets.
I always book directly with hotels, airlines, and auto rental agencies. One gains privileges by cutting out the middleman.
For example, one can generally check out early. We had followed a hotel reservation from locals in Nagoya, and found ourselves in a stodgy "classic" hotel. We were able to pivot to possibly the nicest corner suite in the entire city, at a steep last minute discount.
I did get trapped once, not realizing that my Hiroshima hotel became nonrefundable several days before check-in. With a phone call they moved my reservation to a few days later as a courtesy. The web page then let me cancel.
The majority of wait time was the cinematographer lighting each scene. I imagined a workflow where secondary digital cameras captured 3D information, and all lighting took place in post production. Film productions hemorrhage money by the second; this would be a massive cost saving.
I described this idea to a venture capitalist friend, who concluded one already needed to be a player to pull this off. I mentioned this to an acquaintance at Pixar (a logical player) and they went silent.
Still, we don't shoot movies this way. Not there yet...
reply