Probability Is Not A Substitute For Reasoning

Several Rationalists objected to my recent “Against AGI Timelines” post, in which I argued that “the arguments advanced for particular timelines [to AGI]—long or short—are weak”. This disagreement is broader than predicting the arrival of one speculative technology. It illustrates a general point about when thinking in explicit probabilities is productive and illuminating vs when it’s misleading, confusing, or a big waste of effort.

These critics claim that a lack of good arguments is no obstacle to using AGI timelines, so long as those timelines are expressed as a probability distribution rather than a single date. See e.g. Scott Alexander on Reddit, Roko Mijic on Twitter, and multiple commenters on LessWrong.1

And yes, if you must have AGI timelines, then having a probability distribution is better than just saying “2033!” and calling it a day, but even then your probability distribution is still crap and no one should use it for anything. Expressing yourself in terms of probabilities does not absolve you of the necessity of having reasons for things. These critics don’t claim to have good arguments for any particular AGI timeline. As far as I can tell, they agree with my post’s central claim, which is that there’s no solid reasoning behind any of the estimates that get thrown around.

You can use bad arguments to guess at a median date, and you will end up with noise and nonsense like “2033!”. Or you can use bad arguments to build up a probability distribution… and you will end up with noise and nonsense expressed in neat rows and figures. The output will never be better than the arguments that go into it!2

As an aside, it seems wrong to insist that I engage with people’s AGI timelines as though they represent probability distributions, when for every person who has actually sat down and thought through their 5%/25%/50%/75%/95% thresholds and spot-checked this against their beliefs about particular date ranges and etc etc in order to produce a coherent distribution of probability mass, there are dozens of people who just espouse timelines like “2033!”.

Lots of people, Rationalists especially, want the epistemic credit for moves that they could conceivably make in principle but actually have not done. This is bullshit. Despite his objection above, even Alexander—who is a lot more rigorous than most—is still perfectly happy to use single-date timelines in his arguments, and to treat others’ probability distributions as interchangeable with their median dates:

“For example, last year Metaculus thought human-like AI would arrive in 2040, and superintelligence around 2043 … If you think [AGI arrives in] 2043, the people who work on this question (“alignment researchers”) have twenty years to learn to control AI.”

Elsewhere he repeats this conflation and also claims he discards the rest of the probability distribution [emphasis mine]:

“I should end up with a distribution somewhere in between my prior and this new evidence. But where?

I . . . don’t actually care? I think Metaculus says 2040-something, Grace says 2060-something, and Ajeya [Cotra] says 2050-something, so this is basically just the average thing I already believed. Probably each of those distributions has some kind of complicated shape, but who actually manages to keep the shape of their probability distribution in their head while reasoning? Not me.

Once you’ve established that you ignore the bulk of the probability distribution, you don’t get to fall back on it when critiqued. But if Alexander doesn’t actually have a probability distribution, then plausibly one of my other critics might, and Cotra certainly does. Some people do the real thing, so let’s end this aside about the many who gesture vaguely at “probability distributions” without putting in the legwork to use one. If this method actually works, then we only need to pay attention to the few who follow through, and I’ll return to the main argument to address that. 

Does it work? Should we use their probability distributions to guide our actions, or put in the work to develop probability distributions of our own?

Suppose we ask an insurance company to give “death of Ben Landau-Taylor timelines”. They will be able to give their answer as a probability distribution, with strong reasons and actuarial tables in support of it. This can bear a lot of weight, and is therefore used as a guide to making consequential decisions—not just insurance pricing, but I’d also use this to evaluate e.g. whether I should go ahead with a risky surgery, and you bet your ass I’d “keep the shape of the probability distribution in my head while reasoning” for something like that. Or if we ask a physicist for “radioactive decay of a carbon-14 atom timelines”, they can give a probability distribution with even firmer justification, and so we can build very robust arguments on this foundation. This is what having a probability distribution looks like when people know things—which is rarer than I’d like, but great when you can get it.

Suppose we ask a well-calibrated general or historian for “end of the Russia-Ukraine war timelines” as a probability distribution.3 Most would answer based on their judgment and experience. A few might make a database of past wars and sample from that, or something. Whatever the approach, they’ll be able to give comprehensible reasons for their position, even if it won’t be as well-justified and widely-agreed-upon as an actuarial table. People like Ukrainian refugees or American arms manufacturers would do well to put some weight on a distribution like this, while maintaining substantial skepticism and uncertainty rather than taking the numbers 100% literally. This is what having a probability distribution looks like when people have informed plausible guesses, which is a very common situation.

Suppose we ask the world’s most renowned experts for timelines to peak global population. They can indeed give you a probability distribution, but the result won’t be very reliable at all—the world’s most celebrated experts have been getting this one embarrassingly wrong for two hundred years, from Thomas Malthus to Paul Ehrlich. Their successors today are now producing timelines with probabilistic prediction intervals showing when they expect the growth of the world population to turn negative.4 If this were done with care then arguably it might possibly be worth putting some weight on the result, but no matter how well you do it, this would be a completely different type of object from a carbon-14 decay table, even if both can be expressed as probability distributions. The arguments just aren’t there.

The timing of breakthrough technologies like AGI are even less amenable to quantification than the peak of world population. A lot less. Again, the critics I’m addressing don’t actually dispute that we have no good arguments for this, the only people who argued with this point were advancing (bad) arguments for specific short timelines. The few people who have any probability distributions at all give reasons which are extremely weak at best, if not outright refutable, or sometimes even explicitly deny the need to have a justification.

This is not what having a probability distribution looks like when people know things! This is not what having a probability distribution looks like when people have informed plausible guesses! This is just noise! If you put weight on it then the ground will give way under your feet! Or worse, it might be quicksand, sticking you to an unjustified—but legible!—nonsense answer that’s easy to think about yet unconnected to evidence or reality.

The world is not obligated to give you a probability distribution which is better or more informative than a resigned shrug. Sometimes we have justified views, and when we do, sometimes probabilities are a good way of expressing those views and the strength of our justification. Sometimes we don’t have justified views and can’t get them. Which sucks! I hate it! But slapping unjustified numbers on raw ignorance does not actually make you less ignorant.


[1] While I am arguing against several individual Rationalists here, this is certainly not the position of all Rationalists. Others have agreed with my post. In 2021 ur-Rationalist Eliezer Yudkowsky wrote:

“I feel like you should probably have nearer-term bold predictions if your model [of AGI timelines] is supposedly so solid, so concentrated as a flow of uncertainty, that it’s coming up to you and whispering numbers like “2050” even as the median of a broad distribution. I mean, if you have a model that can actually, like, calculate stuff like that, and is actually bound to the world as a truth.

If you are an aspiring Bayesian, perhaps, you may try to reckon your uncertainty into the form of a probability distribution … But if you are a wise aspiring Bayesian, you will admit that whatever probabilities you are using, they are, in a sense, intuitive, and you just don’t expect them to be all that good.

I have refrained from trying to translate my brain’s native intuitions about this into probabilities, for fear that my verbalized probabilities will be stupider than my intuitions if I try to put weight on them.”

Separately, “Against AGI Timelines” got a couple other Rationalist critics who do claim to have good arguments for short timelines. I’m not persuaded but they are at least not making the particular mistake that I’m arguing against here.

[2] It’s not a priori impossible that there could ever be a good argument for a strong claim about AGI timelines. I’ve never found one and I’ve looked pretty damn hard, but there are lots of things that I don’t know. However, if you want to make strong claims—and “I think AGI will probably (>80%) come in the next 10 years” is definitely a strong claim—then you need to have strong reasons.

[3] The Good Judgment Project will sell you their probability distribution on the subject. If I were making big decisions about the war then I would probably buy it, and use it as one of many inputs into my thinking.

[4] I’m sure every Rationalist can explain at a glance why the UN’s 95% confidence range here is hot garbage. Consider this a parable about the dangers of applying probabilistic mathwashing to locally-popular weakly-justified assumptions.

Leave a comment