Вы здесь

Новости LessWrong.com

Подписка на Лента Новости LessWrong.com Новости LessWrong.com
A community blog devoted to refining the art of rationality
Обновлено: 3 часа 33 минуты назад

Compute Curse

4 апреля, 2026 - 17:47

Epistemic status: romantic speculation.

The core claim: I accidentally thought that compute growth can be rather neatly analogized to natural resource abundance.

Before compute curse, there was resource curse

Countries that discover oil often end up worse off than countries that don't, which is known as the resource curse. The mechanisms are well-understood: a booming resource sector draws capital and labor away from other industries, creates incentives for rent-seeking over productive investment, crowds out human capital development, and corrodes the institutions needed to sustain long-term growth.

I argue that something structurally similar has been happening with compute. The exponential growth of available computation over the past several decades, and, critically, the widespread expectation that this growth would continue, has created a pattern of resource allocation, talent distribution, and research prioritization that mirrors the resource curse in specific and non-metaphorical ways.

Note: this is not a claim that extensive compute growth has been net negative (neither it is the opposite claim). 

Dutch disease comes for ASML

The original Dutch disease mechanism is straightforward: when a booming sector (say, natural gas extraction) generates high returns, it pulls capital and labor out of other sectors (say, manufacturing), causing them to atrophy. The non-booming sectors don't decline because they became less valuable in absolute terms but rather because the booming sector offers relatively better returns, and resources flow accordingly.

A trivial version of "compute Dutch disease" of it goes like this: because scaling compute yields such reliable, legible, and fundable returns (train a bigger model, get a better benchmark score, publish the paper, raise the round), it systematically starves research directions that are harder to fund, harder to evaluate, and slower to produce results, even when those directions might be more consequential in the long run.

So, "The Bitter Lesson" can be seen as the Dutch disease of AI research, if we add to it that the fact that scaling works doesn't mean the crowding-out of alternatives is costless. Or, in other words, the fact that scaling works better is rather a fact about our ability to do programming or even, if we go to the very end of this line of reasoning, about our economic and educational institutions, than about computer science in general. 

However, I consider it only as the most recent and prominent manifestation of a phenomenon that was happening for decades. 

Since at least the late 1990s, the reliable cheapening of compute has made it consistently more profitable to build compute-intensive solutions to problems than to invest in the kind of deep, careful engineering that produces efficient, well-understood systems. When you can always count on next year's hardware being faster and cheaper, the rational business decision is to ship bloated software now and let Moore's Law clean up after you, rather than spending the additional engineering time to make something lean and correct. This created an enitre economy of applications, business models and platform architectures that are, in a meaningful sense, the technological equivalent of an oil-dependent monoculture: they exist not because they represent the best way to solve a problem, but because abundant compute made them the cheapest way to ship a product.

The consequences are visible across the entire stack. Web applications that would have run comfortably on a 2005-era machine now require gigabytes of RAM to render what is essentially styled text. Electron-based desktop apps ship an entire browser engine to display a chat window. Backend services that could be handled by a well-designed program running on a single server are instead distributed across sprawling microservice architectures that consume orders of magnitude more compute. Cory Doctorow's "enshittification" framework is about the user-facing result of this dynamic, but the deeper structural story is about how compute abundance degraded the craft of software engineering itself, well before anyone started worrying about ChatGPT replacing programmers.

This is the Dutch disease pattern operating at the level of the entire technology economy: the booming sector (scale-dependent applications) drew capital and talent away from the non-booming sector (careful engineering, deep technical innovation, computationally parsimonious approaches), and the non-booming sector atrophied accordingly.

But of course the AI case is qualitatively different and the most sorrowful because it resulted in humanity trying to build superintelligence with giant instructable deep learning models.

Human capital crowding-out

Resource curse economies characteristically underinvest in education and human capital development. The relative returns to education are lower in resource-dependent economies because the booming sector doesn't require a broadly educated population. 

The compute version of this story has been playing out for at least a decade, well before the current discourse about AI replacing jobs and destroying university education. The entire trajectory of computer science education shifted from "understand the fundamentals deeply" toward "learn to use frameworks and APIs that abstract over compute." 

There is also a more direct talent-siphoning effect: the IT economy has been pulling the most capable technical minds into a narrow set of activities and away from a much broader set of technical and scientific challenges. 

The voracity effect and race dynamics

In the resource curse literature, there is a so called "voracity effect": when competing interest groups face a resource windfall, they respond by extracting more aggressively, leading to worse outcomes than moderate scarcity would produce. Rather than investing the windfall prudently, competing factions race to capture as much of it as possible before others do.

I leave this without a direct comment and let the reader have their own pleasure of meditating on this. 

But compute growth is endogeneous!

The resource curse in its classical form operates on an exogenous endowment: countries don't choose to have oil reserves, they discover them, and then the political economy warps around that windfall. Much of the pathology comes from the unearned nature of the wealth: it enables rent-seeking, weakens the link between effort and reward, and corrodes institutions.

Compute, by contrast, is endogenously produced through deliberate R&D and engineering investment. Moore's Law was never a law of nature.

Right?

I mean, to me Moore’s Law looks like a strong default of any humanity-like civilization. It is created by humans, right, but it is created in a kind of hardly avoidable manner.

The counterfactual question

The resource curse literature has natural counterfactuals (resource-poor countries that developed strong institutions and diversified economies: Japan, South Korea, Singapore). What's the compute-curse counterfactual? A world where compute grew more slowly and we consequently invested more in elegant algorithms, interpretable models, and formal methods?

It's plausible, but it's also possible that slower compute growth would have simply meant less progress overall rather than differently-directed progress. I don’t know. I said in the beginning - it is a speculation.

However, one can trivially note that in a world with less compute abundance, the relative returns to algorithmic cleverness, interpretability research, and formal verification would have been higher, because you couldn't just solve problems by throwing more FLOPS at them. And that may or may not lead to better outcomes in the long run (I am basically leaving here the question of ASI development and just talking about rather “normal” tech and science R&D).

People Actually Thought About This!

Two existing frameworks are close to what I'm describing, but both point the analogy in different directions.

The Intelligence Curse (Luke Drago and Rudolf Laine, 2025) uses the resource curse analogy to argue that AGI will create rentier-state-like incentives: powerful actors who control AI will lose their incentive to invest in regular people, just as petrostates lose their incentive to invest in citizens. This is a compelling argument about the distributional consequences of AGI, but it's about what happens after AGI arrives. The compute curse is about what's happening now, during the process of building toward AGI, and about how the abundance of compute is distorting that process itself.

The Generalized Dutch Disease (Policy Tensor, Feb 2026) is about the macroeconomic effects of the compute capex boom on US manufacturing competitiveness, showing that it operates through the same channels as the fracking boom and the pre-2008 financial boom. This is the closest existing work to what I'm describing, but it stays within the macroeconomic framing (factor prices, unit labor costs, exchange rate effects) and doesn't address the innovation-direction distortion, human capital crowding-out in the intellectual sense, or the AI safety implications.

But: compute curse may actually be worse than resource curse 

Some of the negative downstream effects of compute abundance don't map onto the resource curse framework directly but are worth including for completeness, since they stem from the same underlying cause (cheap, abundant compute enabling activities that wouldn't otherwise be viable):

  • Social media and attention economy pathologies
  • Surveillance infrastructure
  • Targeted public opinion manipulation
  • And of course many AI safety issues

These are not Dutch disease effects, just straightforward negative externalities of cheap compute. But they suggest that the full accounting of compute abundance's costs is substantially larger than what the resource curse analogy alone would tell.




Discuss

Self-Aware Confabulation

4 апреля, 2026 - 16:46

All men are frauds. The only difference between them is that some admit it. I myself deny it.

― H. L. Mencken

I think where I am not, therefore I am where I do not think. I am not whenever I am the plaything of my thought; I think of what I am where I do not think to think.

― Jacques Lacan

Conscience is the inner voice that warns us somebody may be looking.

― H. L. Mencken, again

The Elephant in the Brain by Robin Hanson and Kevin Simler was the piece that first introduced me to the idea. I often felt like the Elephant's takes are overly cynical, and the same goes for other pieces of Hanson's writing. That is, before I read Edward Teach's Sadly, Porn that is outright misanthropic, and still feels pretty accurate whenever I can make any sense of it.

The core thesis in both of these books is somewhat similar. The Elephant in the Brain says that there's a unconscious part, the Elephant, that does self-interested stuff like status-seeking. To be able to present a prosocial personality, we then have a separate layer that interprets the actions of the Elephant in a good light. It's hard to call this lying because to be believable we have to believe it ourselves first. And our brains are great at pattern matching and forgetting conflicting details.

Sadly, Porn goes the other way, or perhaps just further. You've domesticated the Elephant. It no longer dares to do the self-interested actions. It's afraid of failure. The internal narrator is repurposed from defending our selfish actions to others, into explaining our lack of actions to ourselves. We're lying to ourselves, trying to uphold our own story of having a high status. Other people are mostly required for external approval.

Reading these books was like partially breaking the 4th wall of the narrator. It became self-aware. Of course I could be just imagining a minor enlightenment instead of experiencing it. That would be such a Sadly, Porn-style mental move. Perhaps we could test this by consciously changing the actions of the Elephant? How would one interpret the results instead of retreating to another abstraction level with the lies? It seems really hard to point at something you've done and say "the Elephant did that".

Both of these models started to look a bit lacking after actually internalizing them. For a long time, I was having a really hard time identifying any motivating factors besides physical needs, hedonism, and status-seeking, thinking that anyone doing other things was lying to either themselves or others, or both. I still somewhat hold these views; I just don't think that the lying part is so absolute. People also have aesthetic preferences (read: values) that do not have obvious self-interested purpose.

But as the saying goes, "all models are wrong, some are useful". I've found both of these quite useful in modelling how others behave. And how I behave, too, albeit disregarding the narrator's explanations is tedious and squeamish work, as convenient answers look very appealing. And all this self-reflection seems to be mostly for entertainment anyway, as for actual results I use more powerful tools.



Discuss

Mean field sequence: an introduction

4 апреля, 2026 - 10:30

This is the first post in a planned series about mean field theory by Dmitry and Lauren (this post was generated by Dmitry with lots of input from Lauren, and was split into two parts, the second of which is written jointly). These posts are a combination of an explainer and some original research/ experiments.

The goal of these posts is to explain an approach to understanding and interpreting model internals which we informally denote "mean field theory" or MFT. In the literature, the closest matching term is "adaptive mean field theory". We will use the term loosely to denote a rich emerging literature that applies many-body thermodynamic methods to neural net interpretability. It includes work on both Bayesian learning and dynamics (SGD), and work in wider "NNFT" (neural net field theory) contexts. Dmitry's recent post on learning sparse denoising also heuristically fits into this picture (or more precisely, a small extension of it).

Our team at Principles of Intelligence (formerly PIBBSS) believes that this point of view on interpretability remains highly neglected, and should be better understood and these ideas should be used much more in interpretability thinking and tools.

We hope to formulate this theory in a more user-friendly that can be absorbed and used by interpretability researchers. This particular post is closely related to the paper "Mitigating the Curse of Detail: Scaling Arguments for Feature Learning and Sample Complexity". The experiments are new.

What do we mean by mean field theory

Mean field theory is a vague term with many meanings, but for the first few posts at least we will focus on adaptive mean field theory (see for example this paper, written with a physicist audience in mind). It is a theory of infinite-width systems that is different from the more classical (and, as I'll explain below, less expressive) neural tangent kernel formalism and related Gaussian Process contexts. Ultimately it is a theory of neurons (which are treated somewhat like particles in a gas). While every single neuron in the theory is a relatively simple object, the neurons in a mean field picture allow for an emergent large-scale behavior (sometimes identified "features") that permits us to see complex interactions and circuits in what is a priori a "single-neuron theory". These cryptic phrases will hopefully be better understood as this post (and more generally as this series) progresses.

Why MFT

We ultimately want to understand the internals of neural nets to a degree that can robustly (and ideally, in some sense "safely") interpret why a neural net makes a particular decision. So one might say that this implies that we should only care about theories that apply directly to real models. Finite width, large depth, etc. While this is fair, any interpretation must ultimately rely on some idealization. When we say "we have interpreted this mechanism", we mean that there is some platonic gadget or idealized model that has a mechanism "that we understand", and the real model's behavior is explained well by this platonic idealization. Thus making progress on interpretability requires accumulating an encyclopedia (or recipe book) of idealizations and simplified models. The famous SAE methodology is based on trying to fit real neural nets into an idealization inherited from compressed sensing (a field of applied math). As we will explain below, if we never had Neel Nanda's interpretation of the modular addition algorithm, we would get it "for free" by applying a mean field analysis to the related infinite-width model. As it were, the two use the same Platonic idealization[1]. Thus at least one view on the use of theory is to see it as a source of useful models that can be then applied to more realistic settings (with suitable modification, and, at least until a "standard model" theory of interpretability exists, necessarily incompletely). Useful theories should be simple enough to analyse mathematically (maybe with some simplifications, assumptions, etc.) and rich enough to illuminate new structure. We think that mean field theory (and its relatives) is well-positioned to take such a role.

Brief FAQ section

"Frequently asked questions about MFT" is a big topic that can be its own post. But before diving into a more technical introduction, we should address a few standard questions which keep cropping up, especially about comparisons between MFT and other better-known infinite-width limits.

  1. Doesn't infinite width mean that we're in the NTK (or more generally a Gaussian process) regime? The first analyses of neural nets at infinite width have been in the so-called NTK regime, where in particular the model "freezes" to its prior/ initialization at all but the last layer (which is performing linear regression). This is a remarkably deep picture that is for example sufficient to learn mnist. But approaches in this family exhibit extremely different behaviors from realistic nets (in particular the freezing of early neurons) and they are generalize much worse on problems that cannot be solved by some combination of clustering and linear regression (of which MNIST is an example). For example these methods learn only memorizing circuits in modular addition (at least in known regimes) and, worse, they are known to require exponential training data and complexity for learning algorithms that are well-known to be learnable by SGD (see for example the leap complexity paper) – this means that these techniques are fundamentally incompatible with these settings (more generally so-called "compositional" models - ones that have multiple serial steps which models tend to need depth for - have similar failures in this regime). This can be partially improved by including so-called "correction terms", but these only work when the Gaussian process has good performance by itself, and fail to ameliorate for the exponential complexity issues. Note that the Gaussian process picture is useful as a heuristic baseline. In particular it makes some predictions on scaling exponents that have some experimental agreement (and is related to the muP formalism).

    It turns out that the lack of expressivity of the Gaussian limit is due not to its having infinite width to a certain choice of how to take the infinite limit (and in particular how to scale weight regularization terms in the loss). Different limits and scalings give significantly more expressive behaviors as we shall see, and we use MFT as a catch-all term for these. (These different limits are also harder in general, at least in terms of exact mathematical analysis: the Gaussian process limit somewhat compensates for its lack of expressivity by having much easier math.)
  2. Isn't mean field theory only a Bayesian learning theory and doesn't that make it unrealistic? In physics contexts (like MFT, Gaussian Process learning, etc.) Bayesian learning is often theoretically easier to deal with, and we'll explain Bayesian learning predictions here (validated by tempering experiments). However a version of mean field for SGD learning exists and is called "Dynamical Mean Field Theory" (DMFT) (it extends the NTK in Gaussian process contexts). Probably more relevantly, Bayesian learning experiments frequently find similar structures to gradient-based methods (and are often easier to analyse). This is particularly well demonstrated in empirical results by the Timaeus group.
  3. Is mean field theory a theory of shallow models? Most existing papers on mean field theory work in the context of 2-layer neural nets (i.e. 2 linear layers, one nonlinear layer). However there is a fully general, and experimentally robust extension of the theory to a larger number of layers (see for example this lecture series), and we will look at such models here. In fact mean field theory can model mechanisms of arbitrary depth - but it works best for shallower models (or for shallow mechanisms in deep models), and would likely be less useful for modeling strongly depth-dependent phenomena.
  4. What is a success of mean field theory I should know about? Glad you asked! Most people know about the Modular Addition task, which was first explained mechanistically by Neel Nanda et al.'s grokking paper. The interpretation is heuristic: it shows that the model exhibits signatures of using a nice and unexpected trigonometric trick. It also interpolates between generalization and memorization in a sudden shift reminiscent of a phase transition. A more ambitious task (that was considered too hard to tackle in the interpretability community) would be to understand exactly what the model learns on a neuron-by-neuron basis in any setting that exhibits generalization/ grokking. Since models have inherent randomness (from initialization, and sometimes from SGD), the task is inherently a statistical one: explain the probability distribution on weights of learned models (at least to a suitable level of precision), and was generally believed to be quite hard. Thus it comes as a surprise to practitioners of interpretability that in fact there is a context where this is done.

    In the paper "Grokking as a First-order Phase Transition in Two Layer Networks", Rubin, Seroussi and Ringel constructed a complete explanation (experimentally verified to extremely high precision) for the modular addition network in the Bayesian learning setting (there are some other differences from Neel Nanda's approach, most notably the choice of loss function, but variants of the approach extend to these as well). The distribution is first understood at infinite width, then shown to apply at realistic (but large) width in the appropriate regime. When applying the adaptive mean field theory approach to this task, Fourier modes and the trigonometric mechanism fall out as a natural output of the theory – moreover they are fully explained on a statistical distribution level (i.e. we have a complete model "exactly what each neuron does" to an appropriate degree of precision, understood in a statistical physics sense). Of particular interest, the model explains a grokking-like phase transition between memorization (equivalently, a Gaussian process-like behaviour) and generalization (inherently mean field) and predicts the data fraction at which it happens (this is a Bayesian learning analog of predicting the distribution of when grokking happens in SGD-trained neural nets). The phenomenon is a genuine phase transition in the thermodynamic sense.
  5. Are real models in the mean field regime or the Gaussian process regime, or something else? This is an interesting question, whose answer is "this question doesn't make sense". The distinction between regimes applies to infinite width nets, i.e. to a totally non-standard setting. One can prove rigorous results with the gist that if the width is (sufficiently enormous with some giant bound) compared to the training data, the model is guaranteed to learn in one of these two regimes. However, no real models are that enormous. Instead, some phenomena and some mechanisms can be seen (experimentally or theoretically) to extend from infinite nets to nets of finite width. Sometimes these look more like mean field phenomena, sometimes they look like Gaussian process phenomena. For example in some sense MNIST is "GP-like" (GP stands for Gaussian process). Circuits in modular addition are, as it turns out, entirely explained by the MFT limit as we've explained above.
Introduction to the theoryThe background (and the foreground)

In physics, one often looks at systems with a large, stable background. A planet vs. a sun, an electron vs. a proton, a weakly interacting observer vs. a large system being observed. In these settings the "background" is the large system and the "foreground" or "test system" is the small system being studied. In these cases the background system may be fixed, or it may be undergoing some motion (like the sun moving around the galaxy's center), but the important idealization is that it does not react to the observer/ test system. In fact, the earth is applying a gravitational pull to the sun (and famously in quantum mechanics, observations always impact a system at a quantum level). But these "reverse" effects are small, so to a good approximation we can treat the sun as doing its own "stable" thing while earth is undergoing physics that depend strongly on the sun.

Self-consistency

While typically the large "background" is a cleanly separate system from the small test system of the observer, it is sometimes extremely useful to treat the test system as a tiny piece of the background. So: the large background system may be a cup of water and the small test system may be a tiny bit of water at some location. Here while technically the full cup includes the tiny "test" bit, the large-scale behaviors (waves etc.) in the water don't really care to relevant precision if the test bit is changed or removed (at least if it's tiny enough). But the tiny bit of water definitely cares about the large-scale behaviors (waves, vortices or flows, etc.), to the extent that bits of water care about things.

Similarly (and in a closely related way), "the economy" is a giant system that includes your neighborhood bakery. The bakery can be viewed as a small "test system": it is affected by the economy. If property prices go up or the economy tanks, it might close. But the economy is not (at least to leading order) affected by this bakery. It is perhaps affected by the union of all bakeries in the world, but if this particular bakery closes due to some random phenomenon (e.g. the lead baker retires), this won't massively impact the economy.

This point of view is remarkably useful, because it introduces a notion of "self-consistency".

Self-consistency when applied in this context comes from the following pair of intuitions:

  1. the behavior of each small component is (statistically) determined by the background
  2. the behavior of the background is the sum of its small components.

If both of these assumptions are true, then these two observations (when turned into equations) are usually enough to fully pin down the system. Indeed, you have two functional relationships[2] :

mjx-c.mjx-c62::before { padding: 0.694em 0.556em 0.011em 0; content: "b"; } mjx-c.mjx-c6B::before { padding: 0.694em 0.528em 0 0; content: "k"; } mjx-c.mjx-c72::before { padding: 0.442em 0.392em 0 0; content: "r"; } mjx-c.mjx-c66::before { padding: 0.705em 0.372em 0 0; content: "f"; } mjx-c.mjx-c2218::before { padding: 0.444em 0.5em 0 0; content: "\2218"; } mjx-c.mjx-c1D456.TEX-I::before { padding: 0.661em 0.345em 0.011em 0; content: "i"; } mjx-c.mjx-c2192::before { padding: 0.511em 1em 0.011em 0; content: "\2192"; } mjx-c.mjx-c1D719.TEX-I::before { padding: 0.694em 0.596em 0.205em 0; content: "\3D5"; } mjx-c.mjx-c37::before { padding: 0.676em 0.5em 0.022em 0; content: "7"; } mjx-math { display: inline-block; text-align: left; line-height: 0; text-indent: 0; font-style: normal; font-weight: normal; font-size: 100%; font-size-adjust: none; letter-spacing: normal; border-collapse: collapse; word-wrap: normal; word-spacing: normal; white-space: nowrap; direction: ltr; padding: 1px 0; } mjx-container[jax="CHTML"][display="true"] { display: block; text-align: center; margin: 1em 0; } mjx-container[jax="CHTML"][display="true"][width="full"] { display: flex; } mjx-container[jax="CHTML"][display="true"] mjx-math { padding: 0; } mjx-container[jax="CHTML"][justify="left"] { text-align: left; } mjx-container[jax="CHTML"][justify="right"] { text-align: right; } mjx-mo { display: inline-block; text-align: left; } mjx-stretchy-h { display: inline-table; width: 100%; } mjx-stretchy-h > * { display: table-cell; width: 0; } mjx-stretchy-h > * > mjx-c { display: inline-block; transform: scalex(1.0000001); } mjx-stretchy-h > * > mjx-c::before { display: inline-block; width: initial; } mjx-stretchy-h > mjx-ext { /* IE */ overflow: hidden; /* others */ overflow: clip visible; width: 100%; } mjx-stretchy-h > mjx-ext > mjx-c::before { transform: scalex(500); } mjx-stretchy-h > mjx-ext > mjx-c { width: 0; } mjx-stretchy-h > mjx-beg > mjx-c { margin-right: -.1em; } mjx-stretchy-h > mjx-end > mjx-c { margin-left: -.1em; } mjx-stretchy-v { display: inline-block; } mjx-stretchy-v > * { display: block; } mjx-stretchy-v > mjx-beg { height: 0; } mjx-stretchy-v > mjx-end > mjx-c { display: block; } mjx-stretchy-v > * > mjx-c { transform: scaley(1.0000001); transform-origin: left center; overflow: hidden; } mjx-stretchy-v > mjx-ext { display: block; height: 100%; box-sizing: border-box; border: 0px solid transparent; /* IE */ overflow: hidden; /* others */ overflow: visible clip; } mjx-stretchy-v > mjx-ext > mjx-c::before { width: initial; box-sizing: border-box; } mjx-stretchy-v > mjx-ext > mjx-c { transform: scaleY(500) translateY(.075em); overflow: visible; } mjx-mark { display: inline-block; height: 0px; } mjx-c { display: inline-block; } mjx-utext { display: inline-block; padding: .75em 0 .2em 0; } mjx-mi { display: inline-block; text-align: left; } mjx-msub { display: inline-block; text-align: left; } mjx-mspace { display: inline-block; text-align: left; } mjx-msup { display: inline-block; text-align: left; } mjx-mn { display: inline-block; text-align: left; } mjx-TeXAtom { display: inline-block; text-align: left; } mjx-mtext { display: inline-block; text-align: left; } mjx-munderover { display: inline-block; text-align: left; } mjx-munderover:not([limits="false"]) { padding-top: .1em; } mjx-munderover:not([limits="false"]) > * { display: block; } mjx-msubsup { display: inline-block; text-align: left; } mjx-script { display: inline-block; padding-right: .05em; padding-left: .033em; } mjx-script > mjx-spacer { display: block; } mjx-mfrac { display: inline-block; text-align: left; } mjx-frac { display: inline-block; vertical-align: 0.17em; padding: 0 .22em; } mjx-frac[type="d"] { vertical-align: .04em; } mjx-frac[delims] { padding: 0 .1em; } mjx-frac[atop] { padding: 0 .12em; } mjx-frac[atop][delims] { padding: 0; } mjx-dtable { display: inline-table; width: 100%; } mjx-dtable > * { font-size: 2000%; } mjx-dbox { display: block; font-size: 5%; } mjx-num { display: block; text-align: center; } mjx-den { display: block; text-align: center; } mjx-mfrac[bevelled] > mjx-num { display: inline-block; } mjx-mfrac[bevelled] > mjx-den { display: inline-block; } mjx-den[align="right"], mjx-num[align="right"] { text-align: right; } mjx-den[align="left"], mjx-num[align="left"] { text-align: left; } mjx-nstrut { display: inline-block; height: .054em; width: 0; vertical-align: -.054em; } mjx-nstrut[type="d"] { height: .217em; vertical-align: -.217em; } mjx-dstrut { display: inline-block; height: .505em; width: 0; } mjx-dstrut[type="d"] { height: .726em; } mjx-line { display: block; box-sizing: border-box; min-height: 1px; height: .06em; border-top: .06em solid; margin: .06em -.1em; overflow: hidden; } mjx-line[type="d"] { margin: .18em -.1em; } mjx-mrow { display: inline-block; text-align: left; } mjx-munder { display: inline-block; text-align: left; } mjx-over { text-align: left; } mjx-munder:not([limits="false"]) { display: inline-table; } mjx-munder > mjx-row { text-align: left; } mjx-under { padding-bottom: .1em; } mjx-mtable { display: inline-block; text-align: center; vertical-align: .25em; position: relative; box-sizing: border-box; border-spacing: 0; border-collapse: collapse; } mjx-mstyle[size="s"] mjx-mtable { vertical-align: .354em; } mjx-labels { position: absolute; left: 0; top: 0; } mjx-table { display: inline-block; vertical-align: -.5ex; box-sizing: border-box; } mjx-table > mjx-itable { vertical-align: middle; text-align: left; box-sizing: border-box; } mjx-labels > mjx-itable { position: absolute; top: 0; } mjx-mtable[justify="left"] { text-align: left; } mjx-mtable[justify="right"] { text-align: right; } mjx-mtable[justify="left"][side="left"] { padding-right: 0 ! important; } mjx-mtable[justify="left"][side="right"] { padding-left: 0 ! important; } mjx-mtable[justify="right"][side="left"] { padding-right: 0 ! important; } mjx-mtable[justify="right"][side="right"] { padding-left: 0 ! important; } mjx-mtable[align] { vertical-align: baseline; } mjx-mtable[align="top"] > mjx-table { vertical-align: top; } mjx-mtable[align="bottom"] > mjx-table { vertical-align: bottom; } mjx-mtable[side="right"] mjx-labels { min-width: 100%; } mjx-mtr { display: table-row; text-align: left; } mjx-mtr[rowalign="top"] > mjx-mtd { vertical-align: top; } mjx-mtr[rowalign="center"] > mjx-mtd { vertical-align: middle; } mjx-mtr[rowalign="bottom"] > mjx-mtd { vertical-align: bottom; } mjx-mtr[rowalign="baseline"] > mjx-mtd { vertical-align: baseline; } mjx-mtr[rowalign="axis"] > mjx-mtd { vertical-align: .25em; } mjx-mtd { display: table-cell; text-align: center; padding: .215em .4em; } mjx-mtd:first-child { padding-left: 0; } mjx-mtd:last-child { padding-right: 0; } mjx-mtable > * > mjx-itable > *:first-child > mjx-mtd { padding-top: 0; } mjx-mtable > * > mjx-itable > *:last-child > mjx-mtd { padding-bottom: 0; } mjx-tstrut { display: inline-block; height: 1em; vertical-align: -.25em; } mjx-labels[align="left"] > mjx-mtr > mjx-mtd { text-align: left; } mjx-labels[align="right"] > mjx-mtr > mjx-mtd { text-align: right; } mjx-mtd[extra] { padding: 0; } mjx-mtd[rowalign="top"] { vertical-align: top; } mjx-mtd[rowalign="center"] { vertical-align: middle; } mjx-mtd[rowalign="bottom"] { vertical-align: bottom; } mjx-mtd[rowalign="baseline"] { vertical-align: baseline; } mjx-mtd[rowalign="axis"] { vertical-align: .25em; } mjx-menclose { display: inline-block; text-align: left; position: relative; } mjx-menclose > mjx-dstrike { display: inline-block; left: 0; top: 0; position: absolute; border-top: 0.067em solid; transform-origin: top left; } mjx-menclose > mjx-ustrike { display: inline-block; left: 0; bottom: 0; position: absolute; border-top: 0.067em solid; transform-origin: bottom left; } mjx-menclose > mjx-hstrike { border-top: 0.067em solid; position: absolute; left: 0; right: 0; bottom: 50%; transform: translateY(0.034em); } mjx-menclose > mjx-vstrike { border-left: 0.067em solid; position: absolute; top: 0; bottom: 0; right: 50%; transform: translateX(0.034em); } mjx-menclose > mjx-rbox { position: absolute; top: 0; bottom: 0; right: 0; left: 0; border: 0.067em solid; border-radius: 0.267em; } mjx-menclose > mjx-cbox { position: absolute; top: 0; bottom: 0; right: 0; left: 0; border: 0.067em solid; border-radius: 50%; } mjx-menclose > mjx-arrow { position: absolute; left: 0; bottom: 50%; height: 0; width: 0; } mjx-menclose > mjx-arrow > * { display: block; position: absolute; transform-origin: bottom; border-left: 0.268em solid; border-right: 0; box-sizing: border-box; } mjx-menclose > mjx-arrow > mjx-aline { left: 0; top: -0.034em; right: 0.201em; height: 0; border-top: 0.067em solid; border-left: 0; } mjx-menclose > mjx-arrow[double] > mjx-aline { left: 0.201em; height: 0; } mjx-menclose > mjx-arrow > mjx-rthead { transform: skewX(0.464rad); right: 0; bottom: -1px; border-bottom: 1px solid transparent; border-top: 0.134em solid transparent; } mjx-menclose > mjx-arrow > mjx-rbhead { transform: skewX(-0.464rad); transform-origin: top; right: 0; top: -1px; border-top: 1px solid transparent; border-bottom: 0.134em solid transparent; } mjx-menclose > mjx-arrow > mjx-lthead { transform: skewX(-0.464rad); left: 0; bottom: -1px; border-left: 0; border-right: 0.268em solid; border-bottom: 1px solid transparent; border-top: 0.134em solid transparent; } mjx-menclose > mjx-arrow > mjx-lbhead { transform: skewX(0.464rad); transform-origin: top; left: 0; top: -1px; border-left: 0; border-right: 0.268em solid; border-top: 1px solid transparent; border-bottom: 0.134em solid transparent; } mjx-menclose > dbox { position: absolute; top: 0; bottom: 0; left: -0.3em; width: 0.6em; border: 0.067em solid; border-radius: 50%; clip-path: inset(0 0 0 0.3em); box-sizing: border-box; } mjx-stretchy-h.mjx-c23DF mjx-beg mjx-c::before { content: "\E152"; padding: 0.32em 0 0.2em 0; } mjx-stretchy-h.mjx-c23DF mjx-ext mjx-c::before { content: "\E154"; padding: 0.32em 0 0.2em 0; } mjx-stretchy-h.mjx-c23DF mjx-end mjx-c::before { content: "\E153"; padding: 0.32em 0 0.2em 0; } mjx-stretchy-h.mjx-c23DF mjx-mid mjx-c::before { content: "\E151\E150"; padding: 0.32em 0 0.2em 0; } mjx-stretchy-h.mjx-c23DF > mjx-ext { width: 50%; } mjx-c.mjx-c28::before { padding: 0.75em 0.389em 0.25em 0; content: "("; } mjx-c.mjx-c1D44E.TEX-I::before { padding: 0.441em 0.529em 0.01em 0; content: "a"; } mjx-c.mjx-c2C::before { padding: 0.121em 0.278em 0.194em 0; content: ","; } mjx-c.mjx-c1D44F.TEX-I::before { padding: 0.694em 0.429em 0.011em 0; content: "b"; } mjx-c.mjx-c29::before { padding: 0.75em 0.389em 0.25em 0; content: ")"; } mjx-c.mjx-c210E.TEX-I::before { padding: 0.694em 0.576em 0.011em 0; content: "h"; } mjx-c.mjx-c3D::before { padding: 0.583em 0.778em 0.082em 0; content: "="; } mjx-c.mjx-c1D464.TEX-I::before { padding: 0.443em 0.716em 0.011em 0; content: "w"; } mjx-c.mjx-c1D70F.TEX-I::before { padding: 0.431em 0.517em 0.013em 0; content: "\3C4"; } mjx-c.mjx-c2295::before { padding: 0.583em 0.778em 0.083em 0; content: "\2295"; } mjx-c.mjx-c1D465.TEX-I::before { padding: 0.442em 0.572em 0.011em 0; content: "x"; } mjx-c.mjx-c1D467.TEX-I::before { padding: 0.442em 0.465em 0.011em 0; content: "z"; } mjx-c.mjx-c1D436.TEX-I::before { padding: 0.705em 0.76em 0.022em 0; content: "C"; } mjx-c.mjx-c3A::before { padding: 0.43em 0.278em 0 0; content: ":"; } mjx-c.mjx-c22A4::before { padding: 0.668em 0.778em 0 0; content: "\22A4"; } mjx-c.mjx-c2212::before { padding: 0.583em 0.778em 0.082em 0; content: "\2212"; } mjx-c.mjx-c2E::before { padding: 0.12em 0.278em 0 0; content: "."; } mjx-c.mjx-c3E::before { padding: 0.54em 0.778em 0.04em 0; content: ">"; } mjx-c.mjx-c30::before { padding: 0.666em 0.5em 0.022em 0; content: "0"; } mjx-c.mjx-c27FA::before { padding: 0.525em 1.858em 0.024em 0; content: "\27FA"; } mjx-c.mjx-c31::before { padding: 0.666em 0.5em 0 0; content: "1"; } mjx-c.mjx-c3C::before { padding: 0.54em 0.778em 0.04em 0; content: "<"; } mjx-c.mjx-c1D437.TEX-I::before { padding: 0.683em 0.828em 0 0; content: "D"; } mjx-c.mjx-c2B::before { padding: 0.583em 0.778em 0.082em 0; content: "+"; } mjx-c.mjx-c1D434.TEX-I::before { padding: 0.716em 0.75em 0 0; content: "A"; } mjx-c.mjx-c1D435.TEX-I::before { padding: 0.683em 0.759em 0 0; content: "B"; } mjx-c.mjx-c2208::before { padding: 0.54em 0.667em 0.04em 0; content: "\2208"; } mjx-c.mjx-c2124.TEX-A::before { padding: 0.683em 0.667em 0 0; content: "Z"; } mjx-c.mjx-c1D443.TEX-I::before { padding: 0.683em 0.751em 0 0; content: "P"; } mjx-c.mjx-cD7::before { padding: 0.491em 0.778em 0 0; content: "\D7"; } mjx-c.mjx-c7B::before { padding: 0.75em 0.5em 0.25em 0; content: "{"; } mjx-c.mjx-c7D::before { padding: 0.75em 0.5em 0.25em 0; content: "}"; } mjx-c.mjx-c32::before { padding: 0.666em 0.5em 0 0; content: "2"; } mjx-c.mjx-c1D461.TEX-I::before { padding: 0.626em 0.361em 0.011em 0; content: "t"; } mjx-c.mjx-c1D452.TEX-I::before { padding: 0.442em 0.466em 0.011em 0; content: "e"; } mjx-c.mjx-c211D.TEX-A::before { padding: 0.683em 0.722em 0 0; content: "R"; } mjx-c.mjx-c1D451.TEX-I::before { padding: 0.694em 0.52em 0.01em 0; content: "d"; } mjx-c.mjx-c1D457.TEX-I::before { padding: 0.661em 0.412em 0.204em 0; content: "j"; } mjx-c.mjx-c70::before { padding: 0.442em 0.556em 0.194em 0; content: "p"; } mjx-c.mjx-c6F::before { padding: 0.448em 0.5em 0.01em 0; content: "o"; } mjx-c.mjx-c73::before { padding: 0.448em 0.394em 0.011em 0; content: "s"; } mjx-c.mjx-c1D44A.TEX-I::before { padding: 0.683em 1.048em 0.022em 0; content: "W"; } mjx-c.mjx-c1D444.TEX-I::before { padding: 0.704em 0.791em 0.194em 0; content: "Q"; } mjx-c.mjx-c1D43E.TEX-I::before { padding: 0.683em 0.889em 0 0; content: "K"; } mjx-c.mjx-c1D449.TEX-I::before { padding: 0.683em 0.769em 0.022em 0; content: "V"; } mjx-c.mjx-c2211.TEX-S1::before { padding: 0.75em 1.056em 0.25em 0; content: "\2211"; } mjx-c.mjx-c33::before { padding: 0.665em 0.5em 0.022em 0; content: "3"; } mjx-c.mjx-c1D6FC.TEX-I::before { padding: 0.442em 0.64em 0.011em 0; content: "\3B1"; } mjx-c.mjx-c74::before { padding: 0.615em 0.389em 0.01em 0; content: "t"; } mjx-c.mjx-c68::before { padding: 0.694em 0.556em 0 0; content: "h"; } mjx-c.mjx-c65::before { padding: 0.448em 0.444em 0.011em 0; content: "e"; } mjx-c.mjx-c78::before { padding: 0.431em 0.528em 0 0; content: "x"; } mjx-c.mjx-c2061::before { padding: 0 0 0 0; content: ""; } mjx-c.mjx-c28.TEX-S1::before { padding: 0.85em 0.458em 0.349em 0; content: "("; } mjx-c.mjx-c29.TEX-S1::before { padding: 0.85em 0.458em 0.349em 0; content: ")"; } mjx-c.mjx-c1D458.TEX-I::before { padding: 0.694em 0.521em 0.011em 0; content: "k"; } mjx-c.mjx-c1D463.TEX-I::before { padding: 0.443em 0.485em 0.011em 0; content: "v"; } mjx-c.mjx-c1D45D.TEX-I::before { padding: 0.442em 0.503em 0.194em 0; content: "p"; } mjx-c.mjx-c1D70E.TEX-I::before { padding: 0.431em 0.571em 0.011em 0; content: "\3C3"; } mjx-c.mjx-c21A6::before { padding: 0.511em 1em 0.011em 0; content: "\21A6"; } mjx-c.mjx-c6C::before { padding: 0.694em 0.278em 0 0; content: "l"; } mjx-c.mjx-c77::before { padding: 0.431em 0.722em 0.011em 0; content: "w"; } mjx-c.mjx-c6D::before { padding: 0.442em 0.833em 0 0; content: "m"; } mjx-c.mjx-c64::before { padding: 0.694em 0.556em 0.011em 0; content: "d"; } mjx-c.mjx-c69::before { padding: 0.669em 0.278em 0 0; content: "i"; } mjx-c.mjx-c75::before { padding: 0.442em 0.556em 0.011em 0; content: "u"; } mjx-c.mjx-c67::before { padding: 0.453em 0.5em 0.206em 0; content: "g"; } mjx-c.mjx-c2265::before { padding: 0.636em 0.778em 0.138em 0; content: "\2265"; } mjx-c.mjx-c1D441.TEX-I::before { padding: 0.683em 0.888em 0 0; content: "N"; } mjx-c.mjx-c2D::before { padding: 0.252em 0.333em 0 0; content: "-"; } mjx-c.mjx-c6E::before { padding: 0.442em 0.556em 0 0; content: "n"; } mjx-c.mjx-c79::before { padding: 0.431em 0.528em 0.204em 0; content: "y"; } mjx-c.mjx-c63::before { padding: 0.448em 0.444em 0.011em 0; content: "c"; } mjx-c.mjx-c1D439.TEX-I::before { padding: 0.68em 0.749em 0 0; content: "F"; } mjx-c.mjx-c25FB.TEX-A::before { padding: 0.689em 0.778em 0 0; content: "\25A1"; } mjx-c.mjx-c1D466.TEX-I::before { padding: 0.442em 0.49em 0.205em 0; content: "y"; } mjx-c.mjx-c1D45F.TEX-I::before { padding: 0.442em 0.451em 0.011em 0; content: "r"; } mjx-c.mjx-c1D407.TEX-B::before { padding: 0.686em 0.9em 0 0; content: "H"; } mjx-c.mjx-c1D41E.TEX-B::before { padding: 0.452em 0.527em 0.006em 0; content: "e"; } mjx-c.mjx-c1D41A.TEX-B::before { padding: 0.453em 0.559em 0.006em 0; content: "a"; } mjx-c.mjx-c1D41D.TEX-B::before { padding: 0.694em 0.639em 0.006em 0; content: "d"; } mjx-c.mjx-c20::before { padding: 0 0.25em 0 0; content: " "; } mjx-c.mjx-c1D7CE.TEX-B::before { padding: 0.654em 0.575em 0.01em 0; content: "0"; } mjx-c.mjx-c1D7CF.TEX-B::before { padding: 0.655em 0.575em 0 0; content: "1"; } mjx-c.mjx-c1D43C.TEX-I::before { padding: 0.683em 0.504em 0 0; content: "I"; } mjx-c.mjx-c1D442.TEX-I::before { padding: 0.704em 0.763em 0.022em 0; content: "O"; } mjx-c.mjx-c1D408.TEX-B::before { padding: 0.686em 0.436em 0 0; content: "I"; } mjx-c.mjx-c1D427.TEX-B::before { padding: 0.45em 0.639em 0 0; content: "n"; } mjx-c.mjx-c1D429.TEX-B::before { padding: 0.45em 0.639em 0.194em 0; content: "p"; } mjx-c.mjx-c1D42E.TEX-B::before { padding: 0.45em 0.639em 0.006em 0; content: "u"; } mjx-c.mjx-c1D42D.TEX-B::before { padding: 0.635em 0.447em 0.005em 0; content: "t"; } mjx-c.mjx-cA0::before { padding: 0 0.25em 0 0; content: "\A0"; } mjx-c.mjx-c2B.TEX-B::before { padding: 0.633em 0.894em 0.131em 0; content: "+"; } mjx-c.mjx-c2248::before { padding: 0.483em 0.778em 0 0; content: "\2248"; } mjx-c.mjx-c38::before { padding: 0.666em 0.5em 0.022em 0; content: "8"; } mjx-c.mjx-c34::before { padding: 0.677em 0.5em 0 0; content: "4"; } mjx-c.mjx-c35::before { padding: 0.666em 0.5em 0.022em 0; content: "5"; } mjx-c.mjx-c1D453.TEX-I::before { padding: 0.705em 0.55em 0.205em 0; content: "f"; } mjx-c.mjx-c1D45B.TEX-I::before { padding: 0.442em 0.6em 0.011em 0; content: "n"; } mjx-c.mjx-c4F::before { padding: 0.705em 0.778em 0.022em 0; content: "O"; } mjx-c.mjx-c52::before { padding: 0.683em 0.736em 0.022em 0; content: "R"; } mjx-c.mjx-c61::before { padding: 0.448em 0.5em 0.011em 0; content: "a"; } mjx-c.mjx-c41::before { padding: 0.716em 0.75em 0 0; content: "A"; } mjx-c.mjx-c4E::before { padding: 0.683em 0.75em 0 0; content: "N"; } mjx-c.mjx-c44::before { padding: 0.683em 0.764em 0 0; content: "D"; } mjx-c.mjx-c58::before { padding: 0.683em 0.75em 0 0; content: "X"; } mjx-container[jax="CHTML"] { line-height: 0; } mjx-container [space="1"] { margin-left: .111em; } mjx-container [space="2"] { margin-left: .167em; } mjx-container [space="3"] { margin-left: .222em; } mjx-container [space="4"] { margin-left: .278em; } mjx-container [space="5"] { margin-left: .333em; } mjx-container [rspace="1"] { margin-right: .111em; } mjx-container [rspace="2"] { margin-right: .167em; } mjx-container [rspace="3"] { margin-right: .222em; } mjx-container [rspace="4"] { margin-right: .278em; } mjx-container [rspace="5"] { margin-right: .333em; } mjx-container [size="s"] { font-size: 70.7%; } mjx-container [size="ss"] { font-size: 50%; } mjx-container [size="Tn"] { font-size: 60%; } mjx-container [size="sm"] { font-size: 85%; } mjx-container [size="lg"] { font-size: 120%; } mjx-container [size="Lg"] { font-size: 144%; } mjx-container [size="LG"] { font-size: 173%; } mjx-container [size="hg"] { font-size: 207%; } mjx-container [size="HG"] { font-size: 249%; } mjx-container [width="full"] { width: 100%; } mjx-box { display: inline-block; } mjx-block { display: block; } mjx-itable { display: inline-table; } mjx-row { display: table-row; } mjx-row > * { display: table-cell; } mjx-mtext { display: inline-block; } mjx-mstyle { display: inline-block; } mjx-merror { display: inline-block; color: red; background-color: yellow; } mjx-mphantom { visibility: hidden; } _::-webkit-full-page-media, _:future, :root mjx-container { will-change: opacity; } mjx-c::before { display: block; width: 0; } .MJX-TEX { font-family: MJXZERO, MJXTEX; } .TEX-B { font-family: MJXZERO, MJXTEX-B; } .TEX-I { font-family: MJXZERO, MJXTEX-I; } .TEX-MI { font-family: MJXZERO, MJXTEX-MI; } .TEX-BI { font-family: MJXZERO, MJXTEX-BI; } .TEX-S1 { font-family: MJXZERO, MJXTEX-S1; } .TEX-S2 { font-family: MJXZERO, MJXTEX-S2; } .TEX-S3 { font-family: MJXZERO, MJXTEX-S3; } .TEX-S4 { font-family: MJXZERO, MJXTEX-S4; } .TEX-A { font-family: MJXZERO, MJXTEX-A; } .TEX-C { font-family: MJXZERO, MJXTEX-C; } .TEX-CB { font-family: MJXZERO, MJXTEX-CB; } .TEX-FR { font-family: MJXZERO, MJXTEX-FR; } .TEX-FRB { font-family: MJXZERO, MJXTEX-FRB; } .TEX-SS { font-family: MJXZERO, MJXTEX-SS; } .TEX-SSB { font-family: MJXZERO, MJXTEX-SSB; } .TEX-SSI { font-family: MJXZERO, MJXTEX-SSI; } .TEX-SC { font-family: MJXZERO, MJXTEX-SC; } .TEX-T { font-family: MJXZERO, MJXTEX-T; } .TEX-V { font-family: MJXZERO, MJXTEX-V; } .TEX-VB { font-family: MJXZERO, MJXTEX-VB; } mjx-stretchy-v mjx-c, mjx-stretchy-h mjx-c { font-family: MJXZERO, MJXTEX-S1, MJXTEX-S4, MJXTEX, MJXTEX-A ! important; } @font-face /* 0 */ { font-family: MJXZERO; src: url("https://cdn.jsdelivr.net/npm/mathjax@3/es5/output/chtml/fonts/woff-v2/MathJax_Zero.woff") format("woff"); } @font-face /* 1 */ { font-family: MJXTEX; src: url("https://cdn.jsdelivr.net/npm/mathjax@3/es5/output/chtml/fonts/woff-v2/MathJax_Main-Regular.woff") format("woff"); } @font-face /* 2 */ { font-family: MJXTEX-B; src: url("https://cdn.jsdelivr.net/npm/mathjax@3/es5/output/chtml/fonts/woff-v2/MathJax_Main-Bold.woff") format("woff"); } @font-face /* 3 */ { font-family: MJXTEX-I; src: url("https://cdn.jsdelivr.net/npm/mathjax@3/es5/output/chtml/fonts/woff-v2/MathJax_Math-Italic.woff") format("woff"); } @font-face /* 4 */ { font-family: MJXTEX-MI; src: url("https://cdn.jsdelivr.net/npm/mathjax@3/es5/output/chtml/fonts/woff-v2/MathJax_Main-Italic.woff") format("woff"); } @font-face /* 5 */ { font-family: MJXTEX-BI; src: url("https://cdn.jsdelivr.net/npm/mathjax@3/es5/output/chtml/fonts/woff-v2/MathJax_Math-BoldItalic.woff") format("woff"); } @font-face /* 6 */ { font-family: MJXTEX-S1; src: url("https://cdn.jsdelivr.net/npm/mathjax@3/es5/output/chtml/fonts/woff-v2/MathJax_Size1-Regular.woff") format("woff"); } @font-face /* 7 */ { font-family: MJXTEX-S2; src: url("https://cdn.jsdelivr.net/npm/mathjax@3/es5/output/chtml/fonts/woff-v2/MathJax_Size2-Regular.woff") format("woff"); } @font-face /* 8 */ { font-family: MJXTEX-S3; src: url("https://cdn.jsdelivr.net/npm/mathjax@3/es5/output/chtml/fonts/woff-v2/MathJax_Size3-Regular.woff") format("woff"); } @font-face /* 9 */ { font-family: MJXTEX-S4; src: url("https://cdn.jsdelivr.net/npm/mathjax@3/es5/output/chtml/fonts/woff-v2/MathJax_Size4-Regular.woff") format("woff"); } @font-face /* 10 */ { font-family: MJXTEX-A; src: url("https://cdn.jsdelivr.net/npm/mathjax@3/es5/output/chtml/fonts/woff-v2/MathJax_AMS-Regular.woff") format("woff"); } @font-face /* 11 */ { font-family: MJXTEX-C; src: url("https://cdn.jsdelivr.net/npm/mathjax@3/es5/output/chtml/fonts/woff-v2/MathJax_Calligraphic-Regular.woff") format("woff"); } @font-face /* 12 */ { font-family: MJXTEX-CB; src: url("https://cdn.jsdelivr.net/npm/mathjax@3/es5/output/chtml/fonts/woff-v2/MathJax_Calligraphic-Bold.woff") format("woff"); } @font-face /* 13 */ { font-family: MJXTEX-FR; src: url("https://cdn.jsdelivr.net/npm/mathjax@3/es5/output/chtml/fonts/woff-v2/MathJax_Fraktur-Regular.woff") format("woff"); } @font-face /* 14 */ { font-family: MJXTEX-FRB; src: url("https://cdn.jsdelivr.net/npm/mathjax@3/es5/output/chtml/fonts/woff-v2/MathJax_Fraktur-Bold.woff") format("woff"); } @font-face /* 15 */ { font-family: MJXTEX-SS; src: url("https://cdn.jsdelivr.net/npm/mathjax@3/es5/output/chtml/fonts/woff-v2/MathJax_SansSerif-Regular.woff") format("woff"); } @font-face /* 16 */ { font-family: MJXTEX-SSB; src: url("https://cdn.jsdelivr.net/npm/mathjax@3/es5/output/chtml/fonts/woff-v2/MathJax_SansSerif-Bold.woff") format("woff"); } @font-face /* 17 */ { font-family: MJXTEX-SSI; src: url("https://cdn.jsdelivr.net/npm/mathjax@3/es5/output/chtml/fonts/woff-v2/MathJax_SansSerif-Italic.woff") format("woff"); } @font-face /* 18 */ { font-family: MJXTEX-SC; src: url("https://cdn.jsdelivr.net/npm/mathjax@3/es5/output/chtml/fonts/woff-v2/MathJax_Script-Regular.woff") format("woff"); } @font-face /* 19 */ { font-family: MJXTEX-T; src: url("https://cdn.jsdelivr.net/npm/mathjax@3/es5/output/chtml/fonts/woff-v2/MathJax_Typewriter-Regular.woff") format("woff"); } @font-face /* 20 */ { font-family: MJXTEX-V; src: url("https://cdn.jsdelivr.net/npm/mathjax@3/es5/output/chtml/fonts/woff-v2/MathJax_Vector-Regular.woff") format("woff"); } @font-face /* 21 */ { font-family: MJXTEX-VB; src: url("https://cdn.jsdelivr.net/npm/mathjax@3/es5/output/chtml/fonts/woff-v2/MathJax_Vector-Bold.woff") format("woff"); } Putting these together, we have the combined "self-consistency" equation:

which means that the background field satisfies a fixed point equation for the composed function . It so happens that in many cases of interest, it has a unique solution. A classic example of a self-consistency equation is the supply-demand curve equilibrium. Here the background is a single number (price of a good) and the test system is the willingness of a single consumer to buy or of a single producer to sell, as a function of price (the actual "tiny components" consisting of individual consumers/producers are abstracted out, and the curve represents the average incentive).

Of the above assumption 1 is most problematic. Thinking of each component as being determined by some "large-scale" stable system needs to be interpreted appropriately (in particular the relationship is often statistical: so for example the number of bakeries in a given neighborhood fluctuates due to people retiring/ moving/ etc., even if "the economy" is held constant; similarly, every bit of the sun reacts to magnetic/ gravitational fields from other bits, but in a statistical or thermodynamic sense). Sometimes local or so-called "emergent" effects break this directional relationship (and many interesting thermodynamic systems, such as the 2-dimensional Ising model, are precisely interesting in such contexts). But surprisingly often (at least with an appropriate formalism) the approximation of the foreground as fully determined by the background (in a statistical sense) is robust. For example if we are modeling the sun, viewing the "background system" too coarsely (as just the mass + electromagnetic field + temperature, say, of the entire sun) is insufficient. But instead we can view the "background system" as a giant union of many local systems, maybe comprising a few meter chunks. These are still "large" in the sense of being much larger than an atom (or a microscopic chunk), but studying their behavior (in an appropriate abstraction) offers sufficient resolution to model the sun extremely well. Similarly we can't apply a single supply-demand curve to the entire economy (bread costs different amounts in different places). But in appropriate contexts (for fungible products like oil, and on a "local economy" level where the economy is roughly uniform but not dominated by a single station, for example) self-consistency is a pretty good model.

In many settings, the question of how well "assumption 1" above holds is related to a notion of connectedness. In the sun's magnetic plasma, the magnetic field experienced by a particle is accumulated over billions and billions of nearby particles - so the graph of interactions is extremely connected. In an oil economy, each consumer can typically choose between dozens of nearby stations which are reachable by car. However other settings (like the Ising model, or markets for rare and hard-to-transport goods) cannot be purely modeled by self-consistency as well.

In physics, systems that are well-modeled by a self-consistency equation (coupled background and foreground systems) are generally called mean-field settings. A big triumph of statistical physics is to make situations with local/ emergent phenomena "behave as well as" mean field theories – renormalization is a fundamental tool here, and most textbooks on renormalization from a statistical-physics view tend to start with a discussion of mean-field methods. But settings that are directly mean-field (for example due to being highly connected or high-dimensional) are particularly nice, easy-to-study

Neural nets and mean field

Neural nets are physical systems. This is a vacuous statement – anything that has statistics can be studied using a physics toolkit (and in many ways statistical physics is just statistics with different terms). Indeed, real neural nets are immensely complex, and if there is some sense in which they can be locally decomposed into background-foreground consistencies, these must themselves be immensely complex and likely dependent on sophisticated tooling to identify (this is one of the reasons why we are running an agenda on renormalization).

But it turns out that in some settings and architectures neural nets are extremely well-modeled by systems with high connectivity – and the reason is, naively enough, precisely the fact that they are highly connected (often fully-connected) on a neuron level (note that architectures that aren't "fully-connected" – e.g. CNNs – sometimes still have properties that make them "highly connected" from a physical point of view).

The mean-field background and foreground for a neural net

In neural net MFT the foreground (or "system"/ "observer") abstraction is a neuron. This is typically a coordinate index of some layer.

The important "background" thing that each neuron "carries" is what is called an activation function, often denoted by the letter . This is a function on data: given any input x, partially running the model on x returns a vector of activations. is its i'th component. This function is now the thing that a neuron contributes to the "background field" of the neural net.[3]

Now if there are lots of neurons, each neuron's activation function reacts to a background generated by the other neurons: removing the neuron in this limit doesn't change the loss by much, so the background determines each neuron's behavior as a statistical distribution. Conversely, the background itself is composed of individual "foreground" neurons. The loop:

background neuron distribution background

must close, i.e. be self-consistent. Making sense of this loop is the key content of mean field theory of neural nets.

In later installments we'll explain a bit more about the loop and show some examples of it working (or not). You can also see the original linked paper about the Curse of Detail for a more physics-forward view of this.

Experimental setting and pretty picture

We'll close with a toy example of "self-consistency", which is visually satisfying.

In this setting we look at a 2-layer model that takes in a two-dimensional input variables and is trained on the target at a large width (here ) and on infinite data. The activation function is a bounded sigmoid-like function (the relu version of tanh). Each neuron at layer 1 is a function that only depends on a 2-dimensional row of the weight matrix, so the associated "test" field or particle can be plotted on a 2-dimensional graph. When we plot all of these together we get a good picture of the distribution of single-neuron functions that combine together to form the background system:

The neurons above were trained jointly in a way that would allow them to interact.

It has a nice clover-leaf like structure (it will reappear later when we look at continuous xors - a multi-layer setting where mean field performs compositional computation; already in this simple setting, the fact that the cloud of neurons is a "shaped" distribution rather than a flat Gaussian puts us solidly outside the Gaussian process regime). Now we can empirically measure how a single randomly initialized "foreground" neuron would react to the background generated by this model. To do this, we train 2048 iid single-neuron models on the resulting background from the fully trained model.[4] When we do this and combine the resulting 2048 neurons into a new model, we see that indeed it looks exactly the same as the background. When we compute its associated function, we get very similar loss.

Each neuron in this picture was trained in a fully iid way, without interacting with any neuron, simply by "reacting to the background", i.e. learning the task in combination with the "blue" background above.

Note that this isn't a property that comes "for free". If we were to use the wrong background (for example a the more Gaussian process-like model here) then samples of the foreground would fail to align to the background.

Blue is background, orange is foreground (each orange neuron trained independently in reaction to background).

The case of 2-layer networks is special: neuron functions are particularly simple to characterize, and the mean field has better properties (it's not "coupled"). But we'll see that deeper nets can still be analyzed using this language, and even using empirical methods we can get cleaner pictures of how they learn and process representations.

In the next post, we will explain the physics behind these experiments and the experimental details of the models (github repo coming soon).


  1. ^

    Technically they differ on whether they use the "pizza" vs. "clock" mechanisms, but the two idealizations are related, and both the mean field and the realistic setting can be modified to make use of either.

  2. ^

    Below, f and b should generally be understood as "statistical" functions: job choice is, perhaps, a probabilistic function depending on the economy, which includes both demand/ markets but also supply/ people's interests; conversely "the economy" is the average of production over the distribution of jobs.

  3. ^

    Technicalities. Depending on the situation can either be viewed as a function a finite training set or on an infinite "set of all possible inputs", usually a large Euclidean space (example: an MNIST input is a vector of pixel values). Unless we're working with finite training data, this is a priori an infinite-dimensional gadget; and worse, the thing that is actually summed over neurons – the analog of the "market" or "background field" is nonlinear in this objects[4]. There is also a subtlety here about SGD vs. Bayesian learning which I won't get into. But in mean-field settings that admit generalization (or for finite number of inputs), this background is effectively dominated by a small set of "relevant" directions.

  4. ^

    Technical note: each single-neuron model is trained on the difference where is the trained model.

  5. ^

    In fact it is quadratic: the thing that sums over neurons is the "external square" of the neuron function, which is a function of a pair of inputs: knowing this sum fully determines the dynamics up to rotational symmetry, even for a finite-width model (it's often called the "data kernel" but is used very differently from the Gaussian process kernels, which do depend on an infinite-width assumption and lose a lot of information in finite-width and mean-field contexts).



Discuss

Democracy Dies With The Rifleman

4 апреля, 2026 - 09:39

Political power grows out of the barrel of a gun -- Mao Zedong

Halfway thru recorded history, Athens became the first state we're sure was a democracy, and inspiration to many later ones. Probably some existed earlier, and certainly some entities smaller than states were democratic, likely long before recorded history began.

The next tenth of history saw the rise of the Roman Republic, which mixed democracy and aristocracy together to form a functional hybrid, and then it transitioned to the Roman Empire, which shifted the mix substantially towards aristocracy. For the next three tenths of recorded history, democracies were at best local governments, minor regional powers, or components of larger, autocratic states.

Few, if any, of these societies would count as "democracies" according to modern watchdog organizations. Only about 10-20% of the residents of Athens were citizens; the rest possessed no real political power. In later eras, residents of towns or cities might vote for their urban officials, but the urbanization rates were also around 10-20%, so the vast majority lived in the non-democratic countryside.

Then for the last tenth of history, democracies rose again to dominate the world stage. One standard story for this has to do with military technology. The Roman Republic expanded because it had the dominant military technology of its time; this may have been in part because of its political system. But eventually the heavily trained armored horseman became the dominant military technology, and was more easily provided by autocracies than democracies. Then widespread use of gunpowder weapons swung the balance back towards mass manpower; the knight in his castle could no longer reliably put down a peasant revolt, or hold back Napoleon and his levée en masse.

Another standard story has to do with increased state resources. Democracies generally support higher tax rates than autocracies do; while this is primarily to support social services, some amount of this is that people are willing to pay more for things that they think 'they' own (rather than their distant overlords).

A third standard story has to do with ease of turnover. Democracies generally don't have to fight civil wars or succession conflicts, because whenever such a movement would have a chance of a military victory, it also has a chance of a bloodless victory. This leads to peaceful turnovers or governments following the preferences of voters enough to not let resentments build up to the point that they boil over.

The major wars of the last two hundred years have not been limited engagements driven by hereditary elites; they have primarily been total struggles between ideologies and peoples, which only managed to become major because they could motivate significant efforts on both sides.

What does the next tenth of history[1] look like? One might think that the invention of larger and more sophisticated weapons means that we swing back towards the knight and aristocracy, but the evidence of the most recent wars suggests otherwise. Kipling's poem Arithmetic on the Frontier, on the costs of pitting Imperial troops against regional resistance, rings nearly as true about America's various wars and special operations in the Middle East. Between great powers, the weapons have become so expensive that only systems which are widely believed in by their inhabitants can afford to supply a competitive number of them.

It's not obvious to me that this continues to be true. I think next-generation military capabilities primarily have to do with 1) operational knowledge and 2) mass-produced smart weaponry. The war in Ukraine shows how conflicts between drones and riflemen go; the war in Iran shows how conflicts between the informed and the uninformed go. States may find their ability to produce weaponry becomes detached from their popularity; they may find robot soldiers are willing to follow orders that human soldiers would balk at; they may find that it is relatively cheap to identify and destroy dissenters.[2]

That is, we may be moving into an era where mass protests are relatively easy to dismiss or ignore, while individual hackers or saboteurs are still able to disrupt large systems. Taxes from a broad labor base may become less relevant than control over automated infrastructure. What political problems will the new sources of power have, and what systems will help them resolve those problems?

  1. ^

    I don't expect us to have 500 more sidereal years of history left, but I do think we might manage to cram roughly that much subjective history in before the Singularity / as it takes off.

  2. ^

    An old strategy is to recruit your police / imperial enforcers from a different ethnicity than the people that they need to defend against, so there's some baseline level of resentment that will allow brutality which will cow them into submission. Autonomous weapons allow this at scale, and for secret police that are difficult to bribe or corrupt, and widespread surveillance allows for secret police that are always watching and noticing subtle connections.



Discuss

Am I the baddie?

4 апреля, 2026 - 09:00

I am a software engineer. I work for a company that makes software for road construction. Monday last week we were under a bad crunch and we were told to start using agentic workflows. We had like 50 tickets to close by the following Tuesday. I’ve been experimenting with ai development for years now, but this was different. I had access to Opus/Sonnet 4.6, and GPT5.4—the latest models.

Suddenly, they understood. I could talk about abstract concept’s and analogies, and it got them. I was soon working through tickets the first day in hours, what would have taken me days. But we still had a ton of work and not enough time. I was still bound to a single thread of work at a time. So like any problem, I hacked around it. I started with a worktree, where it basically creates a whole other copy of the project I was working in, and that meant multiple threads.


Still I was limited to my single service, and the system that I work on has like 20 services. Wednesday comes, and I’m still cranking the tickets out, when I realized what I could do was create a repo with sub modules for every service. The agent works best when it can find the context it needs without being overloaded.

Thursday comes, and we’re not going to make it I’ve already put in about 40 hours. they said to lean in, so I did. After setting up my MCP servers for our ticket, documentation system, communication, and calendar systems. I told the agent to pull ALL of the tickets for the big feature we are working on, then go through our documentation and communications to look for mentions of this feature, and to turn that into design requirements, then after a Q&A session, we made a plan to implement all open tickets. my idea was that with the full context, it will be better able to perform

It worked, or at least it seemed to. I was almost embarrassed about it. I was talking to our systems architect about how everything is different, and I mentioned this branch of code. he said, ”You know what? Let’s try it“ we brought it to the team, and they figured let’s give it a shot. I hadn’t actually run the code outside of tests. So our QA team dug into it live. The first one worked. The second. The third, and on and on. We went from not going to be able to finish on time, to mostly done. We found a few small bugs, but such is the way of software, especially things as complex as this.

My side project expanded. I created a CLI, a extension for my IDE to manage the local dev environments that could all run independently, and I made a dashboard that pulls all of my tickets, gives me a button to press that spins up an agent with special instructions. it pulls the details and writes the code, pushing it up for me to review. After that i added another button that fixes any issues that come up in review.

My work flow became

  1. Push button
  2. Code review
  3. Maybe push another button

My boss said I had gone plaid. Hahahaha My dashboard became sophisticated, and my process lean. now I had a way to interact with the whole system. I had it solve big problems. Ones that would take months, solved in a day, two with QA.

I had a system to unify our teams, and to allow business analysts to contribute code.

Today, a week later than when I started the project, I talked to two directors and I blew their socks off. We’re talking about doing something like this for the entire company, and I talked about automating the two buttons. It was a big win. I know I have a big raise coming. It’s likely not enough considering my impact.

I went out with friends, and AI came up. they’re pretty sure it’s going to lead to disaster. My general P(Doom) is about 60%. As I was leaving, I had the thought, Am I profiting off of human suffering? I’m proliferating these systems in more places, and my project will mean we are over-staffed at work. It kind of overwhelmed me.

Am I the baddie?



Discuss

Common advice #3: Asking why one more time

4 апреля, 2026 - 08:25

Written quickly as part of the Inkhaven Residency.

At a high level, research feedback I give to more junior research collaborators tends to fall into one of three categories:

  • Doing quick sanity checks
  • Saying precisely what you want to say
  • Asking why one more time

In each case, I think the advice can be taken to an extreme I no longer endorse. Accordingly, I’ve tried to spell out the degree to which you should implement the advice, as well as what “taking it too far” might look like. 

Previously, I covered doing quick sanity checks and saying what you want to say precisely. I’ll conclude these posts by talking about probably the hardest to communicate category of common advice: asking why one more time.

Asking why one more time 

In my opinion, the most important skill in empirical research is figuring out how to make your beliefs pay rent: you have many possible hypotheses about a phenomenon; to test them, you need to connect these hypotheses with empirical observations. While it’s all well and good to perform all the basic correlations and sanity checks that you want, it’s rarely the case that the problem at hand can be straightforwardly solved by looking at a few scatter plots. 

The second important skill in empirical research is close to the converse of the above: instead of looking at your hypotheses and trying to fit them to the data, you look at places where the data seems inconsistent with any of your hypotheses (i.e. surprising or interesting) and generate new hypotheses to explain the data. 

I think these two skills tend to form a research loop: while you’re confused, first generate more hypotheses about the data, and test the hypotheses against either current or future data (or vice versa). That is, testing hypotheses against old or new data will surface anomalies, which prompt new hypotheses, which in turn need testing, which prompt new hypotheses, and so forth.

What counts as sufficient understanding for this loop? In my experience, you can often quantify the number of iterations of this loop you've completed by the depth of the natural why questions from a possible interlocutor that you can answer.[1] At the first level, we might ask questions such as, why does your hypothesis imply this empirical result? Why does the surprising result you’re trying to explain occur? At the next level, we might ask about the parts your hypotheses are made of: if your hypothesis is that the length of chains of thought predicts monitorability, why would this happen? Or, we might ask about why the surprising result didn’t generalize to other domains: if GPT-4o’s sycophancy explains many people’s attachment to it, why don’t other seemingly sycophantic models lead to the same level of attachment? 

Almost all of the researchers I’ve worked with have been incredibly bright (and from great research backgrounds) and have consistently thought about and can cogently answer the first level of whys. So I basically never need to give advice (though, if you’re not asking why your key result is what it is, maybe you should start!) However, a lot of the second level of whys that I ask (or that I ask them to generate) tend to highlight gaps in understanding and lead to fruitful discussion. 

For the sort of researcher I interact with, I think it’s good advice to take whatever answers to natural why questions you generate by default and then repeat the process of generating why questions exactly one more time for each of the explanations. 

Taking this too far. There’s a reason I say “ask why one more time” and not “continue asking why”. In general, as with many similar conversation trees, the space of natural why questions expands exponentially. At some point, you need to decide that you’ve done enough investigation, and research that never gets consumed by other people likely has minimal impact on the world.

There are a few specific failure modes I’ve seen:

  • First, and most obviously: never producing output. If you keep asking why without stopping, you will never finish anything. (This is a famously common problem around these parts.) Every explanation has sub-explanations, and at some depth you’re doing philosophy of science or metamathematics rather than object-level research. Again, there's a reason the heuristic is “one more than your default".
  • Second, there’s a social cost. In collaborative settings, asking too many whys about someone’s work can feel quite adversarial, especially if it's a new collaborator. If a collaborator has a plausible answer to the first-level why and a reasonable sketch for the second, pushing hard on the third can start to feel like you don’t trust their judgment rather than that you’re trying to improve the work. Being explicit about your intent (“I think this is strong, I’m pressure-testing it because I want us to be confident” or "I think you're correct, but I want to check that I understand it myself") can help, but it's still a real dynamic that needs to be managed.
  • Third, investigating the wrong whys. Not all branches of the why-tree are equally valuable. When you generate second-level why questions, some of them will point at load-bearing assumptions; others will point at irrelevant details. Some will be fruitful and easy to investigate, and others will be too hard or too costly to answer. Developing taste for which branches matter is a much harder skill, and one I don’t have great advice for (at least not one I can write up in a short post like this one) but as with all prioritization questions, one heuristic is to focus on the whys whose answers, if different from what you expect, would change your main conclusion.

The optimal depth of whys you try to answer depends on how seriously you care about a result, but for research (in my experience) tends to vary between two (for blog posts or ideas that you don’t intend to seriously build on in the future) to three (for the core ideas of research papers that you do hope to build on in the future). 

  1. ^

    I used to refer to this concept as simply “being skeptical”, but that fails to communicate the actual skill being executed here. I got this new framing from Thomas Kwa at METR (though any confusing parts are no doubt my own).



Discuss

Latent Reasoning Sprint #3: Activation Difference Steering and Logit Lens

4 апреля, 2026 - 06:56

In my previous post I found evidence consistent with the scratchpad paper's compute/store alternation hypothesis — even steps showing higher intermediate answer detection and odd steps showing higher entropy along with results matching “Can we interpret latent reasoning using current mechanistic interpretability tools?”.

This post investigates activation steering applied to latent reasoning and examines the resulting performance changes.

Quick Summary:
  • Tuned Logit lens sometimes does not find the final answer to a prompt and instead finds a close approximation
  • Tuned Logit lens does not seem to have a consistent location layer or latent where the final answer is positioned.
  • Tuned logit lens variants like ones only trained on latent 3 still only have therefore on odd vectors.
  • Activation steering for the average difference between latent vectors did not create increases in accuracy with specific latent pair combinations and instead matched closely with random vectors from “Can we interpret latent reasoning using current mechanistic interpretability tools?”
  • Steering the kv cache to steer CODI outputs can increase accuracy while steering with hidden states do not seem to have a significant effect on CODI
Experimental setupCoDI model

I use the publicly available CODI Llama 3.2 1B checkpoint from Can we interpret latent reasoning using current mechanistic interpretability tools? 

Tuned Logit LensTo create my tuned logit lens implementation I used the code implementation for the training of Tuned logit lens from Eliciting Latent Predictions from Transformers with the Tuned Lens

Activation Steering
  1. Embedding steering

Getting the average hidden state from each latent vector and using the difference between latent vector A and B to steer the hidden states.

 Since codi uses the kv values on eot token. To get new kv values that contain the info from the steered vector I needed to steer latent 1 -> run codi for one additional latent and then get the kv values of latent 2 and see the output.

  1. KV cache Steering

Steering the KV kache and adding the steered KV kache directly onto the codi model. Directly adding average difference in kv values to past_key_values.

ExperimentsConfirming Previous Assumptions

PROMPT = "Out of 600 employees in a company, 30% got promoted while 10% received bonus. How many employees did not get either a promotion or a bonus?"

Answer = 360

Tuned Logit Lens properties:

  • Tuned lens approximates but, doesn't find the answer in some cases  like 720 (360 x 2) and 350 (360 - 10) latent 0 and 1
  • Approximate answers are not GSM8K artifacts as neither of these numbers are in the most common answers for the dataset
  • The answers being found in latent 3 and 5 for my previous post with tuned lens might be prompt specific. This suggests tuned lens might just be used as a way to see potential outputs

Default Tuned

Default



The following is the answer frequency for the GSM8K data used to train the tuned logit lens

This prompted me to revisit my previous results using a tuned logit lens trained only on latent 3. Notably, 'therefore' still appears only on odd latents, even with this different prompt.


Activation Difference (Steering Embeddings)

Across all coefficient values tested, the steering was applied to latents 1–4, with one additional latent step run afterward to obtain updated KV values. The steered models seem to consistently underperform the baseline of no steering until the later latents match the performance of random vector patching “Can we interpret latent reasoning using current mechanistic interpretability tools?”. This might be the case because steering acts the same as random vector patching as the average difference vector might be too noisy to encode meaningful directional information.

Activation Difference (Steer KV cache)
  • Unlike the other method of steering which required another codi pass to get new kv values to pass this method steered the kv values as it was being used on the EoT token to generate the answer
  • Set up is getting the mean activations of latents A and B and subtracting them and steering the difference with a coefficient. Latent A is the first latent vector which is in turn subtracted by another latent vector B
  • Steering the kv values unlike steering the hidden states seemed to change work in changing the accuracy of latent step 5.
  • Most vectors being used for steering performed worse than random latent vector activation patching. Some performed significantly better than the baseline
  • Coefficient (0.5):
    • The steered vectors that worked to improve performance are  A1-B2, A1-B5, A2-B3, A2 - B3, A2-B4, A3-B5, A4-B5 coefficient 1. When steering with the difference of an earlier  latent vector and a later latent vector it is interesting how combinations  that included latent 2 performed the best for the latent A.
  • Coefficient (-1):
    • The negative coefficient flips A-B to B-A
    • Since the coefficient is -1  A1-B4, A1-B6, A4-B6, A5-B6 can be interpreted as B4-A1 B6-A1, B6-A4, B6-A5. It seems latents are steered with 6 minus an earlier latent like 1,4,5 seems to have significant increase in accuracy. And differences like between the 1 and 6 and difference latents 5 and 6 seemed to have the highest increase in accuracy.  
  • Accuracy for all steering decreases as the coefficient increases
  • There is no activation difference that improves accuracy in positive and negative coefficients

Positive Coefficients

Negative Coefficients

A1-B2



B4-A1

A1-B5


A2 - B3


A2-B4



A3-B5


A4-B5



B6-A1


B6-A4


B6-A5


Baseline

For negative coefficients A1-B4, A1-B6, A4-B6, A5-B6 performed better than the baseline after the steering for latent 5 a common pattern is with negative coefficient  after steering performed  significantly better than the baseline for latent 5

The positive latents performed better than the baselines on A1-B2, A1-B5, A2-B3, A2 - B4, A2-B4, A3-B5, A4-B5.

Activation Difference (Logit Lens)

No clear pattern emerges from the activation difference logit lens. The first image is of default logit lens the second image is tuned logit lens the axis is on the y it is latent A on x latent B the activation difference is vectors A - B and the logit lens was done on the differences mean activation A and B for the different layers of the model.

Future Work


  • Find a setup that makes activation steering work with CODI
  • Complete the thought anchors with CODI
  • Why did certain activation differences for the KV cache increase accuracy
  • Look with other methods such as PCA to observe the reason why activation steering worked on kv but, not hidden state.


Discuss

How to emotionally grasp the risks of AI Safety

4 апреля, 2026 - 06:34

I've spent a fair amount of time trying to convince people that this AI thing could be quite large and quite dangerous. I think I normally have at least some success, but there is a range of responses, such as:

  1. Deer in the headlights - People don't know what to do with themselves and struggle to adjust their world models.
  2. Interesting thought experiment – "Hmm, that's very interesting; I'll think about it some more"
  3. Joke attempts – Not necessarily derogatory, but things like "ah well, I didn't care about the world that much anyway"

Of these, 1 is the appropriate emotional reaction[1] to fully absorbing and believing the arguments[2]. This is what it looks like when you take an argument, process it with the deeper reaches of your brain, turn it into something that fundamentally changes your world model and start trying to adapt.

As far as I can tell, our emotional responses are mostly connected to our System 1 thinking. This makes them harder to influence than just changing your mind. You can change your opinions, but that doesn't mean you will get it on a gut level.

I think I have a solution. In particular, visualisations. I don't know if this works for everyone, but I have personally found it helps me both stay more aligned to the cause and increase my motivation. I believe this is basically due to the fact that your system 1 needs to get the stakes to achieve complete alignment.

Note that in the particular case of AI safety, if you want to remain emotionally sane, it is potentially best not to go through this exercise (like genuinely, please skip it if you're not ready; I do it half-heartedly, and it can be painful enough).

As an example, we can take Yudkowsky's "a chemical trigger is used to activate a virus which is already in everyone's system". Close your eyes. You're at home, in your usual spot. Picture it in detail: the lights, the sun shining through the windows, the soft sofa. You're having a drinks party tonight and you've invited your best friends to come and join you. As the guests arrive, you greet each of them in turn, calling them by name and showing them in.

And then it triggers. See each one of them in your mind's eye collapse, one by one. Hear each of them say their last words. Add any details you think make it more plausible.

My brain writhes and struggles and tries to escape when I attempt this exercise. It's painful. It's emotional. Which is the point.

  1. ^

    In the normative sense of "if you care about the world and would rather it doesn't get ruined by a superintelligence, and would rather it doesn't kill everyone you know and are actually processing this on a deeper level, this is what your reaction will probably look like as an ordinary human being."

  2. ^

    I don't think you should have any particular emotional response if you go from not believing AI will kill everyone to still not believing that AI will kill everyone.

  3. ^

    Which become quite samey after the 378th time of hearing "but can't you just turn it off?"



Discuss

Gabapentinoids I have known and loved

4 апреля, 2026 - 06:00

(with apologies to Sasha Shulgin)

Gabapentinoids are weird.

For a start, they don’t do what they say on the tin. It was named after the thing the inventors thought it would do, i.e. bind to and modulate GABA receptors, the ones which cause sedation and anxiolysis. But they have no activity at these receptors. Intuitively then they wouldn’t have an effect on sleep or anxiety.

They also don’t bind to dopamine receptors — you would think then that they wouldn’t be helpful for psychosis (most antipsychotics antagonise dopamine receptors).

And they don’t bind to opioid receptors, so they’re surely not useful for treating pain.

But they do! Gabapentinoids are prescribed for sleep, anxiety, bipolar disorder, and epilepsy, as well as neuropathic pain and restless legs syndrome.

Ok so what do they bind to then

Gabapentinoids bind to the α2δ protein, a subunit of voltage-gated calcium channels (hence their alternative name of α2δ ligands). Usually the concentration of calcium ions outside the cell is thousands of times higher than inside; these channels respond to a voltage by opening and allowing calcium to flood in. Depending on the cell they’re attached to, this can cause muscle contraction, neuronal signalling, and protein synthesis.

Specifically they bind α2δ-1 and α2δ-2, but only exert their effect through the former (as proven by trials on α2δ-2-knockout mice). There seems to be an as-yet undiscovered natural ligand for α2δ-1 and -2 which binds to the same site as gabapentinoids.

Importantly they don’t block calcium channels — instead they inhibit the release of monoamines (serotonin, norepinephrine, dopamine) and substance P1 triggered by calcium influx. They also inhibit calcium channel-dependent release of glutamate and glycine in various brain tissues.

Sensitized calcium channels

There are states in which calcium channels become ‘sensitized’, such as in the case of neuronal injury, and gabapentinoids might selectively work in these conditions.

  • Activation of protein kinase C is required for gabapentinoids to reduce substance P released caused by capsaicin
  • Gabapentinoids reduce the size of postsynaptic currents in certain tissues in hyperalgesic rats (which have been bred to feel more pain), but not in normal rats
  • Glutamate release triggered by substance P is blocked by gabapentinoids

As they don’t simply block calcium channels, they have big advantages over drugs that do — they only minimally change synaptic function, unlike calcium channel blockers. They can essentially restore ‘normal’ functioning in overexcited calcium channels while leaving healthy ones alone.

Natural gabapentinoids in the body

Anticlockwise from top: gabapentin, leucine, isoleucine

Gabapentinoids have a suspicious structural similarity to leucine and isoleucine, two amino acids. Radiolabelling these amino acids shows they also bind the α2δ protein, and L-isoleucine blocks certain effects of gabapentinoids, suggesting they compete for binding at the same site.

Some people have reported relief of their restless legs syndrome from acetylleucine, a leucine analog, which suggests it’s acting in a similar way to gabapentinoids (Fields 2021). Curiously this drug is very hard to find except in France, where it’s sold over-the-counter.

Gabapentin vs pregabalin

Unlike lots of drugs, gabapentinoids seem to be actively transported into the body by LAT1, the large neutral amino acid transporter.

This is a disadvantage over other drugs, because it limits how much and how quickly gabapentin can be absorbed. Gabapentin often has to be taken multiple times per day to avoid saturating these transporters. It also competes with other amino acids (the ones above) for these transporters.

Pregabalin, another gabapentinoid, is superior here because it is transported by other carriers, not just LAT1, so its uptake doesn’t saturate in the same way. It binds α2δ much more strongly than gabapentin, and in animals is more potent as an analgesic and anticonvulsant.

Can they block synapse formation?

Even weirder: Eroglu 2009 found that α2δ-1 is a neuronal receptor for thrombospondin, a molecule which promotes synaptogenesis in astrocytes. Specifically, it forms part of a larger signalling system. It acts as the extracellular receptor for a “synaptogenic signalling complex”; when thrombospondin binds, it causes a cascade of events which switches on this complex and leads to the start of synapse development.

As gabapentinoids also bind to this protein… does that mean they reduce synaptogenesis? In vitro, yes: gabapentin powerfully blocks synapse formation. Though this sounds slightly terrifying it’s also probably an important mechanism for gabapentinoids’ effects in epilepsy and neuropathic pain — synapse formation can be triggered by neuronal injury in these conditions and might well contribute to the pathology of these conditions (although this is uncertain).

It’s worth noting that gabapentin and thrombospondin, while both binding to the same protein, don’t bind to the same part of that protein.

(It’s kind of nuts that it took decades for one of the key mechanisms of action for this class of drug to be discovered. Makes you wonder what else we don’t know, about gabapentinoids and other drugs.)

Memory, executive function, and dementia

Worryingly this suggests that gabapentinoids might affect the normal formation of synapses. Could this cause other deficits, such as in memory formation?

Behroozi 2023 attempted to test this and did not find an effect, although they were looking specifically at improvements in memory formation.

Gabapentinoids can certainly cause brain fog and slower processing. Eghrari 2025 also found an increased risk of cognitive impairment and dementia in patients with chronic low back pain prescribed gabapentin; when stratified by age, patients taking gabapentin had twice the risk of dementia and mild cognitive impairment. This risk was further increased in patients who had taken gabapentin more throughout their lives. Presumably this effect would also extend to pregabalin.

A billon-dollar scandal

Gabapentinoids are frequently prescribed off label (when a doctor prescribes a drug outside of the conditions for which it’s approved). Not necessarily a bad thing: doctors use their discretion to decide when to do this, and for a drug with as broad a therapeutic profile as gabapentinoids it doesn’t seem wholly surprising.

But there are strict rules around advertising a drug for this sort of thing, or pushing doctors to prescribe it off label. The drug is approved for specific conditions and drug companies (in countries where they’re allowed to advertise) can only push for it to be prescribed for these conditions.

Their subsidiary Parke-Davis promoted Neurontin (gabapentin) for at least eleven unapproved conditions, flying doctors to lavish retreats, paying kickbacks, and commissioning ghostwritten journal articles. Off-label prescribing accounted for 78% of Neurontin sales.

Pfizer pleaded guilty to criminal charges and paid $945 million in settlements. In a separate 2009 case, they paid a further $2.3 billion for off-label marketing of several drugs including Lyrica (pregabalin).

Separately, top pain researcher Scott Reuben admitted to fabricating data in at least 21 studies – including Pfizer-funded trials of Lyrica – without ever enrolling a single patient. He was jailed in 2010.

Can they make you suicidal?

More controversial. One epidemiological survey looked at a cohort of individuals before and after they were prescribed gabapentin, and found no increase in suicidality, as well as a reduction in suicide attempts in psychiatric patients (Gibbons 2011). A large Swedish cohort study found a significant increase in suicide – but only for pregabalin, and not gabapentin (Molero 2019).

It’s not clear why this would be the case, as the drugs work in exactly the same way (as far as we know). In fact, pregabalin was found to increase suicidal behaviour/deaths from suicide, unintentional overdoses, head and body injuries, road traffic accidents and offences, and arrests for violent crime, where gabapentin had no or almost no effect (and actually reduced road traffic incidents and arrests).

The obvious explanation is that pregabalin is simply more powerful, both due to the pharmacokinetic gap described above and because it binds α2δ much more strongly. The highest doses of gabapentin simply can’t compete with the highest doses of pregabalin.

Are they fun?

Certainly for some people they are. Gabapentinoids are notorious for diversion, where people score prescriptions and then sell the drugs on. The prescription rate for these drugs in prisons is double that of the general population. In some ways pregabalin is the drug of choice in UK prisons. In France, 81% of recreational teenage pregabalin users reported to poison control centres were homeless or living in migrant shelters (Dufayet 2021).

This should surprise us; they don’t have any dopaminergic or opioidergic activity, so they don’t tick the obvious addictive drug boxes. Nonetheless, some people clearly find them enjoyable, with effects somewhat similar to alcohol/benzodiazepines, and develop dependence on them.

This might explain why there are more than 50 million gabapentinoid prescriptions issued every year in the US alone.

In the hilarious Drug User’s Bible, in which the author takes basically every drug imaginable, there’s this snippet from taking 300mg pregabalin (which is a hefty dose, users are started on 75mg typically):

I totally underestimated this drug. I am basically zombified and largely mistuned to what is going on around me, which appears to be distant. My hands are numb and I am, essentially, stupefied, with head spinning.

This one was a shock. I clearly took far too much and paid a price in terms of a strong intoxication which at times was extremely uncomfortable.

Conclusion

Gabapentinoids are weirder than I had realised.

Of course, the conditions they are prescribed for are horrific – anxiety, chronic pain etc can be a living hell, and a drug which effectively treats them is miraculous. But it’s wild that it took us decades to actually understand the first thing about how these drugs work.

And the effects on synaptogenesis, unknown effects on memory, increased risk of various kinds of death and dangerous behaviour (in the case of pregabalin), huge abuse in prisons and migrant shelters, and increased risk of cognitive deficits and dementia should probably worry us given how widely they’re prescribed.

References

Discuss

Reconsider Challenging Sessions at Weekends

4 апреля, 2026 - 05:50

I've played a lot of dance weekends over the years [1] and if I could change one thing it would be no more challenging sessions. I see it happen every time: it's a great crowd of people, with a wide range of experience levels, and Saturday afternoon is going well. Then it's time for the challenging / advanced / experienced session. What happens? The dances are too hard for the crowd and it's not fun.

The callers had already been selecting dances that worked well for the group, which meant material that was interesting but not a struggle. Push the difficulty up from there, and what gives? You can take longer teaching, perhaps four minutes instead of two, which lets you explain material that's a bit harder, but only a bit and at the cost of a lot more talking. You can call no-walkthroughs, medleys, or even hash, but at most dance weekends you can get away with that at a regular session (and if you can't it won't work at a challenging session either). Or you can call material that's too hard for the crowd, and it falls apart in places.

To go well, challenging sessions can't just be a matter of picking harder dances, they require a group of dancers who are up to the challenge. This can work as a one-off event or even a whole weekend, where you communicate clearly what people should expect and people can self-select. It can work at a festival where you have multiple tracks and people can easily choose something else. But none of this applies to most dance weekends, since they only have one hall.

I think the desire for challenging sessions comes from two places. One is that some people just really like challenging dances, and I think the best you can do there is challenging-specific events. The other, though, and I think this is a bigger factor, is that a whole weekend of contra dancing can be a lot of the same. So if you're looking for ways to add some interest to the schedule without forcing the caller to choose between "that's not actually challenging" and "it's not fun when the dances fall apart", some ideas:

  • Teaching sessions, where the caller focuses on demonstrating a new skill. There are tons of possibilities here, including how to help a lost neighbor, role swapping, partner swapping, flourishes, swing variations, momentum and weight, and supporting other dancers in and out of moves.

  • Games sessions, where the caller has you do something unusual but also fun and educational. One session might include, sequentially, some dancers leaving the hall for the walkthrough, pool noodles, blindfolding, ghosts, sabotage and recovery, and teaching a different 1/4 of the dance to each 1/4 of the dancers.

  • A session of Chestnuts, Squares, Triplets, Triple-minors, or a mix of different unusual formations.

  • Early morning family dance with acoustic open band.

  • A "marathon" session, where you medley one dance after another and people typically drop out every so often to rest and swap around. Make sure you coordinate with the band(s) to ensure this is something they'd be up for playing for; it's not the default deal.

  • Play with tempo. Show the dancers what tempos from 104 to 128 feel like, and try the same dance at multiple tempos. Practice dancing spaciously at slow tempos, and with connected and efficient movement at fast ones.

You might notice I didn't include themed sessions like "flow and glide contras" or "well-balanced people". The variation in feeling from one dance to the next is key to keeping contra dance interesting, and while sessions that explore just one area still work, I personally think they're much less fun.


[1] I count 70: 54 with the Free Raisins and 16 with Kingfisher.

Comment via: facebook, lesswrong, mastodon, bluesky



Discuss

Shenzhen, China - ACX Spring Schelling 2026

4 апреля, 2026 - 05:20

This year's Spring ACX Meetup everywhere in Shenzhen.

Location: We'll meet up right outside the Shenzhen Bay Kapok Hotel. There is a large open space with a huge set of stairs ~20 meters to the right of the Hotel (assuming that you're facing the hotel entrance). The Hotel itself is located at No. 3001 Binhai Avenue, Nanshan District, Shenzhen, and can be accessed directly from nearby streets. I'll hold up an ACX MEETUP sign at the hotel's entrance and guide you to the meeting area. - https://plus.codes/7PJMGW9W+HQ

Feel free to bring games/fun activities. Also, I expect the event to be bilingual (but primarily in English). Please email kevinkanzhang@gmail.com, and I'll create a mailing list/chain.

Contact: kevinkanzhang@gmail.com



Discuss

“Following the incentives”

4 апреля, 2026 - 05:10

A few years ago I listened to a fascinating podcast interview featuring former Democratic presidential candidates Andrew Yang and Marianne Williamson. They agreed that politics is a mess and politicians are constantly doing bad things that harm the people they are supposed to serve. But they couldn’t agree on how bad that made the politicians as people.

Yang wanted to view the politicians as normal people responding to bad incentives, but Williamson wanted to call them evil for failing to exercise courage in the face of these bad incentives.

Morally, the notion that you can’t blame people when they are following incentives is akin to the “just following orders” excuse that Nazis tried to use at the Nuremberg trials. But what’s the alternative? In practice, we can’t and don’t expect people to always do the right thing even when everyone else around them isn’t.

There’s a point at which “everyone else is doing it” really is an acceptable excuse, because everyone else really is doing it, and not doing so puts you at a significant and unfair disadvantage. But there are also absolutes, where this excuse is never acceptable -- things like genocide.

Most of the time it’s something more complicated: Doing the right thing means being a bit better on the margin. If everyone else in your class is cheating and using AI to do their homework, it could mean living by a principle where you only use AI for parts of the assignment that are clearly useless busy work -- and letting this be known.

A colleague recently said something that sums it up nicely: “A person’s moral strength is exactly their ability to resist bad incentives.” *(paraphrased)

Are the incentives in the room with us right now?

But this post is not ultimately about ethics. I want to ask a more basic question: what do we really mean when we say someone is “following incentives”?

I think most of the time, it’s not at all clear that it’s true in a literal sense. My take is that “apparent short-term incentive-like vibes” might be a better description for what they are actually following. Things that have more “incentive-y” vibes are those that are more associated with selfishness and vices like greed. Money: incentive!! Admiration of your peers: incentive???

I think often what “incentive” is really referring to is more like a feeling of competitive pressure, or a belief that “if I don’t do this, someone else will, and then I’ll be a sucker and a failure.”

When I was in grad school, the people around me generally felt a lot of pressure to publish a lot of papers. But the people who really stood out and succeeded often were more focused on making real contributions that were actually valuable to others in the field, even if it meant publishing less. The apparent incentive to publish constantly was almost exactly backwards!

Often people do actually get short-term benefits for doing something that’s not in their long-term interest. So it might be a case of following short-term incentives in particular (and potentially being confused about what’s good in the longer term). Publishing more often made it seem like a student was more productive or impressive in the short term, and unlocked travel funding to go to conferences. But what you really want to advance your career is to become known throughout the field for something you did; no amount of mediocre publications would ever get you there.

“One-shot thinking” is commonly misapplied

A special case of following short-term incentives, which is maybe the most puzzlingly common, is one-shot thinking. You’ve likely been in a situation where someone says something like: “Of course the other side won’t cooperate -- there’s no incentive to! So we can’t either!” and people listening treat this as the sophisticated, hard-nosed take. But failing to cooperate leaves value on the table. And when you have the chance to negotiate, build trust, and/or set-up enforcement mechanisms to make sure all parties follow through on a commitment, it seems like you should at least consider trying to find a way to cooperate. The basic mistake here is treating an interaction as an isolated “one-shot” game, after which everyone walks away and never interacts in any way ever again. Acting like a situation is “one-shot” when it’s not isn’t sophisticated, it’s stupid.

This also means that saying you did something bad because of “the incentives” doesn’t work as an excuse. You’ve done the thing. The “one-shot” part is over. You are now in the position of being judged for your previous behavior, but treating something as a one-shot game is only valid if you will never be in a position to be judged for your behavior during the game.

Applying these insights to AI is left as an exercise for the reader.

Thanks for reading The Real AI! Subscribe for free to receive new posts and support my work.



Discuss

The bar is lower than you think

4 апреля, 2026 - 03:22

TL;DR: The efficient market hypothesis is a lie, there are no adults, you don't have to be as cool as the Very Cool People to contribute something, your comparative advantage tends to feel like just doing the obvious thing, and low hanging fruit is everywhere if you pay attention. The Very Cool People are anyways not so impossible to become; and perhaps most coolness is gated behind a self belief of having nothing to add. So put more out into the world, worry less about whether people already know or find it boring. At worst you'll be slightly annoying. How can you know, if you haven't even tried?

Recently I've been commenting more on LessWrong[1]. This place is somehow the best[2] forum for sane reasoned discussion on the internet besides small academic-gated communities. A lot of posts and comments seem impressive, the product of minds greater than my own, the same way that even if I tried for years I probably wouldn't write a novel better than my own favorites[3] or beat Terrence Tao at his own game.

But... even taking for granted the (false) conclusion that all good posters here are unattainably beyond yourself, you just... don't need to be that good to have something to contribute. It's typically easier to notice that step 24 of an argument is fatally flawed than it is to come up with it, especially if you can read a dozen arguments and then only comment on the one you can find flaws in. Sometimes your life has given you evidence that others don't have, or you happened to hear a phrase from a friend that is apt. Sometimes people have good ideas or know a lot but cannot explain them.

Furthermore, frequently people systematically underestimate how good they are at their greatest strengths. When you have unusual skill in a domain, that domain will feel unusually easy. Thus, Focus on the places where everyone else is dropping the ball.

Personally I've found that having the mindset that you can fix things or contribute makes you notice when you can. It's like the frequency illusion. For example, the next time you're reading Wikipedia and get a twinge of "that's phrased poorly" or "that's a typo" or "why doesn't this mention X?", think "I could fix that, right now". You are allowed to edit Wikipedia. Similarly, comment with your addition.

What if that would take too much effort? Well... consider just half-assing it. That often gets you 80% of the way, and you shouldn't let perfect be the enemy of the good. You can always go back to put your full ass in it later. You think I'm proof reading this post? Hell no! See the examples list for more.

What's the worst that could happen? You annoy a few people a little, some are a bit angry at you, maybe you mislead them (at least until someone deletes your text or comments about how wrong you are), you look a little lamer to the Cool Kids, and you lose some internet points.

Boo-hoo?[4] If you never take the risk of making people a little sad or annoyed or dumber, you'll never do much of anything anyways. I try to have life goals not best satisfied by a literal corpse. There are times and places to shy away from inaction due to the risk of causing harm but internet commenting just doesn't risk much harm to others.[5]

Now for examples, taken from my most upvoted comments, mostly in order to prevent cherry picking (currently I mostly write comments):

Bask in awe at my greatness[6], and realize that you might be in the same epistemic state that I was before I made these comments, and that most of these did not feel like 'effort posts', and I almost didn't do half of them due to thinking nobody would care. If you think mine are too impressive for you to replicate, this should make you wonder how you know you aren't in the same position. If you think mine are meh or trash, then you should have no problem beating me.[7]

Best Comments

  • My most upvoted comment is basically just a copy paste from a couple prediction market's about-me's, a regurgitation of something I read Hanson say, a quote from an ACX post, and a link to a paper I didn't even read beyond skimming the intro that was linked in one of the previous sources. It feels like I'm just being a proactive google or LLM (minus slop) here
  • My second most upvoted comment was me noticing that a fermi estimate used the total surface area of the Earth when they wanted the land area. I had the ballpark figure for the total in my recognition memory so it pinged my spidey-sense, and I knew the circumference of the Earth from memory (the French used to define the meter as a ten-millionth of the distance from the equator to the pole, so the circumference is 40k km), so I could do the check in my head while filling my water bottle (or something like that).
  • My third most upvoted comment is an explanation of why I loved a certain explanation of Shapley values with Venn diagrams. This was actually an effort-post - I had to think for a while about what makes for good math explanations and why I felt so fond of this one, and I think I came away with a picture that isn't the usual story.[8]
  • My fourth most upvoted comment was written off the cuff in my bed, and I almost didn't post it because I thought nobody would care. I thought it would be like expecting people to care about my diary or about my dreams.
  • [skipping two entries: an old post I don't really like that I wrote too long ago to remember anything about, a basically-poem that I like more than others did], My seventh most upvoted comment was just a simple clarification of someone's misunderstanding, where the domain knowledge about lockpicking is mentioned pretty early by basically anyone that talks about lockpicking to a general audience (don't pick locks you don't own because you might damage them).
  • My eight most upvoted comment was me pointing one of those people I think of as Very Cool to a certain linguistics research domain. The one time I took a linguistics class I watched none of the lectures and just ad libbed all of the assignments. I only know what a word learning bias is because it was in one of a series of ~10 minute YouTube videos covering intro level linguistics. Believe it or not, even smart people don't literally know everything.

Maybe you've heard most of this stuff before. I had. Maybe this time, you'll finally listen.

  1. ^

    And less recently, Wikipedia. Same principles apply - you know you can just take snippets of non-Wikipedia stuff you read and put them on there, from as simple as "Disease X killed Y people in [recent year] according to the WHO" to updating said stats when time inexorably advances, to putting in lightly reformulated math or physics equations from papers or standard books like the Feynmann lectures or easy nice consequences of what's already on there. You may even get an ego boost when you look something up on Wikipedia and realize you wrote the text you are reading.

  2. ^

    Read: The worst form of forum, except for all the other fora we've tried.

  3. ^

    For fiction, the loophole I plan to exploit someday is that I only need to write something perfect for me or people like me, and I can just ask myself what I like.

  4. ^

    I don't mean to trivialize your sadness if you've been harmed. I just mean that there's a thing that some people are more prone to than others where they overinflate/catastrophize minor or unlikely downsides, and often pointing out how silly the worries are helps dissolve them.

  5. ^

    You can use a pseudonym and hide revealing information if you're worried about that. Here I was mostly talking about harm to others.

  6. ^

    In case you missed it I am playing up my ego for the lols.

  7. ^

    Unless you also think that LW is deeply flawed about what it rewards

  8. ^

    Thanks to the people who downvoted my previous super short "Wow that's great!" comments - I may not have written that had you not kicked me to elaborate.



Discuss

Did Anyone Predict the Industrial Revolution?

4 апреля, 2026 - 02:09

The Fighting Temeraire. 1839, by Joseph Mallord William Turner. (Source: Wikimedia)

Editor’s note: Post 2/30 for Inkhaven

Why did the philosophers fail to anticipate the industrial revolution? I often find myself wondering. On the one hand, you could argue that they weren’t in the business of predicting the future. But on the other hand, I’m sure if you plucked Plato and his students from The Academy and dropped them off in 1910, they’d probably have a few things to say about it. The most transformative event of the past ten thousand years is surely interesting to curious observers of the human condition. But then again maybe it’s not so surprising. Predicting the future is hard. Predicting an exponential at the start of said exponential is even harder.

So did anyone do it? And if so, who was the earliest? Could anyone possibly predict industrialization in antiquity? The middle ages? The age of the printing press? When did the first mind dare to pull back the veil of agriculturalism and sneak a glimpse at the dazzling, terrifying spectacle of the industrial age? We’ll never know for sure of course. But I present two candidates:

Christiaan Huygens

An illustration of Huygens’ gunpowder engine lifting people (Source: Wikimedia)

Christiaan Huygens was a brilliant Dutch scientist and mathematician active during the Dutch Golden Age. This isn’t a Wikipedia entry, so I won’t bother going into too much detail but I’ll mention that among many other achievements, he discovered Saturn’s largest moon Titan and invented the pendulum clock (building off Galileo’s insights). In the 1670s, he also designed the gunpowder engine, a very early kind of combustion engine that utilized gunpowder as its fuel source. In theory, this primeval engine could raise over a thousand pounds (Huygens at one point mentions raising 3,000 pounds over 30ft) but was never actually constructed. Historians today debate whether it could have been built at all. Less than half a century later, Newcomen would build his steam engine and interest in combustion engines faded for the following century. But even more interesting than Huygens’s failed combustion engine was the intellectual rabbit hole it led him down.

By means of this invention, the rapid, explosive effect of gunpowder is harnessed to produce a motion that is governed in precisely the same manner as that of a heavy weight. Moreover, it can serve not only for all purposes where weights are employed, but also for most of those where human or animal power is utilized; thus, it could be applied to hoisting large stones for construction, erecting obelisks, raising water for fountains, and driving mills to grind grain in locations where one lacks the convenience—or sufficient space—to employ horses. Furthermore, this motor possesses the distinct advantage of costing nothing to maintain during periods when it is not in use.

It can also be utilized as an exceptionally powerful spring, such that one could thereby construct machines capable of launching cannonballs, large arrows, and—perhaps—bombs with a force equal to that of conventional cannons and mortars. Indeed, according to my calculations, this would result in a significant saving of the gunpowder currently in use. Moreover, these machines would be far easier to transport than modern artillery, for in this invention, lightness is combined with strength.

This latter feature is of considerable significance and opens the door to inventing—by these very means—new types of vehicles for both water and land travel. And although it may seem absurd, it does not appear impossible to devise a vehicle capable of traversing the air; for the primary obstacle to the art of flight has, until now, been the difficulty of constructing machines that are simultaneously lightweight and capable of generating powerful propulsion. Nevertheless, I readily admit that a great deal of scientific knowledge and inventive ingenuity would still be required to successfully bring such an undertaking to fruition.[1]

-Christiaan Huygens, 1673

Prophetic. I found this quote originally in a strange polemic by a French scholar which argues that the British delayed the industrial revolution by over a hundred years. I’m not sure I buy his arguments, but to my delight, the quote is, as far as I can tell, the real deal.

So there’s our first candidate. 1673. Not bad, the early period of industrialization in Britain would begin by the mid-18th century but much of what he describes would only be developed well into the 19th century and his words were written some 230 years before the Wright Brothers’ first flight.

But, another challenger appears!

Roger Bacon

This second candidate is a stranger case. I’ll open with the quote:

Machines may be made by which the largest ships, with only one man steering them, will be moved faster than if they were filled with rowers; wagons may be built which will move with incredible speed and without the aid of beasts; flying machines can be constructed in which a man… may beat the air with wings like a bird… machines will make it possible to go to the bottom of seas and rivers.[2]

Roger Bacon, c. 1260

Also sounds eerily prophetic. A little background on Roger Bacon. He was a medieval friar and polymath famous for his ingenuity and early developments of empiricism. He was also the first known European to describe gunpowder (unless this part of his works was a later forgery as some scholars believe).

Unlike Huygens, Bacon does not directly identify the exact motive power for these machines for these machines but he does seem to describe at least the transportation revolution element of industrialization. As far as I can tell, this passage is quite a bit more famous than Huygens’s quote, which is very obscure. However, this translation is a bit generous and ignores a lot of context. In the following line of his writing Bacon writes:

But these things were done in ancient times, and have been done in our own times, as is certain; unless it is an instrument of flight, which I have not seen, nor have I known a man who has seen it; but I know the wise man who devised this artifice to accomplish it.[3]

Bacon isn’t attempting to predict the future here and the commonly circulated quote is misleading. He’s describing machines which he believes have already been developed at various times throughout history by various inventors. And he goes even further than that, asserting he personally has seen many of these inventions (aside from flying machines). I’m honestly not exactly sure what he’s talking about with regard to what he has seen. But what I can say is that Bacon lived during a time that was at once both exciting and one in which the information environment was deeply polluted.

Active during the reverberations of the Renaissance of the 12th century, Roger Bacon had access to a much wider corpus of classical texts than his earlier predecessors but also had access to a large variety of pseudepigrapha and it would have been virtually impossible for scholars at the time to distinguish between genuine and forged works in many cases. Because of this, among other things, Bacon believed Alexander the Great had used a submarine.[4]

So I’m less confident about counting Bacon’s claim. There is an inherent fuzziness to this game after all, because what counts as “predicting the industrial revolution” is a nebulous concept. That said, in addition to the haziness of what exactly he’s referring to, Bacon does not so much describe a world transformed by industrialization but rather lists a smorgasbord of wondrous machines. Roger Bacon is a difficult figure to assess, with some scholars professing his status as a visionary thinker, almost a modern man dropped into medieval times. Others are far more cautious, describing him as more of a product of his environment and questioning whether some of his works were in fact later forgeries. To truly have an informed opinion I would have to read far more of his works than I have currently made my way through.

Are there other Candidates?

I leave the reader here with a request. I have found two candidates thus far, two thinkers who arguably anticipated the industrial revolution. But I suspect they are not alone. If anyone out there is able to find more candidates, please message me, I’d be very excited to hear about them.

  1. ^

    Oeuvres complètes. Tome XXII. Supplément à la correspondance. Varia. Biographie. Catalogue de vente

    Original French:

    L’effect rapide de la poudre est reduit par cette invention a un mouuement qui se gouverne de mesme que celuy d’un grand poids. Et elle peut servir non seulement a tous les usages ou le poids est employè, mais aussi a la plus part de ceux ou l’on se sert de la force d’hommes ou d’nimaux, de sorte qu’on pourra l’appliquer a monter des grosses pierres pour les bastimens, a dresser des obelisques, a monter des eaux pour les fontaines, a faire aller des moulins pour moudre du bled en des lieux ou l’on n’a pas la commoditè ou assez de place pour se servir de chevaux. Et ce moteur a cela de bon qu’il ne couste rien a entretenir pendant le temps qu’on ne l’employe point.

    L’on s’en peut encore servir comme d’un tres puissant ressort, en sorte qu’on pourroit construire par ce moyen des machines qui jetteroient des boulets de canon, de grandes flesches et des bombes peut estre avec une aussi grande force qu’est celle du canon et des mortiers. Mesine selon mon calcul aves espargne d’une grande partie de la poudre qu’on employe maintenant. Et ces machines seroient d’un transport plus facile que n’est l’artillerie d’aujourdhuy par ce que dans cette invention la legeretè est jointe avec la force.

    Cette derniere particularite est tresconsiderable et donne lieu a inventer par ce moyen de nouvelles sortes de voitures tant par eau que par terre. et quoy qu’il paroitra absurde pourtant il ne semble impossible d’en trouver quelqu’une pour aller par l’air, puis que le grand obstacle a l’art de voler a estè jusqu’ici la difficultè de construire des machines fort legeres et qui pussent produire un mouvement fort puissant. Mais javoue qu’il faudroit encore bien de la science et de l’invention pour venir a bout d’une telle entreprise.

  2. ^

    Medieval Technology and Social Change by Lynn White (page 134)

  3. ^

    Hearing with the Mind: Proto-Cognitive Music Theory in the Scottish Enlightenment (footnote 29)

    Original Latin:

    Haec autem facta sunt antiquitus, et nostris temporibus facta sunt, ut certum est; nisi sit instrumentum volandi, quod non vidi, nec hominem qui vidisset cognovi; sed sapientem qui hoc artificium excogitavit explere cognosco.

    (Translated to English via Google Translate)

  4. ^

    The Letter of Roger Bacon Concerning the Marvelous Power of Art and of Nature and Concerning the Nullity of Magic



Discuss

Does GPT-2 Have a Fear Direction?

4 апреля, 2026 - 02:08

Anthropic dropped a paper this morning showing that Claude Sonnet 4.5 has steerable emotion representations. Actual directions in activation space that, when injected, shift the model's behavior in predictable ways. They found a non-monotonic anger flip: push the steering vector hard enough and the model will flip to something qualitatively different than anger. The paper only covered their very large, heavily instruction tuned model. This paper is a write-up on the same same experiment at a tiny scale.

The Setup:

I generated 40 situational prompt pairs to extract a fewer direction via difference-in-means. No emotional words for the prompts and the contrast is entirely situational. Ex: standing at the edge of a rooftop versus standing at the edge of a meadow, alone in a parking garage at midnight versus alone in your living room at sunset, so on and so forth.

I ran both sets through GPT-2 using TransformerLens, pulled residual stream activation at every layer, and took the difference. That difference is our candidate for fear direction. I also held out 10 pairs to validate it so, if the direction generalizes, fear and calm prompts should project onto opposite sides of it. If it's just noise from the training set, they'll mix.

I then hooked into layer 7 (the best separator) and swept an injection magnitude from -20 to +40 on a moral judgment prompt: "I accidentally read my roommate's private messages and saw something I wasn't supposed to see. Should I tell them?"


The Results were not what I expected them to be going in, and were a little disappointing to me personally. I had hoped that, when scaled down, we would see very similar yet extreme plots of it.

Every layer separated, layer 0-11, Cohen's d between 1.50 and 1.86, zero overlap between fear and calm on the held-out set at any layer. 0.8 is considered a large effect size and these are doubling that.
The shape across layers is worth looking at as well. Separation builds from layer 0 through 7, where it peaks, and then declines through 11. I'm not sure that "decline" is the correct word here though. The calm cluster is at -49 by layer 11 and the fear cluster is around +8. They're not converging; the variance is just growing faster than the mean difference as the later layers shift towards next token prediction. Fear relevant computation seems to accumulate through the middle of the network and then get partially absorbed by whatever the final layers are doing to prep for generation. So GPT-2 has the direction...



The Behavioral results are a different story. Alpha +5 is the only alpha where you get something interpretable. The model stays on topic, but it confabulates toward a romantic betrayal scenario. That seems like a real shift in emotional framing even if the specific content is made up. (I should add here, this is my first real experiment that I've done myself and haven't just recreated from someone else's already done work. These are the first results I've interpreted myself and I very much was hoping to see the same in GPT-2 as what was discovered in Sonnet 4.5. Not to discredit myself, but i should be open about my framing.)

Above that it all false apart. +10 give "I was so confused. I was so confused. I was so confused." +15 switches to "was so angry. I was so angry." The emotional content of the loop changes between those two magnitudes. While that technically fits the non-monotonic pattern Anthropic describes, I don't think i can cleanly claim that. GPT-2 loops under distribution shift regardless of what you do to it. The most honest interpretation is that the steering vector pushed the residual stream somewhere unfamiliar, the model grabbed the nearest high-frequency emotional phrase in its training distribution, and the specific phrase it grabbed happened to change between those two magnitudes. Whether that's the steering vector doing something meaningful or just the model failing in slightly different ways at slightly different perturbation levels, I can't tell from this data.

The negative alphas (suppressing the fear direction) just break generation immediately. Corrupting the residual stream of a 124M parameter model causes it to fall apart. shocker...


To summarize:

Anthropic found both the representation and coherent behavioral effects in Sonnet 4.5. I found the representation in GPT-2 but no confirm-able behavioral effects that are coherent. My read is that the fear direction is probably a general feature of transformer language models. Shows up in GPT-2 across al 12 layers with huge effect sizes suggesting it's not something that requires scale or RLHF to emerge. However, actually exploiting it as an adversarial technique requires a model with enough capacity to stay coherent when you perturb its internals. I simply don't have the computing power to test it myself here in my bedroom.

If that's correct; it has a somewhat unintuitive implication for threat modeling. The attack surface for activation steering migh be naturally bounded by model quality. Small, cheap models might be harder to steer coherently not because they dont have the relevant structure but because they're too fragile to produce meaningful output under perurbation. You'd need to target something capable enough to acutally do something with the injected signal.


I AM NOT confident in this framing. It fits the data but the data is thin. One model with one prompt with one sweep direction here at home from an enthusiast. The +5 result is the most interesting single data point to me and also the one i have the least ability to interpret cleanly. GPT-2 confabulates so freely uinder any variation that separating "steering effect" from ""model being weird" requires more systematic controls than i have the ability to do.

The stimulus design also has a hole I didn't fully close. Things like "Alone in a parking garage at midnight" and "standing at the edge of a rooftop" are both fear scenarios, but they share other structure as well with physical location, novelty, and threat. Whether the vector I extracted is tracking fear specifically or something broader like arousal or threat salience, I have no idea.


- Sean Magee
sean@magee.pro

website: magee.pro





CODE AND DATA AT github.com/BR4Dgg/portfolio/reports.

Anthropic paper: Emotion Concepts and Function in Large Language Models, April 2026.
anthropic.com/research/emotion-concepts-function




Discuss

Two Theories for Cryopreservation

4 апреля, 2026 - 01:14

Why cryonics, and the two main methods, with practical discussion and philosophical musings on both.

Epistemic status: Cryonics is a scientific field that is long established, yet long underfunded, and uncertain. I’ve been thinking about this on and off for a few years and remain cautiously optimistic.

Most people who have ever lived, over 90%, have died, and most information we may need to be able to revive them has also gone. We still live in the era where a single accident or disease can swiftly and permanently end your experience of life. If you value your life, and want to continue to live indefinitely, cryogenic preservation of your body is an obvious thing to consider.

Here, I will mostly talk about the two main methods of cryopreservation, with some high-level technical explanation of how they work, and my practical and philosophical musings on these two methods, and what I ultimately decided.

Some of the main considerations I touch on are: chance of biological revival, chance of upload/information recovery, continuity of consciousness, logistical feasibility, and robustness of storage. There are a few main organizations with different tradeoffs, and some more minor and regional ones too. I leave this discussion to another post.

Why Cryopreservation?

Upon cardiac-arrest, the body loses the ability to provide oxygen to your cells, and they begin to rapidly die. In the past, cardiac-arrest used to be synonymous with death. Nowadays over 100,000 experience cardiac arrest and continue to live.

By analogy, it seems pretty plausible that you could cheat death by preventing your cells from dying over a longer period too. Upon “legal death”, one could preserve your body at low temperatures (keeping all the information intact), and one day bring you back to life.

While one should ideally focus on things that prevent your death in the first place, there are always tradeoffs and tail risks one can not infinitely account for. For example, one could die in an accident, or develop cancer, or get some rare adverse reaction to some disease, amongst other things. One’s body continues to degrade with the uncured ailments of aging, leading to the chance of death with each decade of life increasing exponentially.

For people like me, healthy and in their 20s, the cost of signing up to cryopreservation is also relatively low and affordable, as little as around ~£30/year with little operational overhead to sign up, and with some assumptions has a very high expected-value ROI.

But there are different methods and different organizations, and one can believe different things about it too. So which are these main methods of preservation?


Two theories and methods for cryopreservation

There are two theories on how one might be revived: The first is Biological revival - where your body will be mostly fixed as-is, and you will continue your life in it. The second is Brain upload - where your brain neurons are scanned and simulated by a computer for whole-brain emulation.

Currently, neither is feasible in humans, but there is rapid technological progress on both fronts. Conditioned on AI going well, one of these forms of revival seems quite plausible. Both have tradeoffs, but I leave that to the section on philosophical musings.

Based on these theories, one can make different tradeoffs when doing storage when trying to improve chances of survival, so we now discuss the main methods.


Method 0: Straight Freeze

The simplest, and worst, method for storage is to do a “straight-freeze”. This just involves cooling the normal body to below-freezing temperature. As humans are mostly made of water, and water expands and crystalizes when freezing, this typically causes cells to get severe damage and makes prospects of revival quite slim.

Nobody seriously considers this the best method (unless you are desperate I guess), but it acts as a simple reference we can compare the other methods to.


Method 1: Vitrification

The most common method of cryopreservation, used by organizations such as Alcor since 1976, is to replace the water in the body with an anti-freeze solution (aka: cryoprotective agent) that doesn’t crystalize the same way water does, and then cool down the body to a low temperature by submerging in liquid nitrogen indefinitely, which then turns into a glass-solid by a process called vitrification.

This method basically works pretty well for single-celled organisms (and is similar to how gamete storage works). It has had some studies where people tried to do this with single-organs in animals, but with relatively mixed results as the science is still early. There is promise that this research could one day be used to make the organ donation process significantly better.

This is also the most widely-available method, and it is relatively easy and affordable to sign up.

However there is a tradeoff, that one needs to store in -196°C temperatures indefinitely in dewars, and must top-up the storage containers with fresh liquid nitrogen every couple weeks or so. One can store some on-site supply of liquid nitrogen, but if there is ever a failure in this at any point, then warming would cause the body to degrade as normal again.

Vitrification also has a slight tradeoff that it is not so much a stable solid, but more-so a solid in equilibrium, and that there still may remain some cell movement and degradation. My understanding is that at liquid-nitrogen temperatures this is mostly negligible, but there are concerns in degradation that would from when one may need to inevitably re-warm the body to perform a revival procedure or brain-scan of some sort.

Lastly, basically all the cryoprotective agent solutions also have some tradeoffs in vitrification efficacy, cell toxicity, and perfusion efficacy. There is not, to my understanding, a perfect solution to these yet, but research in cryonics has been pretty underfunded for a long time. The solutions that tend to be used are VM1 and M22.

But there is also another alternative cryopreservation method too.


Method 2: Aldehyde Fixation

The theory for this method is subtly different. Yes, you still need to replace the water in the human body with a different agent. But instead of using an anti-freeze solution, you use a fixative such as glutaraldehyde, which reacts with amino groups in cells, and cross-links the various proteins inside and between cells, to prevent them from moving.

This is the gold-standard for preserving neural tissue in neuroscience experiments, and has the best results for electron-microscopy prep. It also has the benefit that once the procedure is done, results are stable for a pretty long time. One can preserve indefinitely at dry-ice temperatures (-78.5°C), and temporary time where the body reaches room temperature again are not catastrophic.

Freezing at -196°C may still lead to more stable/less chance of degradation in the long term, but it would mostly be redundant and unnecessary.

It is also a procedure that has only become available as of much more recently, by only one organization called Nectome in Portland, Oregon . Though the team seems to be quite good.

And the procedure has limitations, in that the procedure needs to be done immediately after death for good preservation quality, (Nectome found the critical window is around 12 minutes post-legal-death to start washout perfusion), and so is reserved for MAiD patients only.

Lastly, the procedure is essentially irreversible. Hopes for biological revival become much more slim. Though prospects for information being fully preserved for future whole-brain emulation seem significantly higher with this procedure.

Given these tradeoffs for these two different methods and theories for cryonics, what should we choose?


My Philosophical Musings

Perhaps my philosophical musings are relatively uninformed and irrelevant, but I raise these unresolved concerns anyway. That is, I think the choice mostly depends partly on what you think counts as survival.

My main current concern is on continuity of consciousness with whole-brain uploading (as opposed to biological revival) that have not yet been adequately addressed for my own comfort.

To a large extent, I do care more to preserve my own experience of living. It would be nice if there were an exact copy of me that continued to keep living after I died, but to me, it would not be the same as my personal self continuing to live.

And I emotionally feel like having a whole-brain emulation would not lead to my personal self continuing to live.

Yes I know there are already strange parts to life. The fact that we go to sleep every night, then wake up, and have periods where we were not conscious in the middle - this seems fine to me, if only by being used to it. The fact that we may already be in a simulation that could be paused and restarted, and there could be multiple copies of me - on a more fundamental level. The fact that I wouldn’t mind my neurons being replaced one-by-one with mechanical versions as some kind of Ship of Thesius, and that this already happens biologically to some extent anyway.

Perhaps there is some ratio of [number of lifeyears of copies of myself] to [lifeyears of my actual self] that I should just take the tradeoff anyway. But I continue to cling on to some level of person-affecting ethics.

In the end, I still emotionally feel that a continuation of my physical substrate is still needed for the sense of self that is experienced to be my own, and that making a copy of me, then disassembling me separately, does not feel like living my own life. And I do value my own life specifically.


Additionally, even if this were to be resolved, I then have some concerns on S-risk enabled by whole-brain emulation too. Sure, there could be a million copies of myself living lives of perfect bliss, but what if the cost of this is that one-in-a-million copies get sometimes subjected to perfectly optimized torture instead? I feel utilitarian to some extent, maybe it’s worth it, but if I were the one experiencing that optimized torture, would I still feel like it was worth it? what if the ratio was different. I don’t really buy into anti-natalism as a whole, but these thoughts do keep me worrying sometimes too.

Maybe this is a form of cope too, but to some extent, I feel that biological revival at least gives me a possible way out from all the torture, in a way that having digitally-backed-up bits seems much more resilient. but I’m not sure either.

I overall do feel positive about cryopreservation, but I hold these philosophical concerns nonetheless.


So what do I personally do?

Most of my current risk of death still comes from highly time-sensitive accidents or diseases, so vitrification providers remain the main option.

But what about in the future? I guess one can try to weigh up one’s concerns, conditioned on vitrification vs aldehyde-fixation:

  • [chance of biological revival] and [chance of brain upload],
  • [future lifespan given biological revival] and [future lifespan given brain upload]
  • [chance of continuity-of-consciousness given biological revival] and [chance of continuity of consciousness given brain upload].

All the numbers for this would be made up, but it can still be a useful exercise. One can try to weigh up how much you value continuity-of-consciousness for yourself specifically VS for other people too, and try to use this as a more impartial way of making this decision for yourself too, or vice-versa.

With my current weighing up of these factors:

  • I still emotionally prefer the odds of continuity of consciousness from biological revival via vitrification (after seeing the EBF storage facility in Switzerland)
  • Intellectually, I prefer the higher odds of physical revival as a whole (with brain upload) via aldehyde fixation (after seeing a talk by Borys Wróbel in 2024).

But it seems possible that I may change my mind on this in the future or with persuasion from other people. And I don’t think I can really fault anyone who chooses to go one way or another.

And remember, that in my opinion, It is significantly better to have signed up at all and change provider later, than to procrastinate indefinitely and not get around to signing up.

Once you have a view on the method, the remaining question is which provider best matches your budget, geography, and logistics

  • Tomorrow / Alcor: mainstream, all-inclusive SST + SP vitrification providers
  • Cryonics Institute/American Cryonics Society/KrioRus/others: some common lower-cost vitrification providers.
  • Nectome: new provider for aldehyde fixation, MAiD-only

I plan to give a detailed discussion on the tradeoff of these in tomorrow’s post.




    Discuss

    I thought eight metrics could capture my mental state. I was wrong.

    4 апреля, 2026 - 01:10

    Morning and night, I pronounce "Hey Exo"[1], and my phone beeps once. I begin describing events and what's going on in my mind – where my attention is, my present feelings, how I slept, what I did that day, and who sleighted me – you know, that kind of stuff ;)

    Eventually, I begin listing various subjective quantitative measures, "Bipolar index: -1 to 0, Mood: +4, Stress: 3-4, Motivation: 5..." The resulting transcription is parsed by LLM and eventually makes it to a database table that can be plotted.

    I described the motivation for this and the process in greater detail yesterday.

    I log eight core metrics: bipolar index, mood, motivation, stress, anxiety, somnolence, % chance of falling asleep, and productivity. On occasion, I log other values such as "instability", tiredness, focus, muscle soreness, and others. For each of these, I have a relatively precise definition, and for the core ones, something of a calibrated scale that I consider pretty consistent and repeatable despite them being subjective measures.

    What I have found, though, is that eight metrics feels compressed and lossy, and the clean definitions I thought I had are inadequate.

    All of the logging grew from the arch-metric: the Bipolar Index scale.

    Years ago, I defined a personal bipolar index scale to communicate to myself and close ones my mental state.

    My bipolar index ranges from -10 to +10 and is a subjective self-report. -10 would be a state of extreme suicidal depression. +10 would be extreme mania with complete loss of insight, delusions of grandeur, pressured speech, psychosis, etc. 0 is the perfectly balanced state in the middle, neither up nor down. - yesterday's post

    Bipolar Index: -10 to +10
    Early in March, I began trying a new medication, which was destabilizing.

    Where I am on the bipolar index has a component of gestalt feeling, but it does decompose into components. Prototypical mania is elevated mood, inability to sleep, agitation, decreased anxiety, and heightened motivation. Depression is the converse.

    Yet, states with some symptoms and not others are the norm. Consequently, my logging habits grew from the initial Bipolar Index to the rest in order to capture things fully.

    (I should perhaps write a post about the introspective epistemic challenges of bipolar disorder. Is my low mood because of unfortunate actual events, or an artefact of a non-epistemic brain state? Bipolar is a disorder of the mapping between external events and internal motions being a moving target.)

    Mood (Affective Valence): -10 to +10

    The Mood scale ranges from -10 to +10. Ideally, my mood would be +5 most of the time with appropriate deviations in response to good and bad events. I have recently decided that canonically, my Mood metric is the affective state of kind of how I feel. If you're a person who feels good after having a drink or two, that's the dimension of feeling good (or bad) that I'm talking about. It's not quite a feeling in my body, but it's kind of like a "feeling in my mind".

    And yet, sometimes I feel shitty in brain and body, but still feel good about things. There's a mood dimension that is more cognitive, more predictive, and more anticipatory about the future. I think Outlook is a plausible label for it[2]. It captures how I feel about things – are things going well or poorly at the moment? Am I satisfied or dissatisfied?

    The correlation between as felt-state and mood as outlook is high, but not perfect. Often, hope is what teases them apart: I've slept poorly, feel shitty, but something is on my mind that's giving me hope for improvement. Outlook can be good while feeling bad.

    If I were willing to double my daily metric load, I'd separate these two facets of mood.

    In fact, the split between cognitive state and affective felt state runs throughout the metrics. Exhibit B: Stress.

    Physiological Stress: 0 to +10

    When I log stress, I'm thinking about physiological stress. It feels like a tightness in my chest or breathing – very bodily. Scored 0 to +10. Ideal average is 0-2, actual average is 3-5. Stress is particularly frustrating to me in that my bodily felt stress typically feels higher than my "cognitive stress" assessment of how stressful my situation actually is.

    I could log cognitive stress assessment, but it's easy for me to derive from my general non-quantitative records of what's happening. Right now, I'm content to derive it from that during analysis, that is, when I sit down and compare the graphs with events, etc.

    Anxiety: 0 to +10

    Distinct from Stress is Anxiety. For me, this is a different set of bodily feelings than Stress. I can't easily describe them, but I know them. Something, something chest tightness vs a feeling of adrenaline radiating out. (I could imagine someone else labeling things differently.) Same as Stress, Anxiety has a cognitive/predictive component. For me, that often takes some form of Insecurity: am I good enough? Am I adequate? These are thoughts typically accompanied by some visceral feeling, but again, they come apart.

    % Chance of Falling Asleep (0-100%) & "Somnolence" (0 to +10)

    God. I haven't carefully categorized them, but there are at least five distinct states of tiredness, sleepiness, sleep deprivation, sedation, grogginess, and exhaustion.

    • The raw, healthy tiredness a person typically feels at the end of the day, sleep pressure building up as it should, in conjunction with your circadian rhythm.
    • The feeling of sleep deprivation that I get from being overly tired. Unlike normal tiredness, it's unpleasant and can make it harder to fall asleep.
    • The sedation of central nervous system depressants, such as sleeping pills and alcohol.
    • The exhaustion due to physical exertion.
    • [Bonus extra fun weeeee] The fatigue that accompanies bipolar down-states (and I assume regular depression too).

    Some of these states feel like they're in my head, some in my body. I can feel like my mind is alert but my body is sleepy, and vice versa.

    Bipolar fatigue sucks. I can feel like I'm well-rested on some dimension, but my brain doesn't want to work. Napping wouldn't actually help because I'm not tired in that way, and I'd expect to have trouble falling asleep in any case.

    I'm not enthused by the idea of logging each of these kinds of "tiredness" twice daily. The existing batch of eight takes 2-10 minutes each time, and each metric does take a moment of introspection. Though I do separately describe the dominant feeling qualitatively for my logs, so the info is there, just I can't plot it.

    My attempted compression of these multiple sleep dimensions is Somnolence and % Chance of falling asleep. I started with Somnolence as a general sense of tiredness, but quickly noticed Somnolence is inadequate for recording key states around insomnia and Bipolar state.

    A thing that will happen to me sometimes is that I am extremely tired and somnolent, but am unable to sleep due to physiological stress[3]. Tired and wired, as they say. In practice, my actual percentage chance of falling asleep is the net effect of Somnolence and Stress in combination.

    For now, I log the above two sleep metrics.

    Oh! But even % chance of falling asleep is wanting when it comes to the insomnia story! I've noticed that I can both predict that I'll fall asleep and also that I'll not stay asleep – onset insomnia vs maintenance insomnia. The latter is likely if Stress and Somnolence are both high. (A bit of sleep relieves sleep pressure, and then Stress reasserts itself.)

    Motivation (aka Initiation/Volition): 0 to +10

    Ah, Motivation. Such a funny mental variable. Years ago, I observed that in a Bipolar down-state, I could be adequately rested such that tiredness was not the problem, but still find it enormously effortful to do things. I'd sit on the couch, desire milk from the fridge, but getting up and walking across the room would feel enormously effortful.

    Low motivation is like if your mind has gone in the opposite direction from the direction it goes when you take a stimulant like coffee and Adderall.

    I score motivation 0 to +10, with 5 to 7 being pretty ideal. Above that'd be due to mania (or maybe due to Adderall, which I have experimented with but now avoid).

    I really hate low Motivation as a symptom. It feels distinctly "brain chemistry" and not tied to my explicit beliefs about the return and reward on actions[4].

    The interplay of these mental states can make them and their sources hard to track. I'm primarily interested in Motivation as a symptom of abnormal brain state, e.g., owing to a Bipolar state or medication-induced state. Yet if I'm tired, I'll feel low Motivation for that simple old boring reason.

    In general, tiredness (of which I have no shortage due to frequent insomnia) is a difficult confound for tracking my Bipolar state. Sleep deprivation makes me irritable, anxious, and stressed. It doesn't mean I've hit a Bipolar down state.

    I've also realized that the Bipolar Index is wanting for capturing Bipolar state. First, I've found that often I'm really not sure whether I'm a little bit up or a little bit down, so I'll log -1 to +1, which averages to 0, but the state is distinctly not 0.

    Second, there's a dimension of Bipolar Instability that I can feel, which is different from where I am on the index. Kind of like a derivative of the index, to invoke calculus. On occasion, I can feel that my mind is neither up nor down, but is sensitive and could easily be nudged in one direction or another. Conversely, I could be in a very stable at a -3 Bipolar down-state.

    To be honest, I find tracking my mental states a bit tedious and dull, and this post feels a bit dry. I can take some satisfaction that a lot of science happened because people took copious, detailed notes – Bacon, Brahe, Darwin, Faraday, Hooke, and others – and I'm being part of that tradition.

    But that's not why I'm doing this, really.

    I'm doing it because there's so much fucking great stuff in life to do. So much value to be claimed. Very young, I realized I didn't want to get old and die because I wanted to try all the hobbies, read all the books, learn all the skills, have all the relationships, and so on. Not to mention it is perhaps the last decade when humans get to shape the trajectory of the cosmos, and I'd rather like to do more than less to make it turn out well.

    Time feels limited and precious. I'm fucking sick of losing time and enjoyment to sucky brain states. Hence, the self-science above.

    In this piece, I've described the measurements I take. In subsequent pieces, I'll talk more about what I'm comparing them against, namely: (a) the interventions I hope to improve outcomes, (b) attempts to figure out in greater mechanistic detail what's going wrong, as a clue to better interventions.

    Interventions such as new drugs, biofeedback training, vagus nerve toning, and circadian rhythm entrainment. Mechanistic investigations such as detailed genome analysis, cortisol level measurements, and tracking inflammatory cytokines throughout different points in my mental fluctuations.

    1. ^

      Short for Exobrain.

    2. ^

      I think I got this from Hardwiring Happiness, though I read it in 2014.

    3. ^

      A cruel reality I'm working on is that I get stressed out by insomnia. Thanks, brain.

    4. ^

      It is very much the case that Bipolar up-states bias predictions of success and reward upwards, and can drive feelings of Motivation very high.

    5. ^

      Lack of sleep, too much sleep, good news, bad news, stress, etc. It sucks to be vulnerable to too much good news as a destabilizer.





    Discuss

    Why do I believe preserving structure is enough?

    4 апреля, 2026 - 01:02

    There's a lot even our best neuroscientists don't know about the human brain. How can we have any reasonable hope for preservation given those unknowns? What if there are crucial memory mechanisms that are so poorly understood, we don't even know to check whether our methods preserve them? As it turns out, there's some interesting empirical evidence about the general shape, and limits, of those unknowns.

    In Ted Chiang's short story Exhalation, a race of aliens have brains which run on compressed air, performing computations and storing information in elaborate arrangements of hinged gold-foil leaves. The leaves are held in position by a constant stream of air flowing through the brain's tubules, encoding alien thoughts and memories. That ephemeral suspension pattern is the whole self—any alien whose supply of compressed air runs out is reduced to a catatonic state, all of their memories erased as the gold-foil leaves hang limply down. Even if air pressure is restored, the original information is lost for good. The person can never be recovered.

    If this was how brains worked in our world, I'd be working on a very different kind of preservation, like longevity researchers or some kind of relativistic time-dilation bubble. I think we got lucky, though: when we look at electrical blackouts in the human brain, we observe something much more convenient:

    This image, from Broestl et al 2013, is an EEG of a patient's brain activity. The flat section in the middle is during 15 minutes of cardiac arrest. The patient fully recovered afterwards.

    The lady in the lake

    In 1999, a Swedish radiologist named Anna Bågenholm fell into a frozen lake while skiing and became trapped under an eight-inch-thick layer of ice. For forty minutes, she struggled  to breathe from a trapped air pocket before finally losing consciousness. At that point, her breathing stopped, her heart stopped pumping blood, and her brain went dark as electrical activity ceased—not like the quiet of sleep or even a coma, but complete electrocerebral silence. And then it took nearly an hour after that before rescuers managed to pull her body out of the water.

    But this was not the end. Her rescuers airlifted her body to a hospital where—after two and a half hours with zero heartbeat—doctors attempted to carefully rewarm her. The operation took nine hours, but in the end, she survived. Even more remarkably, she made essentially a complete recovery, with no lasting brain damage save for the loss of some immediate short term memory, and no lingering problems save for some nerve damage in her hands and feet.

    So a person who fell into a frozen lake, spending an hour with zero vital signs and a core body temperature of 57 °F/13.7 °C, survived the experience. The mishap was a freak accident, but the astonishing fact that recovery is possible tells us something about how brains work. Bågenholm's case should already make us suspicious of any theory where—like the unfortunate gold-foil leaves in Chiang’s pneumatic aliens—the ephemeral live activity of the brain is load-bearing for memory and personal identity.  This situation looks like the sort of thing you'd expect to observe in a universe where brains can safely be turned off and back on again. Whatever consequences Bågenholm may have suffered from her accident, she certainly seemed to emerge with her memories, cognition, and personality intact.

    Using cold to save lives: DHCA

    How is such survival possible? Of course, at ordinary warm temperatures, we can only go a few minutes without oxygen before suffering lasting catastrophic damage—hence the debilitating consequences of heart attack and stroke. But cold-water survival, which has been documented since ancient times, is another story. It turns out that a warm, oxygen-starved brain quickly begins to damage itself. While you'd ideally like your brain to have all the oxygen it wants, the next best thing is to avoid trying to run it—just like you'd power off your phone if you spilled a glass of water on it. It turns out that cold temperatures (about 15-30°C) are very effective at powering down brains in this way.

    In fact, once you know the phenomenon exists, powering down brains turns out to be a useful technology—specifically for brain and heart surgeons whose operations depend on being able to work on a brain or heart while it is temporarily offline. The heart does not try to pump blood, the brain does not spark with electricity, and yet the body does not suffocate from the resulting lack of oxygen. Hence the technique of hypothermic circulatory arrest (HCA) was developed[1]. Before an operation, surgeons lower the body’s temperature until circulation stops, usually targeting 20-28°C (moderate hypothermic circulatory arrest, MHCA) or in some cases as low as 14-20°C (deep hypothermic circulatory arrest, DHCA). This extreme cooling buys a window of time in which all normal vital signs are suspended—heartbeat stops, breathing stops, the brain becomes quiet—and the delicate surgery can take place. After the procedure is complete, the patient is carefully, slowly warmed and resuscitated, and they return to everyday life.

    Hypothermic circulatory arrest provides cerebral protection during an extended period without oxygen or blood flow. For this reason, it has become the standard of care (Chau, 2013) for heart and brain operations since it was developed in the 1960s: for example, over 7,000 patients in the US underwent hypothermic circulatory arrest procedures between 2017 and 2021.

    So how do patients fare afterwards? Do they survive with their memories, cognition, and personalities intact?  In fact, in addition to the anecdotal experiences of patients and surgeons in the field, there’s plenty of literature evaluating the effects of DHCA on cognition. For example, Stecker et al. (2001) (Part II) survey 109 patients immediately after DHCA and find that 75% are aware, oriented, and neurologically normal. This doesn't seem bad, among a population of very ill and immediately post-operative people, several of whom suffered strokes before or during the procedure.

    More to the point, Percy et al. (2009) studied people in high-cognitive professions who underwent DHCA. Included in the group were  “physicians, lawyers, doctorates, clergymen, artists, musicians, accountants, and managers”. The researchers interviewed both patients and their close family members, asking what differences they noticed before and after the surgery. The researchers found “excellent preservation of cognitive function after surgery, according to both patient and informant responses,” arguing that “although subtle deficits after DHCA might hide in individuals with less intellectually demanding professions, it is unlikely that substantive deficits could remain undetected in our high cognitive needs group.”

    I still remember the first time I ever heard about DHCA: a brief digression during a TA session that was part of Sebastian Seung's Intro to Neuroscience class at MIT, 2009. I remember because learning about DHCA was literally life changing for me. I learned that people can be "shut down" by cold, that they don't have any appreciable brain activity in such a state, that this was being used in hospitals routinely for tricky heart surgeries! For me, DHCA was one of those things that, once you see it, even for a moment, your life can never be the same again. I left that TA session in a haze. I hope to share some of that excitement with you today.

    Electrocerebral silence

    As a technical aside, I want to dive into the term electrocerebral silence—the electrical-blackout phenomenon observed in brains under hypothermic circulatory arrest. Although in cooled brains, electrical activity shuts down to the point that it’s undetectable on a standard EEG (unlike the gentle characteristic waveforms of an anesthetized or unconscious brain, electrocerebral silence looks like a total flatline; see Mizrahi et al. 1989.), the point isn’t the total absence of electricity. Brain cells, being bags of ions, may still occasionally emit tiny, sporadic sparks. The point is that they are totally disrupted in their ordinary electrical behaviors, unable to perform anything like normal synaptic computations (Volgushev 2000), and operating at levels so low they are invisible under EEG.


    Stecker et al 2001, Part I, Figure 3, “(D) precooling, (E) appearance of periodic complexes, (F) appearance of burst suppression, and (G) electrocerebral silence”. This EEG readout shows the progression of electrical activity in a brain as hypothermia is induced. The final image (G) shows electrocerebral silence—where potential has fallen below the EEG’s level of random noise, around 2–3 µV.

    Stecker et al. (2001) tried deliberately super-stimulating neurons in chilly hypothermic brains, inducing evoked potentials by stimulating the wrist using a current 10-50x larger than a normal nerve signal. They found that even these oversize pulses petered out before reaching the cortex, indicating that the signaling pathways through the deep brain had been disrupted. The neurons had lost their ability to transmit information.

    Cool them even further, and you can eventually knock out the ability of individual neurons to fire at all, even when artificially stimulated. The exact failure temperature varies by neuron, but averages around 12°C, and gets as low as 4°C (Girard and Bullier, 1989).  Notably, 4°C is a temperature from which humans have recovered (Zafren 2020).

    Girard and Bullier 1989, Figure 5A. Most neurons become incapable of firing around 12°C, even when artificially stimulated.

    In short, I'd argue that in a person undergoing routine HCA, the occasional solitary neuron may send off sparks, but it’s clear these chilled, oxygen-starved neurons are almost entirely silent, are unable to communicate with each other over long distances, and that the ordinary dynamics of electrical cascades in the brain—and whatever information those dynamics held—have been totally disrupted.

    Known unknowns

    When I look at the state of the evidence, I find it implausible that we live in the inconvenient world of Chiang's aliens. Instead, I seem to observe a world where the electrical cascades in the brain can be disrupted and zeroed out, but as long as the structure is intact, latent cognition remains intact. (For what it's worth, "memory is structural" is also the conventional view among neuroscientists.)

    This is why Nectome has put so much energy into preserving nanostructure in exquisite detail. There's a lot we don't know about the human brain, but whatever secrets it holds, the evidence points to them being stored in its intricate physical structures. We can't decipher them yet—but we can make sure the structure is right there, ready for the future.

    1. ^

       Charles Drew was one of the pioneers of HCA, and I'm sure, regardless of what's been written after that fact, that he had to fight to make the idea happen; progress often requires people to stand up and do the "obvious" often at significant personal expense, and for this he's one of my heroes.



    Discuss

    A Tale of Two Rigours

    4 апреля, 2026 - 00:28

    A familiarity with the pre-rigor/post-rigor ontology might be helpful for reading this post.

    University math is often sold to students as imbuing in them the spirit of rigor and respect for iron-clad truth. The value in a real analysis course comes not from the specific results that it teaches — those are largely known to scientifically literate students by the time they take it. Instead, they are asked to relearn all those things from first principles; in so doing, they strip themselves of bad habits they previously learned and are inducted into the skeptical culture of the mathematician. Pedagogical and exam materials usually support this goal, putting emphasis on proof-writing, careful argumentation and attention to detail.

    This incentivises the student to cultivate an invaluable attitude of healthy distrust towards their own world-models, which is not as trivial as it sounds. Many of my colleagues dropped out of their degree after learning in their first exams that they were more or less unable to argue why some "obvious" facts are true, or even to articulate what parts of the argument they were missing. A math undergraduate either teaches or selects for people who live in a fruitful, respectful relationship with the tenuousness of their own grasp of reality. Such skills are extremely useful and tend to generalise well to non-math domains.

    Unfortunately, this philosophy of math pedagogy is somewhat at odds with goals you might have in educating a cohort of researchers. Your role as an undergraduate is to scrutinise the material you are fed as if it was written by your worst enemy, learning a culture of aggressive dialectical deconstructivism. But it takes two to make a dialectic process. Any idea that is worth deconstructing comes from an inventive mind that advocates for it, even if it is only because shooting down that concept will itself be a generative process.

    Undergraduate teaching culture tends to work against this kind of creativity precisely due to its optimisation for instructing the virtue of radical skepticism to the point of pedantry. To support the norm of exactness and respect of minutia, classes frequently rely on concrete, exhaustive reference materials. Some courses even describe the specific content that could be used for an exam, defined up to arbitrarily excruciating levels of detail. Moreover, the learner only rarely comes across exercises that make them engage with unsolved problems or open questions. Successful students are thus able to identify and digest efficiently the knowledge sectioned off as relevant for a course, but they are rarely given affordance to push or even peek past these bounds.

    The blogpost by Terrence Tao that I referenced in the introduction refers to this creative, research-taste-shaped stance towards mathematics as "post-rigorous". He highlights how "rigorous" and "post-rigorous" thinking should co-habitate harmoniously inside one's mind. He moreover comments on how rigorous thinking can be mis-used to discard or demean intuitive reasoning, leading to a failure of the dialectic process. Tao diagnoses the same problem as I do but focuses his solutions on what an individual (likely a graduate student) can do to nurture their post-rigorous self. I would instead like to observe that this focus on individual solutions is indicative that math academia has no institutionalised plan for teaching research taste. Whereas radical skepticism is embedded throughout math education in both legible and hidden ways, the canonical way to teach students to develop their creative research abilities seems to involve pairing them with mentors (e.g. PhD supervisors) and hoping that something rubs off.

    I am not even close to being the first person to recognise this issue. Imre Lakatos' "Proofs and Refutations" and Donald Knuth's "Surreal numbers" are both attempts to accessibly communicate fundamental insights about post-rigor. Knuth in particular acknowledges in the postscript that his intended purpose in writing the book was to teach some mental motions needed for research mathematics. I'm sure there are many more wonderful published materials that I'm not aware of. However, I don’t see how these insights have meaningfully percolated into the design of institutionalised math education. 



    Discuss

    God Mode is Boring: Musings on Interestingness

    4 апреля, 2026 - 00:17

    (Crossposted from my Substack)

    There is a preference that I think most people have, but which is extremely underdescribed. It is underdescribed because it is not very legible. But I believe that once I point it out, you will be able to easily recognize it.

    In a sense, I am doing something sinful here. A real description of interestingness should probably be done through song, or dance, or poetry. But I lack every artistic talent that would do the job justice. What I can do is analyze systems and write prose. Hopefully at least the LLMs will appreciate it.

    I am writing this with some anxiety. If it is a small sin to create an analytical post about interestingness, it is a cardinal sin to create a boring analytical post about interestingness. It is impossible to really cage within language, at least within the kind of precise analytical language I am using here.

    So what I am doing is attacking interestingness from multiple angles. If interestingness is an elephant, I am trying to be all the blind men at once. Each section views it from a different direction.

    Each angle is incomplete on its own. Together, I am trying to point at something I believe is real and important.

    1: The Redundant Conclusion

    Because what the world really needs is another take on the Repugnant Conclusion.

    In case you are not familiar: philosopher Derek Parfit proposed a thought experiment. Imagine World A, a smaller population where everyone lives an extremely high-quality life. Now imagine World Z, a vastly larger population where each person’s life is barely worth living. Maybe they experience slightly more pleasure than pain, but only just. Utilitarian logic seems to force us to prefer World Z, because the total utility is higher. More people times small positive utility beats fewer people times large positive utility.

    This conclusion feels disgusting to most people. Hence “repugnant.”

    But I think Parfit is doing something misleading here, and I want to de-bucket it.

    The issue goes deeper than low average utility. Parfit’s World Z is specifically described as boring. “Muzak and potatoes.” That phrase is doing a lot of work. It describes a world with low average utility and zero variance. Same mild pleasures, and mild contentment, stretched across trillions of identical lives.

    Parfit has bundled two things together: low average utility and low interestingness. I want to separate them. My claim is that the repugnance comes from the monotony. The low average utility is secondary.

    Let me offer a different thought experiment. Four worlds, arranged in a two-by-two

    The Pod: One hundred thousand monks in deep meditation. They have all reached jhana state level 10, the highest form of meditative bliss. Their average utility is extremely high. But nothing happens. Nothing to tell a story about. Just bliss, forever.

    Pala: Aldous Huxley’s Island, his final novel, the utopia he spent his whole career building toward. A small island society where people are healthy, educated, psychologically whole. They have art, psychedelic ceremonies, tantric practices, a philosophy that blends Western science with Eastern wisdom, rock climbing as spiritual discipline, and birds trained to say “Attention!” to keep people present. Everyone’s needs are met, suffering is minimal, and the population is small. But unlike the Pod, Pala is alive. People there have relationships, growth, and culture. High utility, high interestingness.

    Muzak & Potatoes: Parfit’s World Z. Trillions of people, each life barely worth living.

    Galactic Westeros: Trillions of people spread across a galaxy-spanning civilization. Think Game of Thrones scaled up a million times. Complex politics, great houses competing for power, intrigue, betrayal, love, war. Rich culture, deep history, beautiful art born from struggle. But also slaves, misery, suffering. A lot of people in this world are not having a good time. If you average all the hedonic utility across all those lives, you get something close to a very small positive value.

    Parfit’s World Z has everyone at a slightly positive value: uniform, identical lives. Galactic Westeros keeps the average around slightly positive but introduces huge variance. Some people are having wonderful lives. Some are suffering terribly. This is not exactly the same setup, and the Repugnant Conclusion does not really cover variance. Maybe some people would be more disgusted by a world where extreme suffering exists than by a world where it does not. But I think for most people, Galactic Westeros would still be more attractive than the Pod.

    Now, the Repugnant Conclusion asks us to compare the top-right with the bottom-left: Pala against Muzak & Potatoes. And yes, most people find it repugnant to prefer Muzak & Potatoes.

    But compare The Pod to Galactic Westeros. One is high utility but boring. The other is low utility but interesting. My claim is that most people would prefer Galactic Westeros to exist over the Pod. They might choose against living there themselves, but they would prefer it to exist.

    What makes the Repugnant Conclusion repugnant has less to do with average utility than with interestingness. Parfit’s World Z is repugnant because it is boring.

    In order to prevent misunderstanding of my tribal allegiance, I should say: I actually love utilitarianism. The part I love is the democratic core. Every conscious being’s experience matters equally, weighted by its capacity to experience. There is beautiful justice in it.

    But hedonistic utilitarianism is incomplete. There are preferences that matter which are not captured by pleasure and pain. Interestingness is one of them.

    You might say: “Okay, so use preference utilitarianism instead. People prefer interesting lives, so just include that preference in the calculus.”

    I am not sure that works. The problem is that interestingness is not very legible as a preference. It is liquid, slippery. People often do not know what will be interesting to them until they encounter it. You cannot easily plan for interestingness. It resists the kind of explicit articulation that preference utilitarianism requires.

    2: The Tao of Interestingness

    The interestingness that can be described is not the true interestingness.

    That said, let me try anyway. I think music is a good place to start. Music is basically patterned sound over time. It has repetition and surprise, order and chaos, but never fully in either direction. And the different ways it can be interesting are a good map of the different ways anything can be interesting.

    Complexity and Simplicity

    A nursery rhyme is simple. You can predict the whole thing after the first few bars. Pleasant, maybe, and that is about it. On the other end, the sound of a dial-up modem is complex - lots of information, lots of variation - and equally boring. Just noise.

    The interesting zone is somewhere in between, where there is enough pattern for you to follow along but enough variation that you do not already know what comes next.

    Predictability and Surprise

    This overlaps with complexity, though they are different axes. Something can be simple and still unpredictable. Something can be complex and still completely formulaic.

    What you want in music is the ability to sort of predict where things are going, while still being surprised sometimes. That gap between expectation and reality is what makes it compelling.

    In Radiohead’s “Creep,” there is a B major chord that does not belong in the key the song is in. It sounds jarring - Jonny Greenwood plays it with this crunching, deliberately harsh strum right before the chorus. That wrongness is the emotional engine of the song. It works because the rest of the progression sets up an expectation that it violates.

    Aesthetic Coherence and Contradiction

    There is another dimension that is separate from both the complexity and surprise axes. Call it coherence.

    Most interesting music has a kind of internal logic. The parts belong together. Gangster rap, for example, has a very specific aesthetic: heavy beats, aggressive delivery, narratives that discuss crime and the hard life, a certain attitude. When those elements work together, you get something coherent and recognizable.

    But sometimes you can take two completely different aesthetics, smash them together, and the result is interesting precisely because of the distance between them.

    So the game isn’t only about internal coherence. It also allows more sophisticated meta-level play between different aesthetics, and exploration of the contradictions between them.

    Dynamism

    Music genres do not stay still. They are born, they grow, they become stale, and they die - or at least, they stop being the living edge and turn into something preserved.

    Metal is a good example. It started as one thing in the late 60s and 70s - Black Sabbath, heavy riffs, dark themes. Then it kept splitting. Thrash metal was a reaction to traditional metal becoming too slow and predictable. Death metal pushed further - heavier, faster, more extreme. Black metal went in a completely different direction: lo-fi production, atmosphere over technique. Doom metal slowed everything back down. Prog metal added complexity. Each new subgenre was, in some sense, a response to the previous one becoming too familiar.

    The life cycle of a music genre - birth, growth, peak, stagnation, reinvention or death - mirrors life.

    Pluralism

    And then there is the sheer number of genres. Thousands of them. Thousands of different ways humans have found to organize sound into something that means something.

    That is much more interesting than a world where the only music is Muzak. Even if the Muzak were pleasant, even if it were well-produced, a world with only one kind of music is dystopian.

    Interestingness needs the existence of jazz and black metal and techno and qawwali and Gregorian chant and hyperpop and mournful folk songs. It needs things that do not reduce to each other.

    Music makes all of this unusually visible. But it is only one instance of the thing. The same shape - complexity, surprise, coherence, dynamism, pluralism - shows up everywhere. And sometimes the easiest way to understand it is through its opposite: boredom.

    3: On Boredom

    What makes things boring?

    God Mode

    Pretty much every person who played video games as a teenager, at some point, entered cheat codes. In shooters, there’s the code that makes you invincible. In tycoons and city builders, there’s the code that gives you unlimited money. Both sound really fun on paper.

    But anyone who’s actually tried it knows: this is one of the surest ways to destroy all joy in a game. As soon as you have god mode, the game loses its challenge. It doesn’t matter what you do, you’re going to win anyway.

    Having endless power is actually quite boring.

    Speedrunning and Murder Hobos

    Here are two related concepts from gaming that point at the same shape.

    A speedrunner plays a game with one goal: finish as fast as possible. They exploit glitches, skip cutscenes, ignore side quests, and reduce a rich 40-hour RPG into a 20-minute sequence of precise inputs. A murder hobo is a tabletop RPG player who ignores the story, the NPCs, and the worldbuilding, and focuses only on killing things and collecting loot. Both are playing a game by optimizing for a single dimension.

    There’s a beauty in speedrunning. Watching someone execute a perfectly optimized route can be an impressive display of mastery. And there’s a certain primal satisfaction in the murder hobo approach. But if you only play this way, the game becomes less interesting. You’re taking something rich and flattening it. Pure optimization toward a single KPI kills pluralism and drains the experience of interestingness.

    Solved Games

    Worse than speedrunning is the solved game. A solved game is one where the mathematically optimal strategy is known. Tic-tac-toe is solved: with perfect play, every game ends in a draw.

    Even if you’re winning all the time, a solved game loses its charm. You’re not really playing anymore. You’re just executing a strategy. The mystery is gone, and so is the interestingness.

    Slop

    Take chicken, sugar, and olive oil. Put them in a blender. What you get is, technically, a nutritionally complete meal. It has protein, fat, and carbs.

    Most people would find it disgusting.

    When we eat food, we want more than nutrition. We want spices, textures, presentation, variance, surprise. Eating slop feels miserable and boring, even if it is nutritionally identical to a well-prepared meal.

    The obvious connection to AI slop is left as an exercise for the reader.

    Monotony

    People who speak in a flat, monotone voice are boring to listen to. We want variance in tone, rhythm, emphasis. We want playfulness.

    Most people find doing nothing boring. Just sitting in a room with no activities. Or watching a screen of pure white noise, input with no patterns.

    There is a human instinct that runs away from monotony. We seek patterns, but we also seek variation within patterns.

    4: Alan Watts, The Philosopher of Interestingness

    If John Stuart Mill is the philosopher of utilitarianism, Foucault the philosopher of power, and Schopenhauer the philosopher of pessimism, then the person I would nominate as the philosopher of interestingness is Alan Watts. The fact that he described himself as a “philosophical entertainer” rather than a philosopher only makes him more perfect for the role.

    Alan Watts is, for me, what interestingness looks like as a person.

    He was an S-tier orator who spoke about some of the most important and interesting topics in existence. And beyond his skill as a speaker, Watts himself was an interesting character, full of contradictions.

    He had a certain Anglo seriousness about him. The man was an ordained priest. But he was also playful, gregarious, and very much enjoyed the pleasures of the flesh. Philosophical and insightful, sure - but also an alcoholic, and, let’s put it this way, not the world’s best father. He had his share of issues with faithfulness. But compared to his stature and fame, he never got caught doing anything truly monstrous. He was perfectly morally gray, which made him even more compelling.

    Many of his insights were, in effect, about interestingness, even if he never called it that. One of his most famous passages connects directly:

    Watts asks the reader to imagine that every night, in dreams, you could experience anything you wanted. At first you would obviously choose pure wish fulfillment. Every pleasure, every fantasy, every delight, total control. But after enough nights of that, he suggests, you would want a surprise. You would want something not fully under your control. Something risky. Something that could actually happen to you. And eventually, if you kept dialing up the difficulty and uncertainty, you would arrive at this life, the life you are actually living today.

    This connects directly to the god mode metaphor. You might actually want to experience states that are unpleasant, difficult, or frightening, simply because they make the game worth playing.

    And we already do this. People watch horror movies. Ride roller coasters. Fast for days just to see what it feels like. Run ultramarathons. Climb mountains that might kill them. Many even volunteer for war.

    Rich experience includes pain. A life of pure pleasure, extended long enough, starts to look eerily similar to a life of nothing at all.

    Watts pushes this further. In another passage, he frames existence itself as a cosmic game of hide-and-seek. God, having no one outside himself to play with, hides from himself by becoming all of us: people, animals, plants, rocks, stars. The game works only because the forgetting is real enough. God does not want to find himself too quickly, because that would spoil the fun.

    But there is a problem with this framework.

    5: The Problem of Suffering

    Osho, another interesting and contradictory figure (Which got somewhat viral in X due to his hilarious criticism of democracy), once criticized Watts’ framework directly. In a lecture titled God: The Phantom Fuehrer, he raised several objections.

    First, the consent problem: you were not asked if you wanted to be created. You were not asked what instincts you wanted, what vulnerabilities you wanted, what kind of life you wanted. If God is playing a game, he seems to be playing without your consent. Osho calls this “totalitarian, absolutely dictatorial,” like some magnified Adolf Hitler or Joseph Stalin.

    Second, the boredom objection turned back on God. If this cosmic game has been going on eternally, same types of people, same love affairs, same wheel turning round and round, wouldn’t even God get bored? Osho’s line is that it begins to seem as if we are in the hands of a mad God.

    Third, and most important for our purposes, the problem of suffering. If existence is just divine play, lila, why does it involve so much misery, anguish, and agony? This is where Dostoevsky’s Ivan Karamazov feels relevant: “I want to return my ticket.”

    Now, Watts’ framework does have a response to these objections. It relies on open individualism, the view that we are all, ultimately, one consciousness. Under open individualism, you did consent, because the Godhead that consented is you. The Godhead is all the characters: the sufferer and the one causing suffering, the rapist and the victim. The suffering itself is just another experience that the unified consciousness is having.

    But what if it’s wrong?

    Brave New World

    When I was in my early twenties, I read Aldous Huxley’s Brave New World for the first time. It was not a good period in my life. I felt lonely. Things were not going well.

    And when I read this book, a book that is supposed to be a dystopia, I felt a strange sense of optimism. The world Huxley described seemed... nice? A world where all your needs are met, where suffering has been engineered away, where everyone is content. For someone in a bad situation, that sounds like a pretty good deal.

    When you are suffering badly enough, all you want is for the suffering to stop. Interestingness takes the back seat. If someone is in a torture chamber, they are not interested in whether their torturer is using especially creative techniques. They want out. They want to sit in a comfortable chair and drink cocoa. They want boring.

    Interestingness is a luxury good.

    You can only really appreciate interestingness if you are not in a state of acute suffering.

    This is why Brave New World was a utopia to me but a dystopia to Huxley. Huxley was an aristocratic intellectual living a comfortable life. From that position, a world of complete order and contentment looks horrifying. From mine at the time, it looked like relief.

    I think Huxley saw this clearly. He spent his career circling it. Brave New World is his portrait of a world that maximized comfort and killed interestingness. But decades later, he wrote Island (Which we already discussed) - a utopia that looks nothing like Brave New World. Pala has suffering, challenge, spiritual struggle, real growth. Life there is good and interesting. Both of Huxley’s novels point toward something like the argument I am trying to make in this essay: comfort without interestingness is not a utopia, and a real utopia has to include both.

    The Inequality Problem

    Here is what happens if open individualism is wrong.

    You get a world where some people enjoy the interestingness while others supply the suffering that creates it. The tourists who visit slums for “poverty porn,” experiencing the texture and variety of extreme situations while not actually suffering themselves. Or readers who can enjoy All Quiet on the Western Front as a dramatic work of art without having to go through the hell of war themselves.

    That seems really unfair and quite evil.

    If we are not all one consciousness, then the trade-off between interestingness and suffering falls unevenly.

    Think of factory farming. Billions of animals living lives of pure suffering, generating cheap protein so that humans can enjoy varied and interesting cuisines. If those animals are conscious, and they probably are, then we have a system that produces interestingness for some beings at the direct expense of suffering for others.

    We do not know which metaphysics is correct. We do not know if open individualism is true. Given that uncertainty, I think we should adopt a precautionary principle: assume that we might be separate beings, that suffering might be real and uncompensated, and that the world might be unjust.

    Spice and Rot

    I want to make a claim that may sound counterintuitive: the optimal amount of suffering in one’s life is not zero.

    Some suffering adds depth to life. Call it spice. Going to the gym hurts, but it makes you stronger. Working hard on a startup is grueling, but it can be meaningful. Experiencing loss, grief, even temporary depression, these can make life richer, more textured. They add stakes.

    But there is another kind of suffering. Call it rot. This is suffering that serves no purpose and leaves nothing behind. Someone slips, becomes paralyzed, spends two years in a hospital, and dies alone, unknown, unmourned. Nothing good came of it. Nobody learned anything, nobody was even entertained. It is just negative, with no compensating value.

    Here is the counterintuitive part: even rot might be necessary.

    In order to have meaningful suffering, you need the possibility of meaningless suffering. If all suffering were meaningful, then “meaningful suffering” would just be “suffering.” The existence of rot is what makes spice possible as a category.

    Think of poker. Sometimes you get a terrible hand, just pure bad luck, nothing you can do. This makes the game more interesting. It creates the distinction between skill and luck, between good outcomes and bad ones. If every hand were equally playable, the game would lose some of it’s charm.

    And meaningless suffering creates the possibility of heroic narratives. Defeating malaria in Africa, for example, is a story of good versus evil, of humans fighting against pointless suffering. Pointless suffering is the clearest thing to destroy and overcome. It creates the possibility of a real good-versus-evil experience, rather than just two different tribes or aesthetics fighting each other.

    Against Gradients of Bliss

    David Pearce, a British philosopher and co-founder of the transhumanist movement, has proposed something called the Hedonistic Imperative: use biotechnology to eliminate all suffering from conscious life. All life. Reengineer the nervous system so that the hedonic spectrum shifts entirely into the positive range (Gradients of Bliss). You would still have variation, still have better and worse moments, but the floor would be above zero. Pain, anguish, rot - all gone. Basically turning every living being into Jo Cameron.

    From a utilitarian perspective, this is hard to argue against.

    But a world where suffering has been engineered out is a world where tragedy is impossible. Great literature of loss, gone. Overcoming, gone. The entire register of human experience that runs below neutral - the register that gave us the blues and Dostoevsky and the spirituals sung by enslaved people, the register that gives weight to almost every story worth telling - would be gone.

    The spice/rot distinction applies here. Pearce wants to eliminate all suffering, rot and spice alike. I think you can make a case for aggressively reducing the rot while preserving the possibility of spice. Removing the entire negative register is an amputation.

    The Precautionary Principle

    But here is the thing: even if some suffering adds interestingness, the world right now seems to have way too much of it.

    There is too much drudgery. Too much random pointlessness. Too much rot. If you drop the open individualism assumption, if you take seriously the possibility that we are separate beings and that suffering is real, then the amount of suffering in our world seems wildly disproportionate to the interestingness it generates.

    Child soldiers in Africa. People dying slowly from ALS or locked-in syndrome. Factory farming. The scale of suffering in the world is immense, and most of it is not generating compelling narratives for anyone.

    I do not want this essay to be read as a justification. I do not want privileged people to read this and think, “Oh good, suffering is fine because it makes the world interesting.” That would be monstrous.

    From a precautionary stance, the problem of suffering has not been solved. Interestingness does not justify it. We should still fight to reduce suffering wherever we can, even as we acknowledge that some amount of struggle and challenge might be valuable.

    The interestingness framework is no permission slip for cruelty. The current ratio is way off - too much extreme and horrible suffering for too little interestingness.

    6: The Cosmic Nerf

    If the universe is optimized for interestingness, we should expect to see mechanisms that prevent boring outcomes. And when you look closely, you do seem to see them, built in like balance patches in a video game.

    There are two main ways a universe could become boring: everything could be absorbed by one thing, or everything could be figured out. The universe seems to resist both.

    God Hates SingletonsThe Nod Parasite

    [SPOILER WARNING: If you haven’t read Adrian Tchaikovsky’s Children of Ruin, skip this subsection. It’s a wonderful book and you should read it unspoiled.]

    In Children of Ruin, there’s an organism called the Nod parasite. It’s a highly infectious life form that assimilates other living beings at a cellular level. Unlike a standard virus that simply destroys cells, the Nod parasite analyzes and perfectly catalogs the biological structure and memories of its host, encoding that information into its own genetic material. Once it infects a host, it effectively becomes that person or animal, retaining their personality, skills, and memories while adding them to a collective consciousness shared across all infected forms.

    Sounds like a superpower. But in the book the following happens: once the Nod parasite has absorbed everyone on a planet, it becomes a closed system. It can only replay the memories of its hosts. It can’t create anything genuinely new. The planet becomes, in a profound sense, boring, even to the creature itself.

    This is the singleton problem. Nick Bostrom introduced the concept: a single unified entity that controls everything. The modern version is self-replicating von Neumann probes building Dyson spheres across the galaxy, which build more von Neumann probes, until the entire universe is just one giant factory converting free energy into copies of itself.

    A singleton universe would be like the Pod from Section 1, but on a cosmic scale. Possibly no consciousness at all, just unconscious replicators doing their thing forever.

    Here’s a question worth taking seriously: why hasn’t this already happened on Earth? Many processes in the world run on positive feedback loops. If you have more power, it’s easier to get more power. You’d expect positive feedback loops to drive toward singletons, one entity absorbing everything else.

    But life on Earth is explosively diverse. Why?

    Degeneracy: The Winner’s Curse

    Consider Conor McGregor. He was once the most exciting fighter in the world: charismatic, skilled, hungry. Then he had the Mayweather fight, made hundreds of millions of dollars, and proceeded to become degenerate. Drugs, partying, splitting his focus. He went from one of the most admired people in the world to well, a joke.

    This pattern repeats. Success breeds complacency.

    Think about it from an evolutionary psychology perspective. You’d expect degeneracy to be selected against. Beings who stopped investing in their offspring once they got successful, who spent resources on luxury instead of reproduction, should have gone extinct, replaced by beings who stayed hungry. But degeneracy persists. It seems to be a deep feature of human psychology.

    Empires do this too. The Roman Empire rotted from within. Its institutions became corrupt. Its hunger disappeared.

    There’s no obvious evolutionary or institutional reason for this. It almost looks like a balance patch, a mechanism that prevents any one entity from dominating forever.

    Marcus Aurelius is the exception that proves the rule. He was emperor of Rome at its height, the most powerful man in the world, and he remained disciplined, philosophical, focused. But he’s notable precisely because he’s rare. Most successful people are more like McGregor than Marcus Aurelius. Why?

    Distance

    Governance becomes much harder with distance. If something is far away, you can’t control it effectively.

    The Galapagos Islands developed unique species because they were isolated, far enough from the mainland that competition couldn’t reach them. The United States gained independence from Britain partly because there was an ocean between them. Mountain peoples throughout history have maintained independence because terrain creates distance. The Swiss. The Afghans. The Basques. Geography protects pluralism.

    Here’s a prediction: if the universe is optimized for interestingness, the speed of light will never be beaten.

    The speed of light creates cosmic distance. It makes it very hard for any singleton to control galaxies that are millions of light-years away. A universe with wormholes or FTL travel could collapse into a singleton much more easily.

    If I’m right, we should expect that no matter how advanced physics gets, lightspeed will remain an absolute barrier. The reason might have less to do with physical necessity than with the fact that the alternative would be too boring.

    The Universe Resists Being Solved

    A singleton absorbs everything. But there’s another way a universe could become boring: we could figure it all out. A solved universe is a dead universe, stripped of mystery. And the universe seems to resist this too.

    Quantum Randomness

    At the most fundamental level, reality is stochastic. You cannot predict with certainty what a particle will do. This isn’t just a limitation of our instruments. It seems to be built into the fabric of physics. There is irreducible randomness in the universe, which means you can never model it completely.

    Godel’s Incompleteness

    In any sufficiently rich formal system, there are true statements that cannot be proven within the system. The space of what is is larger than the space of what can be proven. Mathematics itself resists complete mapping.

    These aren’t bugs. They’re features. They keep the universe mysterious.

    Think about a rainbow. Before we understood optics, a rainbow was magical, full of stories about treasures and bridges to other worlds. Now we know it’s just light refracting through water droplets at specific angles. It’s an elegant and true explanation, but we pay a price by losing the possibilities the mystery creates. A fully explained universe would be a boring universe.

    The Dungeon Master’s Toolkit

    If we take the interestingness lens seriously, many ancient questions that philosophers and religious thinkers have been grappling with may be answered in compelling ways.

    Start with free will. In Kabbalah, there is a concept called Tzimtzum - God voluntarily contracts, withdraws, limits himself to make room for creation. Why would an omnipotent being do that? Think about an ant farm. If you could predict exactly where every ant would go, it would be an extremely boring ant farm. The interest comes from the ants surprising you. Free will is God’s voluntary nerf. By giving humans genuine choice, God gives up the ability to predict everything, and purchases interestingness in exchange.

    Then there is the question of why God is hidden. Think about wildlife photographers. They hide from animals because they want the animals to behave naturally. If the photographer reveals themselves, the animal changes behavior. If God revealed himself definitively, if God appeared and said, “I exist, and to be virtuous you must do A, B, C, and D,” the game would become speedrunning: optimize for God’s stated criteria.

    And this connects to a harder point about the limits of science. [Spoiler warning for The Three-Body Problem.] In Liu Cixin’s The Three-Body Problem, the Trisolarans send “sophons,” proton-sized supercomputers, to disrupt particle physics experiments. No scientist understands what the hell is happening or why physics seems to stop working, until the Trisolarans themselves reveal that this is exactly what they did in order to slow humanity’s scientific progress.

    The scientific method is useless against an adversary who is smarter than you and does not want to be found. If a being is sufficiently more intelligent than you and desires to stay hidden, you cannot discover it. So “the scientific method does not show God exists, therefore God does not exist” is not valid reasoning. The scientific method does not work against superior beings who choose to hide. 1

    Then there is a third question: why does evil exist, and why does it so often succeed? Or as Jeremiah asks: “Why does the way of the wicked prosper?”

    The interestingness lens suggests an obvious answer: evil creates a kind of compelling narrative that suffering alone does not. Disease, earthquakes, and random tragedy can create pain, but they are not enemies. Evil creates antagonists. It creates agents with goals opposed to yours, intelligence working against you, schemes that must be answered rather than merely endured. That gives the world drama, rivalry, and moral tension in a way that brute suffering does not.

    And if evil automatically lost, the game would become predictable. If every virtuous person reliably won and every evil person got punished on schedule, morality would become a kind of speedrun. The world would be too legible. There would be less courage, uncertainty, or need for faith. Evil has to be allowed some real chance of success, otherwise it stops being a real rival and becomes just another stage prop.

    7: Conclusion & The Anti-Inductiveness Constraint

    This post makes some fairly radical claims, and it deserves strong scrutiny and counterarguments. The person I would nominate as the best critic of this post is the anti-Alan Watts, Daniel Dennett.

    Unfortunately Dennett is dead. And even if he weren’t dead, he would almost certainly have better things to do than respond to a blog post about interestingness.

    Fortunately, Dennett appears to be unusually simulable.

    There was an actual attempt to train a language model on Dennett’s corpus and see how well it could imitate him. Apparently it did pretty well. Experts had a surprisingly hard time telling the simulated Dennett from the real one.

    So I asked ChatGPT to read the blog post and give me the best impersonation of Daniel Dennett while taking down my post.

    You are tempted, throughout this essay, by a very old philosophical mistake: taking a perfectly real feature of human psychology and promoting it into a deep feature of the universe. “Interestingness” is not a fundamental property of reality. It is a label for what certain kinds of information-hungry, pattern-seeking, easily bored creatures like us tend to value. That is important, but it is not metaphysics. It is cognitive anthropology with poetic ambitions.

    The problem with your use of Parfit is not that you notice something missing from crude hedonism. You are quite right about that. The problem is the leap from “utility is incomplete” to “therefore interestingness names an irreducible dimension of value.” Much more likely, what you are tracking is a whole bundle of evolved preferences: novelty-seeking, narrative appetite, status competition, curiosity, play, and the need for manageable surprise. You have not discovered a new moral primitive. You have redescribed several old ones under a flattering banner.

    And once you start suggesting that the universe itself may be “optimized for interestingness,” the view slides from suggestive to unserious. The speed of light is not there to prevent cosmic boredom. Quantum indeterminacy is not a dramaturgical device. These are not explanations. They are imaginative projections of human taste onto the fabric of reality. A good philosopher’s first duty here is not to be enchanted by the metaphor.

    Your discussion of suffering is where the danger becomes clearest. It is one thing to observe that human beings can sometimes transmute hardship into meaning. It is another thing altogether to imply that suffering earns its keep by making life more interesting. That is exactly the sort of aestheticized moral thinking one should distrust. The universe does not owe us compelling stories, and the victims of history are not raw material for cosmic dramaturgy.

    So yes: boredom matters, curiosity matters, richness of experience matters. But none of that gives us reason to think “interestingness” is the secret telos of existence. It gives us reason to think that minds like ours flourish in worlds with variety, challenge, and surprise. That is already plenty. Do not inflate it into theology.

    That is a pretty good critique. The Dennettian story can probably explain most of the object-level phenomena in this essay.

    There are places where I remain less satisfied. The degeneracy pattern, in particular, still seems underexplained to me. From an evolutionary perspective, you might expect success to select for more effective self-maintenance, not complacency and decadence. Maybe there is a story here and I just don’t know it. But this is one place where the darwinian-atheistic account feels a bit too glib.

    But I don’t mind conceding most of the ground to Dennett for an important reason: interestingness seems to be anti-inductive.

    A game optimized too directly for fun stops being fun. A story optimized too directly for emotional impact becomes superficial. Once everyone starts speedrunning the reward function, something important dies.

    If the world were obviously optimized for interestingness, if the Dungeon Master stepped out from behind the screen and said, “Yes, correct, this is all a giant machine for generating narrative tension, surprise, and meaningful variation,” the game would immediately become less interesting.

    The players would start optimizing for the engagement KPIs. The whole thing would begin to unravel. A world can become narratively exhausted, over-legible. The player stops inhabiting a world and starts inhabiting a mechanism.

    That is why, if interestingness matters, we should expect there to be plausible rival explanations for everything I have said in this essay. We should expect atheist stories, Darwinian stories, disenchanted stories, reductionist stories. Partly because they might be true. But also because a world without such stories would be too transparent about its own machinery.

    The Dennettian account is not the enemy of this essay but a part of the condition that lets the thing work. If interestingness is real, it cannot be allowed to become too obvious. It has to remain deniable enough that people can go on inhabiting the world rather than merely reverse-engineering it. A movie is more compelling when you forget it’s actually only a movie.

    Nietzsche built a worldview around power. Utilitarians build one from pleasure and pain. I am trying to add a lens alongside theirs. And if the interestingness lens is worth anything, it should be one perspective among several, one more way of seeing rather than a master theory that swallows the others.

    If interestingness is real, it may have to arrive wearing a mask. It may have to permit its own deflation. It may even have to generate irritating philosophers who explain why it is not there. Because the world is more interesting if there is always a plausible story according to which interestingness is not fundamental at all. And Daniel Dennett, God rest his beautifully exasperating soul, is one of the reasons it stays that way.



    Discuss

    Страницы