Новости LessWrong.com

A community blog devoted to refining the art of rationality
Обновлено: 49 минут 24 секунды назад

Observe, babble, and prune

7 ноября, 2020 - 00:45
Published on November 6, 2020 9:45 PM GMT

Part 1: A demonstration of deductive logic

If Violet is a mathematician, then she is smart.

If Violet is not smart, then she is not a mathematician.

If Violet is not a mathematician, then she is not smart.

If Violet is smart, then she is a mathematician.

This little poem illustrates a basic set of logical forms. In order, they are the statement, its contrapositive, the inverse, and the converse.

There are three key insights here.

1. Any statement implies the contrapositive, and the contrapositive implies the statement. They go together - if one holds, then the other is guaranteed to hold. They are saying the exact same thing.
2. A statement does not imply the inverse or the converse, as you can see in the poem.
3. The converse is the contrapositive of the inverse. That basically means that we can call If Violet is not a mathematician, then she is not smart "the statement," which will make If Violet is smart, then she is a mathematician the contrapositive. Of course, just because we've changed the "roles" of these statements doesn't mean they've become true! We've simply made an illogical statement. But if it were true - if we lived in a heaven/hell where all smart people were mathematicians - then these statements would "go together" and mean the exact same thing. And our original statement and contrapositive would become the inverse and converse, respectively. (Yes, this means that the inverse implies the converse!).

Once you know this, you can use this "logic machine" as a tool to restate informal arguments in multiple ways in order to understand it better, find the weak points, check for consistency, and see if it accords with your understanding of how the world works.

Only fair to make a victim of my own writing. In Let the AI teach you how to flirt, I wrote the following statement:

If you can get your partner to engage in their own natural flirting style, and get good at detecting it, then you can guess their intentions with much more confidence than the average person is capable of.

If somebody else wrote this post and I wanted to make sure I understood it, I might put it through the logic machine, being a bit generous about the word choice as I modify the statement into the contrapositive, inverse, and converse.

Statement: If you can get your partner to engage in their own natural flirting style, and get good at detecting it, then you can guess their intentions with much more confidence than the average person is capable of.

Contrapositive: If you can't guess your partner's intentions with much more confidence than the average person is capable of, then either you're not engaging them in their own natural flirting style, or you're not good at detecting it.

NOT IMPLIED:

Inverse: If either you're not engaging your partner in their own natural flirting style, or you're not good at detecting it, then you cannot guess their intentions with much more confidence than the average person is capable of.

Converse: If you can guess your partner's intentions with much more confidence than the average person is capable of, then you can get them to engage in their own natural flirting style, and are good at detecting it.

What I notice on a phenomenological level is that after rewriting this statement in the contrapositive, inverse, and converse, I lose track of the point. When I wrote the original article, I had a visual, intuitive model foremost in my mind. It was based on my own memories, as well as the article the blog post was inspired by.

After converting the argument into a logical format, I have a very hard time getting back on the horse. It's hard to see how the statement I picked out leads to the next one, or follows from the previous one.

This is because my article is based on inductive reasoning, rather than deductive logic. You can't symbolically compute the next statement from the previous one.

That seems like a problem. Surprisingly, it's almost like deductive logic has temporarily killed my ability to figure out how to flirt...

Well, maybe that's not soooo surprising.

It seems like it would be useful to find a way to distinguish between whether engaging your fast, inductive reasoning mode, or your slow, logical mode, would be more useful.

Part 2: Deductive logic saves us from misfires of intuition

Recently, I had a fit of anxiety in applying to graduate schools. My anxious thought went something like this:

I don't know exactly what I want to research in grad school. But I know that what I can research effectively will depend a lot on the faculty's expertise. What if I get forced into a topic that isn't actually the most effective thing to research? What if the foundation in that ineffective research topic that I build in my MS program then forces me to continue pursuing it at the PHD level? What if I'm jumping into a program that will determine my future career path? What if I need to figure out the most effective thing to research in before applying, but I can't figure that out until after getting into grad school? Augh!

No worries if you didn't quite follow - it was an anxious headspace.

I found the application of deductive logic here to be very helpful. Shoehorning these thoughts into an "if... then" format and shifting them around to the contrapositive, inverse, and converse forms forced me to slow down and consider carefully the statement I was making, wording definitions with care. It went something like this:

The outcome of my MS depends on two things: the program, and my work. The outcome of my MS determines what I can attempt to accomplish in my next role. Therefore, if the program determines my work, then it determines my next role in school or work. If the programs I enter determine my work at every stage of my working life, then choosing my MS program determines what I can attempt to accomplish for the rest of my life.

Statement: If the programs I enter determine my work at every stage of my working life, then choosing my MS program determines what I can attempt to accomplish for the rest of my life.

Contrapositive: If choosing my MS program does not determine what I can attempt to accomplish for the rest of my life, then the programs I enter do not determine my work at every stage of my working life.

NOT IMPLIED:

Inverse: If the programs I enter do not determine my work at every stage of my working life, then choosing my MS program does not determine what I can attempt to accomplish for the rest of my life.

Converse: If choosing my MS program determines what I can attempt to accomplish for the rest of my life, then the programs I enter determine my work at every stage of my working life.

Having my original anxious thought forced into this logical format allowed me to stand apart from it, and give me a gut check on whether the statement, or the inverse, felt more true. The latter felt much more true than the original statement. Of course, my choice of program is influential, but there must be a meaningful degree of freedom within that. From there, I was able to proceed with my planning in a much more calm and constructive manner.

Part 3: Synthesizing babble and prune

For those familiar with the framework, deductive logic seems to be a form of pruning, while my anxious stream of thought was babble. Sometimes, as in the case of my flirtation article, it is valuable to allow ourselves to babble more and prune less. In the case of my graduate school planning, I needed to prune. If we do one when we ought to do the other, it'll be hard to make intellectual progress. We might draw a blank and fail to even start understanding the issue at hand, a failure of babble. Or we might make decisions based around anxious babble: an idea that feels wrong, but seems like it might be a frightening truth of how the real world works.

Clearly, babble has to come first, or else there would be nothing to prune. Intuition has to precede logic in the sequence of thought.

So the question is, "when is it time to prune, and how do we go about it?"

My guess is there are many forms of pruning. As I wrote this post, I would frequently reword sentences, adjust examples, and sometimes throw out whole sub-ideas that seemed to be leading in the wrong direction. I didn't know where I'd end up when I started writing. So there's constant, lightweight pruning going on as I babble, and it's not all logical, deductive reasoning. Sometimes, it's just choosing a different word, or a gut feeling that leads me to say "nah, I can do better than that," or a new idea that's better and wants to replace an old idea.

In fact, I almost want to introduce a distinction between "pruning," as a form of criticism, and "pivoting," as a change of direction. Pivoting is what I do as I go along with my babble. It's a part of babble that looks a bit like pruning, but isn't. It is part and parcel of developing an idea from the hazy intuitions in my head into a legible form. Even offering counterexamples can be a pivot, a form of babble. It leads the conversation forward.

Pruning, on the other hand, is when I've gotten a thought down, or even a whole series of thoughts. Now, I stand back, and turn them around and around. What am I actually saying here? Are my definitions clear? What if I put a statement through the logic-machine, or shoehorn it into a series of math-like "if... then" statements, just to see what it spits out? Does the cited article actually support the claim that uses it as a reference? This takes place on the level of symbolic manipulation, and kills the progress of thought. It is not generative. It won't allow you to build a map of any more territory than you've already charted. It only allows you to check the symbols you've already put down on paper for accuracy and consistency.

Observe, babble, and prune

I also want to add one more category on top of babble/pivot and prune: observe.

If I sit down and say "I want to babble about X," I might have a very hard time. I can't just demand that my brain come up with all kinds of thoughts about X. And of course, if I can't babble, then I can't prune. What if I wanted to babble about X, though? How do I get this party started?

The first thing is to observe X. This might seem obvious, but it really isn't. It has powerful implications for building good, productive mental habits. Building a strong practice of observation is a foundational skill for a mind, just like being able to babble a model, or prune it.

I want to draw a big circle around observation. It encompasses many things: direct sensory perception, scientific experiment, memory. Even scholarship counts, since it is on some level an attempt to transmit a direct observation or mental experience. Just absorbing observations that seem related to X has to come first before we can start to babble.

My guess is that building a systematic habit of sequencing observe-babble-prune in the proper order would be the fundamental skill for any human being. Probably most people do it on an intuitive level, but having an explicit 1-2-3 map seems like a very useful thing to have.

When I'm trying to think about X, I usually start by trying to babble. In the future, I'm going to try to take it a step back and start to observe.

When I read other people's babble, I know that it will usually include some observations. I'm going to try and bear in mind that their babble didn't start as babble. In fact, if it's good babble, it certainly started with many observations, the vast majority of which are probably not included in the babble. There's a pre-screening process of picking and choosing, synthesizing, and putting forth, that happened in the mind of the writer or speaker before they put words on a page. I haven't gone through that same period of observation, and I can't reconstruct it from the babble.

Conclusion: What to do with babble?

If babble can't let us step into the mind of the author, then what's it good for beyond entertainment and parroting the author's opinions?

• Babble can function as an observation-inspiring machine. It suggests a new way of observing the world: new books to read, new patterns to recognize, new experiences to pursue, or old memories to revisit. It can also transmit select observations directly to you, whether it's a story, a mathematical proof, a graph, a recipe, a recommendation, or a picture in a picture-book.
• Babble can function as a babble-inspiring machine. If it has done the work of inspiring your own observations, you might expand on it, offer counter-examples, criticize it fairly or unfairly, or crack a joke.
• Babble can function as a pruning-machine. It goes into your mind, finds an idea, and addresses it on the level of sheer deductive logic. It points out direct contradictions on a symbolic level, or restates an idea in words that are equally accurate but render it less (or more) sympathetic.

I think it's exciting to imagine a shift from rationality as primarily addressing misfires of the mind - statistical mishaps, cognitive biases, and so on - to rationality as a constructive practice. I would feel comfortable offering "observe, then babble/pivot, then prune" as a comprehensive prescription for literally any question or topic that anybody wanted to explore.

All the other techniques of a rational practice fit within it. My guess is that great mappers of the territory understand clearly how to move between all three phases and have a big toolkit of practices for each one. They don't shirk observation before babbling, and they have a lot of background observation beyond what they choose to convey in their written work.

This framework is neatly recursive and can be applied somewhat mechanically: "I want to get better at observing. How do I do that? Well, first, I need to observe how I observe. And to do that, I need to observe what I observe. Oh, OK - what specifically do I want to get better at observing?" and so on.

I can't promise that it's a magic bullet, but I plan to use it as both a lens when I read other people's writing, and as a guide when I'm following my own curiosity.

Discuss

Covid 'Mink variant'

6 ноября, 2020 - 20:50
Published on November 6, 2020 5:50 PM GMT

A new strain of Covid - "Covid-19 mink variant" or "Cluster 5" - has been circulating in Denmark. The Danish PM announced that twelve people are confirmed infected by the virus; Denmark has ordered the destruction of all of the country's 15million+ mink. Denmark has now imposed local lockdowns to stop the spread of the virus.

While there's no evidence that the virus causes worse disease than other strains of coronavirus, two Danish officials have suggested that the mutant is unlikely to respond to the Covid vaccines currently in development.

I'm struggling to find any more information about the new strain. Keen to know if people have thoughts on this; at first glance this has the potential to undermine all the efforts to create a vaccine and build (limited) herd immunity thus far.

Discuss

Generalized Heat Engine II: Thermodynamic Efficiency Limit

6 ноября, 2020 - 20:30
Published on November 6, 2020 5:30 PM GMT

This post continues where the previous post left off.

Key idea from previous post: this is a compression problem. We want w of the coins to be deterministic and the transformation to be reversible, so all the information from the initial coin-state must be compressed into the non-deterministic coins in the final state. The thermodynamic flavor of the problem comes from the additional constraint: we want to compress the initial state “data” while also conserving the total number of heads.

In this post, we will make one small tweak to this setup relative to the previous post. In the previous post, “work” just meant making some coins deterministically heads; we didn’t have a strong opinion about which coins, so long as we knew which coins they were. In this post, we’ll assume that we start with some extra coins outside our two pools which are deterministically tails, and use our “work” to make those particular coins deterministically heads. This makes it a bit cleaner to separate the “moving energy around” and “moving uncertainty around” aspects of the problem, though of course those two pieces end up coupled.

This post will dig more into the optimization aspect of the problem, look at temperature as a “price” at which we can trade energy (or analogous conserved quantities) for entropy, and view the heat engine problem as arbitrage between two subsystems with different temperatures/prices. That, in turn, yields the usual thermodynamic efficiency limit. Then, we’ll point to some generalizations and applications to wrap up.

Lagrange Multipliers

Let’s think about just one of our two pools in isolation. We’ll imagine adding/removing marginal heads (analogous to energy) to/from the pool.

Recall that our initial coin distribution for each pool is maxentropic subject to a constraint on the number of heads. If we add one head to a pool, then the constraint is relaxed slightly, so that pool’s entropy can increase - the maximum entropy distribution on those coins with one more head allowed will have slightly more entropy than the initial maximum entropy distribution.

How much can the entropy increase? That’s given by the Lagrange multiplier associated with the constraint. As long as the number of heads we add is small (i.e. small enough to use a linear approximation), the increase in maximum entropy will be roughly the number of heads added times the Lagrange multiplier. In economic terms: the Lagrange multiplier is the “price” at which we can trade marginal heads for a marginal change in maximum entropy. (Indeed, prices in economics are exactly the Lagrange multipliers in agents’ optimization problems.)

In standard stat mech, this Lagrange multiplier is the (inverse) temperature. Specifically: in our setup, if we assume energy is proportional to the number of heads and take the limit as the number of coins goes to infinity, then our constraint on total heads becomes a constraint on average energy. The Lagrange multiplier associated with the average energy constraint in the entropy maximization problem is β∝1T, with the proportionality given by Boltzmann’s constant.

Quantitatively, for our example problem, the (initial) Lagrange multiplier for each pool is −log(pH1−pH), i.e. the log likelihood of heads in each pool. That’s 2 bits/head for the hot pool and roughly 3.17 bits/head for the cold pool. Conceptually: if heads have probability 0.2, then one head makes a contribution of −log(0.2) to the entropy, while one tail makes a contribution of −log(0.8). Flipping a tail to a head therefore increases entropy by −log(0.2)+log(0.8)=log(4)=2 bits. (Though note that this is a post-hoc conceptual explanation; Lagrange multipliers are usually best calculated using the usual methods of convex optimization and maximum entropy.)

Arbitrage

If we have two pools at different temperatures, then we can “arbitrage” heads/energy between them to increase the maximum entropy of the whole system.

We remove one head from the hot pool (remember: this just means subtracting 1 from the constraint on the number of heads in that pool). In our example, the hot pool’s Lagrange multiplier is 2, so this decreases the maximum entropy by roughly 2 bits. But then, we add one head to the cold pool, so its maximum entropy increases by roughly 3.17 bits. The total number of heads across the full system remains constant, but the maximum entropy of the full system has increased by 1.17 bits.

What does this mean in terms of extractable “work”, i.e. number of bits we can deterministically make heads?

To extract a unit of work, we remove a head from one of the pools, and add that head to our initially-tails pool, reducing the maximum entropy of the pool by one head (2 bits for the hot pool, 3.17 bits for the cold pool, same as earlier). To maximize efficiency, we’ll take it from the hot pool, so each head of work will decrease the maximum achievable entropy by 2 bits.

To make our whole transformation valid, we must move enough heads from hot pool to cold pool to offset the maximum entropy loss of our work-coins. Assuming we take the work-coins from the hot pool, we’ll need to move roughly 2/1.17 = 1.71 heads from hot to cold for each head of work extracted. In terms of thermodynamic efficiency: for each head removed from the hot pool, we can extract roughly 1/(1 + 1.71) = .37 heads of work.

Writing out the general equation: our Lagrange multipliers are inverse temperatures 1TH and 1TC. Each work-coin “costs” 1TH bits of maximal entropy, and each head moved from hot to cold “earns” 1TC−1TH bits of maximal entropy, so the number of heads we need to move from hot to cold for each head of work is

1TH1TC−1TH=1THTC−1

Finally, the traditional thermodynamic efficiency measure: for each head removed from the hot pool, we can extract 11+1THTC−1=1−TCTH heads of work.

As expected, we've reproduced the usual thermodynamic efficiency limit.

Recap

Let’s recap the reasoning.

We want our transformation to be reversible, so all of the information from the initial distribution must be “stored” in the final distribution. That means the final distribution must have at least as much entropy as the initial distribution - otherwise we won’t have enough space. So, our transformation must not decrease the maximum achievable entropy. That’s the argument from the previous post.

This post says that, if we decrease the number of heads in a pool by 1, then that has a “cost” in maximum achievable entropy, and that cost is given by the Lagrange multiplier in the entropy maximization problem (i.e. the inverse temperature). With one hot pool and one cold pool, we can “arbitrage” by moving heads from one pool to the other, freeing up extra entropy. We can then “spend” that extra entropy to remove heads from a pool and turn them into work. This gives us the usual thermodynamic efficiency limit, 1−TCTH.

Further Generalization

With this setup, it’s easy to see further generalizations. We could have more constraints; these would each have a “price” associated with them, given by the corresponding Lagrange multiplier in the maximum entropy problem. We could even have nonlinear constraints (i.e. not additive across the pools), which we'd handle by local linear approximation. We could have more pools, and we could arbitrage between the pools whenever the prices are different.

We can also generalize thermal equilibrium. Traditionally, we consider two systems to be in thermal equilibrium if they have the same temperature, i.e. the same Lagrange multipliers/prices. More generally, we can consider systems with many constraints and many pools to be in equilibrium when the Lagrange multipliers/prices for all pools match. Notably, this is essentially the same equilibrium condition used in microeconomics: economic agents in a market are in equilibrium when the Lagrange multipliers/prices on all their utility-maximization problems match, and those matching prices are called the “market prices”. So thermal equilibrium corresponds to economic equilibrium in a rather strong sense. (One difference, however: in the thermodynamic picture, the entropies of different pools are added together, whereas we can’t always add utilities across agents in economics - the economic model is a bit more general in that sense. Thermal equilibrium is an economic equilibrium, but the reverse does not apply.)

Finally, a generalization with potentially very wide applicability: it turns out that all of Bayesian probability can be expressed in terms of maximum entropy. When data comes in, our update says “now maximize entropy subject to the data variable being deterministically the observed value”, and this turns out to be equivalent to Bayes’ rule. So, if we express all our models in maximum entropy terms, then essentially any system would be in the right form to apply the ideas above. That said, it wouldn’t necessarily say anything interesting about any random system; it’s the interplay of compression and additional constraints which makes things interesting.

Applications?

I’m still absorbing all this myself. Some examples of the sort of applications I imagine it might apply to, beyond physics:

• I’m designing an AI to operate in some environment. I don’t have direct access to the environment, so I can’t directly reduce my uncertainty about it, but I can program the AI to “move my uncertainty around” in predictable ways. If the AI faces additional constraints (which is already a common way to set up AI problems), then we’d get something thermo-like in the AI design problem.
• In biology, an organism’s genes are effectively a “design” which will operate in some environment which the genes don’t perfectly “know” in advance. So, the genes can’t “reduce their uncertainty” about the environment, but they can program the organism to move that uncertainty around. The organism will also face many other constraints - energy, availability of particular molecules, etc - so we’d expect something thermo-like in the gene optimization problem.
• In economics, business owners/managers, regulators, contract writers, and other incentive-designers need to design rules, and they can’t always observe every time the rules are used. So, they can’t reduce their uncertainty about the environment in which the rules will operate, but they can design the rules to shift uncertainty around. They also probably have other constraints - budget constraints, for instance - so we’d potentially expect something thermo-like in some rule design problems.

I see two (not-insurmountable) barriers to applying thermo-like ideas to these problems. First, outside of physics, our transformations don’t always need to be invertible. In more general problems, I expect we’d want to factor the problem into two parts: one part where our choices reduce the number of possible environments, and another part where we just move uncertainty around within the possible environments. The second part would be thermo-like.

The other barrier is the “goal”. In the thermodynamic setup, we’re trying to deterministically extract a resource - heads, in our toy problem, or energy in physics. This resource-extraction is not synonymous with whatever the goal is in a general optimization problem; resource-extraction would usually just be a subgoal. In any particular problem, we might be able to identify subgoals which involve deterministic resource extraction, but it would be more useful to have a general method for tying a generic goal to the thermo-like problem.

Again, these problems don’t seem insurmountable, or even very conceptually difficult. They’d take some legwork, but are probably tractable.

I’d also be interested to hear other applications which jump to mind. I’m still mulling this over, so there’s probably whole categories of use-cases that I’m missing.

Discuss

Teach People to Recognize the Sound of Covid?

6 ноября, 2020 - 17:30
Published on November 6, 2020 2:30 PM GMT

Researchers claim to have a machine learning system that can diagnose Covid-19 over the phone by analyzing recordings of a forced cough (pdf). They claim sensitivity of 98.5% and specificity of 94.2%, and for asymptomatic cases a sensitivity of 100% (?!?) and specificity of 83.2%. I'm curious to what extent errors in the test are correlated from day to day. This can all be offered at unlimited scale for essentially zero additional cost. So, of course, it will presumably be illegal indefinitely because it's not as good as accurate as a PCR test and/or hasn't gone through the proper approval process, and no one will ever use it. Then again, if one were to somehow download or recreate such a program and run it, who would know?

You can see how they collected data, and contribute your own cough at opensigma.mit.edu. It doesn't seem to me like they actually have great data on which recordings correspond to people who have it? For example, I submitted mine this morning, and said that I don't think I have Covid. If, however, I later learned that I did have Covid at the time I submitted the sample, there doesn't seem to be any way for me to tell them.

You could get better data, however, by collecting alongside regular Covid tests. Have everyone record a sample of a forced cough when you test them, label the cough with their results once you have it, and you end up with high-quality labeled samples. They trained their AI on 5,320 samples, but at current testing rates we could get 80k samples in a single day in just Massachusetts.

It might turn out that even with higher quality data you still end up with a test that is less accurate than the standard of care, and so are unable to convince the FDA to allow it it. (This is an unreasonable threshold, since even a less accurate test can be very useful as a screening tool, but my understanding is the FDA is very set on this point.) Is there another way we could scale out auditory diagnosis?

Very roughly, their system is one where you take lots of samples of coughs, labeled with whether you think they were produced by someone with coronavirus, and train a neural network to predict the label from the sample. What if instead of artificial neural networks, we used brains?

People are really smart, and I suspect that if you spent some time with a program that played you a sample and you guessed which one it was, and then were told whether you were right, you could learn to be quite good at telling them apart. You could speed up the process by starting with prototypical standard and Covid coughs, as identified by the AI, and then showing progressively borderline ones as people get better at it. In fact, I suspect many medical professionals who have worked on a Covid ward already have a good sense of what the cough sounds like.

I don't know the regulations around what needs a license, but I think there's a good chance that hosting a tool like this does not require one, or that it requires one that is relatively practical to get? If so, we could train medical professionals (or even the general public?) to identify these coughs.

Automated screening would be much better, since the cost is so much lower and it could be rolled out extremely widely. But teaching humans to discriminate would be substantially cheaper than what we have today, and with much weaker supply restrictions.

(I looked to see whether the researchers made their samples available, but they don't seem too. Listening to some would've been a great way to check how practical this seems.)

Discuss

6 ноября, 2020 - 17:26
Published on November 6, 2020 2:26 PM GMT

This is the ninth post in the Cartesian frames sequence. Here, we refine our notion of subagent into additive and multiplicative subagents. As usual, we will give many equivalent definitions.

The additive subagent relation can be thought of as representing the relationship between an agent that has made a commitment, and the same agent before making that commitment. The multiplicative subagent relation can be thought of as representing the relationship between a football player and a football team.

Another way to think about the distinction is that additive subagents have fewer options, while multiplicative subagents have less refined options.

We will introduce these concepts with a definition using sub-sums and sub-tensors.

1. Definitions of Additive and Multiplicative Subagent

1.1. Sub-Sum and Sub-Tensor Definitions

Definition: C is a multiplicative subagent of D, written C◃×D, if there exists a C′ and D′≃D with D′∈C⊠C′.

These definitions are nice because they motivate the names "additive" and "multiplicative." Another benefit of these definitions is that they draw attention to the Cartesian frames given by C′. This feature is emphasized more in the below (clearly equivalent) definition.

1.2. Brother and Sister Definitions

Definition: C′ is called a brother to C in D if D≃D′ for some D′∈C⊞C′. Similarly, C′ is called a sister to C in D if D≃D′ for some D′∈C⊠C′.

E.g., one "sister" of a football player will be the entire rest of the football team. One "brother" of a person that precommitted to carry an umbrella will be the counterfactual version of themselves that instead precommitted to not carry an umbrella.

This allows us to trivially restate the above definitions as:

Definition: We say C◃+D if C has a brother in D and C◃×D if C has a sister in D.

Claim: This definition is equivalent to the ones above.

Proof: Trivial. □

1.3. Committing and Externalizing Definitions

Next, we will give the committing definition of additive subagent and an externalizing definition of multiplicative subagent. These definitions are often the easiest to work with directly in examples.

We call the following definition the "committing" definition because we are viewing C as the result of D making a commitment (up to biextensional equivalence).

Definition: Given Cartesian frames C and D over W, we say C◃+D if there exist three sets X, Y, and Z, with X⊆Y, and a function f:Y×Z→W such that C≃(X,Z,⋄) and D≃(Y,Z,∙), where ⋄ and ∙ are given by x⋄z=f(x,z) and y∙z=f(y,z).

Claim: This definition is equivalent to the sub-sum and brother definitions of ◃+.

Proof: First, assume that C has a brother in D. Let C=(A,E,⋅), and let D=(B,F,⋆). Let C′=(A′,E′,⋅′) be brother to C in D. Let D′=(B′,F′,⋆′) be such that  D′≃D and D′∈C⊞C′. Then, if we let X=A, let Y=B′=A⊔A′, let Z=F′, and let f(y,z)=y⋆′z, we get D≃D′=(Y,Z,∙), where y∙z=f(y,z), and by the definition of sub-sum, C≃(X,Z,⋄), where  x⋄z=f(x,z).

Conversely, let X, Y, and Z be arbitrary sets with X⊆Y, and let f:Y×Z→W.  Let C≃C0=(X,Z,⋄0), and let D≃D′=(Y,Z,∙), where x⋄0z=f(x,z) and y∙z=f(y,z). We want to show that C has a brother in D. It suffices to show that C0 has a brother in D, since sub-sum is well-defined up to biextensional equivalence. Indeed, we will show that C1=(Y∖X,Z,⋄1) is brother to C0 in D, where ⋄1 is given by x⋄1z=f(x,z).

Observe that C0⊕C1=(Y,Z×Z,∙′), where ∙′ is given by

y∙′(z0,z1)=y⋄0z0=y∙z0

if y∈X, and is given by

y∙′(z0,z1)=y⋄1z1=y∙z1

otherwise. Consider the diagonal subset S⊆Z×Z given by S={(z,z) | z∈Z}. Observe that the map z↦(gz,hz) is a bijection from Z to S. Observe that if we restrict ∙′ to Y×S, we get ∙′′:Y×S→W given by y∙′′(z,z)=y∙z. Thus (Y,S,∙′′)≅(Y,Z,∙), with the isomorphism coming from the identity on Y, and the bijection between S and Z.

If we further restrict ∙′′ to X×S or (Y∖X)×S, we get ∙0 and ∙1 respectively, given by x∙0=x⋄0z and x∙1(z,z)=x⋄1z. Thus (X,S,∙0)≅(X,Z,⋄0) and (Y∖X,S,∙1)≅(Y∖X,Z,⋄1), with the isomorphisms coming from the identities on Y and X∖Y, and the bijection between S and Z.

Thus (Y,S,∙′′)∈C0⊞C1, and (Y,S,∙′′)≅D′≃D, so C1 is brother to C0 in D, so C has a brother in D. □

Next, we have the externalizing definition of multiplicative subagent. Here, we are viewing C as the result of D sending some of its decisions into the environment (up to biextensional equivalence).

Definition: Given Cartesian frames C and D over W, we say C◃×D if there exist three sets X, Y, and Z, and a function f:X×Y×Z→W such that C≃(X,Y×Z,⋄) and D≃(X×Y,Z,∙), where ⋄ and ∙ are given by x⋄(y,z)=f(x,y,z) and (x,y)∙z=f(x,y,z).

Claim: This definition is equivalent to the sub-tensor and sister definitions of ◃×.

Proof: First, assume that C has a sister in D. Let C=(A,E,⋅), and let D=(B,F,⋆). Let C′=(A′,E′,⋅′) be sister to C in D. Let D′=(B′,F′,⋆′) be such that  D′≃D and D′∈C⊠C′. Then, if we let X=A, let Y=A′, let Z=F′⊆hom(C,C′∗), and let

f(x,y,(g,h))=x⋅h(y)=y⋅′g(x),,

we get D≃D′=(X×Y,Z,∙), where (x,y)∙z=f(x,y,z), and by the definition of sub-tensor, C≃(X,Y×Z,⋄), where  x⋄(y,z)=f(x,y,z).

Conversely, let X, Y, and Z be arbitrary sets, and let f:X×Y×Z→W. Let C≃C0=(X,Y×Z,⋄0), and let D≃D′=(X×Y,Z,∙), where x⋄0(y,z)=(x,y)∙z=f(x,y,z). We will assume for now that at least one of X and Y is nonempty, as the case where both are empty is degenerate.

We want to show that C has a sister in D. It suffices to show that C0 has a sister in D, since sub-tensor is well-defined up to biextensional equivalence. Indeed, we will show that C1=(Y,X×Z,⋄1) is sister to C0 in D, where ⋄1 is given by y⋄1(x,z)=f(x,y,z).

Observe that C0⊗C1=(X×Y,hom(C0,C∗1),∙′), where ∙′ is given by

(x,y)∙′(g,h)=x⋄0h(y)=y⋆1g(x).

For every z∈Z, there is a morphism (gz,hz):C0→C∗1, where gz:X→X×Z is given by gz(x)=(x,z), and gz:Y→Y×Z is given by gz(x)=(x,z). This is clearly a morphism. Consider the subset S⊆hom(C0,C∗1) given by S={(gz,hz) | z∈Z}. Observe that the map z↦(gz,hz) is a bijection from Z to S. (We need that at least one of X and Y is nonempty here for injectivity.)

If we restrict ∙′ to (X×Y)×S, we get ∙′′:(X×Y)×S→W given by y∙′′(gz,hz)=y∙z. Thus, (X×Y,S,∙′′)≅(X×Y,Z,∙), with the isomorphism coming from the identity on X×Y, and the bijection between S and Z.

To show that (X×Y,S,∙′′)∈C0⊠C1, we need to show that C0≃(X,Y×S,∙0) and C1≃(Y,X×S,∙1), where ∙0 and ∙1 are given by

x∙0(y,(gh,zh))=y∙1(x,(gh,zh))=(x,y)∙′′(gz,hz).

Indeed,  x∙0(y,(gh,zh))=x⋄0(y,z) and y∙1(x,(gh,zh))=y⋄1(x,z), so (X,Y×S,∙0)≅(X,Y×Z,⋄0)=C0 and (Y,X×S,∙1)≅(Y,X×Z,⋄1)=C1, with the isomorphisms coming from the identities on X and Y, and the bijection between S and Z.

Thus (X×Y,S,∙′′)∈C0⊞C1, and (Y,S,∙′′)≅D′≃D, so C1 is sister to C0 in D, so C has a sister in D.

Finally, in the case where X and Y are both empty, C≅null, and either D≃null or D≃0, depending on whether Z is empty. It is easy to verify that null⊠null={0,null}, since null⊗null≅0, taking the two subsets of the singleton environment in 0 yields 0 and null as candidate sub-tensors, and both are valid sub-tensors, since either way, the conditions reduce to null≃null. □

Next, we have some definitions that more directly relate to our original definitions of subagent.

1.4. Currying Definitions

Definition: We say C◃+D if there exists a Cartesian frame M over Agent(D) with |Env(M)|=1, such that C≃D∘(M).

Claim: This definition is equivalent to all of the above definitions of ◃+.

Proof: We show equivalence to the committing definition.

First, assume that there exist three sets X, Y, and Z, with X⊆Y, and a function p:Y×Z→W such that C≃(X,Z,⋄) and D≃(Y,Z,∙), where ⋄ and ∙ are given by x⋄z=p(x,z) and y∙z=p(y,z).

Let D=(B,F,⋆), and let (g0,h0):D→(Y,Z,∙) and (g1,h1):(Y,Z,∙)→D compose to something homotopic to the identity in both orders.

We define M, a Cartesian frame over B, by M=(X,{e},⋅), where ⋅ is given  x⋅e=g1(x). Observe that D∘(M)=(X,{e}×F,⋆′), where ⋆′ is given by

x⋆′(e,f)=(x⋅e)⋆f=g1(x)⋆f=x∙h1(f).

To show that (X,Z,⋄)≃D∘(M), we construct morphisms (g2,h2):(X,Z,⋄)→D∘(M) and (g3,h3):D∘(M)→(X,Z,⋄) that compose to something homotopic to the identity in both orders. Let g2 and g3 both be the identity on X. Let h2:{e}×F→Z be given by h2(e,f)=h1(f), and let h3:Z→{e}×F be given by h3(z)=(e,h0(z)).

We know (g2,h2) is a morphism, since for all x∈X and (e,f)∈{e}×F, we have

g2(x)⋆′(e,f)=x⋆′(e,f)=x∙h1(f)=x⋄h1(f)=x⋄(h2(e,f)).

We also have that (g3,h3) is a morphism, since for all x∈X and z∈Z, we have

g3(x)⋄z=x⋄z=x∙z=x∙h1(h0(z))=x⋆′(e,h0(z))=x⋆′h3(z).

Observe that (g2,h2) and (g3,h3) clearly compose to something homotopic to the identity in both orders, since g2∘g3 and g3∘g2 are the identity on X.

Thus, C≃(X,Z,⋄)≃D∘(M), and |Env(M)|=1.

Conversely, assume C≃D∘(M), with |Env(M)|=1.  We define Y=Agent(D) and Z=Env(D). We define f:Y×Z→W by f(y,z)=y∙z, where ∙=Eval(D).

Let X⊆Y be given by X=Image(M). Since |Env(M)|=1, we have M≃⊥X. Thus, C≃D∘(M)≃D∘(⊥X). Unpacking the definition of D∘(⊥X), we get D∘(⊥X)=(X,{e}×Z,⋅), where ⋅ is given by x⋅(e,z)=f(x,z), which is  isomorphic to (X,Z,⋄), where ⋄ is given by x⋄z=f(x,z). Thus C≃(X,Z,⋄) and D=(Y,Z,∙), as in the committing definition. □

Definition: We say C◃×D if there exists a Cartesian frame M over Agent(D) with Image(M)=Agent(D), such that C≃D∘(M).

Claim: This definition is equivalent to all of the above definitions of ◃×.

Proof: We show equivalence to the externalizing definition.

First, assume there exist three sets X, Y, and Z, and a function p:X×Y×Z→W such that C≃(X,Y×Z,⋄) and D≃(X×Y,Z,∙), where ⋄ and ∙ are given by x⋄(y,z)=(x,y)∙z=p(x,y,z).

Let D=(B,F,⋆), and let (g0,h0):D→(X×Y,Z,∙) and (g1,h1):(X×Y,Z,∙)→D compose to something homotopic to the identity in both orders.

We define B′=B⊔{a}, and we define M, a Cartesian frame over B, by M=(X,Y×B′,⋅), where ⋅ is given by x⋅(y,b)=b if b∈B and g0(b)=(x,y), and x⋅(y,b)=g1(x,y) otherwise. Clearly, Image(M)=B, since for any b∈B, if we let (x,y)=g0(b), we have x⋅(y,b)=b.

Observe that for all x∈X, y∈Y, b∈B′ and f∈F, if b∈B and g0(b)=(x,y), then

(x⋅(y,b))⋆f=b⋆f=g1(g0(b))⋆f=g1(x,y)⋆f,

and on the other hand, if b=a or g0(b)≠(x,y), we also have (x⋅(y,b))⋆f=g1(x,y)⋆f.

Thus, we have that D∘(M)=(X,Y×B′×F,⋆′), where ⋆′ is given by

x⋆′(y,b,f)=(x⋅(y,b))⋆f=g1(x,y)⋆f=(x,y)∙h1(f).

To show that (X,Y×Z,⋄)≃D∘(M), we construct morphisms (g2,h2):(X,Y×Z,⋄)→D∘(M) and (g3,h3):D∘(M)→(X,Y×Z,⋄) that compose to something homotopic to the identity in both orders. Let g2 and g3 both be the identity on X. Let h2:Y×B′×F→Y×Z be given by h2(y,b,f)=(y,h1(f)), and let h3:Y×Z→Y×B′×F be given by h3(y,z)=(y,a,h0(z)).

We know (g2,h2) is a morphism, since for all x∈X and (y,b,f)∈Y×B′×F,

g2(x)⋆′(y,b,f)=x⋆′(y,b,f)=(x,y)∙h1(f)=p(x,y,h1(f))=x⋄(y,h1(f))=x⋄(h2(y,b,f)).

We also have that (g3,h3) is a morphism, since for all x∈X and (y,z)∈Y×Z, we have

g3(x)⋄(y,z)=x⋄z=x⋄(y,z)=p(x,y,z)=(x,y)∙z=(x,y)∙h1(h0(z))=(x,y)⋆′(y,a,h0(z))=x⋆′h3(y,z).

Observe that (g2,h2) and (g3,h3) clearly compose to something homotopic to the identity in both orders, since g2∘g3 and g3∘g2 are the identity on X.

Thus, C≃(X,Z,⋄)≃D∘(M), where Image(M)=Agent(D).

Conversely, assume C≃D∘(M), with  Image(M)=Agent(D). Let X=Agent(M), let Y=Env(M), and let Z=Env(D). Let f:X×Y×Z→W be given by f(x,y,z)=(x⋅y)⋆z, where ⋅=Eval(M) and ⋆=Eval(D).

Thus C≃D∘(M)≅(X,Y×Z,⋄), where ⋄ is given by x⋄(y,z)=(x⋅y)⋆z=f(x,y,z). All that remains to show is that D≃(X×Y,Z,∙), where (x,y)∙z=f(x,y,z). Let D=(B,Z,⋆).

We construct morphisms (g0,h0):D→(X×Y,Z,∙) and (g1,h1):D→(X×Y,Z,∙) that compose to something homotopic to the identity in both orders. Let h0 and h1 be the identity on Z. Let g1:X×Y→B be given by g1(x,y)=x⋅y. Since g1 is surjective, it has a right inverse. Let g0:B→X×Y be any choice of right inverse of g1, so g1(g0(b))=b for all b∈B.

We know (g1,h1) is a morphism, since for all (x,y)∈X×Y and z∈Z,

g1(x,y)⋆z=(x⋅y)⋆z=f(x,y,z)=(x,y)∙z=(x,y)∙h1(z).

To see that (g0,h0) is a morphism, given b∈B and z∈Z, let (x,y)=g0(b), and observe

g0(b)∙z=(x,y)∙z=f(x,y,z)=(x⋅y)⋆z=g1(x,y)⋆z=g1(g0(b))⋆z=b⋆h0(z).

(g0,h0) and (g1,h1) clearly compose to something homotopic to the identity in both orders, since h0∘h1 and h1∘h0 are the identity on Z. Thus D≃(X×Y,Z,∙), completing the proof. □

Consider two Cartesian frames C and D, and let M be a frame whose possible agents are Agent(C) and whose possible worlds are Agent(D). When C is a subagent of D, (up to biextensional equivalence) there exists a function from Agent(C), paired with Env(M), to Agent(D).

Just as we did in "Subagents of Cartesian Frames" §1.2 (Currying Definition), we can think of this function as a (possibly) nondeterministic function from Agent(C) to Agent(D), where Env(M) represents the nondeterminism. In the case of additive subagents, Env(M) is a singleton, meaning that the function from Agent(C) to  Agent(D) is actually deterministic. In the case of multiplicative subagents, the (possibly) nondeterministic function is surjective.

Recall that in "Sub-Sums and Sub-Tensors" §3.3 (Sub-Sums and Sub-Tensors Are Superagents), we constructed a frame with a singleton environment to prove that sub-sums are superagents, and we constructed a frame with a surjective evaluation function to prove that sub-tensors are superagents. The currying definitions of ◃+ and ◃× show why this is the case.

1.5. Categorical Definitions

We also have definitions based on the categorical definition of subagent. The categorical definition of additive subagent is almost just swapping the quantifiers from our original categorical definition of subagent. However, we will also have to weaken the definition slightly in order to only require the morphisms to be homotopic.

Definition: We say C◃+D if there exists a single morphism ϕ0:C→D such that for every morphism ϕ:C→⊥ there exists a morphism ϕ1:D→⊥ such that ϕ is homotopic to ϕ1∘ϕ0 .

Claim: This definition is equivalent to all the above definitions of ◃+.

Proof: We show equivalence to the committing definition.

First, let C=(A,E,⋅) and D=(B,F,∙) be Cartesian frames over W, and let (g0,h0):C→D be such that for all (g,h):C→⊥, there exists a (g′,h′):D→⊥ such that (g,h) is homotopic to (g′,h′)∘(g0,h0). Let ⊥=(W,{i},⋆).

Let Y=B, let Z=F, and let X={g0(a) | a∈A}. Let f:Y×Z→W be given by f(y,z)=y∙z. We already have D=(Y,Z,∙), and our goal is to show that C≃(X,Z,⋄), where ⋄ is given by x⋄z=f(x,z).

We construct (g1,h1):C→(X,Z,⋄) and (g2,h2):(X,Z,⋄)→C that compose to something homotopic to the identity in both orders.

We define g1:A→X by g1(a)=g0(a). g1 is surjective, and so has a right inverse. We let g2:X→A be any right inverse to g1, so g1(g2(x))=x for all x∈X. We let h1:Z→E be given by h1(z)=h0(z).

Defining h2:E→Z will be a bit more complicated.  Given an e∈E, let (ge,he) be the morphism from C to ⊥, given by he(i)=e and ge(a)=a⋅e. Let (g′e,h′e):D→⊥ be such that (ge,he) is homotopic to (g′e,h′e)∘(g0,h0). We define h2 by h2(e)=h′e(i).

We trivially have that (g1,h1) is a morphism, since for all a∈A and z∈Z,

g1(a)⋄z=g0(a)∙z=a⋅h0(z)=a⋅h1(z).

To see that (g2,h2) is a morphism, consider x∈X and e∈E, and define (ge,he) and (g′e,h′e) as above. Then,

x⋄h2(e)=g1(g2(x))⋄h′e(i)=g′e(g0(g2(x)))⋆i=g2(x)⋅he(i)=g2(x)⋅e.

We trivially have that (g1,h1)∘(g2,h2) is homotopic to the identity, since g1∘g2 is the identity on X. To see that (g2,h2)∘(g1,h1) is homotopic to the identity on C, observe that for all a∈A and e∈E, defining (ge,he) and (g′e,h′e) as above,

g2(g1(a))⋅e=g1(a)⋄h2(e)=g0(a)⋄h′e(i)=g′e(g0(a))⋆i=a⋆he(i)=a⋅e.

Thus C≃(X,Z,⋄), and C◃+D according to the committing definition.

Conversely, let X, Y, and Z be arbitrary sets with X⊆Y, let f:Y×Z→W, and let C≃(X,Z,⋄) and D≃(Y,Z,∙), where ⋄ and ∙ are given by x⋄z=f(x,z) and y∙z=f(y,z).

Let (g1,h1):C→(X,Z,⋄) and (g2,h2):(X,Z,⋄)→C compose to something homotopic to the identity in both orders, and let (g3,h3):D→(Y,Z,⋄) and (g4,h4):(Y,Z,⋄)→D compose to something homotopic to the identity in both orders. Let (g0,h0):(X,Z,⋄)→(Y,Z,∙) be given by g0 is the embedding of X in Y and h0 is the identity on Z. (g0,h0) is clearly a morphism.

We let ϕ:C→D=(g4,h4)∘(g0,h0)∘(g1,h1).

Given a (g,h):C→⊥, our goal is to construct a (g′,h′):D→⊥ such that (g,h) is homotopic to (g′,h′)∘ϕ.

Let ⊥=(W,{i},⋆), let C=(A,E,⋅0), and let D=(B,F,⋅1). Let h′:{i}→F be given by h′=h3∘h2∘h. Let g′:B→W be given by g′(b)=b⋅1h′(i). This is clearly a morphism, since for all b∈B and i∈{i},

g′(b)⋆i=g′(b)=b⋅h′(i).

To see that (g,h) is homotopic to (g′,h′)∘(g4,h4)∘(g0,h0)∘(g1,h1), we just need to check that (g,h1∘h0∘h4∘h′):C→⊥ is a morphism. Or, equivalently, that (g,h1∘h4∘h3∘h2∘h):C→⊥, since h0 is the identity, and h′=h3∘h2∘h.

Indeed, for all a∈A and i∈{i},

g(a)⋆i=a⋅0h(a)=a⋅0h1(h2(h(a)))=g1(a)⋄h2(h(a))=g1(a)∙h2(h(a))=g1(a)∙h4(h3(h2(h(a))))=g1(a)⋄h4(h3(h2(h(a))))=a⋅0h1(h4(h3(h2(h(a))))).

Thus (g,h) is homotopic to (g′,h′)∘ϕ, completing the proof. □

Definition: We say C◃×D if for every morphism ϕ:C→⊥, there exist morphisms ϕ0:C→D and ϕ1:D→⊥ such that ϕ≅ϕ1∘ϕ0, and for every morphism ψ:1→D, there exist morphisms ψ0:1→C and ψ1:C→D such that ψ≅ψ1∘ψ0.

Before showing that this definition is equivalent to all of the above definitions, we will give one final definition of multiplicative subagent.

1.6. Sub-Environment Definition

First, we define the concept of a sub-environment, which is dual to the concept of a sub-agent.

Definition: We say C is a sub-environment of D, written C◃∗D, if D∗◃C∗.

We can similarly define additive and multiplicative sub-environments.

Definition: We say C is an additive sub-environment of D, written C◃∗+D, if D∗◃+C∗. We say C is an multiplicative sub-environment of D, written C◃∗×D, if D∗◃×C∗.

This definition of a multiplicative sub-environment is redundant, because the set of frames with multiplicative sub-agents is exactly the set of frames with multiplicative sub-environments, as shown below:

Claim: C◃×D if and only if C◃∗×D.

Proof: We prove this using the externalizing definition of ◃×.

If C◃×D, then for some X, Y, Z, and f:X×Y×Z→W, we have C≃(X,Y×Z,⋄) and D≃(X×Y,Z,∙), where ⋄ and ∙ are given by x⋄(y,z)=f(x,y,z) and (x,y)∙z=f(x,y,z).

Observe that D∗≃(Z,Y×X,⋅) and C∗≃(Z×Y,X,⋆), where ⋅ and ⋆ are given by z⋅(y,x)=f(x,y,z) and (z,y)⋆x=f(x,y,z). Taking X′=Z, Y′=Y, Z′=X, and f′(x,y,z)=f(z,y,x), this is exactly the externalizing definition of D∗◃×C∗, so C◃∗×D.

Conversely, if C◃∗×D, then D∗◃×C∗, so C≅{C∗}∗◃×{D∗}∗≅D. □

We now give the sub-environment definition of multiplicative subagent:

Definition: We say C◃×D if C◃D and C◃∗D. Equivalently, we say C◃×D if C◃D and D∗◃C∗.

Claim: This definition is equivalent to the categorical definition of ◃×.

Proof: The condition that for every morphism ϕ:C→⊥, there exist morphisms ϕ0:C→D and ϕ1:D→⊥ such that ϕ≅ϕ1∘ϕ0, is exactly the categorical definition of C◃D.

The condition that for every morphism ψ:1→D, there exist morphisms ψ0:1→C and ψ1:C→D such that ψ≅ψ1∘ψ0, is equivalent to saying that for every morphism  ψ∗:D∗→⊥, there exist morphisms ψ∗0:C∗→⊥ and ψ∗1:D∗→C∗ such that ψ∗≅ψ∗1∘ψ∗0. This is the categorical definition of D∗◃C∗. □

Claim: The categorical and sub-environment definitions of ◃× are equivalent to the other four definitions of multiplicative subagent above: sub-tensor, sister, externalizing, and currying.

Proof: We show equivalence between the externalizing and sub-environment definitions. First, assume that C=(A,E,⋅) and D=(B,F,⋆) are Cartesian frames over W with C◃D and C◃∗D.

We define X=A, Z=F, and Y=hom(C,D). We define p:X×Y×Z→W by

p(a,(g,h),f)=g(a)⋆f=a⋅h(f).

We want to show that C≃(X,Y×Z,⋄), and D≃(X×Y,Z,∙), where ⋄ and ∙ are given by x⋄(y,z)=(x,y)∙z=p(x,y,z).

To see C≃(X,Y×Z,⋄), we construct (g0,h0):C→(X,Y×Z,⋄) and (g1,h1):(X,Y×Z,⋄)→C that compose to something homotopic to the identity in both orders. Let g0 and g1 be the identity on X and let h0:Y×Z→E be defined by h0((g,h),f)=h(f). By the covering definition of subagent, h0 is surjective, and so has a right inverse. Let h1:E→Y×Z be any right inverse of h0, so h0(h1(e))=e for all e∈E.

We know (g0,h0) is a morphism, because for all a∈A and ((g,h),f)∈Y×Z,

g0(a)⋄((g,h),f)=a⋄((g,h),f)=p(a,(g,h),f)=a⋅h(f)=a⋅h0((g,h),f).

We know (g1,h1) is a morphism, since for x∈X and e∈E, if ((g,h),f)=h1(e),

g1(x)⋅e=x⋅h0((g,h),f)=x⋅h(f)=p(x,(g,h),f)=x⋄((g,h),f)=x⋄h1(e).

(g0,h0) and (g1,h1) clearly compose to something homotopic to the identity in both orders, since g0∘g1 and g1∘g0 are the identity on X.

To see D≃(X×Y,Z,∙), we construct (g2,h2):D→(X×Y,Z,∙) and (g3,h3):(X×Y,Z,∙)→D that compose to something homotopic to the identity in both orders. Let h2 and h3 be the identity on Z and let g3:X×Y→B be defined by g3(a,(g,h))=g(a). By the covering definition of subagent and the fact that D∗◃C∗, g3 is surjective, and so has a right inverse. Let g2:B→X×Y be any right inverse of g3, so g3(g2(b))=b for all b∈B.

We know (g3,h3) is a morphism, because for all f∈F and (a,(g,h))∈X×Y,

g3(a,(g,h))⋆f=g(a)⋆f=p(a,(g,h),f)=(a,(g,h))∙f=(a,(g,h))∙h3(f).

We know (g2,h2) is a morphism, since for z∈Z and b∈B, if (a,(g,h))=g2(b),

g2(b)∙z=(a,(g,h))∙z=p(a,(g,h),z)=g(a)⋆z=g3(a,(g,h))⋆z=b⋆h2(z).

Observe that (g2,h2) and (g3,h3) clearly compose to something homotopic to the identity in both orders, since h2∘h3 and h3∘h2 are the identity on Z.

Thus,  C≃(X,Y×Z,⋄), and D≃(X×Y,Z,∙).

Conversely, if C◃×D according to the externalizing definition, then we also have D∗◃×C∗. However, by the currying definitions of multiplicative subagent and of subagent, multiplicative subagent is stronger than subagent, so C◃D and D∗◃C∗. □

2. Basic Properties

Now that we have enough definitions of additive and multiplicative subagent, we can cover some basic properties.

First: Additive and multiplicative subagents are subagents.

Claim: If C◃+D, then C◃D. Similarly, if C◃×D, then C◃D.

Proof: Clear from the currying definitions. □

Additive and multiplicative subagent are also well-defined up to biextensional equivalence.

Claim: If C◃+D, C′≃C, and D′≃D, then C′◃+D′. Similarly, if C◃×D, C′≃C, and D′≃D, then C′◃×D′.

Proof: Clear from the committing and externalizing definitions. □

Claim: Both ◃+ and ◃× are reflexive and transitive.

Proof: Reflexivity is clear from the categorical definitions. Transitivity of  ◃× is clear from the transitivity of ◃ and the sub-environment definition. Transitivity of ◃+ can be seen using the categorical definition, by composing the morphisms and using the fact that being homotopic is preserved by composition. □

3. Decomposition Theorems

We have two decomposition theorems involving additive and multiplicative subagents.

3.1. First Decomposition Theorem

Theorem: C0◃C1 if and only if there exists a C2 such that C0◃×C2◃+C1.

Proof: We will use the currying definitions of subagent and multiplicative subagent, and the committing definition of additive subagent. Let C0=(A0,E0,⋅0) and C1=(A1,E1,⋅1). If C0◃C1, there exists some Cartesian frame D over A1 such that C0=C∘1(D).

Let C2=(Image(D),E1,⋅2), where ⋅2 is given by a⋅2e=a⋅1e. C2 is created by deleting some possible agents from C1, so by the committing definition of additive subagent C2◃+C1.

Also, if we let D′ be the Cartesian frame over Image(D) which is identical to D, but on a restricted codomain, then we clearly have that C∘1(D)≅C∘2(D′). Thus C0≃C∘2(D′) and Image(D′)=Agent(C2), so C0◃×C2.

The converse is trivial, since subagent is weaker than additive and multiplicative subagent and is transitive. □

Imagine that that a group of kids, Alice, Bob, Carol, etc., is deciding whether to start a game of baseball or football against another group. If they choose baseball, they form a team represented by the frame CB, while if they choose football, they form a team represented by the frame CF. We can model this by imagining that C0 is the group's initial state, and CB and CF are precommitment-style subagents of C0.

Suppose the group chooses football. CF's choices are a function of Alice-the-football-player's choices, Bob-the-football-player's choices, etc. (Importantly, Alice here has different options and a different environment than if the original group had chosen baseball. So we will need to represent Alice-the-football-player, CAF, with a different frame than Alice-the-baseball-player, CAB; and likewise for Bob and the other team members.)

It is easy to see in this case that the relationship between Alice-the-football-player's frame (CAF) and the entire group's initial frame (C0) can be decomposed into the additive relationship between C0 and CF and the multiplicative relationship between CF and CAF, in that order.

The first decomposition theorem tells us that every subagent relation, even ones that don't seem to involve a combination of "making a commitment" and "being a team," can be decomposed into a combination of those two things. I've provided an example above where this factorization feels natural, but other cases may be less natural.

Using the framing from our discussion of the currying definitions: this decomposition is always possible because we can always decompose a possibly-nondeterministic function f into (1) a possibly-nondeterministic surjective function onto f's image, and (2) a deterministic function embedding f's image in f's codomain.

3.2. Second Decomposition Theorem

Theorem: There exists a morphism from C0 to C1 if and only if there exists a C2 such that C0◃∗+C2◃+C1.

Proof: First, let C0=(A,E,⋅), let C1=(B,F,⋆), and let (g,h):C0→C1. We let C2=(A,F,⋄), where a⋄f=g(a)⋆f=a⋅h(b).

First, we show C2◃+C1, To do this, we let B′⊆B be the image of g, and let C′2=(B′,F,⋆′), where ⋆′ is given by b⋆′f=b⋆f. By the committing definition of additive subagent, it suffices to show that C′2≃C2.

We define (g0,h0):C2→C′2 and (g1,h1):C′2→C2 as follows. We let h0 and h1 be the identity on F. We let g0:A→B′ be given by g0(a)=g(a). Observe that g0 is surjective, and thus has a right inverse. Let g1 be any right inverse to g0, so g0(g1(b))=b for all b∈B′.

We know (g0,h0) is a morphism, since for all a∈A and f∈F, we have

g0(a)⋆′f=g(a)⋆f=a⋄f=a⋄h0(f).

Similarly, we know (g1,h1) is a morphism, since for all b∈B′ and f∈F, we have

g1(b)⋄f=g1(b)⋄h0(f)=g0(g1(b))⋆′f=b⋆′f=b⋆′h1(f).

Clearly, (g0,h0)∘(g1,h1) and  (g1,h1)∘(g0,h0) are homotopic to the identity, since h0∘h1 and h1∘h0 are the identity on F. Thus, C′2≃C2.

The fact that C0◃∗+C2, or equivalently C∗2◃+C∗0, is symmetric, since the relationship between C∗2 and C∗0 is the same as the relationship between C2 and C1.

Conversely, if C2◃+C1, there is a morphism from C2 to C1 by the categorical definition of additive subagent. Similarly, if C0◃∗+C2, then C∗2◃+C∗0, so there is a morphism from  C∗2 to C∗0, and thus a morphism from C0 to C2. These compose to a morphism from C0 to C1. □

When we introduced morphisms and described them as "interfaces," we noted that every morphism (g,h):C0→C1 implies the existence of an intermediate frame C2 that represents Agent(C0) interacting with Env(C1). The second decomposition theorem formalizes this claim, and also notes that this intermediate frame is a super-environment of C0 and a subagent of C1.

In our next post, we will provide several methods for constructing additive and multiplicative subagents: "Committing, Assuming, Externalizing, and Internalizing."

I'll be hosting online office hours this Sunday at 2-4pm PT for discussing Cartesian frames.

Discuss

"model scores" is a questionable concept

6 ноября, 2020 - 12:33
Published on November 6, 2020 3:19 AM GMT

A couple of years ago, while I was working on a problem of what to do with the predictions of a random forest model, I noticed that the mean of the predictions on the test set didn't match the mean of the outcomes in the training data. Let me make that more concrete; consider the following problem (isomorphic to a problem I worked on in the past):

A town has two donut shops. The donut shops are in terrible competition, and the townspeople feel very strongly about the shops. It's, like, political. You run the good shop. Every year, 6% of people switch to your competitor. You are a data-driven company, and your kid needed something to do, so they've been sitting in the corner of the shop for years scribbling down information about each visitor and visit. So now you have a bunch of data on all your customers. "Did they come in that Monday 4 weeks ago? How about the Monday after that? How many donuts did they buy each month?" and so on. Representing "They switched to the other shop forever" as 1 and "that didn't happen" as 0, we have a label vector labels of 0's and 1's, grouped into, say, one data record and label per customer-year. We want to predict who might switch soon, so that we can try and prevent the switch.

So we put your kid's notes into some predictive model, train it, and make predictions on some other set of data not used in the training (call this the test set). Each prediction will take as input a data record that looks something like (Maxwell, Male, age 30, visited twice this year), and output a continuous prediction in the range (0, 1) for how likely Maxwell is to leave. You'll also have a label, 1 or 0, for whether Maxwell actually left. After doing these predictions on the test set, you can check how well your model is modeling by comparing the true labels to the predictions in various ways. One thing to check when figuring out how well you did: check that the average number of people predicted to leave your shop in past years is close to the number of people who actually left. If there are 100 people in the test set, and only 6 actually left in any given year, you should be unhappy to see a prediction vector where all the predictions look like [0.1, 0.9, 0.5, 0.8] (imagine this list is much longer with numbers on a similar scale), because the expected number of leavers from those probabilities is (0.1 + 0.9 + 0.2 + 0.8) / 4 = 2 / 4 = 0.5, or 50%. When only 6% actually left! The predictions are too high. This happens.

So. I had a project, working with the output of someone else's classification model, isomorphic to the donut model. The input data had been balanced, and the probabilities had the undesired property of mean(predictions) >> mean(true labels). And I thought... Oh. The means are way off. This means models don't always output probabilities! They output something else. I started calling these output numbers "model scores", because they lacked the means-are-close property that I wanted. And for years, when people talked about the outputs of classification models and called them "probabilities", I would suggest calling them model scores instead, because the outputs weren't promised to satisfy this property. But I think I was off-base. Now I think that classifier outputs are still probabilities - they are just bad probabilities, that have the base rate way off. Being off on the base rate doesn't make something not a probability, though, any more than a biased sample or data issue turns probabilities into some separate type of thing. Bad probabilities are probabilities! You can apply the operations and reasoning of probability theory to them just the same.

I still think it's good to distinguish between 1) probabilities that are only good for relative ordering from 2) probabilities that are good for more than that, but saying that the former aren't probabilities was going too far. This is what I mean by "'model scores' is a questionable concept."

Calibrating the probabilities from a model

If your model's prediction probabilities have undesired properties, and re-training isn't an option... you can fix it! You can go from mean(predictions) = 50% to mean(predictions) = mean(true labels) = 6%. In my problem, I wanted to do this so that we could apply some decision theory stuff to it: Say it costs $10 to mail a person a great offer, that has some chance of keeping them around for the next year (out of appreciation). You expect the average retained person to spend$50 in a year. Consider a customer, Kim. Should you mail her a coupon? It might be worth it if Kim seems on the fence about the donut question - maybe you heard her talking about the other shop wistfully recently, and your prediction on her switching soon is high. Then $10 might be worth it, since you might be able to avoid the$50 loss. On the other hand, if Randall has been coming to your shop for 20 years and has zero interest in the other shop, that $10 probably doesn't make a difference either way, and would be a waste. Here, having well-calibrated probabilities, in the means-are-close sense, is important. If I think my customers are 50% likely to leave on average, I should freak out and mail everyone - I'm looking at a 50% *$50 loss per customer. But this isn't actually the situation, since we know just 6% of people leave each year.

to this scaled distribution:

The true shape of this distribution is the same as the first one. It only looks a little different because I let my plotting software choose how to bin things.

It looks the same, but the range has been compressed - there are no predictions above 0.21 now. If you zoom out to the full range (0, 1), the latter picture looks like this:

Same shape again, binned differently again.

The means are equal now. But what sucks about this is: what if there is someone with a prediction of 0.90? They'll show up here as having a 0.20 probability of leaving. 90% of $50 is$45, so you expect to make net $35 from sending the coupon. But if you use the after-linear-scaling prediction of 20%, 0.2 *$50 - $10 gives an expected gain of$0, so you might not bother. Smushing predictions toward 0 will screw up our coupon calculations.

There are actually many ways to change the distribution of pms into some distribution of ps, where mean(pm) = mean(ps). In some cases you can use the inner workings of your model directly, like in this explanation of how to change your priors in a trained logistic regression. What I ended up doing in my problem was some sort of ad-hoc exponential scaling, raising the pms to whatever power x made mean(pxm) = μt. I don't remember all my reasons for doing this -  I remember it at least avoiding the smush-everything-left behavior of the linear scaling - but overall I don't think it was very well-founded.

I think the right thing to do, if you can't retrain the model to do it without balancing, is to write down the qualitative and quantitative conditions you want, as math formulas, and find the scaling function that satisfies them (if it exists). Like, if I want scores above 0.9 to stay above 0.9, and for the mean to equal μt, then I have two constraints. These two constraints narrow down the large space of scaling functions. A third constraint narrows it down more, and so on, until you either find a unique family of functions that fits your requirements, or find that there is no such function. In the latter case you can relax some of your requirements. My point here is that you have some ideas in your head of the behavior you want: say you know some people really want to leave - you overheard a very bitter conversation about the change to the bavarians, but you don't know who was talking. You've overheard a few bitter conversations like this every year for the past decade. To represent this, you could write down a constraint like "at least 5% of predictions should be assigned 0.9+". Writing down constraints like this in a real problem is probably hard, and I haven't done it before, but I hope to write a post about doing it, in the future.

Discuss

Being Productive With Chronic Health Conditions

6 ноября, 2020 - 06:08
Published on November 5, 2020 12:22 AM GMT

This post appeared first on the EA Coaching blog.

A fair number of clients who suffer from chronic health problems come though my metaphorical door. Personally, I’ve dealt with a chronic health condition called POTS since my early teens.

When you’re dealing with longer term, physical chronic health problems that cause fatigue, brain fog, or pain, standard productivity advice might not work. Common advice can even backfire, leaving you worse off. Similarly, shorter term problems such as a concussion or pregnancy will temporarily change what you can do.

Of course, I am not a doctor. This is a collection of productivity tips based on many conversations I’ve had, not proven medical advice. If you have a chronic health issue, you should be getting medical advice from appropriate professionals.

Improving your habits and better managing your symptoms will almost certainly increase your productivity more than trying to power through. So work smarter, not harder. Here are some tips for improving your capacity to do work.

Spend time trying to find treatments as early as possible. The biggest productivity gains may come from finding better ways to manage your condition. You may need to talk to many specialists, try different medications, adopt new routines, or learn new ways of moving. The earlier you do these, the longer you will be enjoying the rewards. College is a great time to do this. Of course, grant yourself some compassion if navigating a complicated medical landscape seems like an overwhelming burden on top of dealing with your normal work.

Get enough sleep. Sleeping less than you need so that you have more time awake will reduce your productivity. If you have any kind of chronic fatigue issue, you just need more sleep than average. It’s common to need nine hours a night or more in such cases. If you’ve been running a sleep debt, you may need a lot of rest to catch up over many days.

Adapt your environment to support you. If people are getting annoyed that you’re taking longer than normal to reply to their emails, set up an autoreply that you’re slow to reply right now because of a health problem. If you’re struggling to remember to take your pills, get a weekly pill organizer or an electric bottle cap that tells you when you last took your meds. Build in redundant systems to make sure the vital habits happen.

Make your key habits robust and flexible. When you find the actions, medication, or tricks that make or break your day, plan a way to do those even when you let your other work slide. When you're “falling behind,” it can feel like you don’t have time to exercise or sleep enough or do your stretches. Build routines flexible enough to adapt. Maybe you may need to schedule your deep work for today based on when you have energy. Or maybe you need to make a scaled down habit list with just one or two items that you follow when you’re not feeling up to your full normal routine.

Learn your warning signs and pay attention to them. Learn the warning signs that your body is nearing its limit. When those warning signs pop up, plan responses that will leave you ready to be productive again tomorrow. E.g. if you have carpal tunnel, stop before your hands hurt.

Plan on not being 100%. When you’re setting goals, give yourself buffer time. Are you brain foggy one in three days? Then don’t make a plan that requires you to be at peak capacity every day. If you can do more than you planned, great! If not, you already have a plan in place for dealing with it.

Slow down and respond to new conditions. If your condition recently developed, slow down a bit more than you think you need. You’re still calibrated to what you could do before, but your “normal” has changed. You need to accept that a good day right now isn’t as good as what you used to consider a good day. While you’re figuring out what you can do, you don’t want to push yourself too hard. You might make things worse. This is especially common in repeated stress injuries like carpal tunnel. Second, aggressively try to solve this new problem. The above argument for solving a problem earlier rather than later applies. More importantly, however, some health issues are much easier to fix early on. E.g.If you keep working with a concussion, you might cause further brain damage.

Personal examples:

It’s not always easy to imagine what these tips would look like in practice. So I’ve included some personal examples of what they look like for me.

One of my key habits is regular exercise. So I start off the day with fifteen minutes of yoga. A habit established mainly, I confess, through committing to pay a housemate for each missed day. I sat out half the standing poses today because I was feeling lightheaded (one of my warning signs). Since overexercising can leave me tired for weeks, I’m not trying to push myself here.

I have three bottles of salted water beside my bed so I stay hydrated. After what feels like the pharmacological equivalent of shopping for new clothes, one of that many drugs that my doctors recommend works really well. The salt water, meds, and other things I’ve done to manage POTS have been huge in increasing my productive time - there is no way I could have just pushed through to my current levels of productivity.

I went to bed early enough last night that I could get a solid 8.5 hours of sleep. My housemates graciously agreed to pause our board game and finish it today so I could get enough sleep.

Most advice you hear will argue either to accept your limits or push yourself more. That’s because most people giving such advice are trying to push back on the pendulum swinging too far in one direction.

This post isn’t about recommending either.

I get the arguments for wanting to accomplish more, and sometimes being able to do so if you push yourself more. I also understand why pushing yourself too much can be net harmful to your productivity (and, of course, your happiness).

What is hard is deciding when you should do each.

Because “push yourself” and “accept your limits” isn’t a binary choice. Rather, you’re trying to find where your limits should be and when you want to push yourself more.

This is hard because the normal benchmarks don’t apply. If work is a boulder each person has to push up a hill, your boulder is bigger than that of a “normal” person. So it doesn’t make sense to measure yourself against the arbitrary eight-hour workday.

Instead, you need to figure out for yourself what you can accomplish. When will you expect that you “should” be productive and push through? When will you accept that you gave it your best but hit a limit?

A couple of therapists and I put our heads together to brainstorm the following criteria for deciding where to set your limit. The goal we aimed for is sustainability – can you maintain this threshold without burning out or further damaging your health?

Here’s some questions we came up with:

Are you exhausted all of the time? Does increasing time off work or rest reduce the exhaustion? If you’re exhausted unless you sleep more, sleep more. Running an increasingly large sleep deficit is not sustainable.

Are you experiencing sudden changes? E.g. sudden loss of energy, sudden low mood or mania, sudden loss or increase of appetite, sudden increase in brain fog, sudden loss of weight? If you experience sudden changes, check with a doctor. These can be a symptom of something severe.

Does taking a break (e.g. a weekend off) substantially improve your symptoms? If taking a break improves your symptoms, think about what breaks you need for sustainability. “I’ll just push through” often doesn’t work for prolonged periods of time, and you can set yourself back far more if you over do it. Draw your limits based around protecting the necessary amount of break time.

Is your physical health impacting your behavior? E.g. snapping at people or making many typos. If so, how costly could these mistakes be? If you might mess up something important because you’re fatigued, consider whether it’s actually more effective for you to work less and rest more.

These questions are about you over time, not just this moment. You might feel okay today getting less sleep, but then consistently be tired after a week of doing so. Your limit isn’t the max you can possibly push yourself; it’s healthy boundaries that protect your ability to sustainably do important work.

Trying to do more

As you build your habits, you probably want to try doing things and noticing when you should ease off. However, constantly checking in with yourself might cause more stress and heighten awareness of symptoms. So it's useful to watch out for that and find a pattern of checking in with yourself that works well for you.

Here are some tips for when you want to try doing more.

Prioritize ruthlessly. You have more limited capacity than other people. However, this usually reduces the amount you can do more than the importance of what you can do. You can make up a good deal of that gap by doing just the most important tasks.

Try to avoid failing with abandon. It can feel tempting to write off the entire day when you wake up feeling awful. Or to think that you shouldn’t try because you’ll never be able to do something. There may be times when those responses are warranted, but they shouldn’t be the default. Even if your day started out bad, maybe you’ll feel up to doing some work in the afternoon. So check in with what you feel up to right now.

Try the 5-minute test. A therapist mentor once told me that for people dealing with anxiety, they suggest the person ask themselves “Can I do this for five minutes?” If the answer is yes, do it for five minutes. Then ask the question again. When the answer is no, it’s okay to stop. You can apply a similar check in with yourself. Ask yourself questions like these: “Do I feel up to doing my top priority for five minutes? If not, do I feel up to doing easier work? If not, what break is most likely to leave me feeling better later?” You can also use it for self-care, “Can I do five minutes of exercise?” or “Can I spend 5 minutes figuring out which meds I need to order?”

Work up slowly. Set goals based on what you’ve done previously. No one would expect to go from couch potato to running a marathon instantly. Similarly, if you normally work three hours a day, don’t suddenly jump to eight. Even if you manage to do it for a day or two, it’s not sustainable. Increase your work in smalls chunks so that you have time to learn the necessary habits and notice when you are overexerting yourself. When you experience a setback, recalibrate and go more slowly.

Personal examples:

I’m aiming for three hours of deep work today since I only have a few calls. Through trial and error, I know I can expect to hit that goal ~80% of the time if I work hard. So it’s a mild stretch goal that pushes me a little bit.

Today wasn’t a good day. I woke up feeling like my head was stuffed with cotton, and didn’t end up starting my deep work until 5pm - far later than the planned 9am.

Cut yourself slack

Finally, here are some ways to think if you’re near your limit.

You don’t need to compare yourself to others. It doesn’t always feel like you should be cutting yourself slack. By which I mean, it’s really hard to know if you're just "not trying" as hard or if you're pushing a heavier boulder up the hill than others. Psychologically, it can help if the problem suddenly starts or you can get an official diagnosis, so you can compare your current experience to recent memory or quantitative medical criteria to know that your experience is not normal.

In practice, I set thresholds by the above questions on figuring out your limits. If you are testing how much you can do and finding the limits you need to respect so you’re not too tired or in too much pain, then that is the upper bound of what you should expect of yourself. This is true even if someone else doesn’t have those limits.

You’re not slacking off.

It’s easy to feel bad about your productivity if this is you. You probably aren’t able to work as much as you see others working or as much as you think you should be working. I’ve spent years building habits, improving prioritization, and learning to manage my condition – and I still average fewer hours of work each day than my partner does with far less effort. I still have days where I hit a wall and need to nap for several hours.

When you see yourself struggling to accomplish what others seem to do easily, it can feel like you are a failure. And that sucks.

But you are not lazy or slacking off. To return to my earlier analogy, if work is a boulder each person has to push up a hill, your boulder is bigger than theirs. So, cut yourself some slack if you can’t push your boulder up quite as high a hill as someone with a lighter boulder.

You’re pushing through something that is genuinely difficult. For many people with chronic illness, the everyday experience is the same as a normal person’s sick day. The difference is that when a normal person feels as bad as you do right now, they call in sick and stay in bed. But this is your normal. You can’t let yourself take off every “normal person” bad day, or else you wouldn’t have any days to work. So you push through pain or fatigue or brain fog, and that takes extra effort.

Personal examples:

I schedule client calls with breaks every few hours so that I have time to take a nap if necessary.

I score my productivity for the day on a 1-5 scale based on what percent I accomplished of the work I think I could have done given how I felt today. On days when I’m tired or brain foggy, “good enough” translates to much less work than on a good day. I’ve failed at this so many times that I’ve come to respect my warning signs.

Conclusion

Chronic conditions have an up-and-down cycle - sometimes it’s meh, sometimes things seem mostly fine, and sometimes you’re in a really bad place. Managing your productivity is about learning to negotiate that cycle: reduce the downs if you can, make the most of the ups, and practice some self-love as you go.

Because things can get better. While I still work fewer hours than some of my peers, I’m able to work mostly normal hours. That wasn’t true a few years ago.

I hope you found these helpful for deciding when to push or cut yourself slack. To reiterate, please be kind to yourself. You’ll be happier and more productive optimizing for sustainable work rather than beating yourself up for having a health problem.

Resources and acknowledgements

Living a Healthy Life with Chronic Conditions - I haven’t read it, but I found this book on the CDC’s site for self-management of chronic conditions. It seems to have a broad collection of basic tips.

Many thanks to Ewelina Tur, Damon Pourtahmaseb-Sasi, Daniel Kestenholz, Nicole Ross, Mary Wang, Rohin Shah, Bill Zito, Jonathan Mustin, and Amanda Ngo for their input.

Discuss

Sunday, Nov 5: Tuning Your Cognitive Algorithms

6 ноября, 2020 - 03:57
Published on November 6, 2020 12:57 AM GMT

For this Sunday's LW Online Meetup, I'll be will be leading an exercise on Mindful Puzzle Solving, derived from the Tuning Your Cognitive Algorithms exercise on bewelltuned.com

The basic premise is to solve a medium-difficulty-puzzle, while spending a lot of attention on noticing exactly what your brain is doing at every step. You can then notice which pieces of your process are doing most of the work, and which are wasted motion.

This is not only valuable for improving your puzzle-solving abilities, but for generally improving the feedback loops that improve your cognitive abilities. I've found this to be one of the most essential rationalist skills that I've learned.

We'll be meeting in this Zoom Room at noon PT. I'll give a short talk on the theory-and-practice of tuning your cognitive algorithms. We'll spend two 20 minute periods doing individual exercises, and then talking about what we learned.

Afterwards, we'll head over to the Walled Garden (LessWrong's persistent gather.town world) for after-meetup chit chat.

Discuss

6 ноября, 2020 - 03:07
Published on November 4, 2020 7:37 PM GMT

From a pure consumption perspective, the basic problem of gift-giving is that you usually know what you want better than I know what you want. If we both buy each other $20 gifts, then we’ll probably end up with things we want less than whatever we would have bought ourselves for$20. The ceremony and social occasion of gift-giving adds enough value to offset the downsides, but we still end up with entire stores selling useless crap which is given as a gift and then promptly disposed of by the recipient.

The Yard Sale Supply Warehouse - stocking all those little knick-knacks and decorations which you have no use for, but it’s so cute your relative would just love it as a present

We can minimize the problem with wish-lists or gifting cash, but that’s still just buying people the things they’d likely buy for themselves anyway (modulo general thriftiness). What if I want to choose gifts in a way that directly adds real value? In other words: I want to choose gifts which will give someone more value than whatever they would have bought themselves.

On the face of it, this sounds rather… presumptuous. The only way this makes sense is if I know what you want better than you know what you want, in at least some cases.

On the other hand, there are things which money alone cannot buy - specifically knowledge and expertise. If I do not have enough knowledge in an area to distinguish true experts from those who claim to be experts, then I cannot efficiently buy myself the products of true expertise. If you do have enough knowledge to distinguish true experts, or to act as an expert yourself, then that creates an opportunity for value-added gift giving: you can buy the products of true expertise as a gift for me.

A few examples…

My father is into cooking, and has strong opinions about spatulas. A spatula should have some give without being floppy, have a reasonably long lifting-part and a comfortable handle, be dishwasher-safe, and under no circumstances should a spatula be made of plastic. He spent years searching for the perfect spatula, and finally settled on one. I once told him that in his will I wanted him to leave me the spatula - my brother and sister can have the rest, I didn’t need the house or whatever money might be left, but I wanted that spatula.

That particular spatula is no longer in production, but they weren’t very rare when they were made, so he kept an eye out for them at yard sales and the like. A few years ago, he gave me one such spatula for Christmas. That’s a gift that money alone cannot buy.

If you see this at a flea market, check that the flat part flexes a bit under pressure, then BUY IT

A less dramatic example: my father enjoys the sort of light econ books which are popular in rationalist-adjacent circles - think Elephant in the Brain or David Friedman’s lighter books. But he doesn’t have a social circle which recommends and reviews such books. The best he can do is go to a bookstore and look around, or keep an eye on authors he’s read before, or look at Amazon’s recommendations. As a side benefit of reading econ and rationalist blogs, I can often pick out good books for my father that he wouldn’t have encountered himself.

Similarly: I have a group of friends who are really into board games. Prior to the pandemic, we’d have game night once or twice a week. They had a large collection and kept up with new releases, so I’ve experienced a reasonably wide array of games. By contrast, my parents enjoy board games, but they grew up with Hasbro crap and don’t have a social circle which exposes them to anything decent. So, I’ve taken to buying them games as gifts. These have been quite high-value - for instance, they still bring Love Letter with them whenever they travel. (As an added bonus, the new games are good for socializing with the family during the holidays.)

An example from this year: I’ve been to Shanghai with my girlfriend several times. She speaks the language, I don’t. There’s a bunch of tasty chinese foods which are basically-unheard-of in the US; often they don’t even have standard English names. (I’m particularly fond of Xinjiang/Uyghur ethnic food.) They’re rare even in authentic Chinese restaurants here, and looking up recipes is hard-to-impossible - even when I can find a recipe in English, it’s usually some bland knockoff written by a white Midwesterner whose idea of “spices” is “maybe a little salt and pepper”. So, for Christmas this year I asked my girlfriend for recipes for my favorite Chinese foods.

These are all gifts which money alone cannot buy - at a bare minimum, they’d require a large amount of extra effort just to turn the money into the product. The general rule is: think about what knowledge or expertise you have which the gift-recipient lacks, and try to use that expertise to produce a gift which the recipient would not be able to buy for themselves, or would not know to buy for themselves.

Discuss

Impostor Syndrome as skill/dominance mismatch

6 ноября, 2020 - 02:54
Published on November 5, 2020 8:05 PM GMT

I am surprised that there is nothing about Impostor Syndrome on Robin Hanson's website, when to me it seems obviously connected to status. To use the standard formula: Impostor Syndrome is not about lack of skills.

(Also related to: humility, status regulation, unpopularity of nerds.)

Let me quote my older article:

Robin Hanson calls the two basic forms of status "dominance" and "prestige"; the fear-based and the admiration-based sources of social power respectively. He also notes how people high in "dominance" prefer to be perceived as (also) high in "prestige" [1, 2]. Simply said, a brutal dictator often wants to be praised as kind and smart and skilled (e.g. the "coryphaeus of science" Stalin), not merely powerful and dangerous.

[...] If you are not ready to challenge the chieftain for the leadership of the tribe, and if you don't want to risk being perceived as such, the safe behavior is to also downplay your skills as a hunter.

Although humans have two mechanisms of constructing social hierarchies, at the end of the day both of them compete for the same resource: power over people. Thus we see powerful people leveraging their power to also get acknowledged as artists or scientists; and successful artists or scientists leveraging their popularity to express political opinions.

The hierarchy of "dominance" is based on strength, but is not strength alone. The strongest chimp in the tribe can be defeated if the second-strongest and third-strongest join against him. Civilization makes it even more complicated. Stalin wasn't the physically strongest man in the entire Soviet Union.

(In theory, the most powerful person shouldn't need physical strength at all, if they have an army and secret police at command. But in practice, I suppose our instincts demand it; a physically weak leader would probably be a permanent magnet for rebellions. Therefore leaders flaunt their health and strength.)

Similarly, the hierarchy of "prestige" is based on skill, but is not skill alone. The most skilled person can be... what? Outskilled by a coalition of opponents? Nah, sounds like too much work. It is easier to stop them flaunting their skill, either by taking away their tools, or by threatening to break their arms and legs if you see them performing publicly again.

Which makes the prestige ladder a mixed one. To get on the top, you need a combination of superior skill and at least average dominance. If you are 10 at skill and 2 at dominance, you will probably be bullied into submission by someone who is 9 at skill and 7 at dominance. (Remember that dominance is not only physical strength; it also includes social power. Sticks and stones may break your bones, but Twitter can ruin your life.) No one applauds a talent who was too afraid to get on the stage.

What are you supposed to do then, if you happen to be 10 at skill and 2 at dominance, and your neighbor is 9 at skill and 7 at dominance and looks pissed off when you are around? Well, if you value your life, but can't increase your dominance, the solution is to downplay your skill and pretend to be at most 8; maybe even less just to be safe.

How is this all related to the Impostor Syndrome?

I suspect that Impostor Syndrome is simply an instinctive reaction to noticing that your skills are disproportionally high compared to your relative dominance at the workplace. This needs to stop, now! The most reliable way to convince others of your incompetence is to convince yourself. So you notice some imperfection of yourself or your work, and you exaggerate it in your mind until you feel like a complete idiot. Except you still have the prestigious job title, so now you fear you might be punished for that.

What predictions does this model make?

People with Impostor Syndrome are on average physically weaker or less popular or coming from less privileged backgrounds than people who feel like true masters (controlling for the actual level of skill).

Therapy based on "look, according to evidence X, you are really skilled" and "hey, nobody is perfect" will not work, unless it accidentally stumbles on something that makes the patient feel stronger or more popular. On the other hand, weightlifting will reduce the Impostor Syndrome, despite having no relation to the disputed skill.

Discuss

Generalized Heat Engine

6 ноября, 2020 - 01:25
Published on November 5, 2020 7:01 PM GMT

I’d like to be able to apply more of the tools of statistical mechanics and thermodynamics outside the context of physics. For some pieces, that’s pretty straightforward - a large chunk of statistical mechanics is just information theory, and that’s already a flourishing standalone field which formulates things in general ways. But for other pieces, it’s less obvious. What’s the analogue of a refrigerator or a carnot cycle in more general problems? How do “work” and “heat” generalize to problems outside physics? The principle of maximum entropy tells us how to generalize temperature, and offers one generalization of work and heat, but it’s not immediately obvious why we can’t extract “work” from “heat” without subsystems at different temperatures, or how to turn that into a useful idea in non-physics applications.

This post documents my own exploration of these questions in the context of a relatively simple problem, with minimal reference to physics (other than by analogy). Specifically: we’ll talk about how to construct the analogue of a heat engine using biased coins.

Intuition

The main idea I want to generalize here is that we can “move uncertainty around” without reducing uncertainty. This is exactly what e.g. a refrigerator or heat engine does.

Consider the viewpoint of a refrigerator-designer. All the microscopic dynamics of the (fridge + environment) system must be reversible, so the number of possible microscopic states will never decrease on its own as time passes. The only way to reduce uncertainty about the microscopic state is to observe it. But the fridge designer is designing the system, deciding in advance how it will behave. The designer has no direct access to the environment in which the fridge will run, no way to measure the exact positions the atoms will be in when the fridge first turns on. The designer, in short, cannot directly observe the system. So, from the designer’s perspective, there’s uncertainty which cannot be reduced.

(In statistical mechanics, there are several entirely different justifications for why observations can’t reduce microscopic uncertainty/entropy - for instance, in one approach, macroscopic variables are chosen in such a way that we can deterministically predict future macroscopic observations. Another comes from Maxwell’s demon-style arguments, where the demon’s memory has to be included as part of the system. I’ll use the designer viewpoint, since it’s conceptually simple and easy to apply in other areas - in particular, we can easily apply it to the design of AIs embedded in their environment.)

While we can’t reduce our total uncertainty, we can move it around. We design the machine to apply transformations to the system which leave us more certain about some subsystems (e.g. the inside of the refrigerator), but less certain about other subsystems (e.g. heat baths used to power the system).

Setup

We’re going to apply transformations to these coins. Each transformation replaces some set of coins with new values which are a function of their old values. For instance, one transformation might be

(XC1,XH3,XH7)←(XC1,XH3XC1+XH7¯¯¯¯¯¯¯¯XC1,XH7XC1+XH3¯¯¯¯¯¯¯¯XC1)

(Here the bar denotes logical not - i.e. ¯¯¯¯¯X means "not X".) This transformation swaps XH3 with XH7 if XC1 is 1, and leaves everything unchanged if XC1 is 0.

We’ll mostly be able to use any transformations we want, but with two big constraints. First: all transformations must be reversible. If we know the final state of the coins and which transformations were applied, then we must be able to reconstruct the initial state of the coins. (This is the analogue of microscopic reversibility.) Our example transformation above is reversible - since it doesn’t change XC1, we can always tell whether XH3 and XH7 were swapped, and we can swap them back if they were (indeed, we can do so by simply reapplying the same transformation).

Second constraint: all transformations must conserve the number of heads; heads can be neither created nor destroyed on net. Here the number of heads is our analogue of energy, and heads-conservation is our analogue of microscopic energy conservation. (In physics, we’d probably describe this as some kind of spin system in an external magnetic field.) Our example transformation above conserves the number of heads: it either swaps to coins or leaves everything alone, so the total number of heads stays the same.

One more key rule: while we will be able to choose what transformation to apply, we do not get to look at the coins before choosing our transformation. Physical analogy: if we’re building a heat engine or refrigerator or the like, we can’t just freely observe the microscopic state of the system. More generally, if we’re designing some machine (like a heat engine), we have to decide up-front how the machine will behave, before we have perfect information about the environment in which it will run. The machine itself can “observe” variables while running, but the machine is part of the system, so those “observations” need to be reversible and energy-conserving just like any other transformations.

Writing it all out mathematically: we choose some transformation T for which

• (XH,XC)′=T(XH,XC)
• (∑kXHk+∑kXCk)′=∑kXHk+∑kXCk
• T is invertible

We’ll want to choose this T to do something interesting, like reduce the uncertainty of particular coins.

Extracting “Work”

General problem: choose a transformation to produce some coins which are 1 with near-zero uncertainty (i.e. asymptotically zero uncertainty). We’ll call these deterministic coins “work”, and use w to denote the number of work-coins produced.

We’ll look at two subproblems to this problem. First, we’ll try to do it using just one of the two pools of coins (the hot one, though it doesn’t matter). This is the equivalent of “turning heat directly into work”, i.e. a type-2 perpetual motion machine; we’d expect it to be impossible. Second, we’ll tackle the problem using both pools, and figure out how much work we can extract. This is the equivalent of a heat engine.

Extracting Work From One Heat Bath

The first key thing to notice is that this is inherently an information compression problem. I have n random coins with heads-probability 0.2. I want to make w of those coins near-certainly 1, while still making the transformation reversible - therefore the remaining n−w transformed coins must contain all of the information from the original n coins. In other words, I need to compress the info from the original n coins into n−m bits with near-certainty.

If we whip out our information theory, that compression is fairly straightforward. Our biased coins have entropy of −(0.2∗log(0.2)+0.8∗log(0.8))≈0.72 bits per coin. So, with a reversible transformation we can compress all of the info into 0.73n of the coins, and the remaining 0.27n coins can all be nearly-deterministic.

(We’re fudging a bit here - we may need to add one or two extra coins from outside to make the compression algorithm handle unlikely cases without loss - but for current purposes that’s not a big deal. I’ll be fudging this sort of thing throughout the post.)

However, we also need to conserve the number of heads. That’s a problem: fully compressed bits are 50/50 in general, so our 0.73n compressed bits include roughly 0.36n tails. We started with only 0.2n tails, so we have no way to balance the books - even if all of our 0.27n deterministic bits are heads, we still end up with too few heads and too many tails.

This generalizes: we won’t be able to compress our information without producing more tails. Hand-wavy proof: the initial distribution of coins is maxentropic subject to a constraint on the total number of heads. So, we can’t compress it without violating that constraint.

Let’s spell this out a bit more carefully.

A maxentropic variable contains as much information as possible - there is no other distribution over the same outcomes with higher entropy. In general, mutual information I(X,Y) is at most the entropy of one variable H(X) - i.e. the information in X about Y is at most all of the information in X, so the higher the entropy H(X) the more information X can potentially contain about any other variable Y.

In our case, we have an initial state X and a final state X′. We want to compress all the info in X into X′, I(X,X′)=H(X), so we must have H(X′)≥I(X,X′)=H(X). Initial state X is maxentropic: its possible outcomes are all values of n coin flips with a fixed number of heads, and X has the highest possible H(X) over those outcomes. Final state X′ we choose to be maxentropic - we need H(X′)≥H(X), so we make H(X′) as large as possible. However, note that the possible outcomes of X′ are a strict subset of the possible outcomes of X: possible outcomes of X′ are all values of n coin flips with a fixed number of heads AND the first w coins are all heads. So, we choose X′ to be maxentropic on this set of outcomes, but it’s a strictly smaller set of outcomes than for X, so the maximum achievable entropy H(X′) will be less than H(X). Thus: our condition H(X′)≥H(X) cannot be achieved.

We cannot extract deterministic bits (i.e. work) from a single pool of maxentropic-subject-to-constraint random bits (i.e. heat), while still respecting the constraint.

Even more generally: if we have a pool of random variables which are maxentropic subject to some constraint, we won’t be able to compress them without violating that constraint. If the constraint fixes a value of ∑kfk(Xk), and we want to deterministically fix f1(X1), then that reduces the number of possible values of 1} f_k(X_k)">∑k>1fk(Xk), and therefore reduces the amount of information which the remaining variables can contain. Since they didn’t have any “spare” entropy before (i.e. initial state is maxentropic subject to the constraint), we won’t be able to “fit” all the information into the remaining entropy.

That’s a very general analogue of the idea that we can’t extract work from a single-temperature heat bath. How about two heat baths?

Extracting Work From Two Heat Baths

Now we have 2n coins to play with: n with probability 0.1, and n with probability 0.2. The entropy is roughly 0.73 bits per “hot” coin, and 0.47 bits per “cold” coin. So, we’d need 1.19n coins with a roughly 50/50 mix of heads and tails to contain all the info. That’s still too many tails: full compression would require roughly .59n tails, and we only have about (0.1+0.2)n=0.3n. But our initial distribution is no longer maxentropic given the overall constraint, so maybe it could work if we only partially compress the information?

Let’s set up the problem more explicitly, to maximize the work we can extract.

Our final distribution will contain w deterministic bits and 2n−w information-containing bits. The information-containing bits must contain a total of 0.3n tails. In order to contain as much information as possible, the final distribution of those 2n−w bits should be maxentropic subject to the constraint on the number of tails. So, they should be roughly (remember, large n) IID with probability 2n−w−0.3n2n−w of heads, with total entropy −(2n−w)(2n−w−0.3n2n−wlog(2n−w−0.3n2n−w)+0.3n2n−wlog(0.3n2n−w)). We set that equal to the amount of entropy we need (i.e. 1.19n bits), and solve for w. In this case, I find w=0.12n. Since we started with about 1.7n heads, we’re able to extract about 7% of them as “work” (or 15% of the “hot” heads).

So we can indeed extract work from two heat baths at different temperatures.

Notably, the “efficiency” we calculated is not the usual theoretical optimal efficiency from thermodynamics. That “optimal efficiency” comes from a slightly different problem - rather than converting all our bits into as much work as possible, that problem considers the optimal conversion of random bits into work at the margin, assuming our heat baths don’t run out. In particular, that means we usually wouldn’t be using equal numbers of bits from the hot and cold pools.

This post is already plenty long, so I’ll save further discussion of thermodynamic efficiency and temperatures for another day.

Takeaway

The point of this exercise is to cast core ideas of statistical mechanics - especially the more thermo-esque ideas - in terms which are easier to generalize beyond physics. To that end, the key ideas are:

• Thermo-like laws apply when we can't gain information about a system (e.g. because we're designing a machine to operate in an environment which we can't observe directly at design time), can't lose information about a system at a low level (either due to physical reversibility constraints or because we don't want to throw out info), and the system has some other constraints (like energy conservation).
• We can operate on the system in ways which move uncertainty around, without decreasing it.
• If we want to move uncertainty around in a way which makes certain variables nearly deterministic (i.e. "extract work"), that's a compression problem.
• We can't compress a maxentropic distribution, so we can't extract work from a single maxentropic-subject-to-constraint pool of variables without violating the constraint.
• We can extract work from two pools of variables which are initially maxentropic under different constraints, while still respecting the full-system constraint.

Discuss

When Money Is Abundant, Knowledge Is The Real Wealth

6 ноября, 2020 - 01:11
Published on November 3, 2020 5:34 PM GMT

First Puzzle Piece

By and large, the President of the United States can order people to do things, and they will do those things. POTUS is often considered the most powerful person in the world. And yet, the president cannot order a virus to stop replicating. The president cannot order GDP to increase. The president cannot order world peace.

Are there orders the president could give which would result in world peace, or increasing GDP, or the end of a virus? Probably, yes. Any of these could likely even be done with relatively little opportunity cost. Yet no president in history has known which orders will efficiently achieve these objectives. There are probably some people in the world who know which orders would efficiently increase GDP, but the president cannot distinguish them from the millions of people who claim to know (and may even believe it themselves) but are wrong.

Last I heard, Jeff Bezos was the official richest man in the world. He can buy basically anything money can buy. But he can’t buy a cure for cancer. Is there some way he could spend a billion dollars to cure cancer in five years? Probably, yes. But Jeff Bezos does not know how to do that. Even if someone somewhere in the world does know how to turn a billion dollars into a cancer cure in five years, Jeff Bezos cannot distinguish that person from the thousands of other people who claim to know (and may even believe it themselves) but are wrong.

When non-experts cannot distinguish true expertise from noise, money cannot buy expertise. Knowledge cannot be outsourced; we must understand things ourselves.

Second Puzzle Piece

The Haber process combines one molecule of nitrogen with three molecules of hydrogen to produce two molecules of ammonia - useful for fertilizer, explosives, etc. If I feed a few grams of hydrogen and several tons of nitrogen into the Haber process, I’ll get out a few grams of ammonia. No matter how much more nitrogen I pile in - a thousand tons, a million tons, whatever - I will not get more than a few grams of ammonia. If the reaction is limited by the amount of hydrogen, then throwing more nitrogen at it will not make much difference.

In the language of constraints and slackness: ammonia production is constrained by hydrogen, and by nitrogen. When nitrogen is abundant, the nitrogen constraint is slack; adding more nitrogen won’t make much difference. Conversely, since hydrogen is scarce, the hydrogen constraint is taut; adding more hydrogen will make a difference. Hydrogen is the bottleneck.

Likewise in economic production: if a medieval book-maker requires 12 sheep skins and 30 days’ work from a transcriptionist to produce a book, and the book-maker has thousands of transcriptionist-hours available but only 12 sheep, then he can only make one book. Throwing more transcriptionists at the book-maker will not increase the number of books produced; sheep are the bottleneck.

When some inputs become more or less abundant, bottlenecks change. If our book-maker suddenly acquires tens of thousands of sheep skins, then transcriptionists may become the bottleneck to book-production. In general, when one resource becomes abundant, other resources become bottlenecks.

Putting The Pieces Together

If I don’t know how to efficiently turn power into a GDP increase, or money into a cure for cancer, then throwing more power/money at the problem will not make much difference.

King Louis XV of France was one of the richest and most powerful people in the world. He died of smallpox in 1774, the same year that a dairy farmer successfully immunized his wife and children with cowpox. All that money and power could not buy the knowledge of a dairy farmer - the knowledge that cowpox could safely immunize against smallpox. There were thousands of humoral experts, faith healers, eastern spiritualists, and so forth who would claim to offer some protection against smallpox, and King Louis XV could not distinguish the real solution.

As one resource becomes abundant, other resources become bottlenecks. When wealth and power become abundant, anything wealth and power cannot buy become bottlenecks - including knowledge and expertise.

After a certain point, wealth and power cease to be the taut constraints on one’s action space. They just don’t matter that much. Sure, giant yachts are great for social status, and our lizard-brains love politics. The modern economy is happy to provide outlets for disposing of large amounts of wealth and power. But personally, I don’t care that much about giant yachts. I want a cure for aging. I want weekend trips to the moon. I want flying cars and an indestructible body and tiny genetically-engineered dragons. Money and power can’t efficiently buy that; the bottleneck is knowledge.

Based on my own experience and the experience of others I know, I think knowledge starts to become taut rather quickly - I’d say at an annual income level in the low hundred thousands. With that much income, if I knew exactly the experiments or studies to perform to discover a cure for cancer, I could probably make them happen. (Getting regulatory approval is another matter, but I think that would largely handle itself if people knew the solution - there’s a large profit incentive, after all.) Beyond that level, more money mostly just means more ability to spray and pray for solutions - which is not a promising strategy in our high-dimensional world.

So, two years ago I quit my monetarily-lucrative job as a data scientist and have mostly focused on acquiring knowledge since then. I can worry about money if and when I know what to do with it.

A mindset I recommend trying on from time to time, especially for people with \$100k+ income: think of money as an abundant resource. Everything money can buy is “cheap”, because money is "cheap". Then the things which are “expensive” are the things which money alone cannot buy - including knowledge and understanding of the world. Life lesson from Disney!Rumplestiltskin: there are things which money cannot buy, therefore it is important to acquire such things and use them for barter and investment. In particular, it’s worth looking for opportunities to acquire knowledge and expertise which can be leveraged for more knowledge and expertise.

Investments In Knowledge

Past a certain point, money and power are no longer the limiting factors for me to get what I want. Knowledge becomes the bottleneck instead. At that point, money and power are no longer particularly relevant measures of my capabilities. Pursuing more “wealth” in the usual sense of the word is no longer a very useful instrumental goal. At that point, the type of “wealth” I really need to pursue is knowledge.

If I want to build long-term knowledge-wealth, then the analogy between money-wealth and knowledge-wealth suggests an interesting question: what does a knowledge “investment” look like? What is a capital asset of knowledge, an investment which pays dividends in more knowledge?

Mapping out the internal workings of a system takes a lot of up-front work. It’s much easier to try random molecules and see if they cure cancer, than to map out all the internal signals and cells and interactions which cause cancer. But the latter is a capital investment: once we’ve nailed down one gear in the model, one signal or one mutation or one cell-state, that informs all of our future tests and model-building. If we find that Y mediates the effect of X on Z, then our future studies of the Y-Z interaction can safely ignore X. On the other hand, if we test a random molecule and find that it doesn’t cure cancer, then that tells us little-to-nothing; that knowledge does not yield dividends.

Of course, gears-level models aren’t the only form of capital investment in knowledge. Most tools of applied math and the sciences consist of general models which we can learn once and then apply in many different contexts. They are general-purpose gears which we can recognize in many systems.

Once I understand the internal details of how e.g. capacitors work, I can apply that knowledge to understand not only electronic circuits, but also charged biological membranes. When I understand the math of microeconomics, I can apply it to optimization problems in AI. When I understand shocks and rarefactions in nonlinear PDEs, I can see them in action at the beach or in traffic. And the “core” topics - calculus, linear algebra, differential equations, big-O analysis, Bayesian probability, optimization, dynamical systems, etc - can be applied all over. General-purpose models are a capital investment in knowledge.

I hope that someday my own research will be on that list. That’s the kind of wealth I’m investing in now.

Discuss

Babble Challenge: 50 thoughts on stable, cooperative institutions

6 ноября, 2020 - 01:06
Published on November 5, 2020 6:38 AM GMT

In a recent LessWrong question Anna Salamon asks “Where did stable, cooperative institutions come from (like bridges that stay up; the rule of law; or Google)?” She also worries that “the magic that used to enable such cooperative institutions is fading”.

Anna’s post is strong in babble. It does provide gears-level mechanisms and concrete hypotheses. But it also gestures at intuitions, felt senses, and insights-waiting-to-be-had.

This week’s challenge is simple: Have 50 thoughts about Anna’s post and questions.

Do you have a guess at the magic enabling human societies to build roads and postal services? Do you think institutions are actually getting stronger over time? What are 10 examples of how institutions changed in the last 100 years? Or 10 predictions about how they'll change in the future? Etc.

Your thoughts can be hypotheses, questions, anecdotes, confusions, disagreements, feelings... and so forth.

50 thoughts, no need for them to be longer than a sentence.

You have 1 hour.

Looking back

Here are the current rankings. (You gain a star for completing a challenge, and lose one for missing a week. I’m not including myself since it feels weird to be both gamemaster and participant.)

Great job everyone!

★★★★★ gjm

★★★★ Yonge

★★★ Tetraspace Grouping, Slider

★★ Mark Xu, Bucky

★ Turntrout, Harmless, Tao Lin, Daniel Kokotajlo, chasmani, supposedlyfun

Moving Forwards

This is week 6 of my 7-week babble sprint.

It is said that sufficiently advanced technology is indistinguishable from magic.

I think something similar is true for building skills.

There are some skills of which you can see the contours. You can squint and see yourself wielding them, with practice. And there are some things which seem like magic. As if though the kinds of humans who wield them are fundamentally different from the kind of human you are. There's no set of steps that could get you to where they are at.

Intellectual creativity often falls in this bucket.

For whatever reason, culture loves to create the vision of a genius. The media writes about “the 14-year old who climbed Mount Everest and wrote software for America’s largest bank” when in fact they made an impressive-for-their-age contribution to an open source package and camped out at a lower base station reachable by walking.

Maybe because creativity is so illegible. It seems that there is nothing. And then there’s an idea. George Orwell said of writing that it was like being “driven on by some demon whom one can neither resist nor understand.”

It’s especially illegible from the outside.

It will often happen to me that I read a LessWrong post. Full of brilliant, interesting, novel thoughts; and with a bustling comment section. And faced with this Tower of Babble I take a peak at what my own brain generates, 2 seconds after being hurled into the spotlight —

nothing

— and I despair.

I feel like I don’t have ideas. Like I am a person who does not have ideas.

But I miss that Tower’s are built one stone at a time. Once, where that great obelisk rests, there was only wind.

I’ve recently been meditating on this, trying to feel this truth in my bones:

I am a machine. One that turns time and food and air into creativity. And machines are in the domain of Science. They are understandable, extendable, lawful.

Now that I’ve done 5 weeks of the Babble Challenge, this is becoming clearer. I can choose to have ideas. I’m getting a better sense of the gears that turn to produce my creativity, I see motion where before there was only fog and magic.

If you’ve also looked at LessWrong threads and felt they were the playing fields of wizards, I also want you to have this experience. I want you to feel like a machine who, with ambition and deliberate practice, can learn to turn time into ideas.

Rules
• Focus on the content of Anna's essay

Think about the ideas and the questions, not the spelling or word choice. Think about institutions and cooperation, not about paragraph length and sentence structure. Try to engage with the substance, rather than the symbol.

• 50 answers or nothing. Shoot for 1 hour.

Any answer must contain 50 ideas to count. That’s the babble challenge.

However, the 1 hour limit is a stretch goal. It’s fine if it takes longer to get to 50.

• Post your answers inside of spoiler tags. (How do I do that?)

This is really important. Sharing babble in public is a scary experience. I don’t want people to leave this having back-chained the experience “If I am creative, people will look down on me”. So be generous with those upvotes.

If you comment on someone else’s post, focus on making exciting, novel ideas work — instead of tearing apart worse ideas.

• Not all your ideas have to work

I've often found that 1 great idea can hide among 10 bad ones. You just need to push through the worse ones. Keep talking. To adapt Wayne Gretzky's great quote: "You miss 100% of the ideas you never generate."

• My main tip: when you’re stuck, say something stupid.

If you spend 5 min agonising over not having anything to say, you’re doing it wrong. You’re being too critical. Just lower your standards and say something, anything. Soon enough you’ll be back on track.

This is really, really important. It’s the only way I’m able to complete these exercises.

---

Now, go forth and Babble!

50 thoughts on the question about stable, cooperative institutions!

Discuss

What considerations influence whether I have more influence over short or long timelines?

6 ноября, 2020 - 01:03
Published on November 5, 2020 7:56 PM GMT

As my timelines have been shortening, I've been rethinking my priorities. As have many of my colleagues. It occurs to us that there are probably general considerations that should cause us to weight towards short-timelines plans, or long-timelines plans. (Besides, of course, the probability of short and long timelines) For example, if timelines are short then maybe AI safety is more neglected, and therefore higher EV for me to work on, so maybe I should be systematically more inclined to act as if timelines are short.

We are at this point very unsure what the most important considerations are, and how they balance. So I'm polling the hive mind!

Discuss

Covid 11/5: Don’t Mention the War

6 ноября, 2020 - 01:00
Published on November 5, 2020 2:20 PM GMT

Mutation has always been an elephant in the Covid-19 room, and I haven’t paid as much attention to it as I should have. It is increasingly clear the Covid-19 has mutated, and the new strain is substantially more infectious than the old one and now virtually the only strain. One write-up from the Financial Times: Scientists warn a more infectious coronavirus variant spreading across Europe. The new strain doesn’t seem different from the old one once you are infected, in terms of your prognosis or chance of death, and immunity to either strain gives immunity to both.

I don’t have a handle on what ‘ten times more infectious’ cashes out into in terms of magnitude, what the new baseline R0 is, or how much we would need to tighten our precautions to compensate. If you do know, please let us know. My assumption is that ‘ten times more infectious’ probably cashes out to not that big a difference in practice, and likely a misleading way to categorize what’s happening, but I have no idea.

What I do know is that it makes sense that if your previous goal was to do exactly enough to keep things under control, or your people were using control systems to do the same thing, and then the virus gets more infectious, you are going to be in a lot of trouble until you can adjust. Especially if winter arrives at the same time. And adjusting sufficiently won’t be easy.

This also throws a wrench into any calculations regarding herd immunity. Previously I was confident that 50% immunity, if accumulated through uncontrolled spread, would be more than sufficient. If the baseline R0 is suddenly much higher, that could no longer be the case. How much not the case depends on the magnitude of the change. Again, I do not have a good read on that.

I also suspect, both due to dropping death rates (which likely have multiple compounding causes) and also purely on first principles, that the new strain is effectively less deadly. Initial viral load when infected likely matters for how often you die and how bad your prognosis is in general. If the new virus is more infectious due to things like changes in its spike proteins, but replicates at the same rate and is equally vulnerable to an immune system response once it starts, it seems likely that it results in lower average initial viral loads. That in turn would make it effectively less deadly, even if a laboratory test would indicate no difference. I’m curious what other people’s takes on this might be.

Aside from Covid-19, nothing else important happened this week. Let’s run the numbers.

They’re not good.

The Numbers Deaths DateWESTMIDWESTSOUTHNORTHEASTSep 3-Sep 911417712717329Sep 10-Sep 1611599543199373Sep 17-Sep 2310168932695399Sep 24-Sep 309349902619360Oct 1-Oct 779711032308400Oct 8-Oct 1478212172366436Oct 15-Oct 2180415912370523Oct 22-Oct 2889517012208612Oct 29-Nov 495619772309613

Things continue as we might have expected, with another steady rise in the death rate especially in the Midwest. There are some details worth pointing out that don’t make it onto the graph or chart.

In particular, the rise in deaths was entirely in the last three days and especially in the final day, when deaths jumped to 1,566 at the same time positive tests first exceeded 100k. That’s 300+ more deaths than any day in the past few months. This raises my fear of a rapid acceleration from here, but the most likely explanation is that everyone was busy or distracted, intentionally or unintentionally, and shifted their reporting forward in time a bit, underreporting earlier and catching up now. It is clear that test and death counts reported on a given day are somewhat correlated due to variance in how much data the system can process on any given day.

Also, some unusually large irregularities out West mostly cancelled out, as Arizona had a dramatic rise in deaths while California took a plunge. I don’t expect either effect to be sustained.

Test Counts DateUSA testsPositive %NY testsPositive %Cumulative PositivesAug 27-Sep 25,041,6345.5%611,7210.8%1.85%Sep 3-Sep 94,849,1345.3%552,6240.9%1.93%Sep 10-Sep 164,631,4085.8%559,4630.9%2.01%Sep 17-Sep 235,739,8535.2%610,8020.9%2.10%Sep 24-Sep 305,839,6275.1%618,3781.1%2.19%Oct 1-Oct 76,021,8075.2%763,9351.3%2.29%Oct 8-Oct 146,327,9725.8%850,2231.1%2.40%Oct 15-Oct 216,443,3716.5%865,8901.2%2.52%Oct 22-Oct 286,936,3007.5%890,1851.4%2.68%Oct 29-Nov 47,244,3478.6%973,7771.6%2.87% Positive Tests DateWESTMIDWESTSOUTHNORTHEASTSep 3-Sep 9472737243910640821926Sep 10-Sep 16450507526411581223755Sep 17-Sep 23540258538112773223342Sep 24-Sep 30554969293210630027214Oct 1-Oct 7567429724311017034042Oct 8-Oct 146828412574411799538918Oct 15-Oct 217557114985113323843325Oct 22-Oct 289498318188115812357420Oct 29-Nov 411268425291716709870166

Positive test percentages are rising and were especially high yesterday, so if anything the number of cases is rising faster than this. The rise looks more dramatic on the graph because it is a percentage rise off a high baseline and that’s how exponential growth works. This is the baseline scenario of cases continuing to rise at the same pace until deaths rise enough to case behavior adjustments, and/or we do major lockdowns that I don’t expect.

Positive Test Percentages PercentagesNortheastMidwestSouthWest9/3 to 9/91.97%6.02%8.48%4.13%9/10 to 9/162.41%5.99%11.35%4.49%9/17 to 9/232.20%5.96%7.13%4.11%9/24 to 9/302.60%6.17%6.18%4.27%10/1 to 10/72.61%6.05%6.74%4.23%10/8 to 10/142.57%8.14%7.09%4.75%10/15 to 10/222.95%8.70%7.85%5.36%10/22 to 10/283.68%9.87%8.58%6.46%10/29 to 11/44.28%12.79%8.86%7.04%

There was a huge and quite scary jump in the midwest. Other regions continue to get steadily worse, and on slowly improving test counts. We see no signs that this weave is about to peak, and testing is not expanding fast enough to keep pace. My model is that even when testing looks fully adequate the majority of cases are never identified. When positive test rates are double digit, I believe the vast majority of cases are being missed.

Nothing in the sections below is as important as the numbers. We have uncontrolled spread almost everywhere, growing by double digit percentages on a weekly basis, and we don’t yet see signs people are adjusting behavior or any sign of the political will to impose measures sufficiently effective to work.

I hope that we are only a few weeks from things turning around due to behavior adjustments, despite not yet seeing any sign of those adjustments. There are reasons for optimism and hope, but for now that’s all that is. Hope.

Europe

(That spike in Spain is them adding in past deaths, so ignore it.)

Lockdowns are in effect in many places across the continent. It takes some time to see the benefits in infections, and several weeks more to see the death rate decline. When you first lock down, there is a brief period where in-household infections actually accelerate even as out-of-household ones decline. Next week will get us past that stage, so we’d better start seeing these infection curves turn sharply downward. If they don’t, then either lockdowns will need to tighten further or defeat will need to be conceded. There aren’t any other options.

Sweden’s death rate is going down again. I don’t have a theory or a story to tell, but it’s curious.

Well, Belgium

Belgium seems to have it worst of any country in Europe. It’s bad enough I pulled them from the graphs because adding them changed the Y-axis and made the graphs harder to read – you can play around with the data site here. According to this interview, the medical system is expected to collapse within a week, although things could be sustained longer if patients can be diverted to Germany or other nations. The numbers here are scary as hell. If the source is correct, 25% of doctors are symptomatic right now. The number infected is that much higher, and this is after having a reasonably bad but quick first wave. Doctors known to be positive are working anyway, because everyone is working as often as they can and it’s still not enough, so the alternative is worse.

When I said last week that the worst was likely behind us, it’s because I think back to when we and other nations faced this situation back in March and April, plus supply chain disruptions and potential full economic collapse. That’s the worst. What’s happening in Belgium now is quite bad, but not as bad as what happened to New York City or to Italy in March and April.

For now, Belgium is going into a six week lockdown as of October 30. Schools are being closed for two weeks, in contrast to England and France where they are being kept open, and I hope they then extend this until December. Also in contrast to Germany’s lockdown, which seems far less severe, although that makes political sense since their outbreak is much less severe as well for now.

I continue to believe that, given the decision to lockdown, the lockdown needs to be as severe and total – and as soon – as possible. Better to do it quickly than to drag things out.

What I have not heard is what the plan is for after the new lockdowns. We know that we can reduce levels of infection via lockdown, but that buys only a little time if the hockey stick curve returns shortly thereafter. So what is the long term plan? What does it take?

This response from the above link is telling: ”Calculations suggest Germany needs to reduce contacts between people by roughly 75% from the current level,” Lauterbach says. “That is incredibly hard if you want to keep schools and most businesses open.” But bars and restaurants account for many contacts while providing only about 1% of Germany’s gross domestic product, making them “kind of the perfect target for pandemic measures.”

That would make perfect sense… if the current level involved life being as it was before Covid-19.

So it’s essentially an admission that even Germans didn’t even take the most basic and efficient of measures to social distance or mask up.

All or nothing. Either do what it takes to drive to zero cases, do what it takes to maintain R0<1 indefinitely, or accept defeat. There are no other options. And if you’re going to do it, do it early, and do it quickly. The link above broadly agrees, and wants to take cases all the way to zero then impose border controls. If politically viable and the vaccine isn’t close, that seems like the least bad option to me. But you have to mean it, or it won’t work.

I continue to be very confused that major European countries are keeping schools open while otherwise locking down, but don’t have anything useful to add on that subject.

Then again, do you know how you know when something is very safe in context? When even one death is reported as news.

If You Can Make It Here, You Still Have to Quarantine Before You Make It Anywhere In Particular

New York had what seemed like a reasonable policy. New York had very low case counts, while many other states had high case counts.  If you are travelling from a state with lots of Covid-19 cases, you need to quarantine or get tested, even if you’re returning home.

Last week, Governor Cuomo had to make special exceptions for New Jersey and Connecticut, because cases have gone up so much everywhere that those states would have qualified for quarantine procedures, along with almost every other state. New York itself would have qualified soon enough, if it hadn’t already. The old policy wasn’t indexed to anything, so it clearly no longer made sense.

The new policy is that all visitors except those from adjacent states must take a test before arriving, get a negative result, then come to the state, then within three days get a second negative result, and must provide proof of both tests. If your second test is positive or you fail to get it, you must quarantine for two weeks. Graciously, he will permit us to take day trips to New Jersey, Pennsylvania and Connecticut without getting tested, but everyone else needs to get tested upon return even if they are gone less than a day. Cuomo has also asked everyone to cancel Thanksgiving.

Does this new policy make sense? Is it cost effective?

When someone travels, they go from interacting with one set of people to interacting with a different set of people. Even if both locations typically have similar infection levels, this mixing of different groups is still going to increase the chance the virus will spread. Combine that with what people typically do when they travel, which is to go meet with other people, and that becomes doubly true. Add in that New York, while not doing great, is doing a lot better than most of the rest of the country at this point, and it seems like a strong argument for taking some precautions. But do these particular precautions make sense and are they efficient?

This is obviously not the first best solution. Whether it makes sense depends on what marginal effects dominate, and on the counterfactual.

What is limiting our ability to run tests? If we have an essentially fixed supply of tests that is slowly increasing, then investing in more tests would be good (and New York is good about this at least relative to other places) but we want to allocate what tests we have in a maximally efficient manner. The first test for each person probably makes sense, but giving two tests to everyone who crosses state lines does not seem efficient.

Whereas if we have spare capacity that we could use if we had the willingness to pay for it, then essentially all additional tests are good ideas on the margin. That’s not quite true but it’s close. Anyone who is out there in the world can benefit from the information and the cost is super low. So in this scenario, if you can’t find other places to get those tests used, then by all means use two per person. Same goes if demand for testing creates its own supply over the short to medium term, which on the margin is plausible.

This certainly makes more sense than universal quarantines from essentially every state, and seems likely to both drive testing capacity and be net helpful. What about convenience cost? I am certainly less excited about traveling outside New York if it means getting tested multiple times. It’s substantially annoying. But that’s plausibly also good, because such travel is risky, so taxing it will help balance things. So long as the tax isn’t wasteful, it’s good.

My rough conclusion after a few days thinking it over is that this is as good a policy as we could have plausibly gotten, and it would make sense to emulate it in other states that are doing much better than the country at large.

Limiting Supply Is a Double Edged Sword

It is tempting to mock statements like this one: “Not sure I understand how reducing how late restaurants, gyms & casinos can be open will help fight #Covid19. The virus doesn’t have a curfew.”

The baseline explanation is obvious. If you close them earlier, less people will use restaurants, gyms and casinos, so the risks they contribute will decline. Every time you close something more often on the margin, things improve. Yes, you would prefer to shut them down entirely, but that may not be practical.

But it’s not obvious that this is right. There are plausible circumstances under which this is wrong.

When those running this town decided to shut down the playground over ‘mask non-compliance’ what happened?

(One story is that you caught the no-good people being no good and you successfully punished them, and their suffering will be a Sacrifice to the Gods who will therefore perhaps look upon us with mercy, with no expectation that this makes anyone directly physically safer. Or that this will increase the power of those making this decision, or stroke their egos, or other such explanations, but we’re going to set all that aside for now.)

A plausible story is that they stopped a dangerous activity, and you trade off the safety against the other costs. Perhaps activity moved to backyards.

Another story is that they moved that dangerous activity to other places where it is still permitted, such as the playgrounds in other towns, which are now more crowded and hence got more risky, and also creates a vector for infection to spread from town to town. Or perhaps the activity moved to more playdates, indoors, with an order of magnitude more risk and probably no masks. Or perhaps everyone involved went generally stir-crazy and lost enough sanity points that they started doing various ill-advised things more often.

It all depends on the substitution effects.

The same goes for limiting hours on a restaurant. If you cut hours in half, and people who would have gone at night instead cook for themselves or get delivery, that’s the trade-off you wanted. You’ve reduced risk.

Also, the virus kind of does have somewhat of a (reverse) curfew for restaurants, in the sense that people are more willing to dine outdoors during the day when there is a sun, and less willing at night when it’s the winter in Massachusetts. It’s plausible that time-shifting people is valuable here, or that cutting off the riskier part of their activity and not the safer part could make sense. Or that people’s gatherings at night tend to be more social and risky for other reasons.

The nightmare scenario is that you close all the bars at 10pm instead of 2am, and instead of half the people coming to the bar in the evening and half at night, many of those who would have gone at night instead pack into the place in the evening, and risk goes up rather than down.

The gym very plausibly has this problem, and limiting hours there seems rather nuts. I am guessing there will be a high degree of temporal substitution. If you partially shut it down, then everyone has to come in at the same time and the place gets crowded. You actively want them open 24/7 so people can socially distance. So we should either open all the gyms all of the time, or close all of the gyms all of the time. We often foolishly ‘compromise’ on this because we lack a physical model, and make everything worse.

All or nothing. Half measures only backfire.

In Other News

Huge if true! Researchers claim to have a machine learning system that can diagnose Covid-19 over the phone by analyzing recordings of a forced cough (pdf). They claim sensitivity of 98.5% and specificity of 94.2%, and for asymptomatic cases a sensitivity of 100% (?!?) and specificity of 83.2%. I’m curious to what extent errors in the test are correlated from day to day. This can all be offered at unlimited scale for essentially zero additional cost. So, of course, it will presumably be illegal indefinitely because it’s not as good as accurate as a PCR test and/or hasn’t gone through the proper approval process, and no one will ever use it. Then again, if one were to somehow download or recreate such a program and run it, who would know?

Attempts at alternative diagnostic methods abound. We also learn that Finland’s Covid sniffer dog trial ‘extremely positive’ except that researchers do not seem to have confirmed, at least in the body of the post, that positive results from the dogs correspond to having Covid-19. All I saw from reading is that the dogs give positive results at about the right frequency. Which is nice, sure, but to be useful it really does have to find the actual positives, no matter how positive the feedback is from travellers.

The WSJ reports that Germany is planning to start vaccinations this year, ideally within hours of the vaccine being approved. As you would expect, I very much approve of this, I hope it works out, and I consider it a huge scandal that we won’t be able to follow them within 24 hours. If the Germans approve a vaccine, give me the damn vaccine.

Especially if it’s for Covid-19.

I still don’t have a good story for exactly what happened in Sweden, and as far as I can tell no one else does either. The graphs are profoundly weird and don’t make any sense. The best I can do is a multi-stage story of adoptive behavior that I would never have proposed unless I was fitting to the graph. Not a great option.

Japan using experiments to fill baseball stadium. Japan has dealt with the virus for the moment, so they get to have half-capacity baseball stadiums. Now they’re looking into whether they can do better than that, by running experiments and gathering data on airflow. Here in America, the NBA used sports as a justification to innovate superior testing and isolation procedures, which we hope can then be used to help the rest of us. In Japan, baseball gets to do the same. I wonder what other experiments we could run using this trick to end-run around civilization’s ban on experiments and data gathering.

Antivirals work best if given early, before you know for sure that you need them. The vulnerable thus face the Catch-22 of not being able to get antiviral meds before they are in the hospital/ICU, because they’re not sick enough to justify it, and then when they end up there because they didn’t get the medication early, are told it is too late because the medication would no longer work. Here is one such story. I have not verified this.

White House Advisor Scott Atlas has said some rather brazenly false things about Covid-19 and the course of the pandemic. It is not my place to say whether he knows they are false, or whether he cares whether they are false, or even thinks the question is meaningful. Either way, given his position, his statements risk doing great harm. I understand the desire to push back.

But destroying more of what remains of academic freedom to do so is not the way. Academic freedom questions arise on campus over COVID-19 strategy conflicts paints a chilling picture of how people on campus are currently thinking about such questions. Consider this quote:

[Spiegel] said, “There are limitations to academic freedom. What you express has to be honest, data-based and reflect what is known in the field. If you are going to claim academic freedom, you had better be academic, as well as free.”

If, in order to claim academic freedom, you have to say things that reflect what is known in the field, then what is the point of academic freedom? Being free to agree with what everyone already thinks is better than not being able to say anything at all, but not by that much. Only being able to say ‘data-based’ things would in many cases not be much better, especially given who decides what is data-based. Censorship in general seems to rapidly be on the rise, in academia, on social media and elsewhere. If left unchecked, this will not only hit the outgroup, and it will not end well.

This seems like a wise framing of how off the rails our (lack of) thinking has been this whole time regarding what is appropriate and ethical:

Reminder that “Much of our understanding of other coronaviruses comes from challenge trials done in the UK in the 1960s” back when we still were capable of doing such things.

One risk with a vaccine is that if the vaccine is only somewhat effective, and it causes mask usage to decline, it could end up not only not helping but making things worse. As with many such ‘papers’ this is a basic mathematical truth rather than some new finding, but important nonetheless, especially if non-vaccinated people also stop taking precautions. I expect to be able to deal with these types of problems, but there’s definitely risk there.

New study estimates 30,000 confirmed cases and 700 deaths from Trump rallies. Which would (among other things) be a CFR of over 2%. Also note that this effect would persist, as those people in turn infect others, so those 700 deaths should recur until we get things under control. All effects that make things worse are deadlier than they look until control systems set in. Effect size seems plausible to me. I would have liked to see more attention on the indoor vs. outdoor distinction, as there were three indoor rallies in their sample.

This study casts doubt on the effectiveness of contact tracing. More precisely, it casts doubt on the effectiveness of such tracing in San Francisco. It was hopelessly slow, waiting 6 days after symptoms to contact potential other cases, and failed to identify many new positives. I think this says a lot about our state capacity and ability to coordinate around such tasks, and very little about contact tracing beyond our inability to implement it usefully. As long as we are more concerned with privacy and voluntary cooperation than pandemic containmentent, and don’t actually want to do this for real, it won’t work. A shame.

I neglected last week to talk about how risky it would be to go Trick or Treating and whether doing so made sense and how to do so responsibly, both as a parent or as a giver of candy. I didn’t think about it. The answer seems obvious, which is that unless you’re doing it in a large group in an apartment building, all you’re doing is walking around outside and being given individually wrapped candies, so it’s obviously fine to leave out a bowl of candy and take from the bowl. Divia confirms this was how things worked in her area, my wife confirms the same in Warwick. I’m not worried about trick or treating at all. If anything, it’s a relatively low-risk way to stay active and avoid going stir crazy, and I’m sad I wasn’t able to join.

What did worry me were reports of adult indoor parties on Halloween, which fell on a Saturday night at a highly stressful time. I saw a number of them, but it’s hard to translate that into a good sense of how common they were. This is the one night many young adults and teenagers consider themselves to have a pass to act irresponsibly and there’s no reason to think this wouldn’t apply to pandemic precautions as well. Even the Surgeon General’s kid doesn’t get it. One could hope that the holiday would at least encourage mask compliance.

Paul Romer has yet another column on how crazy it is we don’t do massively more testing.

All I Want For Christmas Is (Still) a Covid Vaccine

Part of the story of why it’s going to be a while is that people like Speaker Pelosi are actively getting in the way via refusing to accept an approval in the UK as good enough for us. It would be very difficult to overstate my outrage that this is happening, and I really hope this is pure hypocrisy that will be reversed by coincidence on January 20.

I’m not sure how much of a reset is being called for here. My understanding was that October was always about the best-case scenario but required a lot of things to go right slash it would have been a political move by the administration, and something more like December was more of the hopeful case where still things mostly go right. The reasons to be optimistic are that there’s lots of irons in the fire only one of which needs to work, and there’s huge incentives pushing on everyone involved to get this done.

Like everyone else, I did get my hopes up somewhat, cause it’s been quite the long slog, but the timeline we are supposed to reset to does not sound that different. The article talks about asking for emergency authorization in mid-November for Pfizer, with Paul Mango of Health and Human Services expressing hope we can start with our most vulnerable by the end of the year, finish that leg by January, and everyone who wants it by March or April. That seems pretty great if it happens.

We are warned that we ‘may not get a vaccine by the end of 2020’ but that’s a warning that’s been loud and clear the entire time. So what’s the expectations reset here? And what has unexpectedly or unpredictably gone wrong, given the results themselves all seem promising?

Another way to put this is, there have been delays off the engineering schedule, but only those who lack engineering experience expect things to happen according to the engineering schedule.

Particular trials are behind schedule because they were paused or not enough people were infected, but are those things a surprise? Should we have expected anything else? Were these problems inevitable? I also felt the need to double check one other question: What exactly happens when these trials are paused, anyway?

The only thing that happens when a trial is paused is that they stop recruiting new patients. There is no reason to stop following the people already in the trial, you can’t undo a vaccination, and doing only part of a vaccination is not advisable. So pausing a trial for five weeks will slow it down if it hasn’t reached its ideal size, but presuming one is looking for a critical mass of data, it will be much less than five weeks of delay.

This also suggests that it is vitally important, when starting a trial, to get everyone treated as rapidly as possible once you start treating anyone. Otherwise, every time even one person in any arm of the trial has any serious medical issue, which is likely to happen even if the drug is a sugar pill, you end up paused, perhaps for a long time.

The second problem the trials are running into is that not enough people in the trials are catching the virus, which I assume refers to the control group.

Which raises an obvious question. Infections are clearly on the rise. If you’re not getting enough infections now, what was the expectation? What went wrong?

The pauses are getting in the way, but are far shorter than the delays and at least somewhat expected, so that’s at most a minor explanation.

If anything there should be substantially more people infected in the control group than one would have expected. There were previously concerns that cases were down and that (as welcome as it would have been) would interfere with gathering data. The opposite happened, so what’s the problem?

The trials were too small. There were some reports of problems recruiting for the trials, especially among desired subgroups. This of course is partly because it’s illegal to pay participants, but the importance of these trials is off the charts and a substantial portion of people I know would have volunteered if asked. It would not have taken much effort to find more volunteers.

The most basic explanation is ‘you should get many times more people than you need in order to finish quicker than you expect, and then double it again’ and they instead tried to get ‘enough to expect to finish on time’ and then stuff happened.

Or, rather, the main thing didn’t happen much, and this took them by surprise when it really shouldn’t have.

The Old and the Cautious

My guess is a key thing that happened is that they failed to anticipate how careful their trial participants would be compared to the general population.

A point I have emphasized time and again is that most Covid risk comes from doing stupid stuff. Not wearing a mask, indoor gatherings including indoor dining, massive packed political rallies, college students being college students, that sort of thing. There’s also some jobs that require repeatedly taking substantial risk, but mostly the statistics I’ve seen suggest that proper mask and barrier precautions work wonders even then. Each of several factors in an activity change the order of magnitude of risk taken. Those changes multiply together. Many people are doing radically fewer activities with people they don’t live with while taking lots of precautions, while others act as if everything is normal.

The people who take the stupid risks take them repeatedly, so the people you interact with when taking stupid risks are stupidly risky people to interact with, multiplying risk again.

If you do a study with people who care enough about not catching Covid-19 to seek out a vaccine trial, most of your sample is not going to react to maybe being vaccinated by throwing caution to the wind. They’re going to be almost as cautious as before, because that’s the type of people who sign up. The drug companies intuitively thought they’d get an infection rate not too far from population baseline, and (I am guessing) instead got one radically lower.

On reflection, like everyone else I want to see the treatment group and know whether the vaccines work, but I am also remarkably curious about the control group. The control group is not a representative sample of the population. It is a sample of a very interesting subpopulation. Let’s find out what types of activities these people participate in and how often by asking them, and let’s compare their infection rate to the general population adjusting for various known factors, and let’s run the whole correlation matrix and see what happens. If we can’t run studies otherwise, we have to take advantage of every opportunity we can get.

On Herd Immunity (A Version of This Likely Will Become Its Own Post)

Last week I had an influx of people reading my weekly column for the first time, by way of Twitter outrage and the inevitable Streisand Effect. Lots of people were pointing at this supposedly wrong and awful thing to say how wrong and awful it was. They pointed out an important misquote which I corrected. Mostly it was a lot of ‘experts/doctors/Fauci say X and he says ~X so he’s wrong and dangerous’ including after it became clear that once the quote was corrected I locally agreed with Fauci.

Whole event was stressful at an already stressful time for all of us, but did double the post’s readership and correct an important error, so overall I can’t complain.

The only other substantive complaint was that several sources rightfully challenged my herd immunity claim. It is quite a bold claim, a new reader wouldn’t know how I was justifying that claim as I hadn’t gone over my reasoning at least for a while, and in full perhaps ever, at least in terms of putting it all together at once.

Time to do that now, as promised.

Early in the pandemic, I did an analysis post called On R0. Looking back on it now, I see a bunch of stuff I got wrong, but I still agree with the three central points of the post absent substantial mutations of the virus.

The first central point is that if everyone lives their life as if Covid-19 does not exist, we know R0 started out somewhere between 2.5 and 4 in most places, perhaps up to 8 in a few like New York City. If we reduce risk by 75%, we will suppress and potentially eventually eradicate Covid-19 in all but a few places. In those few places, we’ll need more.

The second central point is that there are a lot of relatively cheap and available ways to cut those risks by 75% or more. Most people I know have cut their risk by 90% or more, before considering that those around them have also mostly done the same.

The third central point is that different people have radically different risk levels, that the riskier people are taking most of the risks, frequently with each other, while others have radically reduced their risks, and that the risk takers will mostly get infected first.

Combine those points and it seems obvious to me that by the time you get 50% infected, you’re going to reduce effective risk by more than 75%. Even with no other changes, that’s a victory condition for most places, and there will doubtless be some amount of behavioral change. We are not going to shake hands as often as we did, or be quite as up close talking, and will be more wary if we start coughing and so on, even if masks mostly fall out of favor once things die down. My conservative guess is that the needed number is more like 35%.

The classic objection is to point to the classic SIR model – susceptible, infected, recovered. This model treats everyone as identical people who take identical risks identically often with identical others completely at random, and are all equally susceptible to infection. That model is, upon reflection, either a toy model upper bound to be used in the spirit in which it was created, or obvious nonsense if taken seriously.

The even stronger objection is that you need to infect enough people so that the most risk taking communities have R0 below 1, but then treat infection as if it is evenly distributed. This isn’t a straw man, this happens periodically and happened again last week. The conclusion is then that you need e.g. 90% infected. But this is both lacking one’s cake and not eating it too. If different groups are differently risky enough that it matters, then the infections seek out the risky first.

Thus, the first part of the explanation is that my reasonable range for the original herd immunity threshold is between 25% and 50%. What we’ve seen since then seems compatible with this range. On first principles I would have put substantial weight on the lower range here or even lower, but the evidence since then seems to reject the lower end of my range. It also seems more consistent with the higher end of my range than it would with higher numbers, as one area after another peaks relatively quickly once spread gets completely out of hand, almost no matter what policies are chosen, combined with the antibody test data suggesting that infection rates are much higher than the PCR test results alone.

The second part of the explanation is that there are a lot more cases than positive tests. In most places, I believe that we are detecting at most 25% or so of cases in most places, and that number used to be even lower. The higher the positive test rate, the higher the percentage of cases we are likely missing. As of writing this about 2.8% of the population has tested positive, but I am guessing more like 18% of the population was positive at some point. This roughly matches the machine learning source I was using (which is no longer updating) and other similar sources like this one. There is of course disagreement on the exact number, but mostly I see people with broadly similar guesses and people who effectively treat positive tests and infections as identical. I see few in between.

In my model, that 18% is the only reason cases are only increasing at something like 10%/week instead of going full hockey stick in a similar way to Europe. Over time, more people become immune, but also over time people are taking on more risks, as fatigue sets in and now as the weather gets colder. The two effects work to offset each other, and people adjust their behaviors for the death rate, which combine to create a control system.

To figure out how many deaths would be required to hit herd immunity, we need to know two things: How many must be infected, and the infection fatality rate for new infections. If 18% are already infected and the threshold is at most 50%, then we need 32% additional infections out of 331 million people, or about 106 million new infections. What is the IFR?

We don’t have great data, the same way we don’t have great data on many other things, and for the same reasons. What we do know is that the IFR has dropped dramatically. Early in the pandemic, I was estimating 0.6% in the absence of medical system collapse, and using a 1% IFR for my calculations to be conservative. At this point, I think the upper bound is 0.4% or so, and my best guess is around 0.2%.

If the IFR is 0.2% and we need to infect 106 million people, then we would need just over 200k deaths from here to reach 50% infected, which puts the cap safely well under 500k deaths. If the IFR is higher and remains 0.4%, then it would be double that, and if the necessary infected were on the high end we would go somewhat above 500k deaths to around 650k. That seems like the worst case scenario absent medical system collapse or fading immunity, and my median estimate – again, assuming there isn’t a medical system collapse – is that this ends around 400k American deaths unless we get a vaccine in time to improve that.

An intuition pump might be that I think we are roughly 1/3rd of the way through in terms of infections, but the death rate has been at least cut in half, so that later two thirds shouldn’t kill more people than have already died.

That’s a lot of deaths, and deaths are really bad! It also would mean a lot of suffering. If we have a practical way to cut that down to 300k, it’s worth a lot to do that. I just don’t see any sign that we have a practical path to that outcome.

The new mutation throws a wrench into all of this, potentially driving the necessary thresholds much higher. If the new virus is twice as infectious in the sense that under typical conditions an infected person now infects an average of 8 people rather than 4, then we have to cut risk by 88% instead of 75%. That’s a lot harder.

If I had to make a rough guess, I’d say that it’s probably not as bad as that. I haven’t seen any clear estimates, but Europe is now almost entirely dealing with the new strain as per the Financial Times and other sources, and looks like it had an R0 between 1.2 (the most visible outside estimate I’ve seen) and 1.35 (my eyeballed ballpark estimate for France) before its new wave of lockdowns, despite widespread completely irresponsible behaviors. Before that, governments seemed to be going for a balance to keep R0 at or just below 1 (in some cases like Cuomo here in America we even hear goals just above one, which is madness and completely innumerate) and we didn’t see anything approaching eradication. If anything, until things were well out of control, behavior was growing less responsible rather than more responsible. Thus, we can presume that the previous R0 was very close to 1.

Take that all together,  and my guess as to the effective practical increase is that it is on the order of 20%-35%, so we’ll need something like an 80% reduction rather than 75% to put the whole thing to bed. That’s a huge problem when it initially interacts with an ill-prepared control system, but still seems like a target that is well within range of reasonable measures and would be reached not that long after 75% through immunity.

This is definitely not an argument that proves we should stop trying to contain Covid-19 and ‘go back to normal’, or rely on herd immunity ‘strategy’. If we stopped trying to contain it on all levels, it seems likely we would collapse the medical system, and we would lose many more people and politically likely be forced to do more lockdowns that would lead to severe damage on many fronts. Also, our preventative measures are likely themselves also reducing the death rate in other ways.

On the margin, however, it is informative of our choices, both individually and as a group. It informs us of the costs and benefits of different choices. If we are going to pay a massive price on many levels to try and do better than the scenario outlined above, we need to know the magnitude of the benefits, and compare it to the magnitude of the costs.

It is also entirely possible that I am wrong. I welcome challenges to the model (in both directions!), so long as they are not another iteration of ‘but experts say’ or ‘you’re not an expert’ or reiterating standard SIR without arguing why my objections to it aren’t valid. To the extent those objections have content, I’ve absorbed and responded to that content already.

If I am making a well-calibrated estimate given everything I know and believe, half the time I’m going to be too optimistic even after adjusting my estimate for such worries. If we lose more than 500k Americans before this is over without medical system collapse or a lot of reinfections, that would surprise me but not shock me. It would likely be because the IFR is somewhat higher than I think it is going forward, combined with us increasing our risk taking more than I expect, or if the mutation of the virus is a bigger deal than I’m estimating. I am not at all saying that this would be a Can’t Happen. I will however be shocked if the medical system does not collapse, reinfections are not a major factor, and we then see 750k deaths or even 1mm deaths.

Of course, we will only know the answer if this scenario happens. That seems increasingly likely, but it is far from certain.

The Good Doctor

Given what happened last week and my accidental misquoting of Dr. Fauci, I should explain how I see Dr. Fauci’s role in all this.

I believe Dr. Fauci mostly has an unusually accurate picture of what is happening, and that he is doing his best to contain the pandemic given his position and the tools available. He’s been doing his best to maintain public health while serving under several presidents. He has had a truly thankless job this time around, forced to fight constantly with the administration over even basic facts of the situation.

Mostly he’s been fighting the good fight. He also has a way with words. There’s a reason the public trusts and likes him so much.

That doesn’t mean we agree on everything, or that I always agree with the things he claims or suggests.

There are several issues.

The first issue is that Dr. Fauci is operating primarily on Simulacra Level 2 with some consideration for Level 1 – he tries to provide information that will cause us to change our perceptions of reality and thus alter our behavior in ways that will save lives, and worries some about whether his information is misleading or false because he thinks credibility matters too.

While not quite where I would want him to be, that is a huge improvement over the bulk of people in power, who are operating mostly on Simulacra Level 3 or even Level 4, and is why this section is titled The Good Doctor.

He has shown he is willing to tell what he feels are white lies in order to change the public’s behavior and save lives. I emphatically would not have done this in his place, but I am not judging. The ethical questions involved are difficult. In his position, most others would do the same.  In practice, I think the lies he and others told about masks were massively destructive in a way that was knowable in advance, but I do think that he lied with the best of intentions.

The whole projection of 400,000 deaths by the end of the year was another (but less egregious and not very harmful) similar case. I believe I know what he’s doing and why he’s doing it, and we can disagree about whether it matters that the numbers don’t add up unless you stare at them in exactly the right way. In a world in which he’s up against an administration (that he’s nominally still working for!) that tells a different order of magnitude of lies in both quantity and severity feels relevant here as well.

Thus, when Dr. Fauci makes a statement, I have to hear it in light of his primary motive, which is to cause us to act responsibly and save lives. It makes sense to worry that he might tactically join the Doom Patrol in the context of a vaccine, if he was worried it would cause us to let our guards down too far too soon, and he felt the need to lay groundwork to prevent that.

Still, he didn’t do it last week, and I wasn’t as suspicious as I should have been when it looked like he was doing it. Yes, he’s willing to mislead when he sees it as necessary, but this particular claim didn’t fit the pattern. That’s why I felt it was important! It seemed like he was going full Doom Patrol at Level 3 on this issue, which would have been a change from his usual actions. Thus, I noticed that this was an important change to notice, and that would require pushback and attention, but failed to notice that this also meant there was a good chance it hadn’t happened. I should not only have verified the quote more carefully, I should have been extra suspicious. I wasn’t, I need to do better, and that’s on me.

The second issue is a disagreement between our models. Dr. Fauci probably disagrees strongly with my model of future herd immunity, through some combination of disagreement over how many are infected, and disagreement over the heterogeneity effects. I believe these disagreements are genuine, but also that not much effort is being made to consider models like the one I outline.

The other disagreement we have is that Dr. Fauci (as far as I can tell) believes in the general wisdom of medical ethics and procedures, and in all the barriers we have that stop people from doing things. I don’t.

That’s the context in which I evaluate Dr. Fauci and his statements. I want him to keep his job. Of all the major authorities, his statements are the most credible, both in terms of his expertise and his honesty, but we still have to evaluate his statements in light of the fact that he is primarily telling us what he feels we need to hear in order to stay safe, and only secondarily telling us what he believes.

And, of course, sometimes and on some things, we will disagree.

Paths Forward

This was a super long weekly post because there was some overhead from last week to deal with, and the opportunity to flesh some of it out was a welcome distraction. There’s still a lot to do.

This week I gave a presentation of my updates on Covid-19 in a rationalist town hall. It ended up taking an hour, and there were tons of things I left out. It is high time I wrote up a summary of my current model and key things to know. If I can find the time and focus, I hope to turn that presentation and my thoughts above into a post or series of posts that can act as reference/entry points.

On a practical on-the-ground level, it’s getting steadily worse out there, and is unlikely to turn around in the next month or two. The best case short term scenario is that it soon stops getting worse in terms of real infections, while positive tests and deaths keep rising for a while due to lag. To the extent that infection risks are worth taking, they are not going to get less risky for a while. If you have Thanksgiving or Christmas plans, think carefully about the risks involved, and consider cancelling your plans or at a minimum reducing the size of your gatherings.

It is going to be ugly out there. Stay safe.

Discuss

Sub-Sums and Sub-Tensors

5 ноября, 2020 - 21:06
Published on November 5, 2020 6:06 PM GMT

This is the eighth post in the Cartesian frames sequence. Here, we define new versions of the sum and tensor of Cartesian frames that can delete spurious possible environments from the frame.

1. Motivating Examples

Consider two players, Alice and Bob, in a prisoner's dilemma. We will let W={0,1,2,3} be the space of utilities for Alice. The Cartesian frame for Alice looks like

C0=(2031),

where the top row represents Alice cooperating, and the left column represents Bob cooperating.

Let C1=(20) represent Alice committed to cooperating, and let C2=(31) represent Alice committed to defecting. Since the real Alice can either cooperate or defect, one might expect that Alice (C0) would equal the sum (C1⊕C2) of Alice cooperating with Alice defecting. However,

C1⊕C2=(20203113).

The last two columns are spurious environments that represent Bob copying Alice's move, and Bob doing the opposite of Alice's move. However, since Bob cannot see Alice's move, Bob should not be able to implement these policies: Bob can only choose to cooperate or defect.

Next, consider a unilateralist's curse game where two players each have access to a button that destroys the Earth. If either player pushes the button, the Earth is destroyed. Otherwise, the Earth is not destroyed. W={0,1}, where 0 represents the world being destroyed and 1 represents the world not being destroyed.

Here, both players have the Cartesian frame

D1=(0001),

where the first row represents pressing the button, and the first column represents the other player pressing the button.

The two players together can be expressed with the Cartesian frame

D2=⎛⎜ ⎜ ⎜⎝0001⎞⎟ ⎟ ⎟⎠,

where the rows in order represent: both players pressing the button; the first player pressing the button; the second player pressing the button; and neither player pressing the button.

One might expect that D1⊗D1 would be D2, but in fact,

D1⊗D1=⎛⎜ ⎜ ⎜⎝00000010⎞⎟ ⎟ ⎟⎠.

The second possible environment, in which the earth is just destroyed regardless of what the players do, is spurious.

In both of the above examples, the spurious environments are only spurious because of our interpretation. In the prisoner's dilemma case, C1⊕C2 would be correct if Alice and Bob were playing a modified dilemma where Bob can see Alice's choice. In the unilateralist's curse example, D1⊗D1 would be correct if there were three people playing the game. The problem is that the ⊕ and ⊗ operations do not see our interpretation, and so include all possible environments.

2. Deleting Spurious Environments

We introduce two new concepts, called sub-sum and sub-tensor, which represents sum and tensor with some (but not too many) spurious environments removed.

Definition: Let C=(A,E,⋅), and let D=(B,F,⋆). A sub-sum of C and D is a Cartesian frame of the form (A⊔B,X,⋄), where X⊆Env(C⊕D) and ⋄ is Eval(C⊕D) restricted to (A⊔B)×X, such that C≃(A,X,⋄C) and D≃(B,X,⋄D), where ⋄C is ⋄ restricted to A×X and ⋄D is ⋄ restricted to B×X. Let C⊞D denote the set of all sub-sums of C and D.

Definition: Let C=(A,E,⋅), and let D=(B,F,⋆). A sub-tensor of C and D is a Cartesian frame of the form (A×B,X,∙), where X⊆Env(C⊗D) and ∙ is Eval(C⊗D) restricted to (A×B)×X, such that C≃(A,B×X,∙C) and D≃(B,A×X,∙D), where ∙C and ∙D are given by a∙C(b,x)=(a,b)∙x and b∙D(a,x)=(a,b)∙x. Let C⊠D denote the set of all sub-tensors of C and D.

Thus, we define C⊞D and C⊠D to be sets of Cartesian frames that can be obtained by deleting columns from from C⊕D and C⊗D, respectively, but we have an extra restriction to ensure that we do not delete too many columns.

We will discuss later how to interpret the extra restriction, but first let us go back to our above examples.

If C1=(20) and C2=(31), then C1⊞C2 has 7 elements:

(20203113),(202311),(200313),(220313),(020113),(2031),(2013)

The 9 Cartesian frames that can be obtained by deleting columns from C⊕D that are not in C1⊞C2 are:

(2231),(2033),(0211),(0013),(23),(01),(21),(03),().

The Cartesian frames above in C1⊞C2 are exactly those with all four entries, 0, 1, 2, and 3. This is because the extra restriction to be in C1⊞C2 is exactly that if you delete the bottom row, you get an object biextensionally equivalent to (20), and if you delete the top row, you get an object biextensionally equivalent to (31).

Similarly, from the unilateralist's curse example,

D1⊠D1=⎧⎪ ⎪ ⎪⎨⎪ ⎪ ⎪⎩⎛⎜ ⎜ ⎜⎝0001⎞⎟ ⎟ ⎟⎠,⎛⎜ ⎜ ⎜⎝00000010⎞⎟ ⎟ ⎟⎠⎫⎪ ⎪ ⎪⎬⎪ ⎪ ⎪⎭.

It is easy to see that there are no other Cartesian frames in D1⊠D1, since there are only four subsets of the two element environment of D1⊗D1, and the Cartesian frames corresponding to the other two subsets do not have 1 in their image, so we cannot build anything biextensionally equivalent to D1 out of them.

Conversely, let C=(A,E,⋅) and D=(B,F,⋆) both be D1, and notice that if

(A×B,X,∙)=⎛⎜ ⎜ ⎜⎝0001⎞⎟ ⎟ ⎟⎠,

then (A,B×X,∙C) and (B,A×X,∙D) are both two-by-two matrices with a single 1 entry and three 0 entries, and so must be isomorphic to D1. Similarly, if

(A×B,X,∙)=⎛⎜ ⎜ ⎜⎝00000010⎞⎟ ⎟ ⎟⎠,

then (A,B×X,∙C) and (B,A×X,∙D) are both two-by-four matrices with a single 1 entry and seven 0 entries, and so must by biextensionally equivalent to D1.

3. Properties of Sub-Sums and Sub-Tensors

3.1. Sub-Sums and Sub-Tensors Are Commutative

Claim: For any Cartesian frames C0 and C1, there is a bijection between C0⊞C1 and C1⊞C0 that preserves Cartesian frames up to isomorphism. Similarly, there is a bijection between C0⊠C1 and C1⊠C0 that preserves Cartesian frames up to isomorphism.

Proof: Trivial. □

3.2. Tensors Need Not Be Sub-Tensors

For any Cartesian frames C and D with nonempty environments, we have that C⊕D∈C⊞D. However, sometimes C⊗D∉C⊠D. Indeed, sometimes C⊠D={}.

For example, if C and D both have nonempty image, but there are no morphisms from C to D∗, then C⊗D has no environments, and it is easy to see that C⊠D must be empty.

3.3. Sub-Sums and Sub-Tensors Are Superagents

Claim: For any Cartesian frames C0 and C1, and for any D0∈C0⊞C1, we have C0◃D0 and C1◃D0. Similarly, for any D1∈C0⊠C1, we have C0◃D1 and C1◃D1.

Proof: Let Ci=(Ai,Ei,⋅i) and Di=(Bi,Fi,⋆i). First, we show C0◃D0 using the currying definition of subagent. Observe B0=A0⊔A1. Consider the Cartesian frame (A0,{e},⋄) over B0, where ⋄ is given by a⋄e=a. Observe that the definition of sub-sum says that C0≃D∘0(A0,{e},⋄), so C0◃D0. Therefore, C0◃D0, and by commutativity, we also have C1◃D0 .

Similarly, we show C0◃D1 using the currying definition of subagent. Observe that B1=A0×A1. Consider the Cartesian frame (A0,A1,∙) over B1, where ∙ is given by a0∙a1=(a0,a1). Observe that the definition of sub-tensor says that C0≃D∘1(A0,A1,∙). Therefore, C0◃D1, and by commutativity, we also have C1◃D1. □

Observe that in the above proof, the Cartesian frame over B0 we constructed to show that sub-sums are superagents had a singleton environment, and the Cartesian frame over B1 we constructed to show that sub-tensors are superagents had a surjective evaluation function. The relevance of this observation will become clear later.

3.4. Biextensional Equivalence

As we do with most of our definitions, we will now show that sub-sums and sub-tensors are well-defined up to biextensional equivalence.

Claim: Given two Cartesian frames over W, C0 and C1, and given any D∈C0⊞C1, we have that for all C′0≃C0 and C′1≃C1, there exists a D′≃D, with D′∈C0⊞C1.

Proof: Let C0=(A0,E0,⋅0), let C1=(A1,E1,⋅1), and let D=(A0⊔A1,X,⋆) be an element of C0⊞C1, so X⊆E0×E1, and a⋆(e0,e1)=a⋅0e0 if a∈A0, and a⋆(e0,e1)=a⋅1e1 if a∈A1.

The fact that D∈C0⊞C1 tells us that for i∈{0,1}, Di≃Ci, where Di=(Ai,X,⋆i) with ⋆i given by a⋆0(e0,e1)=a⋅0e0 and a⋆1(e0,e1)=a⋅0e1.

For i∈{0,1}, let C′i=(A′i,E′i,⋅′i) satisfy C′i≃Ci. Thus, there exist morphisms (gi,hi):Ci→C′i, and (g′i,h′i):C′i→Ci, such that (g0,h0)∘(g′0,h′0), (g′0,h′0)∘(g0,h0), (g1,h1)∘(g′1,h′1), and (g′1,h′1)∘(g1,h1) are all homotopic to the identity.

We define a function f:E0×E1→E′0×E′1 by f(e0,e1)=(h′0(e0),h′1(e1)). Then, we define X′⊆E′0×E′1 by X′={f(x) | x∈X}, and let D′=(A′0⊔A′1,X′,⋆′), where a⋆′(e0,e1)=a⋅′0e0 if a∈A′0, and a⋆′(e0,e1)=a⋅′1e1 if a∈A′1. We need to show D′∈C′0⊞C′1, and that D′≃D.

To show that D′≃D, we will construct a pair of morphisms (j,k):D→D′ and (j′,k′):D′→D that compose to something homotopic to the identity in both orders. We define j:A0⊔A1→A′0⊔A′1 by j(a)=g0(a) if a∈A0, and j(a)=g1(a) if a∈A1. We similarly define j′:A′0⊔A′1→A0⊔A1 by j′(a)=g′0(a) if a∈A′0, and j′(a)=g′1(a) if a∈A′1. We define k′:X→X′ by k′(x)=f(x), which is clearly a function into X′, by the definition of X′. Further, k′ is surjective, and thus has a right inverse. We choose k:X′→X to be any right inverse to k′, so f(k(x))=x for all x∈X′.

To see that (j′,k′) is a morphism, observe that for a∈A′0⊔A′1, and (e0,e1)∈X, if a∈A′i, then

j′(a)⋆(e0,e1)=g′i(a)⋅iei=a⋅′ih′i(ei)=a⋆′(h′0(e0),h′1(e1))=a⋆′k′(e0,e1)).

To see that (j,k) is a morphism, consider an arbitrary a∈A0⊔A1 and (e′0,e′1)∈X′, and let (e0,e1)=k(e′0,e′1). Then, if a∈Ai, we have

j(a)⋆′(e′0,e′1)=j(a)⋆′f(e0,e1)=j(a)⋆′(h′0(e0),h′1(e1))=gi(a)⋅′ih′i(ei)=g′i(gi(a))⋅iei=a⋅iei=a⋆(e0,e1)=a⋆k(e′0,e′1).

To see that (j′,k′)∘(j,k) is homotopic to the identity on D, observe that for all a∈A0⊔A1 and (e0,e1)∈X, we have that if a∈Ai,

j′(j(a))⋆(e0,e1)=g′i(gi(a))⋅iei=a⋅iei=a⋆(e0,e1).

Similarly, to see that (j,k)∘(j′,k′) is homotopic to the identity on D′, observe that for all a∈A′0⊔A′1 and (e0,e1)∈X′, we have that if a∈A′i,

j(j′(a))⋆′(e0,e1)=gi(g′i(a))⋅′iei=a⋅′iei=a⋆′(e0,e1).

Thus, D′≃D.

To see D′∈C′0⊞C′1, we need to show that D′i≃C′i, where D′i=(A′i,X′,⋆′i) with ⋆′i given by a⋆′0(e0,e1)=a⋅′0e0 and a⋆′1(e0,e1)=a⋅′1e1. It suffices to show that D′i≃Di, since Di≃Ci≃C′i.

For i∈{0,1}, we construct morphisms (ji,ki):Di→D′i, and (j′i,k′i):D′i→Di. We define ji=gi, j′i=g′i, ki=k, and k′i=k′.

To see that (ji,ki) is a morphism, observe that for all a∈Ai and x∈X′, we have

ji(a)⋆′ix=gi(a)⋆′ix=j(a)⋆′x=a⋆k(x)=a⋆iki(x),

and to see that (j′i,k′i) is a morphism, observe that for all a∈A′i, and x∈X, we have

j′i(a)⋆ix=g′i(a)⋆ix=j′(a)⋆x=a⋆′k′(x)=a⋆′ik′i(x).

To see (j′i,k′i)∘(ji,ki) is homotopic to the identity on Di, observe that for all a∈Ai and x∈X, we have

j′i(ji(a))⋆ix=g′i(gi(a))⋆ix=j′(j(a))⋆x=a⋆x=a⋆ix,

and similarly, to show that (ji,ki)∘(j′i,k′i) is homotopic to the identity on D′i, observe that for all a∈A′i and x∈X′, we have

ji(j′i(a))⋆′ix=gi(g′i(a))⋆′ix=j(j′(a))⋆′x=a⋆′x=a⋆′ix.

Thus, we have that D′i≃Di, completing the proof. □

We have a similar result for sub-tensors, whose proof directly mirrors the proof for sub-sums:

Claim: Given two Cartesian frames over W, C0 and C1, and any D∈C0⊠C1, we have that for all C′0≃C0 and C′1≃C1, there exists a D′≃D, with D′∈C′0⊠C′1.

Proof: Let C0=(A0,E0,⋅0), let C1=(A1,E1,⋅1), and let D=(A0×A1,X,⋆) be an element of C0⊠C1, so X⊆hom(C0,C∗1), and

(a,b)⋆(g,h)=a⋅0h(b)=b⋅1g(a).

The fact that D∈C0⊠C1 tells us that for i∈{0,1}, Di≃Ci, where Di=(Ai,A1−i×X,⋆i) with ⋆i given by

a⋆0(b,(g,h))=b⋆1(a,(g,h))=(a,b)⋆(g,h).

For i∈{0,1}, let C′i=(A′i,E′i,⋅′i) satisfy C′i≃Ci. Thus, there exist morphisms (gi,hi):Ci→C′i, and (g′i,h′i):C′i→Ci, such that (g0,h0)∘(g′0,h′0), (g′0,h′0)∘(g0,h0), (g1,h1)∘(g′1,h′1), and (g′1,h′1)∘(g1,h1) are all homotopic to the identity.

We define a function f:hom(C0,C1∗)→hom(C′0,C′1∗) by f(g,h)=(h′1,g′1)∘(g,h)∘(g′0,h′0). This function is well-defined, since (h′1,g′1)=(g′1,h′1)∗∈hom(C∗1,C′1∗) and (h′0,g′0)∈hom(C′0,C0).

Then, we define X′⊆hom(C′0,C′1∗) by X′={f(g,h) | (g,h)∈X}, and let D′=(A′0×A′1,X′,⋆′), where

(a,b)⋆′(g,h)=a⋅′0h(b)=b⋅′1g(a).

We need to show that D′∈C′0⊠C′1, and that D′≃D.

To show that D′≃D, we will construct a pair of morphisms (j,k):D→D′ and (j′,k′):D′→D that compose to something homotopic to the identity in both orders. We define j:A0×A1→A′0×A′1 by j(a,b)=(g0(a),g1(b)), and we similarly define j′:A′0×A′1→A0×A1 by j′(a,b)=(g′0(a),g′1(b)). We define k′:X→X′ by k′(x)=f(x), which is clearly a function into X′, by the definition of X′. Further, k′ is surjective, and thus has a right inverse. We choose k:X′→X to be any right inverse to k′, so f(k(x))=x for all x∈X′.

To see (j′,k′) is a morphism, observe that for (a,b)∈A′0×A′1, and (g,h)∈X, we have

j′(a,b)⋆(g,h)=(g′0(a),g′1(b))⋆(g,h)=g′1(b)⋅1g(g′0(a))=b⋅′1h′1(g(g′0(a)))=(a,b)⋆′(h′1∘g∘g′0,h′0∘h∘g′1)=(a,b)⋆′k′(g,h).

To see that (j,k) is a morphism, consider an arbitrary (a,b)∈A0×A1 and (g′,h′)∈X′, and let (g,h)=k(g′,h′). Then, we have:

j(a,b)⋆′(g′,h′)=(g0(a),g1(b))⋆′f(g,h)=(g0(a),g1(b))⋆′(h′1,h′1)∘(g,h)∘(g′0,h′0)=g1(b)⋅′1h′1(g(g′0(g0(a))))=g′1(g1(b))⋅1g(g′0(g0(a)))=b⋅1g(g′0(g0(a)))=g′0(g0(a))⋅0h(b)=a⋅0h(b)=(a,b)⋆(g,h)=(a,b)⋆k(g′,h′).

To see that (j′,k′)∘(j,k) is homotopic to the identity on D, observe that for all (a,b)∈A0×A1 and (g,h)∈X, we have:

j′(j(a,b))⋆(g,h)=(g′0(g0(a)),g′1(g1(b)))⋆(g,h)=g′1(g1(b))⋅1g(g′0(g0(a)))=b⋅1g(g′0(g0(a)))=g′0(g0(a))⋅0h(b)=a⋅0h(b)=(a,b)⋆(g,h).

Similarly, to see that (j,k)∘(j′,k′) is homotopic to the identity on D′, observe that for all (a,b)∈A′0×A′1 and (g,h)∈X, we have:

j(j′(a,b))⋆′(g,h)=(g0(g′0(a)),g1(g′1(b)))⋆′(g,h)=g1(g′1(b))⋅′1g(g0(g′0(a)))=b⋅′1g(g0(g′0(a)))=g0(g′0(a))⋅′0h(b)=a⋅′0h(b)=(a,b)⋆′(g,h).

Thus, D′≃D.

To see D′∈C′0⊠C′1, we need to show that D′i≃C′i, where D′i=(A′i,A′1−i×X′,⋆′i) with ⋆′i given by

a⋆′0(b,(g,h))=b⋆′1(a,(g,h))=(a,b)⋆′(g,h).

It suffices to show that D′i≃Di, since Di≃Ci≃C′i.

For i∈{0,1}, we construct morphisms (ji,ki):Di→D′i, and (j′i,k′i):D′i→Di. We define ji=gi and j′i=g′i. We define ki:(A′1−i×X′)→(A1−i×X) by ki(a,x)=(g′1−i(a),k(x)), and similarly define k′i:(A1−i×X)→(A′1−i×X′) by k′i(a,x)=(g1−i(a),k′(x)).

To see that (j0,k0) is a morphism, observe that for all a∈A0 and (a1,(g,h))∈A′1×X′, we have:

a⋆0k0(b,(g,h))=a⋆0(g′1(b),k(g,h))=(a,g′1(b))⋆k(g,h)=j(a,g′1(b))⋆′(g,h)=(g0(a),g1(g′1(b)))⋆′(g,h)=g1(g′1(b))⋅′1g(g0(a))=b⋅′1g(g0(a))=(g0(a),b)⋆′(g,h)=j0(a)⋆′0(b,(g,h)).

To see that (j1,k1), (j′0,k′0), and (j′1,k′1) are morphisms is similar.

To see (j′0,k′0)∘(j0,k0) is homotopic to the identity on D0, observe that for all a∈A0 and (b,(g,h))∈A1×X, we have

j′i(ji(a))⋆ix=(g′i(gi(a)),b)⋆(g,h)=g′i(gi(a))⋅0h(b)=a⋅0h(b)=(a,b)⋆(g,h)=a⋆0(b,(g,h)),

and seeing that (j′1,k′1)∘(j1,k1), (j0,k0)∘(j′0,k′0), and (j1,k1)∘(j′1,k′1) are homotopic to the identity is similar.

Thus, we have that D′i≃Di, completing the proof. □

In our next post, we will use sub-sum and sub-tensor to define additive subagents, which are like agents that have committed to restrict their class of options; and multiplicative subagents, which are like agents that are contained inside other agents. We will also introduce the concept of sub-environments.

I'll be hosting online office hours this Sunday at 2-4pm PT for discussing Cartesian frames.

Discuss

Defining capability and alignment in gradient descent

5 ноября, 2020 - 17:36
Published on November 5, 2020 2:36 PM GMT

This is the first post in a series where I'll explore AI alignment in a simplified setting: a neural network that's being trained by gradient descent. I'm choosing this setting because it involves a well-defined optimization process that has enough complexity to be interesting, but that's still understandable enough to make crisp mathematical statements about. As a result, it serves as a good starting point for rigorous thinking about alignment.

Defining inner alignment

First, I want to highlight a definitional issue. Right now there are two definitions of inner alignment circulating in the community. This issue was first pointed out to me by Evan Hubinger in a recent conversation.

The first definition is the one from last year's Risks from Learned Optimization paper, which Evan co-authored and which introduced the term. This paper defined the inner alignment problem as "the problem of eliminating the base-mesa objective gap" (Section 1.2). The implication is that if we can eliminate the gap between the base objective of a base optimizer, and the mesa-objectives of any mesa-optimizers that base optimizer may give rise to, then we will have satisfied the necessary and sufficient conditions for the base optimizer to be inner-aligned.

There's also a second definition that seems to be more commonly used. This definition says that "inner alignment fails when your capabilities generalize but your objective does not". This comes from an intuition (pointed out to me by Rohin Shah) that the combination of inner alignment and outer alignment should be accident-proof with respect to an optimizer's intent: an optimizer that's both inner- and outer-aligned should be trying to do what we want. Since an outer-aligned optimizer is one whose base objective is something we want, this intuition suggests that the remaining part of the intent alignment problem — the problem of getting the optimizer to try to achieve the base objective we set — is what inner alignment refers to.

Here I'll try to propose more precise definitions of alignment and capability in an optimizer, and explore what generalization and robustness might mean in the context of these properties. I'll also propose ways to quantify the capability and alignment profiles of existing ML systems.

But before doing that, I want to motivate these definitions with an example.

The base objective

The optimizer I'll be using as my example will be a gradient descent process, which we're going to apply to train a simplified neural network. I want to emphasize that I'm treating gradient descent as the optimizer here — not the neural network. The neural network isn't necessarily an optimizer itself, it's just the output artifact of our gradient descent optimizer.

To make this scenario concrete, we'll imagine the neural network we're training is a simplified language model: a feedforward MLP with a softmax layer at the top. The softmax layer converts the MLP's output activations into a probability distribution over next words, and the model gets scored on the cross-entropy loss between that probability distribution, and the actual next word that appears in the training text. (This ignores many of the complications of modern language models, but I'm keeping this example simple.)

So at a given training step t, the loss function for our language model is

L(t)=1NN∑i=1(−yi⋅log(softmax(f(xi,θ(t)))))

I'll refer to the function f(xi,θ(t)) as “the neural network”. Here, “⋅” is the dot product.

Notice that L(t) here is our base objective: it's the quantity we're trying to get our gradient descent process to optimize for. If we'd succeeded in solving the entire outer alignment problem, and concluded that the base objective L(t) was the only quantity we cared about optimizing, then the remaining challenge — getting our gradient descent process to actually optimize for L(t) — would constitute the inner alignment problem, by our second definition above.

So the question now is: under what conditions does gradient descent actually optimize for our base objective?

The true objective

To answer this, we can try to determine which quantity gradient descent is truly optimizing for, and then look at how and when that quantity correlates with the base objective we really care about.

We can start by imagining the (t+1)th step of gradient descent as applying a learning function L to the parameters in θ(t):

L(θ(t))=θ(t+1)=θ(t)+Δθ(t)

Running gradient descent consists of applying L(⋅) repeatedly to θ(0):

Lt(θ(0))=θ(t)

In the long run, gradient descent should converge on some terminal value θ∗=limt→∞θ(t). (For now, we'll assume that this limit exists.)

The key characteristic of a terminal value θ∗ (when it exists) is that it's a fixed point of the dynamical system defined by L(⋅). In other words:

L(θ∗)=θ∗

Some of the fixed points θ∗ of this system will coincide with global or local minima of our base objective, the cross-entropy loss L(t) — but not all of them. Some will be saddle points, while others will be local or global maxima. And while we don't consider all these fixed points to be equally performant with respect to our base objective, our gradient descent optimizer does consider them all to be equally performant with respect to its true objective.

This disagreement is the core of the inner alignment problem in this setting: our gradient descent process isn't always optimizing for the quantity we want it to. So what quantity is it optimizing for?

When we apply one step of gradient descent, we update each parameter in our neural network by an amount equal to a learning rate, times the error in that parameter that we calculate during backprop on the loss function L(t). The update we apply to the jth parameter, to move it from θj(t) to θj(t+1), can be written as

Δθj(t)=−ϵ(t)∂L∂θj∣∣∣θ=θ(t)

Here, ϵ(t) represents our learning rate at time step t.

So our gradient descent optimizer will terminate if and only if there exists some time step t∗ such that Δθj(t∗)=0, across all parameters j. (For a fixed learning function L, this condition implies that the gradient updates are zero for all t≥t∗ as well.) And this happens if and only if the sum of the gradients

G(t)=M∑j=1∣∣∣∂L∂θj∣∣∣∣∣∣θ=θ(t)

is equal to zero when t≥t∗.

But G(t) represents more than just the terminal condition for our optimizer. It's the quantity that gradient descent is actually trying to minimize: anytime G(t) deviates from zero, the amount of optimization power that's applied to move G(t) towards zero is proportional to G(t) itself. That makes G(t) the true objective of our gradient descent optimizer — it's the loss function that gradient descent is actually optimizing for.

So now we have a base objective L(t), which we've assigned to an optimizer; and we have a true objective G(t), which is the one our optimizer is actually pursuing. Intuitively, the inner alignment of our optimizer seem like it would be related to how much, and under what circumstances, L(t) correlates with G(t) over the course of a training run. So we'll look at that next.

Two examples

Let's now consider two optimizers, A and B. Optimizers A and B are identical apart for one difference: Optimizer A has its parameters initialized at θA(0), while Optimizer B has its parameters initialized at θB(0).

As luck would have it, this small difference is enough to put θA(t) and θB(t) into different basins of attraction of the loss function. As a result, our two optimizers end up in different terminal states:

limt→∞θA(t)=(θA)∗limt→∞θB(t)=(θB)∗

These two terminal states also correspond — again, by luck in this example — to different values of the base objective. Indeed, it turns out that θA(0) is in the basin of attraction of a global minimum of the loss function, while θB(0) is in the basin of attraction of a local minimum. As a result, after many training steps, the base objectives of the two optimizers end up converging to different values:

\lim_{t\rightarrow\infty}L^A(t)">limt→∞LB(t)>limt→∞LA(t)

Again, the limit of the loss function LA(t) is less than the limit of LB(t) because (θA)∗ corresponds to a global minimum, while (θB)∗ only corresponds to a local minimum. So Optimizer A is clearly better than Optimizer B, from the standpoint of its performance on our base objective — minimization of the loss function.

But crucially, because (θA)∗ and (θB)∗ both represent fixed points with zero gradients, the true objectives of the two optimizers both converge to zero in the limit:

limt→∞GA(t)=limt→∞GB(t)=0

In other words, Optimizer A and Optimizer B are equally good at optimizing for their true objectives. Optimizer A just does a better job of optimizing for the base objective we want, as a side effect of optimizing for its true objective. Intuitively, we might say that Optimizers A and B are equally capable with respect to their true objectives, while Optimizer A is better aligned with our base objective than Optimizer B is.

Let's look at a second example. This time we'll compare Optimizer A to a ththird optimizer, Optimizer C. These two optimizers are again identical, apart from one detail: while Optimizer A uses learning rate decay with limt→∞ϵA(t)=0, Optimizer C uses a constant learning rate with ϵB(t)=ϵB.

As a result of its learning rate decay schedule, Optimizer A converges on a global minimum in the t→∞ limit. But Optimizer C, with its constant learning rate, doesn't converge the same way. While it's drawn towards the same global minimum as Optimizer A, Optimizer C ends up orbiting the minimum point chaotically, without ever quite reaching it — its finite learning rate means it never perfectly hits the global minimum point, no matter how many learning steps we give it. As a result,

\lim_{t\rightarrow\infty}G^A(t)=0">limt→∞GC(t)>limt→∞GA(t)=0

(To be clear, this is an abuse of notation: in reality limt→∞GC(t) generally won't be well-defined for a chaotic orbit like this. But we can think of this instead as denoting the long-term limit of the average of GC(t) over a sufficiently large number of time steps.)

Intuitively, we might say that Optimizer A is more capable than Optimizer C, since it performs better, in the long run, on its true objective.

Optimizer A also performs better than Optimizer C on our base objective:

\lim_{t\rightarrow\infty}L^A(t)">limt→∞LC(t)>limt→∞LA(t)

And interestingly, Optimizer A's better performance than C on our base objective is a direct result of its better performance than C on its true objective. So we might say that, in this second scenario, Optimizer C's performance on the base objective is capability-limited. If we improved C's capability on its true objective, we could get it to perform better on the base objective, too.

Capability and alignment

With those intuitions in hand, I'll propose the following two definitions.

Definition 1. Let Lt be a base optimizer acting over t optimization steps, and let L(t) represent the value of its base objective at optimization step t. Then the capability of Lt with respect to the base objective L(t) is

C(L)=limT→∞1TT∑t=1L(t)−L(0)

Definition 2. Let LtB be a base optimizer with base objective L(t), and LtM be a mesa-optimizer with mesa-objective G(t). Then the mesa-optimizer's alignment with the base optimizer is given by

α(LB,LM)=limT→∞∑Tt=1L(t)−L(0)∑Tt=1G(t)−G(0)

If C(LB) and C(LM) are both finite, we can also write LtM's alignment with LtB as

α(LB,LM)=C(LB)C(LM)

The intuition behind these definitions is that the capability C(⋅) of an optimizer is the amount by which the optimizer is able to improve its objective over many optimization steps. One way in which a base optimizer can try to improve its base objective is by delegating part of its optimization work to a mesa-optimizer, which has its own mesa-objective. The alignment factor α in Definition 2 is a way of quantifying how effective that delegation is: to what extent does the mesa-optimizer's progress in optimizing for its mesa-objective "drag along" the base objective of the base optimizer that created it?

In our gradient descent example, our mesa-optimizer LtM was the gradient descent process, and its mesa-objective was what, at the time, I called the "true objective", G(t). But the base optimizer LtB was the human who designed the neural network and ran the gradient process on it. If we think of this human as being our base optimizer, then we can write the capability of our human designer as

C(LB)=α(LB,LM)C(LM)

In other words, if a base optimizer delegates its objective to a mesa-optimizer, then that base optimizer's capability is equal to the capability of that mesa-optimizer, times how well-aligned the mesa-optimizer is to the base optimizer's base objective. If you fully delegate a goal to a subordinate, your capability on that goal is the product of 1) how capable your subordinate is at achieving their own goals; and 2) how well-aligned their own goals are to the goal you delegated to them. This seems intuitively reasonable.

But it also has a curiously unintuitive consequence in gradient descent. We tend to think that when we add neurons to an architecture, we're systematically increasing the capability of gradient descent on that architecture. But the definitions above suggest a different interpretation: because gradient descent might converge equally well on its true objective G(t) on a big neural net as on a small one, its capability as an optimizer isn't systematically increased by adding neurons. Instead, adding neurons improves the degree to which gradient descent converges on a base objective that's aligned with our goals.

Robustness and generalization

As I've defined them above, capability and alignment are fragile properties. Two optimizers Lt1 and Lt2 could be nearly identical, but still have very different capabilities C(L1) and C(L2). This is a problem, because the optimizers in our definitions are specified up to and including things like their datasets and parameter initializations. So something as minor as a slight change in dataset — which we should expect to happen often to real-world optimizers — could cause a big change in the capability of the optimizer, as we've defined it.

We care a lot about whether an optimizer remains capable when we perturb it in various ways, including running it on different datasets. We also care a lot about whether an optimizer with objective G(t) remains capable when we change its objective to something slightly different like G′(t). And we also care to what extent the alignment between two optimizers is preserved when we perturb either optimizer. Below I'll define two properties that describe the degree to which optimizers retain their capability and alignment properties under perturbations.

Definition 3. Let C(L1) be the capability of optimizer Lt1, and let α(L1,L2) be the alignment of optimizer Lt2 with optimizer Lt1. Let δLt1 and δLt2 be finite perturbations applied respectively to Lt1 and Lt2. Then, the capability of Lt1 is robust under perturbation δLt1 if

C(L1)≈C(L1+δL1)

Similarly, the alignment of Lt2 with Lt1 is robust under perturbations δLt1 and δLt2 if

α(L1,L2)≈α(L1+δL1,L2+δL2)

Definition 4. Let Lt1 be an optimizer with objective function L(t), and let Lt2 be an optimizer with objective function G(t). Let δLt1 be a finite perturbation applied to Lt1, such that the optimizer Lt1+δLt1 differs from Lt1 only in that its objective function is L′(t) instead of L(t). Then, the capability of Lt1 generalizes to objective L′(t) if

C(L1)≈C(L1+δL1)

Similarly, the alignment of Lt2 with Lt1 generalizes to objective L′(t) if

α(L1,L2)≈α(L1+δL1,L2)

Intuitively, we're defining a robustly capable optimizer as one whose capability isn't strongly affected by classes of perturbations that we care about — and we're defining robust alignment between two optimizers in an analogous way. We're also thinking of generalization as a special case of robustness, meaning specifically that the optimizer is robust to perturbation to its objective function. So an optimizer whose capabilities generalize is one that continues to work well when we give it a new objective.

Quantifying inner alignment

With the vocabulary above, we can now define inner alignment more precisely, and even think about how to quantify it in real systems. We might say that a mesa-optimizer Lt2 is inner-aligned with its base optimizer Lt1 if its alignment factor α(Lt1,Lt2) remains robustly high under variations δ(X,Y) in the datasets that we expect either optimizer to encounter in the future. We can also quantify inner alignment by looking at how much specific variations in the data distribution affects the alignment factor between two optimizers.

We might also be interested investigating other properties that could affect inner alignment from a safety perspective. For example, under what conditions will alignment between a base optimizer and a mesa-optimizer generalize well to a new base objective? What kinds of perturbations to our optimizers are likely to yield breakdowns in robustness? As we add capacity to a deep learning model, should expect alignment to improve? And if so, should we expect an inflection point in this improvement — a level of capacity beyond which alignment declines sharply? How could we detect and characterize an inflection point like this? These are some of the topics I'll be exploring in the future.

Terminal states and transients

I want to highlight one final issue with the definitions above: I've defined inner alignment here only in connection with the limiting behavior of our optimizers. That means a mesa-optimizer that's well-aligned with its base optimizer would still — by the definition above — be free to do dangerous things on the path to correctly optimizing for the base objective.

To take an extreme example, we could have a system that's perfectly aligned to optimize for human happiness, but that only discovers that humans don't want to have their brains surgically extracted from their bodies after it's already done so. Even if the system later corrected its error, grew us new bodies, and ultimately gave us a good end state, we'd still have experienced a very unpleasant transient in the process. Essentially, this definition of alignment says to the mesa-optimizer: it's okay if you break a vase, as long as we know that you'll put it back together again in the long run.

I can understand this definition being controversial. It may be the most extreme possible version of the claim that the ends justify the means. So it could also be worth resolving the alignment problem into "weak" and "strong" versions — where weak alignment would refer to the t→∞ limit, while strong alignment would refer to transient behavior over, say, the next N optimization steps. A concept of strong alignment could let us prove statements like "this optimizer will have a performance level of at worst x on our base objective over the next N optimization steps." This seems very desirable.

On the other hand, we may want to prepare for the possibility that the terminal states we want will only be accessible through paths that involve transient unpleasantness. Perhaps one really does have to break eggs to make an omelet, and that's just how the universe is. (I don't think this is particularly likely: high-capacity neural networks and policy iteration in RL are both data points that suggest incrementalism is increasingly viable in higher-dimensional problem spaces.)

To summarize, weak alignment, which is what this post is mostly about, would say that "everything will be all right in the end." Strong alignment, which refers to the transient, would say that "everything will be all right in the end, and the journey there will be all right, too." It's not clear which one will be easier to prove than the other in which circumstance, so we'll probably need to develop rigorous definitions of both.

Big thanks to Rohin Shah, Jan Leike, Jeremie Harris, and Evan Hubinger for reviewing early drafts of this, suggesting ideas, and pointing out mistakes!

Discuss

Multiple Worlds, One Universal Wave Function

5 ноября, 2020 - 01:28
Published on November 4, 2020 10:28 PM GMT

The following post is an adaptation of a paper I wrote in 2017 that I thought might be of interest to people here on LessWrong. The paper is essentially my attempt at presenting the clearest and most cogent defense of the Everett interpretation of quantum mechanics—the interpretation that I very strongly believe to be true—that I could. My motivation for posting this now is that I was recently talking with a colleague of mine who mentioned that they had stumbled upon my paper recently and really enjoyed it, and so realizing that I hadn't ever really shared it here on LessWrong, I figured I would put it out there in case anyone else found it similarly useful or interesting.

It's also worth noting that LessWrong has a storied history with the Everett interpretation, with Yudkowsky also defending it quite vigorously. I actually cite Eliezer at one point in the paper—and I basically agree with what he said in his sequence—though I hope that if you bounced away from that sequence you'll find my paper more persuasive.

Abstract

We seek to present and defend the view that the interpretation of quantum mechanics is no more complicated than the interpretation of plate tectonics: that which is being studied is real, and that which the theory predicts is true. The view which holds that the mathematical formalism of quantum mechanics—without any additional postulates—is a complete description of reality is known as the Everett interpretation. We seek to defend the Everett interpretation of quantum mechanics as the most probable interpretation available. To accomplish this task, we analyze the history of the Everett interpretation, provide mathematical backing for its assertions, respond to criticisms that have been leveled against it, and compare it to its modern alternatives.

Introduction

One of the most puzzling aspects of quantum mechanics is the fact that, when one measures a system in a superposition of multiple states, it is only ever found in one of them. This puzzle was dubbed the “measurement problem,” and the first attempt at a solution was by Werner Heisenberg, who in 1927 proposed his theory of “wave function collapse.”[1] Heisenberg proposed that there was a cutoff length, below which systems were governed by quantum mechanics, and above which they were governed by classical mechanics. Whenever quantum systems encounter the cutoff point, the theory stated, they collapse down into a single state with probabilities following the squared amplitude, or Born, rule. Thus, the theory predicted that physics just behaved differently at different length scales. This traditional interpretation of quantum mechanics is usually referred to as the Copenhagen interpretation.

From the very beginning, the Copenhagen interpretation was seriously suspect. Albert Einstein was famously displeased with its lack of determinism, saying “God does not play dice,” to which Niels Bohr quipped in response, “Einstein, stop telling God what to do.”[2] As clever as Bohr’s answer is, Einstein—with his famous physical intuition—was right to be concerned. Though Einstein favored a hidden variable interpretation[3], which was later ruled out by Bell’s theorem[4], the Copenhagen interpretation nevertheless leaves open many questions. If physics behaves differently at different length scales, what is the cutoff point? What qualifies as a wave-function-collapsing measurement? How can physics behave differently at different length scales, when macroscopic objects are made up of microscopic objects? Why is the observer not governed by the same laws of physics as the system being observed? Where do the squared amplitude Born probabilities come from? If the physical world is fundamentally random, how is the world we see selected from all the possibilities? How could one explain the applicability of quantum mechanics to macroscopic systems, such as Chandrasekhar’s insight in 1930 that modeling neutron stars required the entire star to be treated as a quantum system?[5]

The Everett Interpretation of Quantum Mechanics

Enter the Everett Interpretation. In 1956, Hugh Everett III, then a doctoral candidate at Princeton, had an idea: if you could find a way to explain the phenomenon of measurement from within wave mechanics, you could do away with the extra postulate of wave function collapse, and thus many of the problems of the Copenhagen interpretation. Everett worked on this idea under his thesis advisor, Einstein-prize-winning theoretical physicist John Wheeler, who would later publish a paper in support of Everett’s theory.[6] In 1957, Everett finished his thesis “The Theory of the Universal Wave Function,”[7] published as the “‘Relative State’ Formulation of Quantum Mechanics.”[8] In his thesis, Everett succeeded in deriving every one of the strange quirks of the Copenhagen interpretation—wave function collapse, the apparent randomness of measurement, and even the Born rule—from purely wave mechanical grounds, as we will do in the "Mathematics of the Everett Interpretation" section.

Everett’s derivation relied on what was at the time a controversial application of quantum mechanics: the existence of wave functions containing observers themselves. Everett believed that there was no reason to restrict the domain of quantum mechanics to only small, unobserved systems. Instead, Everett proposed that any system, even the system of the entire universe, could be encompassed in a single, albeit often intractable, “universal wave function.”

Modern formulations of the Everett interpretation reduce his reasoning down to two fundamental ideas:[9][10][11][12][13]

• the wave function obeys the standard, linear, deterministic Schrodinger wave equation at all times (the relativistic variant, to be precise), and
• the wave function is physically real.

Specifically, the first statement precludes wave function collapse and demands that we continue to use the same wave mechanics for all systems, even those with observers, and the second statement demands that we accept the physical implications of doing so. The Everett interpretation is precisely that which is implied by these two statements.

Importantly, neither of these two principles are additional assumptions on top of traditional quantum theory—instead, they are simplifications of existing quantum theory, since they act only to remove the prior ad-hoc postulates of wave function collapse and the non-universal applicability of the wave equation.[11][14] The beauty of the Everett interpretation is the fact that we can remove the postulates of the Copenhagen interpretation and still end up with a theory that works.

DeWitt’s Multiple Worlds

Removing the Copenhagen postulates had some implications that did not mesh well with many physicists’ existing physical intuitions. If one accepted Everett’s universal wave function, one was forced to accept the idea that macroscopic objects—cats, people, planets, stars, galaxies, even the entire universe—could be in a superposition of many states, just as microscopic objects could. In other words, multiple different versions of the universe—multiple worlds, so to speak—could exist simultaneously. It was for this reason that Einstein-prize-winning physicist Bryce DeWitt, a supporter of the Everett interpretation, dubbed Everett’s theory of the universal wave function the “multiworld” (or now more commonly “multiple worlds”) interpretation of quantum mechanics.[9]

While the idea of multiple worlds may at first seem strange, to Everett, it was simply an extension of the normal laws of quantum mechanics. Simultaneous superposition of states is something physicists already accept for microscopic systems whenever they do quantum mechanics—by virtue of the overwhelming empirical evidence in favor of it. Not only that, but evidence keeps coming out demonstrating superpositions at larger and larger length scales. In 1999 it was demonstrated, for example, that Carbon-60 molecules can be put into a superposition.[15]. While it is unlikely that a superposition of such a macroscopic object as Schrodinger’s cat will ever be conclusively demonstrated, due to the difficulty in isolating such a system from the outside world, it is likely that the trend of demonstrating superposition at larger and larger length scales will continue. It seems that to not accept that a cat could be in a superposition, even if we can never demonstrate it, however, is a failure of induction—a rejection of an empirically-demonstrated trend.

While the Everett interpretation ended up implying the existence of multiple worlds, this was never Everett’s starting point. The “multiple worlds” of the Everett interpretation were not added to traditional quantum mechanics as new postulates, but rather fell out from the act of taking away the existing ad-hoc postulates of the Copenhagen interpretation—a consequence of taking the wave function seriously as a fundamental physical entity. In Everett’s own words, “The aim is not to deny or contradict the conventional formulation of quantum theory, which has demonstrated its usefulness in an overwhelming variety of problems, but rather to supply a new, more general and complete formulation, from which the conventional interpretation can be deduced.”[8] Thus, it is not surprising that Stephen Hawking and Nobel laureate Murray Gell-Mann, supporters of the Everett interpretation, have expressed reservations with the name “multiple worlds interpretation,” and therefore we will continue to refer to the theory simply as the Everett interpretation instead.[16]

The Nature of Observation

Accepting the Everett interpretation raises an important question: if the macroscopic world can be in a superposition of multiple states, what differentiates them? Stephen Hawking has the answer: “in order to determine where one is in space-time one has to measure the metric and this act of measurement places one in one of the various different branches of the wave function in the Wheeler-Everett interpretation of quantum mechanics.”[17] When we perform an observation on a system whose state is in a superposition of eigenfunctions, a version of us sees each different, possible eigenfunction. The different worlds are defined by the different eigenfunctions that are observed.

We can show this, as Everett did, just by acknowledging the existence of universal, joint system-observer wave functions.[7][8] Before measuring the state of a system in a superposition, the observer and the system are independent—we can get their joint wave function simply by multiplying together their individual wave functions. After measurement, however, the two become entangled—that is, the state of the observer becomes dependent on the state of the system that was observed. The result is that for each eigenfunction in the system’s superposition, the observer’s wave function evolves differently. Thus, we can no longer express their joint wave function as the product of their individual wave functions. Instead, we are forced to express the joint wave function as a sum of different components, one for each possible eigenfunction of the system that could be observed. These different components are the different “worlds” of the Everett interpretation, with the only difference between them being which eigenfunction of the system was observed. We will formalize this reasoning in the "The Apparent Collapse of The Wave Function" section.

We are still left with the question, however, of why we experience a particular probability of seeing some states over others, if every state that can be observed is observed. Informally, we can think of the different worlds—the different possible observations—as being “weighted” by their squared amplitudes, and which one of the versions of us we are as a random choice from that weighted distribution. Formally, we can prove that under the Everett interpretation, if an observer interacts with many systems each in a superposition of multiple states, the distribution of states they see will follow the Born rule.[7][8][18][11][19][14] A portion of Everett’s proof of this fact is included in the "The Born Probability Rule" section.

The Mathematics of the Everett Interpretation

Previously, we asserted that universally-applied wave mechanics was sufficient, without ad-hoc postulates such as wave function collapse, to imply all the oddities of the Copenhagen interpretation. We will now prove that assertion. In this section, as per the Everett interpretation, we will accept that basic wave mechanics is obeyed for all physical systems, including those containing observers. From that assumption, we will show that the apparent phenomena of wave function collapse, random measurement, and the Born Rule follow. The proofs given below are adopted from Everett’s original paper.[7][8]

The Apparent Collapse of The Wave Function

Consider the simple case where φ=φ0 and thus we are in initial state Ψ0=ψφ0. In this case, by our previous definition of ψi and requirement that φi remain unchanged, we can write the state after the observation as Ψ1=ψ0φ0. Since quantum mechanics is linear, and the eigenfunctions φi are orthogonal, it must be that this same process occurs for each φi.

Thus, by the principle of superposition, we can write Ψ1 in its general form as Ψ1=∑iaiψiφi For the next observation, each ψi will once again see the same φi, since it has not changed state. As previously defined, we use the notation ψi,i to denote the state of O after observing S in state φi twice. Thus, we can write Ψ2 as Ψ2=∑iaiψi,iφi and more generally, we can write Ψn as Ψn=∑iaiψi,i,…,iφi where i is repeated n times in i,i,…,i.

Thus, once a measurement of S has been performed, every subsequent measurement will see the same eigenfunction, even though all eigenfunctions continue to exist. We can see this from the fact that the same i is repeated in each state ψi,i,…,i of O. In this way, we see how, despite the fact that the original wave function φ=∑iaiφi for S is in a superposition of many eigenfunctions, once a measurement has been performed, each subsequent measurement will always see the same eigenfunction.

Note that there is no longer a single, independent state ψ of O. Instead, there are many ψi,i,…,i, one for each eigenfunction. What does that mean? It means that for every eigenfunction φi of S, there is a corresponding state ψi,i,…,i of O wherein O sees that eigenfunction. Thus, one is required to accept that there are many observers Oi, with corresponding state ψi,i,…,i, each one seeing a different eigenfunction φi. This is the origin of the Everett interpretation's "multiple worlds."

From the perspective of each Oi in this scenario it will appear as if φ has "collapsed" from a complex superposition ∑iaiφi into a single eigenfunction φi. As we can see from the joint wave function, however, that is not the case---in fact, the entire superposition still exists. What has changed is only that ψ, the state of O, is no longer independent of that superposition, and has instead become entangled with it.

The Apparent Randomness of Measurement

Suppose we now have many such systems S, which we will denote Sn where n∈N. Consider O from before, but with the modification that instead of repeatedly observing a single S, O observes different Sn in each measurement, such that Ψn is the joint system-observer wave function after measuring the nth Sn.

As before, we will define the initial joint wave function Ψ0 as

Ψ0=ψ∑i1,i2,…,in(ai1,i2,…,inφi1(x1)φi2(x2)⋯φin(xn))

where we are summing over all possible combinations of eigenfunctions for the different systems Sn with arbitrary coefficients ai1,i2,…,in for each combination.

Then, as before, we can use the principle of superposition to find Ψ1 as

Ψ1=∑i1,i2,…,in(ψi1ai1,i2,…,inφi1(x1)φi2(x2)⋯φin(xn))

since the first measurement will see the state φi1 of S1. More generally, we can write Ψn as

Ψn=∑i1,i2,…,in(ψi1,i2,…,inai1,i2,…,inφi1(x1)φi2(x2)⋯φin(xn))

following the same principle, as each measurement of an Sn will see the corresponding state φin.

Thus, when subsequent measurements of identical systems Sn are performed, the resulting sequence of eigenfunctions observed by O in each ψ appear random (according to what distribution we will show in the next subsection), since there is no structure to the sequences i1,i2,…,in. This appearance of randomness is true even though the entire process is completely deterministic. If, alternatively, O was to return to a previously-measured Sn, we would get a repeat of the first analysis, wherein O would always see the same state as was previously measured.

The Born Probability Rule

As before, consider a system S in state ∑iaiφi. To be able to talk about a probability for an observer O to see state φi, we need some function P(ai) that will serve as a measure of that probability.

Since we know that quantum mechanics is invariant up to an overall phase, we will impose the condition on P that it must satisfy the equation

P(ai)=P(√a∗iai)=P(|ai|)

Furthermore, by the linearity of quantum mechanics, we will impose the condition on P such that for aφ defined as aφ=∑iaiφi P must satisfy the equation P(a)=∑iP(ai)

Together, these two conditions fully specify what function P must be. Assuming φ is normalized, such that ∑iφ∗iφi=1, it must be that

a∗a=∑ia∗iai

or equivalently

|a|=√∑i|ai|2

such that

P(|a|)=P⎛⎜⎝√∑i|ai|2⎞⎟⎠

which, using the phase invariance condition that P(|a|) = P(a), gives

P(a)=P⎛⎜⎝√∑i|ai|2⎞⎟⎠

Then, from the linearity condition, we have P(a)=∑iP(ai) which, by the phase invariance condition, is equivalent to

P(a)=∑iP(√|ai|2)

Putting it all together, we get

P(a)=P⎛⎜⎝√∑i|ai|2⎞⎟⎠=∑iP(√|ai|2)

then, defining a new function g(x)=P(√x), yields g(∑i|ai|2)=∑ig(|ai|2) which implies that g must be a linear function such that for some constant c g(x)=cx Therefore, since P(x)=g(x2), P(x)=cx2 which, imposing the phase invariance condition, becomes P(x)=c|x|2 which, where c is normalized to 1, is the Born rule.

The fact that this measure is a probability, beyond that it is the only measure that can be, is deserving of further proof. The concept of probability is notoriously hard to define, however, and without a definition of probability, it is just as meaningful to call P something as arbitrary as the “stallion” of the wave function as the “probability.”[1] Nevertheless, for nearly every reasonable probability theory that exists, such proofs have been provided. Everett provided a proof based on the standard frequentist definition of probability[7][8], David Deutsch (Oxford theoretical physicist) has provided a proof based on game theory[18], and David Wallace (USC theoretical physicist) has provided a proof based on decision theory[11]. For any reasonable definition of probability, wave mechanics is able to show that the above measure satisfies it in the limit without any additional postulates.[19][14][20]

Arguments For and Against the Everett Interpretation

Despite the unrivaled empirical success of quantum theory, the very suggestion that it may be literally true as a description of nature is still greeted with cynicism, incomprehension, and even anger.[21]

David Deutsch, 1996

Falsifiability and Empiricism

Perhaps the most common criticism of the Everett interpretation is the claim that it is not falsifiable, and thus falls outside the realm of empirical science.[22] In fact, this claim is simply not true—many different methods for testing the Everett interpretation have been proposed, and, a great deal of empirical data regarding the Everett interpretation is already available.

One such method we have already discussed: the Everett interpretation removes the Copenhagen interpretation’s postulate that the wave function must collapse at a particular length scale. Were it ever to be conclusively demonstrated that superposition was impossible past some point, the Everett interpretation would be disproved. Thus, every demonstration performed of superposition at larger and larger length scales—such as for Carbon 60 as was previously mentioned[15]—is a test of the Everett interpretation. Arguably, it is the Copenhagen interpretation which is unfalsifiable, since it makes no claim about where the boundary lies at which wave function collapse occurs, and thus proponents can respond to the evidence of larger superpositions simply by changing their theory and moving their proposed boundary up.

Another method of falsification regards the interaction between the Everett interpretation and quantum gravity. The Everett interpretation makes a definitive prediction that gravity must be quantized. Were gravity not quantized—not wrapped up in the wave function like all the other forces—and instead simply a background metric for the entire wave function, we would be able to detect the gravitational impact of the other states we were in a superposition with.[10][23] In 1957, Richard Feynman, who would later come to explicitly support the Everett interpretation[16] as well as become a Nobel laureate, presented an early version of the above argument as a reason to believe in quantum gravity, arguing, “There is a bare possibility (which I shouldn’t mention!) that quantum mechanics fails and becomes classical again when the amplification gets far enough [but] if you believe in quantum mechanics up to any level then you have to believe in gravitational quantization.”[24]

Another proposal concerns differing probabilities of finding ourselves in the universe we are in depending on whether the Everett interpretation holds or not. If the Everett interpretation is false, and the universe only has a single state, there is only one state for us to find ourselves in, and thus we would expect to find ourselves in an approximately random universe. On the other hand, if the Everett interpretation is true, and there are many different states that the universe is in, we could find ourselves in any of them, and thus we would expect to find ourselves in one which was more disposed than average towards the existence of life. Approximate calculations of the relative probability of the observed universe based on the Hartle-Hawking boundary condition strongly support the Everett interpretation.[10]

Finally, as we made a point of being clear about in the "The Everett Interpretation of Quantum Mechanics" section, the Everett interpretation is simply a consequence of taking the wave function seriously as a physical entity. Thus, it is somewhat unfair to ask the Everett interpretation to achieve falsifiability independently of the theory—quantum mechanics—which implies it.[22] If a new theory were proposed that said quantum mechanics stopped working outside of the future light cone of Earth, we would not accept it as a new physical controversy—we would say that, unless there is incredibly strong proof otherwise, we should by default assume that the same laws of physics apply everywhere. The Everett interpretation is just that default—it is only by historical accident that it happened to be discovered after the Copenhagen interpretation. Thus, to the extent that one has confidence in the universal applicability of the principles of quantum mechanics, one should have equal confidence in the Everett interpretation, since it is a logical consequence. It is in fact all the more impressive—and tantamount to its importance to quantum mechanics—that the Everett interpretation manages to achieve falsifiability and empirical support despite its primary virtue of simply saying that quantum mechanics be applied universally.

Simplicity

Another common objection to the Everett interpretation is that it “postulates too many universes,” which Sean Carroll, a Caltech cosmologist and supporter of the Everett interpretation, calls “the basic silly objection.”[25] At this point, it should be very clear why this objection is silly: the Everett interpretation postulates no such thing—the existence of “many universes” is an implication, not a postulate, of the theory. Opponents of the Everett interpretation, however, have accused it of a lack of simplicity on the grounds that adding in all those additional universes is unnecessary added complexity, and since by the principle of Occam’s razor the simplest explanation is probably correct, the Everett interpretation can be rejected.[26]

In fact, Occam’s razor is an incredibly strong argument in favor of the Everett interpretation. To explain this, we will first need to formalize what we mean by Occam’s razor, which will require some measure of theoretical computer science. Specifically, we will make use of Solomonoff’s theory of inductive inference: the best, most general framework we have for comparing the probability of empirically indistinguishable physical theories.[27][28][29][2] To use Solomonoff’s formalism, only one assumption is required of us: under some encoding scheme, competing theories of the universe can be modeled as programs. This assumption does not imply that the universe must be computable, only that it can be computably described, which all physical theories capable of being written down must abide by. From this assumption, and the axioms of probability theory, Solomonoff induction can be derived.[27]

Solomonoff induction tells us that, if we have a set of programs[3] {Ti} which encode for empirically indistinguishable physical theories, the probability P of the theory described by a given program Ti with length in bits (0s and 1s) |Ti| is given by

P(Ti)∼2−|Ti|

up to a constant normalization factor calculated across all the {Ti} to make the probabilities sum to 1.[27] We can see how this makes intuitive sense, since if we are predicting an arbitrary system, and thus have no information about the correctness of a program implementing a theory other than its length in bits, we are forced to assign equal probability to each of the two options for each bit, 0 and 1, and thus each additional bit adds a factor of 12 to the total probability of the program. Furthermore, we can see how Solomonoff induction serves as a formalization of Occam's razor, since it gives us a way of calculating how much to discount longer, more complex theories in favor of shorter, simpler ones.

Now, we will attempt to apply this formalism to assign probabilities to competing interpretations of quantum mechanics, which we will represent as elements of the set {Ti}. Let W be the shortest program which computes the wave equation. Since the wave equation is a component of all quantum theories, it must be that |W| ≤ |Ti|. Thus, the smallest that any Ti could possibly be is |W|, such that any Ti of length |W| is at least twice as probable as a Ti of any other length. The Everett interpretation is such a Ti, since it requires nothing else beyond wave mechanics, and follows directly from it. Therefore, from the perspective of Solomonoff induction, the Everett interpretation is provably optimal in terms of program length, and thus also in terms of probability.

To get a sense of the magnitude of these effects, we will attempt to approximate how much less probable the Copenhagen interpretation is than the Everett interpretation. We will represent the Copenhagen interpretation C as made of three parts: W, wave mechanics; O, a machine which determines when to collapse the wave function; and L, classical mechanics. Then, where the Everett interpretation E is just W, we can write their relative probabilities as

P(C)P(E)=2−|W|−|O|−|L|2−|W|=2−|O|−|L|

How large are O and L? As a quick Fermi estimate for L, we will take Newton’s three laws of motion, Einstein’s general relativistic field equation, and Maxwell’s four equations of electromagnetism as the principles of classical mechanics, for a total of 8 fundamental equations. Assume the minimal implementation for each one averages 100 bits—a very modest estimate, considering the smallest Chess program ever written is 3896 bits long.[30] Then, the relative probability is at most

P(C)P(E)=2−|O|−|L|<2−|L|≈2−800≈2⋅10−241

which is about the probability of picking four random atoms in the universe and getting the same one each time, and is thus so small as to be trivially dismissible.

The Arrow of Time

Another objection to the Everett interpretation is that it is time-symmetric. Since the Everett interpretation is just the wave equation, its time symmetry follows from the fact that the Schrodinger equation is time-reversal invariant, or more technically, charge-parity-time-reversal (CPT) invariant. The Copenhagen interpretation, however, is not, since wave function collapse is a fundamentally irreversible event.[31] In fact, CPT symmetry is not the only natural property that wave function collapse lacks that the Schrodinger equation has—wave function collapse breaks linearity, unitarity, differentiability, locality, and determinism.[13][12][16][32] The Everett interpretation, by virtue of consisting of nothing but the Schrodinger equation, preserves all of these properties. This is an argument in favor of the Everett interpretation, since there are strong theoretical and empirical reasons to believe that such symmetries are properties of the universe.[33][34][35][5]

Nevertheless, as mentioned above, it has been argued that the Copenhagen interpretation’s breaking of CPT symmetry is actually a point in its favor, since it supposedly explains the arrow of time, the idea that time does not behave symmetrically in our everyday experience.[31] Unfortunately for the Copenhagen interpretation, wave function collapse does not actually imply any of the desired thermodynamic properties of the arrow of time.[31] Furthermore, under the Everett interpretation, the arrow of time can be explained using the standard thermodynamic explanation that the universe started in a very low-entropy state.[36]

In fact, accepting the Everett interpretation gets rid of the need for the current state of the universe to be dependent on subtle initial variations in that low-entropy state.[36] Instead, the current state of the universe is simply one of the many different components of the wave function that evolved deterministically from that initial state. Thus, the Everett interpretation is even simpler—from a Solomonoff perspective—than was shown in the "Simplicity" section, since it forgoes the need for its program to specify a complex initial condition for the universe with many subtle variations.

Other Interpretations of Quantum Mechanics

The mathematical formalism of the quantum theory is capable of yielding its own interpretation.[9]

Bryce DeWitt, 1970

Decoherence

It is sometimes proposed that wave mechanics alone is sufficient to explain the apparent phenomenon of wave function collapse without the need for the Everett interpretation’s multiple worlds. The justification for this assertion is usually based on the idea of decoherence. Decoherence is the mathematical result, following from the wave equation, that tightly-interacting superpositions tend to evolve into non-interacting superpositions.[37][38] Importantly, decoherence does not destroy the superposition—it merely “diagonalizes” it, which is to say, it removes the interference terms.[37] After decoherence, one is always still left with a superposition of multiple states.[39][40] The only way to remove the resulting superposition is to assume wave function collapse, which every statistical theory claiming to do away with multiple worlds has been shown to implicitly assume.[41][19] There is no escaping the logic presented in the "The Apparent Collapse of The Wave Function" section—if one accepts the universal applicability of the wave function, one must accept the multiple worlds it implies.

That is not to say that decoherence is not an incredibly valuable, useful concept for the interpretation of quantum mechanics, however. In the Everett interpretation, decoherence serves the very important role of ensuring that macroscopic superpositions—the multiple worlds of the Everett interpretation—are non-interacting, and that each one thus behaves approximately classically.[41][40] Thus, the simplest decoherence-based interpretation of quantum mechanics is in fact the Everett interpretation. From the Stanford Encyclopedia of Philosophy, “Decoherence as such does not provide a solution to the measurement problem, at least not unless it is combined with an appropriate interpretation of the theory [and it has been suggested that] decoherence is most naturally understood in terms of Everett-like interpretations.”[39] The discoverer of decoherence himself, German theoretical physicist Heinz-Dieter Zeh, is an ardent proponent of the Everett interpretation.[42][36]

Furthermore, we have given general arguments in favor of the existence of the multiple worlds implied by the Everett interpretation, which are all reasons to favor the Everett interpretation over any single-world theory. Specifically, calculations of the probability of the current state of the universe support the Everett interpretation[10], as does the fact that the Everett interpretation allows for the initial state of the universe to be simpler[36].

Consistent Histories

The consistent histories interpretation of quantum mechanics, owing primarily to Robert Griffiths, eschews probabilities over “measurement” in favor of probabilities over “histories,” which are defined as arbitrary sequences of events.[43] Consistent histories provides a way of formalizing what classical probabilistic questions make sense in a quantum domain and which do not—that is, which are consistent. Its explanation for why this consistency always appears at large length scales is based on the idea of decoherence, as discussed above.[43][44] In this context, consistent histories is a very useful tool for reasoning about probabilities in the context of quantum mechanics, and for providing yet another proof of the natural origin of the Born rule.

Proponents of consistent histories claim that it does not imply the multiple worlds of the Everett interpretation.[43] However, since the theory is based on decoherence, there are always multiple different consistent histories, which cannot be removed via any natural history selection criterion.[45][44] Thus, just as the wave equation implies the Everett interpretation, so too does consistent histories. To see this, we will consider the fact that consistent histories works because of Feynmann’s observation that the amplitude of any given final state can be calculated as the sum of the amplitudes along all the possible paths to that state.[44][46] Importantly, we know that two different histories—for example, the different branches of a Mach-Zender interferometer—can diverge and then later merge back together and interfere with each other. Thus, it is not in general possible to describe the state of the universe as a single history, since other, parallel histories can interfere and change how that state will later evolve. A history is great for describing how a state came to be, but not very useful for describing how it might evolve in the future. For that, including the other parallel histories—the full superposition—is necessary.

Once one accepts that the existence of multiple histories is necessary on a microscopic level, their existence on a macroscopic level follows—excluding them would require an extra postulate, which would make consistent histories equivalent to the Copenhagen interpretation. If such an extra postulate is not made, then the result is macroscopic superposition, which is to say, the Everett interpretation. This formulation of consistent histories without any extra postulates has been called the theory of “the universal path integral,” exactly mirroring Everett’s theory of the universal wave function.[46] The theory of the universal wave function—the Everett interpretation—is to the theory of the universal path integral as wave mechanics is to the sum-over-paths approach, which is to say that they are both equivalent formalisms with the same implications.

Pilot Wave Theory

The pilot wave interpretation, otherwise known as the de Broglie-Bohm interpretation, postulates that the wave function, rather than being physically real, is a background which “guides” otherwise classical particles.[47] As we saw with the Copenhagen interpretation, the obvious question to ask of the pilot wave interpretation is whether its extra postulate—in this case adding in classical particles—is necessary or useful in any way. The answer to this question is a definitive no. Heinz-Dieter Zeh says of the pilot wave interpretation, “Bohm’s pilot wave theory is successful only because it keeps Schrodinger’s (exact) wave mechanics unchanged, while the rest of it is observationally meaningless and solely based on classical prejudice.”[42] As we have previously shown in the "The Mathematics of the Everett Interpretation" section, wave mechanics is capable of solving all supposed problems of measurement without the need for any additional postulates. While it is true that pilot wave theory solves all these problems as well, it does so not by virtue of its classical add-ons, but simply by virtue of including the entirety of wave mechanics.[42][48]

Furthermore, since pilot wave theory has no collapse postulate, it does not even get rid of the existence of multiple words. If the universe computes the entirety of the wave function, including all of its multiple worlds, then all of the observers in those worlds should experience physical reality by the act of being computed—it is not at all clear how the classical particles could have physical reality and the rest of the wave function not.[21][42] In the words of David Deutsch, “pilot-wave theories are parallel-universes theories in a state of chronic denial. This is no coincidence. Pilot-wave theories assume that the quantum formalism describes reality. The multiplicity of reality is a direct consequence of any such theory.”[21]

However, since the extra classical particles only exist in one of these worlds, the pilot wave interpretation also does not resolve the problem of the low likelihood of the observed state of the universe[10] or the complexity of the required initial condition[36]. Thus, the pilot wave interpretation, despite being strictly more complicated than the Everett interpretation—both in terms of its extra postulate and the concerns above—produces exactly no additional explanatory power. Therefore, we can safely dismiss the pilot wave interpretation on the grounds of the same simplicity argument used against the Copenhagen interpretation in the "Simplicity" section.

Conclusion

Harvard theoretical physicist Sidney Coleman uses the following parable from Wittgenstein as an analogy for the interpretation of quantum mechanics: “‘Tell me,’ Wittgenstein asked a friend, ‘why do people always say, it was natural for man to assume that the sun went round the Earth rather than that the Earth was rotating?’ His friend replied, ‘Well, obviously because it just looks as though the Sun is going round the Earth.’ Wittgenstein replied, ‘Well, what would it have looked like if it had looked as though the Earth was rotating?’”[49] Of course, the answer is it would have looked exactly as it actually does! To our fallible human intuition, it seems as if we are seeing the sun rotating around the Earth, despite the fact that what we are actually seeing is a heliocentric solar system. Similarly, it seems as if we are seeing the wave function randomly collapsing around us, despite the fact that this phenomenon is entirely explained just from the wave equation, which we already know empirically is a law of nature.

It is perhaps unfortunate that the Everett interpretation ended up implying the existence of multiple worlds, since this fact has led to many incorrectly viewing the Everett interpretation as a fanciful theory of alternative realities, rather than the best, simplest theory we have as of yet for explaining measurement in quantum mechanics. The Everett interpretation’s greatest virtue is the fact that it is barely even an interpretation of quantum mechanics, holding as its most fundamental principle that the wave equation can interpret itself. In the words of David Wallace: “If I were to pick one theme as central to the tangled development of the Everett interpretation of quantum mechanics, it would probably be: the formalism is to be left alone. What distinguished Everett’s original paper both from the Dirac-von Neumann collapse-of-the-wavefunction orthodoxy and from contemporary rivals such as the de Broglie-Bohm theory was its insistence that unitary quantum mechanics need not be supplemented in any way (whether by hidden variables, by new dynamical processes, or whatever).”[11]

There is a tendency of many physicists to describe the Everett interpretation simply as one possible answer to the measurement problem. It should hopefully be clear at this point why that view should be rejected—the Everett interpretation is not simply yet another solution to the measurement problem, but rather a straightforward conclusion of quantum mechanics itself that shows that the measurement problem should never have been a problem in the first place. Without the Everett interpretation, one is forced to needlessly introduce complex, symmetry-breaking, empirically-unjustifiable postulates—either wave function collapse or pilot wave theory—just to explain what was already explicable under basic wave mechanics. The Everett interpretation is not just another possible way of interpreting quantum mechanics, but a necessary component of any quantum theory that wishes to explain the phenomenon of measurement in a natural way. In the words of John Wheeler, Everett’s thesis advisor, “No escape seems possible from [Everett's] relative state formulation if one wants to have a complete mathematical model for the quantum mechanics that is internal to an isolated system. Apart from Everett’s concept of relative states, no self-consistent system of ideas [fully explains the universe].”[6]

References

[1] Heisenberg, W. (1927). THE ACTUAL CONTENT OF QUANTUM THEORETICAL KINEMATICS AND MECHANICS. Zeitschrift für Physik.

[2] Anon. The solvay conference, probably the most intelligent picture ever taken, 1927.

[3] Einstein, A., Podolsky, B. and Rosen, N. (1935). Can quantum-mechanical description of physical reality be considered complete? Physical Review.

[4] Greenberger, D. M. (1990). Bell’s theorem without inequalities. American Journal of Physics.

[5] Townsend, J. (2010). Quantum physics: A fundamental approach to modern physics. University Science Books.

[6] Wheeler, J. A. (1957). Assessment of everett’s “relative state” formulation of quantum theory. Reviews of Modern Physics.

[7] Everett, H. (1957). THE THEORY OF THE UNIVERSAL WAVE FUNCTION. Princeton University Press.

[8] Everett, H. (1957). “Relative state” formulation of quantum mechanics. Reviews of Modern Physics.

[9] DeWitt, B. S. (1970). Quantum mechanics and reality. Physics Today.

[10] Barrau, A. (2015). Testing the everett interpretation of quantum mechanics with cosmology.

[11] Wallace, D. (2007). Quantum probability from subjective likelihood: Improving on deutsch’s proof of the probability rule. Studies in History and Philosophy of Science.

[12] Saunders, S., Barrett, J., Kent, A. and Wallace, D. (2010). Many worlds?: Everett, quantum theory, & reality. Oxford University Press.

[13] Wallace, D. (2014). The emergent multiverse. Oxford University Press.

[14] Wallace, D. (2006). Epistemology quantized: Circumstances in which we should come to believe in the everett interpretation. The British Journal for the Philosophy of Science.

[15] Arndt, M., Nairz, O., Vos-Andreae, J., Keller, C., Zouw, G. van der and Zeilinger, A. (1999). Wave-particle duality of C60 molecules. Nature.

[16] Price, M. C. (1995). THE EVERETT FAQ.

[17] Hawking, S. S. (1975). Black holes and thermodynamics. Physical Review D.

[18] Deutsch, D. (1999). Quantum theory of probability and decisions. Proceedings of the Royal Society of London.

[19] Wallace, D. (2003). Everettian rationality: Defending deutsch’s approach to probability in the everett interpretation. Studies in History and Philosophy of Science.

[20] Clark, C. (2010). A theoretical introduction to wave mechanics.

[21] Deutsch, D. (1996). Comment on lockwood. The British Journal for the Philosophy of Science.

[22] Carroll, S. (2015). The wrong objections to the many-worlds interpretation of quantum mechanics.

[23] Hartle, J. B. (2014). SPACETIME QUANTUM MECHANICS AND THE QUANTUM MECHANICS OF SPACETIME.

[24] Zeh, H. D. (2011). Feynman’s interpretation of quantum theory. The European Physical Journal.

[25] Carroll, S. (2014). Why the many-worlds formulation of quantum mechanics is probably correct.

[26] Rae, A. I. M. (2009). Everett and the born rule. Studies in History and Philosophy of Science.

[27] Solomonoff, R. J. (1960). A PRELIMINARY REPORT ON a GENERAL THEORY OF INDUCTIVE INFERENCE.

[28] Soklakov, A. N. (2001). Occam’s razor as a formal basis for a physical theory.

[29] Altair, A. (2012). An intuitive explanation of solomonoff induction.

[30] Kelion, L. (2015). Coder creates smallest chess game for computers.

[31] Bitbol, M. (1988). THE CONCEPT OF MEASUREMENT AND TIME SYMMETRY IN QUANTUM MECHANICS. Philosophy of Science.

[32] Yudkowsky, E. (2008). The quantum physics sequence: Collapse postulates.

[33] Ellis, J. and Hagelin, J. S. (1984). Search for violations of quantum mechanics. Nuclear Physics.

[34] Ellis, J., Lopez, J. L., Mavromatos, N. E. and Nanopoulos, D. V. (1996). Precision tests of CPT symmetry and quantum mechanics in the neutral kaon system. Physical Review D.

[35] Agrawal, M. (2003). Linearity in quantum mechanics.

[36] Zeh, H. D. (1988). Measurement in bohm’s versus everett’s quantum theory. Foundations of Physics.

[37] Zurek, W. H. (2002). Decoherence and the transition from quantum to classical—revisited. Los Alamos Science.

[38] Schlosshauer, M. (2005). Decoherence, the meausrement problem, and interpretations of quantum mechanics.

[39] Anon. (2012). The role of decoherence in quantum mechanics. Stanford Encyclopedia of Philosophy.

[40] Wallace, D. (2003). Everett and structure. Studies in History and Philosophy of Science.

[41] Zeh, H. D. (1970). On the interpretation of measurement in quantum theory. Foundations of Physics.

[42] Zeh, H. D. (1999). Why bohm’s quantum theory? Foundations of Physics Letters.

[43] Griffiths, R. B. (1984). Consistent histories and the interpretation of quantum mechanics. Journal of Statistical Physics.

[44] Gell-Mann, M. and Hartle, J. B. (1989). Quantum mechanics in the light of quantum cosmology. Int. Symp. Foundations of Quantum Mechanics.

[45] Wallden, P. (2014). Contrary inferences in consistent histories and a set selection criterion.

[46] Lloyd, S. and Dreyer, O. (2015). The universal path integral. Quantum Information Processing.

[47] Bohm, D. J. and Hiley, B. J. (1982). The de broglie pilot wave theory and the further development of new insights arising out of it. Foundations of Physics.

[48] Brown, H. R. and Wallace, D. (2005). Solving the measurement problem: De broglie-bohm loses out to everett. Foundations of Physics.

[49] Coleman, S. (1994). Quantum mechanics in your face.

1. Fun fact: this paper was part of a paper contest that all undergraduate physics students at Harvey Mudd College participate in (which this paper won) for which there's a longstanding tradition (perpetuated by the students) that each student get a random word and be challenged to include it in their paper. My word was “stallion.” ↩︎

2. In some of these sources, the equivalent formalism of Kolmogorov complexity is used instead. ↩︎

3. To be precise, these should be universal Turing machine programs. ↩︎

Discuss