Вы здесь

Новости LessWrong.com

Подписка на Лента Новости LessWrong.com Новости LessWrong.com
A community blog devoted to refining the art of rationality
Обновлено: 30 минут 18 секунд назад

Counterfactuals as a matter of Social Convention

30 ноября, 2019 - 13:35
Published on November 30, 2019 10:35 AM UTC

In my last post, I wrote that the counterfactuals in Transparent-Box Newcomb's problem were largely a matter of social convention. One point I overlooked for a long time was that formalising a problem like Newcomb's is tricker than it seems. Depending on how it is written, some statements may seem to apply to just our actual world, some may seem to be also referring to counterfactual worlds and some may seem ambiguous.

To clarify this, I'll consider phrases that one might hear in relation to this problem + some variations and draw out their implications. I won't use modal logic since it really wouldn't add anything to this discussion except more jargon.

The idea that counterfactuals could have a social element should seem really puzzling at first. After all, counterfactuals determine what counts as a good decision and surely what is a good decision isn't just a matter of social convention? I think I know how to resolve this problem and I'll address that in a post soon, but for now I'll just provide a hint and link you to a comment by Abram Demski talking about how probabilities are somewhere between subjective and objective.

Example 1:

a) Omega is a perfect predictor

b) You find out from an infallible source that Omega will predict your choice correctly

The first suggests that Omega will predict you correctly no matter what you choose, so we might take it to apply to every counterfactual world, while it is technically possible that Omega might only be a perfect predictor in this world. The second is much more ambiguous and you might take its prediction to only be correct in this world and not the counterfactual.

Example 2:

a) The first box always contains $1000

b) The first box contains $1000

First seems to be making a claim about counterfactual worlds again, while the second is ambiguous. It isn't clear if it applies to all worlds or not.

Example 3:

"The game works as follows: the first box contains $1000, while the second contains $0 or $1000 depending on whether the predictor predicts you'll two-box or one-box"

Talking about the rules of the game seems to be a hint that this will apply to all counterfactuals. After all, decision problems are normally about winning within a game, as opposed to the rules changing according to your decision.

Example 4:

a) The box in front of you contains $1 million

b) The box in front of you contains either $0 or $1 million. In this case, it contains $1 million

The first is ambiguous. The second seems to make a statement about all counterfactuals, then one about this world. If it were making a statement just about this world then the first sentence wouldn't have been necessary.


This could be leveraged to provide a critique of the erasure approach. This approach wants to construct a non-trivial decision problem by erasing information, but this analysis suggests that either a) this may be unnecessary because it is already implicit in the problem which information is universal or not or b) the issue isn't that we need to figure out which assumption to erase, but that the problem is ambiguous about which parts should be taken universally.


What attempts have been made at global coordination around AI safety?

30 ноября, 2019 - 07:28
Published on November 30, 2019 4:28 AM UTC

For instance, might there be a maintained list of attempts at global agreement, whether they be public or private?

One unenforceable but highly endorsed example is the Future of Life's AI Open Letter, which has now attracted ~8,000 signatures from AI safety researchers, AGI researchers and other AI-adjacent technologists. It's not immediately clear what percentage of AI safety concerned individuals this represents, but at a cursory glance it would appear to be the largest consensus to date. The letter is merely an agreement to address AI safety sooner rather than later, so I am interested to hear of any agreements that address AI safety policy itself, even if the agreement is considered largely unsuccessful.

Please feel free to answer with personal views on global coordination.


Useful Does Not Mean Secure

30 ноября, 2019 - 05:05
Published on November 30, 2019 2:05 AM UTC

Brief summary of what I'm trying to do with this post:

  1. Contrast a “Usefulness” focused approach to building AI with a “Security” focused approach, and try to give an account of where security problems come from in AI.
  2. Show how marginal transparency improvements don’t necessarily improve things from the perspective of security.
  3. Describe what research is happening that I think is making progress from the perspective of security.

In this post I will be attempting to taboo the term 'alignment', and just talk about properties of systems. The below is not very original, I'm often just saying things in my own words that Paul and Eliezer have written, in large part just to try to think through the considerations myself. My thanks to Abram Demski and Rob Bensinger for comments on a draft of this post, though this doesn't mean they endorse the content or anything.

Useful Does Not Mean Secure

This post grew out of a comment thread elsewhere. In that thread, Ray Arnold was worried that there was an uncanny valley of how good we are at understanding and building AI where we can build AGI but not a safe AGI. Rohin Shah replied, and I'll quote from his reply:

Consider instead this worldview:

The way you build things that are useful and do what you want is to understand how things work and put them together in a deliberate way. If you put things together randomly, they either won't work, or will have unintended side effects.

(This worldview can apply to far more than AI; e.g. it seems right in basically every STEM field. You might argue that putting things together randomly seems to work surprisingly well in AI, to which I say that it really doesn't, you just don't see all of the effort where you put things together randomly and it simply flat-out fails.)

The argument "it's good to for people to understand AI techniques better even if it accelerates AGI" is a very straightforward non-clever consequence of this worldview.


Under the worldview I mentioned, the first-order effect of better understanding of AI systems, is that you are more likely to build AI systems that are useful and do what you want.

A lot of things Rohin says in that thread make sense. But in this post, let me point to a different perspective on AI that I might consider, if I were to focus entirely on Paul Christiano's model of greedy algorithms in part II of his post on what failure looks like. That perspective sounds something like this:

The way you build things that are useful and do what you want, when you're in an environment with much more powerful optimisers than you, is to spend a lot of extra time making them secure against adversaries, over and above simply making them useful. This is so that the other optimisers cannot exploit your system to achieve their own goals.

If you build things that are useful, predictable, and don't have bad side-effects, but are subject to far more powerful optimisation pressures than you, then by default the things you build will be taken over by other forces and end up not being very useful at all.

An important distinction about artificial intelligence research is that you're not simply competing against other humans, where you have to worry about hackers, governments and political groups, but that the core goal of artificial intelligence research is the creation of much more powerful general optimisers than currently exist within humanity. This is a difference in kind from all other STEM fields.

Whereas normal programming systems that aren't built quite right are more likely to do dumb things or just break, when you make an AI system that isn't exactly what you wanted, the system might be powerfully optimising for other targets in a way that has the potential to be highly adversarial. In discussions of AI alignment, Stuart Russell often likes to use an analogy to “building bridges that stay up” being an entirely integrated field, not distinct from bridge building. To extend the analogy a little, you might say the field of AI is unusual in that if you don't quite make the bridge well enough, the bridge itself may actively seek out security vulnerabilities that bring the bridge down, then hide them from your attention until such a time as it has the freedom to take the bridge down in one go, and then take out all the other bridges in the world.

Now, talk of AI necessarily blurs the line between 'external optimisation pressures' and 'the system is useful and does what you want' because the system itself is creating the new, powerful optimisation pressure that needs securing against. Paul's post on what failure looks like talks about this, so I’ll quote it here:

Modern ML instantiates massive numbers of cognitive policies, and then further refines (and ultimately deploys) whatever policies perform well according to some training objective. If progress continues, eventually machine learning will probably produce systems that have a detailed understanding of the world, which are able to adapt their behavior in order to achieve specific goals.

Once we start searching over policies that understand the world well enough, we run into a problem: any influence-seeking policies we stumble across would also score well according to our training objective, because performing well on the training objective is a good strategy for obtaining influence.

How frequently will we run into influence-seeking policies, vs. policies that just straightforwardly pursue the goals we wanted them to? I don’t know.

You could take the position that, even though security work is not normally central to a field, this new security work is already central to this field, so increasing the ability to build 'useful' things will naturally have to solve this novel security work, so the field of AI will get it right by default.

This is my understanding of Paul's mainline expectation (based on his estimates here and that his work is based around making useful / well motivated AI described here, here and in Rohin’s comment on that post) and also my understanding of Rohin's mainline expectation (based on his estimates here). My understanding is this still means there's a lot of value on the table from marginal work, so both of them work on the problem, but by default they expect the field to engage with this problem and do it well.

Restatement: In normal tech companies, there's a difference between "making useful systems" and "making secure systems". In the field of AI, "making useful systems" includes potentially building powerful adversaries, which involves novel security problems, so you might expect that executing the standard "make useful systems" will result in solving the novel security features.

For example, in a debate on instrumental convergence between various major AI researchers, this was also the position that Francesca Rossi took:

Stuart, I agree that it would easy to build a coffee fetching machine that is not aligned to our values, but why would we do this? Of course value alignment is not easy, and still a research challenge, but I would make it part of the picture when we envision future intelligent machines.

However, Yann LeCun said something subtly different:

One would have to be rather incompetent not to have a mechanism by which new terms in the objective could be added to prevent previously-unforeseen bad behavior.

Yann is implicitly taking the stance that there will not be powerful adversarial pressures exploiting such unforeseen differences in the objective function and humanity's values. His responses are of the kind "We wouldn't do that" and "We would change it quickly when those problems arose", but not "Here's how you build a machine learning system that cannot be flawed in this way". It seems to me that he does not expect there to be any further security concerns of the type discussed above. If I pointed out a way that your system would malfunction, it is sometimes okay to say “Oh, if anyone accidentally gives that input to the system, then we’ll see and fix any problems that occur”, but if your government computer system is not secure, then by the time you’ve noticed what’s happening, a powerful adversary is inside your system and taking actions against you.

(Though I should mention that I don't think this is the crux of the matter for Yann. I think his key disagreement is that he thinks we cannot talk usefully about safe AGI design before we know how to build an AGI - he doesn't think that prosaic AI alignment is in principle feasible or worth thinking about.)

In general, it seems to me that if you show me how an AI system is flawed, if my response is to simply patch that particular problem then go back to relaxing, I am implicitly disbelieving that optimisation processes more powerful than human civilization will look for similar flaws and exploit them, as otherwise my threat level would go up drastically.

To clarify what this worry looks like: advances in AGI are hopefully building systems that can scale to being as useful and intelligent as is physically feasible in our universe - optimisation power way above that of human civilization's. As you start getting smarter, you need to build more into your system to make sure the smart bits can't exploit the system for their own goals. This assumes an epistemic advantage, as Paul says in the Failure post:

Attempts to suppress influence-seeking behavior (call them “immune systems”) rest on the suppressor having some kind of epistemic advantage over the influence-seeker. Once the influence-seekers can outthink an immune system, they can avoid detection and potentially even compromise the immune system to further expand their influence. If ML systems are more sophisticated than humans, immune systems must themselves be automated. And if ML plays a large role in that automation, then the immune system is subject to the same pressure towards influence-seeking.

There's a notion whereby if you take a useful machine learning system, and you just make it more powerful, what you're essentially doing is increasing the intelligence of the optimisation forces passing through it, including the adversarial optimisation forces. As you take the system and make it vastly superintelligent, your primary focus needs to be on security from adversarial forces, rather than primarily on making something that's useful. You've become an AI security expert, not an AI usefulness expert. The important idea is that AI systems can break at higher levels of intelligence, even if they're currently quite useful.

As I understand it, this sort of thing happened at Google, who first were a computer networks experts, and then became security experts, because for a while the main changes they made to Google Search were to increase security and make it harder for people to game the pagerank system. The adversarial pressures on them have since hit terminal velocity and there probably won't be any further increases, unless and until we build superintelligent AI (be it general or the relevant kind of narrow.

Marginal Transparency Does Not Mean Marginal Security

A key question in figuring out whether to solve this security problem via technical research (as opposed to global coordination) is whether a line of work differentially makes this sort of security from optimisation powers easier to work on, or whether it simply makes the system more useful in ways that don't account for the novel adversaries.

It mostly sounds to me like the case being made above in this instance is that the system is generally becoming more understandable, so it's both easier to build and easier to secure. My general sense is that increasing understanding is always better. But one of the core arguments Nick Bostrom makes is that not all scientific insights are positive for humanity at its current level of wisdom, and that we need to be able to order which insights we get. There are many types of 'increased understanding', and it's not clear that all of them are positive on the current margin.

Let me give a concrete example where more transparency increases usefulness but not security. 

It’s sometimes the case that, in prisons, the prison staff will attempt to get the prisoners to help out with tasks, errands and maintenance around the prison. Now, if you send them into offices and parts of the prison but don’t watch what they do, then they may conspire against the prison staff - build things that can be used as weapons against the guards, build tunnels, use computers to contact the outside world, etc. Now, if you can start looking at what the prisoners are doing when they’re in office block #17, and you have basic rewards/punishments (e.g. giving or removing access to special foods and entertainment things), then you can reward them for doing good work there, and punish them for doing bad work there. You can also punish them for conspiring against you there.

However, this hasn’t really helped. if they were planning to build a tunnel in office block #17, they will now just build a tunnel in office block #18 instead. If I can start to see what they do all the time in certain locations, then I can punish the ones that aren't doing broadly the right work, and generally make sure they're doing the right work. But this doesn't mean that in the places I can't look that they're not conspiring against me. It seems plausible to me that many improvements in transparency, while making it easier to make the people do the prison’s work, do not make it harder to conspire against me, and just move around where the conspiring is happening.

If you’re trying to delegate and incentivise people to do labour for you, you might just think that you should bring in a good manager, to get people to do useful work. But in the prison situation, you primarily need to become a security expert, over and above being an expert in how to manage well. In this situation, there are many improvements in transparency of what they're doing that helps force them to do useful work, that doesn't secure them against conspiring with each other to break out of the system. 

I think that with AI, we already have all the weights on the NN's to look at, so the system is maximally transparent. We can see everything, for certain values of 'see'. I think the relevant question is "on what level you can understand what's going on". As we get higher-level understanding, we can maybe start to figure out if it's doing certain bad things, or certain good things, and punish/reward those. But just because you're making sure that the process will do something useful (e.g. invest money, run a hospital, classify images) doesn't mean I know how to tell whether this will lead to the type of full understanding that means that adversarial work can't be moved to areas that are too hard / very costly for me to understand.

Restatement: Marginal improvements in understandability and transparency can make it much easier to make useful systems but it's not necessarily the case that it produces a meaningful difference in the ability to produce secure systems. It will allow us, at increasingly higher levels of understanding, to be able to change the type of work needed to exploit the system; this is not the same as a design that is safe no matter how powerful the optimisation power against us.

I wrote this in response to Ray trying to figure out how to tell whether any given type of machine learning research is making differential progress. While the specific type of research discussed in that thread has a more detailed story than I will go into here, and mostly seems very helpful from my layman perspective, I think that research "either being of zero impact, or else making the whole field more transparent/understandable" does not mean that the research makes differential progress on making the system secure. Transparency can increase usefulness without increasing security.

In one sense, a machine learning system is maximally transparent - I can see every part of what it is doing. But while I don't understand its reasoning, while there are levels on which I don't know what it’s thinking, by default I’m not confident that adversarial thought isn't happening there.

Current Technical Work on Security

From this perspective, let me talk about the research that seems like it's aiming to help on the security front. This is not all the work being done, just the work that I feel I understand well enough to summarise from this perspective.

My understanding is that the main work attempting to pinpoint where optimisation enters the system in surprising ways is Hubinger, Mikulik, Skalse, van Merwijk and Garrabrant's work on risks from learned optimisation (paper, sequence). This gives lots of names to concepts describing how optimisers work, and asks questions like:

  • Under what conditions will my learned algorithm itself do optimisation?
  • When the learned algorithm does optimisation, what will its objective be, and what will the relationship be between its objective and the loss function of the neural net that produced it?
  • If the learned optimiser has successfully built a model of the objective function that was used to build it, what conditions predict whether it will it be working around my objective as opposed to toward it?
  • When should I expected the optimiser in the learned algorithm to try to deceive me?

The paper also asks whether it's possible to prevent influence-seeking algorithms from entering your systems by creating complexity measures on the system, such as time and space penalties. On this topic, Paul Christiano has asked whether requiring systems be maximally efficient according to circuit description length removes all adversarial behaviour; and Evan has offered an answer in the negative.

It's also the case that the Agent Foundations team at MIRI is trying to think about the problem of inner alignment more broadly, and poke at various concepts around here, such as in their writeups on Robust Delegation and Subsystem Alignment. This explores many simple background questions to which we don't have principled answers, and cannot draw toy models of intelligent agents that reliably get these problems right.

  • Is there a principled way to figure out whether I should trust that something more intelligent than me shares my values, given that I can't figure out exactly what it's going to do? If I am a child, sometimes adults will do something that the opposite of what I want - is there a way of figuring out whether they're doing this in accordance with my goals?
  • How should I tell a more intelligent agent than me what I want it to do, given that I don't know everything about what I want? This is especially hard given that optimisation amplifies the differences between what I say I want and what I actually want (aka Goodhart's Law).
  • How do I make sure the different parts of a mind are in a good balance, rather than some parts overpowering other parts? When it comes to my own mind, sometimes different parts get out of whack and I become too self-critical, or overconfident, or depressed, or manic. Is there a principled way of thinking about this?
  • How do I give another agent a good description of what to do in a domain, without teaching them everything I know about the domain? This is a problem in companies, where sometimes people who don't understand the whole vision can make bad tradeoffs.

Am curious about explanations of how other work fits into this framework.


Neural Annealing: Toward a Neural Theory of Everything (crosspost)

29 ноября, 2019 - 21:09
Published on November 29, 2019 5:31 PM UTC

The following is QRI's unified theory of music, meditation, psychedelics, depression, trauma, and emotional processing. Implications for how the brain implements Bayesian updating, and future directions for neuroscience. Crossposted from http://opentheory.net


Context: follow-up to The Neuroscience of Meditation and A Future For Neuroscience; a unification of (1) the Entropic Brain & REBUS (Carhart-Harris et al. 2014; 2018; 2019), (2) the Free Energy Principle (Friston 2010), (3) Connectome-Specific Harmonic Waves (Atasoy et al. 2016; 2017), and (4) QRI’s Symmetry Theory of Valence (Johnson 2016; Gomez Emilsson 2017).

0. Introduction

Why is neuroscience so hard?

Part of the problem is that the brain is complicated. But we’ve also mostly been doing it wrong, trying to explain the brain using methods that couldn’t possibly generate insight about the things we care about.

On QRI’s lineages page, we suggest there’s a distinction between ‘old’ and ‘new’ neuroscience:

Traditionally, neuroscience has been concerned with cataloguing the brain, e.g. collecting discrete observations about anatomy, observed cyclic patterns (EEG frequencies), and cell types and neurotransmitters, and trying to match these facts with functional stories. However, it’s increasingly clear that these sorts of neat stories about localized function are artifacts of the tools we’re using to look at the brain, not of the brain’s underlying computational structure.
What’s the alternative? Instead of centering our exploration on the sorts of raw data our tools are able to gather, we can approach the brain as a self-organizing system, something which uses a few core principles to both build and regulate itself. As such, if we can reverse-engineer these core principles and use what tools we have to validate these bottom-up models, we can both understand the internal logic of the brain’s algorithms — the how and why the brain does what it does — as well as find more elegant intervention points for altering it.

That’s a big check to try to cash. What might this look like?

I. Annealing metaphors for the brain

In my post about the neuroscience of meditation, I talked about simulated annealing, a natural implication of Robin Carhart-Harris’s work on entropic disintegration in the brain:

Annealing involves heating a metal above its recrystallization temperature, keeping it there for long enough for the microstructure of the metal to reach equilibrium, then slowly cooling it down, letting new patterns crystallize. This releases the internal stresses of the material, and is often used to restore ductility (plasticity and toughness) on metals that have been ‘cold-worked’ and have become very hard and brittle— in a sense, annealing is a ‘reset switch’ which allows metals to go back to a more pristine, natural state after being bent or stressed. I suspect this is a useful metaphor for brains, in that they can become hard and brittle over time with a build-up of internal stresses, and these stresses can be released by periodically entering high-energy states where a more natural neural microstructure can reemerge.

In his work on the entropic brain, Carhart-Harris studies how psychedelics like LSD and psilocybin add enough energy (neural activity) to the brain that existing neural patterns are disrupted, much like how heating a metal disrupts its existing molecular bonds. Recently, Carhart-Harris and Friston have unified their frameworks under the REBUS (RElaxed Beliefs Under pSychedelics) model, which also imports the annealing metaphor for brains:

The hypothesized flattening of the brain’s (variational free) energy landscape under psychedelics can be seen as analogous to the phenomenon of simulated annealing in computer science—which itself is analogous to annealing in metallurgy, whereby a system is heated (i.e., instantiated by increased neural excitability), such that it attains a state of heightened plasticity, in which the discovery of new energy minima (relatively stable places/trajectories for the system to visit/reside in for a period of time) is accelerated (Wang and Smith, 1998). Subsequently, as the drug is metabolized and the system cools, its dynamics begin to stabilize—and attractor basins begin to steepen again (Carhart-Harris et al., 2017). This process may result in the emergence of a new energy landscape with revised properties.

It’s a powerful metaphor since it ties together and recontextualizes so many core neuroscience concepts: free energy landscapes, Bayesian modeling, the ‘handshake’ between bottom-up sense-data and top-down priors. For a general overview of the math, see Wikipedia on simulated annealing, Metropolis-Hastings algorithm, Parallel tempering; for more on Carhart-Harris’s and Friston’s work, see Scott Alexander’s and Milan Griffes’ commentary. There seems to be some convergence on this metaphor: as Scott Alexander noted,

F&CH aren’t the first people to discuss this theory of psychedelics. It’s been in the air for a couple of years now – and props to local bloggers at the Qualia Research Institute and Mad.Science.Blog for getting good explanations up before the parts had even all come together in journal articles. I’m especially interested in QRI’s theory that meditation has the same kind of annealing effect, which I think would explain a lot.

The basics: how does annealing work?

Carhart-Harris’s and Friston’s model does many very clever things and is a substantial addition to the literature; I start from a similar frame but describe the process slightly differently. The following is QRI’s model (based on my talk on the Neuroscience of Meditation in Thailand):

  • First, energy (neural excitation, e.g. Free Energy from prediction errors) builds up in the brain, either gradually or suddenly, collecting disproportionately in the brain’s natural eigenmodes;
  • This build-up of energy (rate of neural firing) crosses a metastability threshold and the brain enters a high-energy state, causing entropic disintegration (weakening previously ‘sticky’ attractors);
  • The brain’s neurons self-organize into new multi-scale equilibria (attractors), aka implicit assumptions about reality’s structure and value weightings, which given present information should generate lower levels of prediction error than previous models (this is implicitly both a resynchronization of internal predictive models with the environment, and a minimization of dissonance in connectome-specific harmonic waves); 
  • The brain ‘cools’ (neural activity levels slowly return to normal), and parts of the new self-organized patterns remain and become part of the brain’s normal activity landscape;
  • The cycle repeats, as the brain’s models become outdated and prediction errors start to build up again.

Any ‘emotionally intense’ experience that you need time to process most likely involves this entropic disintegration->search->annealing mechanism— this is what emotional processing is.

And I’d suggest that this is the core dynamic of how the brain updates its structure, the mechanism the brain uses to pay down its ‘technical debt’. In other words, entering high-energy states (i.e., intense emotional states which take some time to ‘process’) is how the brain releases structural stress and adapts to new developments. This process needs to happen on a regular basis to support healthy function, and if it doesn’t, psychological health degrades— In particular, mental flexibility & emotional vibrancy go down — analogous to a drop in a metal’s ‘ductility’. People seem to have a strong subconscious drive toward entering these states and if they haven’t experienced a high-energy brain state in some time, they actively seek one out, even sometimes in destructive ways.

However, the brain spends most of its time in low-energy states, because they’re safer: systems in noisy environments need to limit their rate of updating. There are often spikes of energy in the brain, but these don’t tend to snowball into full high-energy states because the brain has many ‘energy sinks’ (inhibitory top-down predictive models) which soak up excess energy before entropic disintegration can occur.

But the brain can enter high-energy states if these energy sinks are:

(1) De-activated, if certain evolved trigger conditions are present- e.g., death of a loved one, falling in love, good sex, social rejection, getting bitten by a weird animal, failing some important prediction. In these cases there seems to be some sort of adaptive gating mechanism that disables the typical energy sinks in order to allow entropic disintegration->search->annealing to happen.

(2) Overwhelmed, if there’s an enormous magnitude of energy coming in, faster than the energy sinks can mop it up- e.g., watching a horror movie, direct brain stimulation, first day of school, being sleep deprived, military boot camp, cult indoctrinations, your wedding day.

(3) Avoided, if semantically-neutral energy is applied to the system. Essentially, coherent energy which isn’t strongly linked to any cognitive, emotional, or sensory process will be partially illegible to most existing energy sinks, and so it can persist long enough to build up basically ‘hacking’ the brain’s activity normalization system. (Hold that thought this is the most interesting one. We’ll return to it later.)

This is the ‘view from 30,000 feet’ for how simulated annealing in the brain works. If you stopped reading here, you’d walk away with a reasonable toy model of QRI’s “Neural Annealing” framework.

But there’s a lot more to the model! The rest of this writeup is an iterative tour using Neural Annealing to explain meditation, trauma, love, depression, psychedelics, and effective therapy, with each section adding a variation on the core theme.

Interlude: FEP, CSHW, and EBH/REBUS

QRI’s “Neural Annealing” framework is essentially a unification of Karl Friston’s Free Energy Principle (FEP), Selen Atasoy’s Connectome-Specific Harmonic Waves (CSHW), Robin Carhart-Harris’s Entropic Brain Hypothesis (EBH), and QRI’s own Symmetry Theory of Valence (STV). Recently, Friston and Carhart-Harris have unified their respective paradigms with the Relaxed Beliefs Under pSychedelics (REBUS) model. I believe combining all three is exponentially more powerful, not only giving the computational-level story of REBUS, but also giving us a model for how the brain may be physically implementing REBUS, and Bayesian updating in general, with a correspondingly richer set of predictions. 

First, here’s a quick recap: to paraphrase what I wrote elsewhere,

Karl Friston’s Free Energy Principle (FEP) is the leading theory of self-organizing system dynamics, one which has (in various guises) pretty much taken neuroscience by storm. It argues that any self-organizing system which effectively resists disorder must (as its core organizing principle) minimize its free energy, that free energy is equivalent to surprise (in a Bayesian sense), and that this surprise-minimization drives basically all human behavior. This minimization of surprise revolves around Bayesian-type reasoning: the brain is always getting bottom-up sense data flowing in, more than it can handle. So it relies on top-down predictive models that attempt to sort through all this data so we can focus on the surprising stuff, the stuff that can’t be effortlessly predicted. The core of the FEP is the details of how this ‘handshake’ between bottom-up and top-down happens, and what can influence it. See Friston’s primary work; Scott Alexander’s attempt to distill it. Related to (and sometimes used synonymously with) Active Inference, the Bayesian Brain, and Predictive Processing / Predictive Coding.

Robin Carhart-Harris’s Entropic Brain Hypothesis (EBH) is essentially an attempt to import key concepts such as entropy and self-organized criticality from statistical physics into neuroscience, in order to explain psychedelic phenomena. As I noted above, it suggests that certain conditions such as psychedelics can add enough energy to brain networks that they undergo ‘entropic disintegration’, and then self-organize into new equilibria. See Carhart-Harris 2018

Selen Atasoy’s Connectome-Specific Harmonic Waves (CSHW) is a method for applying harmonic analysis to the brain: basically, it uses various forms of brain imaging to infer what the brain’s natural resonant frequencies (eigenmodes) are, and how much energy each of these frequencies have. The core workflow is three steps: first combine MRI and DTI to approximate a brain’s connectome, then with an empirically-derived wave propagation equation calculate what the natural harmonics are of this connectome, then estimate which power distribution between these harmonics would most accurately reconstruct the observed fMRI activity. This framework offers several notable things: (a) these connectome-specific harmonic waves (CSHWs) are natural Schelling points that the brain has probably self-organized around (and so are worth talking about); (b) a plausible mid-level bridge connecting bottom-up neural dynamics and high-level psychological phenomena, (c) something we can actually measure. CSHW is an empirical paradigm, which is very uncommon in theoretical neuroscience. Here’s a transcript of Atasoy’s explanation; I also wrote extensively about CSHW in A Future for Neuroscience.

In short: each of these three paradigms is a description of how the brain self-organizes. Friston’s work understands the self-organization from a computational lens; Carhart-Harris an energetic lens; Atasoy a physical lens.

Finally, I’d offer two further pieces of background context: 

QRI’s own Symmetry Theory of Valence (STV), which hypothesizes that given a mathematical representation of an experience, the symmetry of this representation will encode how pleasant the experience is (Johnson 2016). We further hypothesize that consonance between a brain’s connectome-specific harmonic waves (CSHWs) will be a reasonable proxy for this symmetry (Gomez Emilsson 2017).

Marr’s Three Levels: as explained on our lineages page, 

David Marr is most famous for Marr’s Three Levels (along with Tomaso Poggio), which describe ”the three levels at which any machine carrying out an information-processing task must be understood:”
>Computational theory: What is the goal of the computation, why is it appropriate, and what is the logic of the strategy by which it can be carried out?
>Representation and algorithm: How can this computational theory be implemented? In particular, what is the representation for the input and output, and what is the algorithm for the transformation?
>Hardware implementation: How can the representation and algorithm be realized physically? [Marr (1982), p. 25]
This framework sounds simple, but is remarkably important since arguably most of the confusion in neuroscience (and phenomenology research) comes from starting a sentence on one Marr-Poggio level and finishing it on another, and this framework lets people debug that confusion.

Back to annealing

As noted, Carhart-Harris and Friston have unified their paradigms under REBUS by understanding prediction errors as the ‘energy’ parameter which drives disruption (entropic disintegration) in the brain’s networks. Over time, this drives an evolutionary search function which attempts to minimize these prediction errors. I think this is a very beautiful description of a very clever system, and one which allows us an opportunity to cross-validate each model, and jump between levels of description if we get ‘stuck’. But it’s still missing a story about physical implementation. What is this ‘energy’, physically speaking?

II. How meditation works: semantically-neutral annealing

I believe that almost all techniques that intentionally ‘hack’ the brain’s annealing process share a common mechanism: a build-up of semantically neutral energy. “Semantically neutral energy” refers to neural activity which is not strongly associated with any specific cognitive or emotional process. As I note above, usually energy build-up is limited: once a perturbation of the system neatly falls into a pattern recognized by the brain’s predictive hierarchy, the neural activity propagating this pattern is dissipated. But if a pattern never quite matches anything, or takes advance of implementation-level structure to persist, and especially if it’s getting continually reinforced by some external or internal dynamic it can persist long enough to build up. I think meditation is a perfect example of a process which adds semantically-neutral energy to the brain: effortful attention on excitatory bottom-up sense-data and attenuation of inhibitory top-down predictive models will naturally lead to a build-up of this ‘non-semantic’ energy in the brain. From The Neuroscience of Meditation:

Furthermore, from what I gather from experienced meditators, successfully entering meditative flow may be one of the most reliable ways to reach these high-energy brain states. I.e., it’s very common for meditation to produce feelings of high intensity, at least in people able to actually enter meditative flow. Meditation also produces more ‘pure’ or ‘neutral’ high-energy states, ones that are free of the intentional content usually associated with intense experiences which may distort or limit the scope of the annealing process. So we can think of intermediate-to-advanced (‘successful flow-state’) meditation as a reheating process, whereby the brain enters a more plastic and neutral state, releases pent-up structural stresses, and recrystallizes into a more balanced, neutral configuration as it cools. Iterated many times, this will drive an evolutionary process and will produce a very different brain, one which is more unified & anti-fragile, less distorted toward intentionality, and in general structurally optimized against stress.
An open question is how or why meditation produces high-energy brain states. There isn’t any consensus on this, but with a nod to the predictive coding framework, I’d offer that bottom-up sense-data is generally excitatory, adding energy to the system, whereas top-down predictive Bayesian models are generally inhibitory, functioning as ‘energy sinks’. And so by ‘noting and knowing’ our sensations before our top-down models activate, in a sense we’re diverting the ‘energy’ of our sensations away from its usual counterbalancing force. If we do this long enough and skillfully enough, this energy can build up and lead to ‘entropic disintegration’, essentially pushing enough energy into the system that existing attractors are disrupted and annealing can occur. 

A natural question here is what *is* this ‘semantically neutral energy’ exactly? an abstract answer here is “semantically neutral energy” can be thought of as an increase in brain activity which is (1) illegible to Marr’s semantic/computational level, but (2) coherent with regard to Marr’s algorithmic or implementational levels (another term for this might be ‘semantically-illegible energy’). But my concrete answer is that semantically neutral energy is a build-up of energy in the brain’s natural resonances — energy accumulating in CSHWs. And so it’s this that builds up during meditation, and this that starts a semantically-neutral annealing process which has a unique effect profile.

I think semantically-neutral annealing is the best kind of annealing for psychological health, because: 

(1) By mostly avoiding energy sinks, the same entropic disintegration->search->annealing process can happen using less total energy, which is less disruptive to the fine details of the system;

(2) Since this energy is semantically-neutral, it doesn’t depend on or trigger as many semantic processes in the brain (which can have unpredictable effects), and likewise it doesn’t necessarily rely on anti-inductive ‘hacks’ to trick the predictive processing system, and these factors make it a more reliable and repeatable source of annealing; 

(3) Very very importantly: similarly to how vibratory energy applied to a tuning fork quickly collapses to the natural resonant frequency of the tuning fork, I’m speculating that coherent, semantically-neutral energy added to the brain will naturally cluster in the brain’s natural connectome harmonics, which will thus drive an annealing process which strengthens a consonant subset of the brain’s natural harmonic resonances in the long-term— essentially ‘retuning the brain’ toward more resonant/flow states. For more details, see The Neuroscience of Meditation;

(4) Finally, this process should feel really really good and in the long-term, retune the mind to be more pleasant to inhabit. QRI’s work on the Symmetry Theory of Valence (STV) and our method of applying this to the brain (CDNS) suggests that harmony in the brain is literally synonymous with pleasure, and so processes which ‘deepen the grooves’ of core harmonic resonances will tend to boost the mind’s default hedonic level (likely helping significantly with neuroticism and emotional resilience).

I.e., Meditation is a remarkably clever technique which piggybacks on several of the brain’s core principles of self-organization: first, effortful attention on (excitatory) sense-data and inhibiting (inhibitory) predictive storytelling naturally pushes the brain into a high-energy state and makes it more malleable; this excess energy disproportionately collects in natural brain harmonics, and as the brain ‘cools’ from its high-energy state, these energized harmonics become ‘deeper’, leading to more psychological robustness. Less neuroticism and more flow. I think this is where a large portion of the benefits of advanced meditation comes from.

Meditation isn’t the only method to induce build-up of semantically-neutral energy; the “Big Three” are:

Meditation, which seems to work by both increasing excitatory sense-data and decreasing inhibitory top-down predictive models (energy sinks);

Psychedelics, which intuitively may function by disabling existing energy sinks (or perhaps overloading them by increasing baseline firing rates or increasing the branching factor of neural activity).

Music, a sensory input which seems to exist on the knife’s edge between exhibiting highly ordered patterns (some of which will hit natural connectome harmonics and so allow accumulation of energy through resonance) on one hand, and on the other hand not being too predictable (thus dodging most inhibitory top-down predictive models);

Hybrid approaches also exist: e.g. exercise, dance, sex, tantric practices, EMDR, and breath work are essentially combinations of the rhythmic portion of music and the sensory portion of meditation. The fact that psychedelics reliably enhance the potency of each and every one of these practices is not a coincidence, but due to shared mechanism.[1]

III. Depression as a disorder of annealing; bipolar depression doubly so

To describe depression in one sentence: “Depression is a self-reinforcing perturbation from the natural annealing cycle.” There are two related aspects to this: (1) an inability to anneal normally, and (2) annealing abnormally (more specifically, annealing new attractor basins which are high in dissonance, or annealing a pathological change in energy parameter dynamics).

Most people have a simple model of depression as “being sad all the time” but I think a two-factor model looking at energy parameter and valence offers a lot of clarity and predictive utility. Roughly speaking, this suggests parametrizing depression into three core types:

I. Depression with no high energy states, characterized by a lack of annealing (emotional clarity and dynamism) in general;

II. Depression with high-energy negative states, which over time anneals minds toward suffering and hopelessness;

III. Bipolar depression with high-energy positive & negative states, which over time anneals minds toward the dramatic.

These categories aren’t exclusive or static; too much time in one will increase the probability one may also fall into the others. 

Not annealing frequently enough may be the most important ‘non-obvious’ cause of depression. Brains especially younger ones, since they’re changing so much really do need to anneal regularly to pay down their ‘technical debt’, and if they don’t, they grow brittle and neurotic. (Technical debt in the brain builds up as we twist our existing brain networks to accommodate new facts; this debt is ‘paid down’ when we enter high-energy states and let new brain networks which fit these constraints self-organize) The ‘annealing pressure’ also increases over time, and if a wholesome annealing opportunity fails to present itself, the brain will progressively lower its standards looking for any opportunity for annealing. Especially if done repeatedly, this can cause long-term damage to the brain’s attractor basin landscape. (We see this in negative coping strategies such as cutting, drama-seeking, and so on if someone is engaging in such, they’ve probably annealed poorly, and also likely have few realistic opportunities for healthy annealing.) Many forms of entertainment we think of as palliatives in today’s society (e.g. movies, video games) may be weak-and-incomplete-but-still-nonzero drivers of annealing. Not as good as the real thing, but better than nothing if that’s your only option.

At the high-energy extreme, it seems likely and tragic that depression compounds itself by repeatedly causing intense negative emotion (high-energy states) which anneals the brain toward these patterns, and toward assigning salience on the set of problems and types of thoughts (attractor landscape) facing a depressed person — many of which are their own cause and would weaken if ignored. Relatedly, I suspect some CSHW- and music-theory-related math could be found describing how depression anneals what I would call a brain’s ‘connectome key signature’ (CKS) toward a ‘minor key’, an internal logic which feels tragic/hopeless (has fewer harmonious arrangements and progressions), which the brain then uses as building blocks for its reality. 

Bipolar depression seems a little more strange; the extreme highs and lows may in aggregate produce crazier annealing patterns than just one or the other — essentially there’s a ‘tug of war’ between patterns annealed during each extreme, which prioritizes the survival of the class of patterns that exist during both extremely positive and extremely negative states. In practice, over time this anneals a mind’s stories toward the dramatic, and toward reducing the activation energy needed to flip the brain between major and minor keys (the psych literature calls this ‘kindling’). Each of these ‘key signature flips’ would itself release a great deal of pent-up energy, further driving the annealing process. As I note in A Future for Neuroscience:

This is not to say our key signatures are completely static, however: an interesting thread to pull here may be that some brains seem to flip between a major key and a minor key, with these keys being local maximas of harmony. I suspect each is better at certain kinds of processing, and although parts of each can be compatible with the other, each has elements that present as defection to the internal logic of the other and so these attractors can be ‘sticky’. But there can also be a buildup of tension as one gathers information that is incompatible with one’s key signature, which gets progressively more difficult to maintain, and can lead to the sort of intensity of experience that drives an annealing-like process when the key signature flips. And in the case of repeated flips, the patterns which are compatible with both key signatures will be the most strongly reinforced.

In some ways a bipolar brain may result in significant cognitive and creative advantages: perhaps the biggest is more access to high-energy states, which in the short term helps creativity by allowing more exploration and also steeper valence gradients to follow, and iterated over the long term allows significantly more optimization pressure on the subsystems that are repeatedly annealed. However this has corresponding epistemological downsides as noted above; fueling creative work with valence deltas is likely to ‘warp the engine’ over time, to paraphrase Shinzen. Friston’s notion that ‘systems maximizing long-term stability spend most of their time in a small number of states’ seems particularly relevant to mood disorders. (My colleague Andrés suggests this ‘bipolar effect profile’ may be replicated by valence-enhancing drugs with a short duration and hangover, such as cocaine- this at least fits stereotypes.)

I find myself wondering if neuroticism can be thought of as ancient neural technology intended to reduce annealing frequency in the ancestral environment — essentially if we look into the brains of highly neurotic people, we might find strong energy sinks located around natural connectome harmonics which prevent semantically-neutral energy build-up. This likely contributes to certain forms of depression (and leads to pernicious feedback cycles — the less one anneals, the more neurotic one gets, the less able to reach high-energy states one becomes), but might also help prevent seizures or inappropriate updating/annealing, and may have frequency-dependent benefits. E.g., a group with 19 carefree annealers and 1 neurotic guardian will act more wisely than one with 20 carefree annealers or 20 neurotic guardians. The ‘neuroticism=energy sinks’ frame seems to suggest how to reduce neuroticism (anneal more often, especially semantically-neutral annealing), and also offer clues as to how neuroticism is implemented in the brain: we might look into the mathematics of Anderson localization in the connectome: topological features that can ‘eat’ waves.

Is sleep a natural annealing process? If so, this could cleanly explain the connection between depression and chronic sleep disturbances — poor sleep as both a cause and effect of infrequent annealing. And it would indicate a treatment path: a restoration of normal annealing patterns may help improve both mood and sleep. I hold the following lightly, but we might model nREM as the heating-up phase (undampened harmonics) and REM as the neural search & cooling process. From a review drawing parallels between sleep and jhana (intense meditative) states: 

This paper is a preliminary report on the first detailed EEG study of jhana meditation, with findings radically different to studies of more familiar, less focused forms of meditation. While remaining highly alert and “present” in their subjective experience, a high proportion of subjects display “spindle” activity in their EEG, superficially similar to sleep spindles of stage 2 nREM sleep, while more-experienced subjects display high voltage slow-waves reminiscent, but significantly different, to the slow waves of deeper stage 4 nREM sleep, or even high-voltage delta coma. Some others show brief posterior spike-wave bursts, again similar, but with significant differences, to absence epilepsy. Some subjects also develop the ability to consciously evoke clonic seizure-like activity at will, under full control. (Dennison 2019)

It seems plausible that broad rhythmic brain activity helps with certain ‘physical housekeeping’ tasks in the brain as well, and if one anneals regularly they may need somewhat less sleep (see recent research on Alzheimer’s, sleep, and rhythmic stimulation helping break up brain plaques).

The ‘dead neuron’ model of neuroticism and depression:

Deep learning models can exhibit ‘dead neurons’: neurons whose activation function gets ‘stuck’ on the on or off position, for instance when a sigmoid function gets too high or too low and its slope drops to almost zero. These ‘dead’ neurons can be nigh-impossible to ‘revive’ within the model, since it can be the case that their gradient (implicit sensitivity to input) is so shallow that there simply aren’t inputs that will nudge it in one or another direction.

Graphic: sigmoidal function. This loses sensitivity when values get too high or too low. Different activation functions can lose sensitivity (lead to ‘dead neurons’) under different scenarios- ReLU is notorious for this.

These ‘dead’ neurons tend to cause lots of problems, since their “always-on” or “always off” signal tends to propagate through the network very strongly, causing later neurons in the chain to also exhibit less sensitivity to input. (Sometimes this process will cascade, sometimes not, much like malignant vs benign tumors.) 

I suspect this might be a strong frame for understanding the ‘psychological cruft’ which builds up in brains, and how and why regular annealing is so healthy: over time, sensitive neurons can slide into this broken state, shifting from conditional values to the neurological equivalent of static 0s and 1s. In this case I would expect more neuroticism, less flexible thinking, lower emotional resilience, and worse epistemology from people who haven’t annealed recently: lots of all-or-nothing thinking. But by injecting lots of energy into the system, enough of the internal and external context of these neurons is shifted such that some of them may get ‘reset’ and regain their conditional processing state. At the very least, this self-reorganization process can allow these neurons to move to less-critical points in processing networks. 

An idea related to this frame is that a core function of neural annealing is to maintain a smooth gradient of harmony in the brain (and mind) – to make it possible to “follow your joy” toward better outcomes. If this breaks down and you can’t “follow your joy”, consider putting yourself in a situation which could plausibly kickstart an annealing process (even if you don’t feel emotionally motivated to do so). 

IV. The nature of trauma and the implementation of the Bayesian Brain

Trauma is one of the worst elements of the human condition. It’s easy enough to accumulate that we all have some, and it’s hard to get rid of. But what is it?

Scott Alexander recently reviewed a core work in the PTSD literature, The Body Keeps The Score, and offers some context:

The book stressed the variety of responses to PTSD. Some people get anxious. Some people get angry. But a lot of people, whatever their other symptoms, also go completely numb. They are probably still “having” “emotions” “under” “the” “surface”, but they have no perception of them. Sometimes this mental deficit is accompanied by equally surprising bodily deficits. Van der Kolk describes a study on stereoagnosia in PTSD patients: if blindfolded and given a small object (like a key), they are unable to recognize it by feel, even though this task is easy for healthy people. Sometimes this gets even more extreme, like the case of a massage therapy patient who did not realize they were being massaged until the therapist verbally acknowledged she had started.
The book is called The Body Keeps The Score, and it returns again and again to the idea of PTSD patients as disconnected from their bodies. The body sends a rich flow of information to the brain, which is part of what we mean when we say we “feel alive” or “feel like I’m in my body”. In PTSD, this flow gets interrupted. People feel “like nothing”. …
There’s some discussion of the neurobiology of all this, but it never really connects with the vividness of the anecdotes. A lot of stuff about how trauma causes the lizard brain to inappropriately activate in ways the rational brain can’t control, how your “smoke detector” can be set to overdrive, all backed up with the proper set of big words like “dorsolateral prefrontal cortex” – but none of it seemed to reach the point where I felt like I was making progress to a gears-level explanation. I felt like the level on which I wanted an explanation of PTSD, and the level at which van der Kolk was explaining PTSD, never really connected; I can’t put it any better than that. …
There are a lot of alternative treatments for PTSD. Neurofeedback, where you attach yourself to a machine that reads your brain waves and try to explore the effect your thoughts have on brain wave production until you are consciously able to manipulate your neural states. Internal family systems, where a therapist guides you through discovering “parts” of yourself (think a weak version of multiple personalities), and you talk to them, and figure out what they want, and make bargains with them where they get what they want and so stop causing mental illness. Eye movement directed reprocessing (alternative when the book was written, now basically establishment) where you move your eyes back and forth while talking about your trauma, and this seems to somehow help you process it better. Acupuncture. Massage. Yoga. …
Maybe the most consistent lesson from this book’s tour of successful alternative therapies – keeping with the theme of the title – is that it’s important for PTSD patients to get back in touch with their bodies. Massage therapy, yoga, and acupuncture addressed this directly, usually creating gentle, comfortable sensations that patients could take note of to gradually relax the absolute firewall between bodily sensation and conscious processing.

The simple Neural Annealing take on trauma is that significant negative events can push the brain into a high-energy state filled with ‘trauma patterns’, and as the brain cools, some of these trauma patterns crystallize/anneal in a very durable form, which present as PTSD.

I think this is a more useful answer than what’s out there currently, offering straightforward intuitive answers for (1) what kinds of things are most likely to cause PTSD, (2) why PTSD is so ‘sticky’, and (3) an intuitive solution to PTSD: anneal over the bad patterns with better patterns.

But Scott’s description seems to point at something further: that there’s a disconnection happening with trauma. To address this, I propose the Neural Annealing model for how CSHW could implement the Bayesian Brain model of cognition. We’ll then circle back and discuss what might be going wrong during trauma.

Last year in A Future for Neuroscience, I shared the frame that we could split CSHWs into high-frequency and low-frequency types, and perhaps say something about how they might serve different purposes in the Bayesian brain:

The mathematics of signal propagation and the nature of emotions
High frequency harmonics will tend to stop at the boundaries of brain regions, and thus will be used more for fine-grained and very local information processing; low frequency harmonics will tend to travel longer distances, much as low frequency sounds travel better through walls. This paints a possible, and I think useful, picture of what emotions fundamentally are: semi-discrete conditional bundles of low(ish) frequency brain harmonics that essentially act as Bayesian priors for our limbic system. Change the harmonics, change the priors and thus the behavior. Panksepp’s seven core drives (play, panic/grief, fear, rage, seeking, lust, care) might be a decent first-pass approximation for the attractors in this system. 

I would now add this roughly implies a continuum of CSHWs, with scale-free functional roles:

  • Region-specific harmonic waves (RSHWs) high frequency resonances that implement the processing of cognitive particulars, and are localized to a specific brain region (much like how high-frequencies don’t travel through walls) in theory quantifiable through simply applying Atasoy’s CSHW method to individual brain regions;
  • Connectome-specific harmonic waves (CSHWs) low-frequency connectome-wide resonances that act as Bayesian priors, carrying relatively simple ‘emotional-type’ information across the brain;
  • Sensorium-specific harmonic waves (SSHWs) very-low-frequency waves that span not just the connectome, but the larger nervous system and parts of the body. These encode somatic information – in theory, we could infer sensorium eigenmodes by applying Atasoy’s method to not only the connectome, but the nervous system, adjusting for variable nerve-lengths, and validate against something like body-emotion maps.[2][3]

These waves shade into each other – a ‘low-frequency thought’ shades into a ‘high-frequency emotion’, a ‘low-frequency emotion’ shades into somatic information. As we go further up in frequencies, these waves become more localized.

An interesting implication here is we may essentially get Bayesian updating to naturally emerge from this typology, through interactions between these various waves: essentially, I think it’s ‘injection-locking all the way down’. (Injection-locking is when harmonic oscillators (like CSHWs) essentially ‘sync up’ their periods and phases.) Specifically:

Low-frequency CSHWs carry priors, higher frequency RSHWs deal with particulars. Lower frequencies span the brain; higher frequencies resonate within more local regions of the brain — the higher the frequency of the wave, the smaller the region it tends to resonate in. The RSHWs in different regions can’t talk to each other directly, since (definitionally) these waves can’t travel across regional boundaries. But they can talk to each other indirectly, through interacting with low-frequency CSHWs. More specifically, I speculate that regions and CSHW-encoded priors interact through a power-weighted averaging between CSHWs and RSHWs, as mediated by the math of injection-locking and injection-pulling. This allows both functional partitioning and also global updating: regions get some isolation in order to perform their specialized computations, but they also get exposure to data about the overall Bayesian prior situation, aka what we call ‘emotional information’. I.e. Region A syncs up with CSHWs, which carry the information to Region B and sync up with the RSHWs there, and so on. Of note, there’s a delicate, power-weighted handshake between CSHWs and RSHWs: low-frequency harmonics (emotions / Bayesian priors) carry more power per harmonic (lower due to frequency, much higher due to amplitude) but there are many more high-frequency harmonics (sensory+cognitive particulars). Strong emotions like anger likely pump huge amounts of energy into CSHWs and upend this balance, forcing RSHWs to synchronize with CSHWs. We can think of this as sacrificing the delicate epistemology-harmonization handshake in favor of unity of processing and clarity of action — or put simply, forcing perception to match top-down expectations.

On entropic disintegration, search, and annealing in evolved harmonic systems:

The noisy, stochastic nature of brain activity, along with practical requirements for homeostasis, will lead to a strong optimization of the CSHW+RSHW configuration for local minima which are resistant to change. However, a large enough perturbation will push the system out of this basin (entropic disintegration step). The neural search step is essentially the system stochastically testing different harmonic configurations; the neural annealing step is the system ‘settling into’ a configuration as its top-down predictive models get sufficiently good at sopping up the excess energy in the system, essentially forming a new basin it will again take a large amount of perturbation to get out of. The strength of annealing can be thought of as the steepness of this basin, and also the Hebbian reinforcement of system attractors (“neurons that fire together, wire together”). Insofar as partitioning is possible in a broadly-coupled harmonic system, these perturbations will tend to be ‘local’ as the brain has strong incentives to preserve structure that doesn’t need updating. 

Toward a generalized definition of trauma: a breakdown of information-propagation-via-injection-locking

I propose that sometimes the brain needs to rapidly halt information propagation across regions to prevent cascading system failure (a metaphor that comes to mind is an uncontrolled prion-like change in the local key signature that ripples out from a traumatized region, progressively breaking cybernetic calibrations). I believe the brain uses two interlinked mechanisms to do this: (1) weakening CSHWs, thus weakening information propagation throughout the brain, and (2) arranging different brain regions into frequency regimes which make information transfer difficult between them (the golden mean is the mathematically-optimal ratio for non-interaction). Once this happens, it can be very hard to reverse, since it forms a self-sustaining cycle: (1) causes (2) and (2) causes (1). We call this ‘trauma’.

Some predictions from this I’d expect to see substantially less energy in low-frequency CSHWs after trauma, and substantially more energy in low-frequency CSHWs during both therapeutic psychedelic use (e.g. MDMA therapy) and during psychological integration work. Stretching a little, perhaps we could also apply Atasoy’s CSHW algorithm to individual brain regions and compare their spectrums (and those of CSHWs), to quantify the expected frequency-coupling between each region.[4] Possibly these two measures could be developed into a causal quantitative metric for trauma.

This generalized ‘breakdown of communication’ definition of trauma neatly fits with the story Scott tells about PTSD, where people 

[A]re probably still “having” “emotions” “under” “the” “surface”, but they have no perception of them … PTSD patients as disconnected from their bodies. The body sends a rich flow of information to the brain, which is part of what we mean when we say we “feel alive” or “feel like I’m in my body”. In PTSD, this flow gets interrupted. People feel “like nothing”.

It also fits with the therapies that seem to work: EMDR, neurofeedback, Internal Family Systems (IFS), yoga, massage — the consistent thread that connects these is they all plausibly help restart and strengthen communication within the brain (which I hold is strongly mediated by CSHWs). Scott doesn’t mention music, but I’d expect it to be surprisingly effective at boosting emotional integration — and I’d expect the most effective music will have strong low-frequency rhythms.

This shades into novel types of therapeutic approaches: perhaps we could simply pump energy into lower-frequency bands (perhaps harmonic stimulation centered at ~3-6hz) to kickstart emotional integration.[5]

Sidenote on music: The simple description I gave of music was 

[A] sensory input which seems to exist on the knife’s edge between exhibiting highly ordered patterns (some of which will hit natural connectome harmonics and so allow accumulation of energy through resonance) on one hand, and on the other hand not being toopredictable (thus dodging most inhibitory top-down predictive models).

Armed with the CSHW/RSHW distinction, we can give this a second pass. In short, I expect the above story to be true, but in a fractal way: music will be hitting both CSHWs and RSHWs. Naturally, different regions will have different sets of harmonics, which means simple tones are unlikely to produce much cross-regional resonance. Instead, the music which is the most effective at increasing the brain’s energy parameter will tie together and layer a diverse set of motifs, with two goals: (1) hitting as many connectome-specific *and* region-specific resonances as possible, and (2) entraining disparate regions and pulling them into sync essentially using injection-locking to pull RKSs (Regional Key Signatures) into sync with each other and the CKS (Connectome Key Signature).

Could we quantify what the ‘perfect song’ would be, for a given connectome? Not exactly, since so much of music’s effects rely on getting through the brain’s predictive processing gauntlet and the state of this gauntlet isn’t well-captured by a static connectome, but we could possibly use this framework to design (potentially much) more evocative songs.

It’s also worth noting that better music and better ways to listen to music shade quickly into potential therapies for trauma under this model. 

V. On psychedelics:

As noted above, Neural Annealing suggests a very simple model for understanding the effects of psychedelics: as substances which “may function by disabling existing energy sinks (or perhaps overloading them by increasing baseline firing rates or increasing the branching factor of neural activity),” dramatically increasing semantically-neutral energy. Psychedelics share a ‘characteristic feeling’ (and characteristic emotional aftereffects) with each other and with activities such as meditation, listening to music, EMDR, breath work, and so on, because all of these things increase the energy parameter of the brain. Psychedelics are particularly interesting because they do this so powerfully, effortlessly, and noisily (with the effects bleeding over into sensory modalities, not just accumulating in harmonics).

A full Neural Annealing model of psychedelics will have to wait a few more months as internal QRI discussion settles on a unified story. But a few preliminary notes:

First, we could define ‘psychedelics’ in a principled way, as any substance, pattern, or process that produces semantically-neutral energy accumulation – anything that disables, overloads, or avoids the brain’s energy normalization system. The implication here is interesting, that anything that adds semantically neutral energy into the brain should produce psychedelic effects, regardless of how this is done. E.g., even things like modern art may be classifiable as a psychedelic, insofar as it generates semantically-neutral energy (see Gomez Emilsson 2019). But we should also note that current psychedelics are not necessarily perfect sources of ‘clean semantically-neutral energy’; they’re substances that happen to massively increase the energy parameter of the brain, with no guarantees about how ‘balanced’ this boost is. There may be better and more targeted methods to do this in the future. In the meantime, I would recommend modest caution with substances which involve a hangover after use, as negative valence or affective blunting during a critical window could ‘sour’ the annealing process with subtle long-term mood effects.[6]

As mentioned above, I’ve been thinking more and more that the core psychological changes driven by psychedelics are best understood in terms of the amount and ‘statistical flavor’ of the semantically-neutral energy they add to the system. Or, as an alternate framing, psychedelics may be best understood as temporary disrupters of the brain’s natural energy sinks, each with a specific target or ‘flavor’ of disruption (or psychedelics may add to neural activity’s ‘branching factor’, which in turn will add a specific flavor to the energy). I also find myself wondering, all else being equal, whether psychedelic visuals actually are inversely correlated with annealing effects, since by diverting energy into the visual system (which plausibly has very effective energy sinks), there is less energy available to drive entropic disintegration.[7]

As I noted in A Future for Neuroscience, another starting point for sorting through psychoactive drugs would be 

[T]o parametrize the effects (and ‘phenomenological texture’) of all psychoactive drugs in terms of their effects on the consonance, dissonance, and noise of a brain, both in overall terms and within different frequency bands (Gomez Emilsson 2017).
In the long term, we’ll want to move upstream and predict connectome-specific effects of drugs treating psychoactive substances as operators on neuroacoustic properties, which produce region-by-region changes in how waves propagate in the brain (and thus different people will respond differently to a drug, because these sorts of changes will generate different types of results across different connectomes). Essentially, this would involve evaluating how various drugs change the internal parameters of the CSHW model, instead of just the outputs. Moving upstream like this might be necessary to predict why e.g. some people respond well to a given SSRI, while others don’t (nobody has a clue how this works right now).

Possibly this would allow us to generate a principled typology of psychoactives, and also check for missing quadrants: psychoactives and psychedelics we haven’t discovered or created yet. (See also Andrés’s notion of parametrizing the ‘information vs energy trajectory’ of a trip.) We can also think of anti-psychotic drugs as anti-psychedelics: substances that rapidly decrease the energy parameter of the brain (Gomez Emilsson 2019). We at QRI strongly believe this makes anti-psychotics more dangerous than commonly realized: the neural search process is complex and delicate, and an externally-forced, uneven rapid cooling process may warp the internal landscape of the brain in subtle but deleterious ways.  In theory, we could test this indirectly by evaluating the effects of anti-psychotics on sensory integration tasks in healthy controls – but as noted above, this may be an unethical experiment.

Another frame would be ‘psychedelics as full-spectrum resonance agents’ CSHWs are meant to substantially resonate during normal human operation (falling in love, orgasm, etc) RSHWs are not. The perceptual and epistemological changes we sometimes see during psychedelics could be due to the fine logical machinery that usually deals with high-context sensory particulars (facts and logical inferences) starting to malfunction as its natural eigenmodes are activated. Like linking and rhythmically flipping all the bits in a memory register, ignoring what that register is “supposed to” compute. If psychedelic visuals are an example of RSHW resonance, HPPD may be an example of this RSHW resonance annealing into durable patterns. 

On MDMA’s strangely powerful therapeutic effects, I’d suggest MDMA shares the ‘basic psychedelic package’ with substances like LSD and psilocybin (albeit a little weaker at common doses). Anything with this ‘baseline’ package significantly increases the energy parameter of the brain, which both allows escape from bad local minima and canalizes the brain’s core CSHWs, which both should be highly therapeutic. My intuition is MDMA may also have a particular effect on stochastic firing frequencies of neurons, and that this effect essentially acts as an emergent metronome – and this metronome will drive synchronicity between diverse brain regions. Given the presence of such a region-spanning ‘clean’ metronomic signal, brain regions that have partially ‘stopped talking to each other’ will re-establish integration, and some of this integration will persist while sober (or rather, some of the reasons for the lack of integration will have been negotiated away during the MDMA-driven integration). Plausibly this ‘emergent metronome’ effect may also underlie the particular phenomenological effects of 5-MeO-DMT as well, particularly in terms of sense of unity, high valence, and therapeutic potential.[8] 

Somewhat poetic sidenote: on taking psychedelics:

In the abstract I think psychedelics are more powerful, more dangerous, and more healing than commonly assumed.

But we don’t live in the abstract. The natural question for any given person is thus: should I take them?

There’s no one-size-fits-all answer, and I recommend checking with local laws. But I can share a simple heuristic for who shouldn’t worry too much about the downsides of psychedelics and who should be very careful: do you trust your own aesthetic?

Psychedelics massively increase the ‘energy parameter’ of the brain, so naturally there’s a large amount of very-high-dimensional exploration going on. There are countless ‘micro-choices’ your brain makes as to how to anneal after this exploration: we can think of a person’s ‘aesthetic’ as individual variance in these annealing choices. What the self-organizing system which is the brain’s subconscious finds beautiful in the moment and implicitly strives to save.

Sometimes, and in some people, we want the right things, we find the right things beautiful. Things that have a deep elegance and fit with everything about us and fit with how reality works. We just need enough energy parameter to get there. Psychedelics are a great way to get there.

Other times, we might not want the right things. Evolution is kind of a jerk, epistemologically speaking: it cares much more about genetic reproduction than it does about deep coherence and calibration with reality and such. Sometimes we’re at a functional local maxima, but we’re not pointed in the right direction globally, and frankly speaking our lack of a high energy parameter is our saving grace – our inability to directly muck up our emotional landscape. Insofar as this is true – and it will be more true at certain times than others, and in certain people than others, and perhaps in certain combinations of people than others – using psychedelics to crank the energy parameter is not good for a person. Our ‘Psychedelic Extrapolated Volition’ (PEV) is not a healthy vector.[9]

The natural follow-up is, how do you know whether your PEV is positive or not? 

Hard question, but probably good to ask your friends – group epistemology seems healthy in these cases. And in general it seems strongly preferable to err on the side of caution. You can always take that LSD tomorrow, or next week, or next year.

(But, don’t be too paranoid about one trip permanently breaking your brain, either. My guess is the annealing that tends to ‘stick’ is that which actually finds better local minima (thankfully) – if it’s an unsuccessful exploration I suspect the system can usually climb back to where it was (with some caveats).)

A separate factor is your current energy parameter and how psychedelics may increase this baseline: if you’re dragging on the bottom of your energetic attractor basins, maybe a little kick could be healthy. But if you’re already ‘high on life’ –  consider skipping the LSD and MDMA. Increasing a high baseline can redline the system into exquisitely unbearable intensity.

VI. Love and other types of Neural Annealing

It’s important to note that most annealing doesn’t happen in a vacuum: just as “set and setting” matter quite a lot for psychedelics, and for emotional updating in general, the importance of context in the annealing model is hard to overstate. Much as holding a magnet close to iron as it cools can magnetize the metal, the intentional content present when entropic disintegration->annealing happens provides important constraints for which new patterns form. I propose there are four general types of neural annealing:

A. Annealing to an object or event. Annealing which is ‘pointed at’ something is by far the most common type. Some object, or event, or new insight makes itself known in a surprising or otherwise intensely salient way, and this pushes the brain into a high-energy state, kickstarting a self-organization process for accommodating the presence and significance of this new thing. This can involve intense positive emotion — a new romantic partner, the birth of your child, your wedding day. This sort of annealing can also be caused by trauma— getting bitten by a weird animal, social rejection, losing a close one. As I suggested in The Neuroscience of Meditation, neural annealing may offer a rather pithy description of love:

Finally, to speculate a little about one of the deep mysteries of life, perhaps we can describe love as the result of a strong annealing process while under the influence of some pattern. I.e., evolution has primed us such that certain intentional objects (e.g. romantic partners) can trigger high-energy states where the brain smooths out its discontinuities/dissonances, such that given the presence of that pattern our brains are in harmony. This is obviously a two-edged sword: on one hand it heals and renews our ‘cold-worked’ brain circuits and unifies our minds, but also makes us dependent: the felt-sense of this intentional object becomes the key which unlocks this state. (I believe we can also anneal to archetypes instead of specific people.)
Annealing can produce durable patterns, but isn’t permanent; over time, discontinuities creep back in as the system gets ‘cold-worked’. To stay in love over the long-term, a couple will need to re-anneal in the felt-presence of each other on a regular basis. From my experience, some people have a natural psychological drive toward reflexive stability here: they see their partner as the source of goodness in their lives, so naturally they work hard to keep their mind aligned on valuing them. (It’s circular, but it works.) Whereas others are more self-reliant, exploratory, and restless, less prone toward these self-stable loops or annealing around external intentional objects in general. Whether or not, and within which precise contexts, someone’s annealing habits fall into this ‘reflexive stability attractor’ might explain much about e.g. attachment style, hedonic strategy, and aesthetic trajectory.

Perhaps we can go further now, and hypothesize that ‘falling in love’ is a specific algorithm the brain runs, which is triggered by when the ‘felt sense’ of another person (a pattern distributed across RSHWs, CSHWs, and SSHWs) produces substantial systemic resonance. When this happens, and in the absence of warning signs (dissonance), a person will actively seek to fill their sensorium with this signal, which amplifies the systemic resonance (potentially to extreme levels) and further synchronizes priors and other regions into harmony with the original pattern. As you fall in love, you literally anneal to your felt-sense of that person – you take their rhythm as yours, because your body judged it to be so. A key which fit your connectome’s lock. This will naturally do two things: (1) fuzz boundaries between lovers, as patterns progressively synchronize, and (2) add a harmonic echo, or ‘warm consonant glow’ to all thoughts about the person. This latter phenomenon will feel nice, but also keep itself stable: the presence of this bundle of synchronized frequencies will stabilize (via injection-locking) many forms of drift – effectively preventing certain thoughts/perceptions. This may fade over time if not refreshed, but perhaps to completely ‘fall out of love’ the brain has to build a competing key signature elsewhere, e.g. in a golden mean ratio to this harmonic echo, and these rivalrous key signatures (implicitly Bayesian priors about what is real and what is good) battle it out. (Thanks to Andrés for discussion on competing key signatures.) This ‘de-annealing’ process literally erasing someone’s patterns and rhythms from your body can follow several trajectories, few of them pleasant, as the system renegotiates new (or old) equilibria.

B. Annealing to an ontology. A much more general type of annealing is when the entropic disintegration->annealing process is pointed toward an ontology, and the brain reorganizes its internal structure (‘ontological contours’) to accommodate this new typology. This can happen implicitly and weakly, over the course of entropic disintegration->annealing to multiple separate ideas, or explicitly and strongly, for instance reading some book in college which completely reshapes one’s view of reality.

Any craftsman, any intellectual, any philosopher worth their salt is strongly annealed toward at least one nuanced ontology, and in fact much of the influence of the Great Philosophers can be found in how they’ve laid out their thoughts in a way that others can use as a coherent annealing target. What makes something a good annealing target? I’d offer it’s the presence of clear archetypes arranged in both a novel but ultimately cognitively efficient way. These archetypes can be thought of as a combination of nature (innate Jungian-type limbic resonances) and nurture (prior annealed patterns & cultural reifications).

An important point here is that peoples’ conception of where goodness comes from is dependent upon their ontology; change the ontology, change the perceived nature of goodness itself! See e.g. John Lily’s discussion of the supra-self-metaprogrammer (SSMP). This frame-shift can also manifest at the extreme end of falling in love, where all the world’s goodness seems to come from your special person (a dangerous thing).

C. Social annealing. A special hybrid of annealing to an ontology and to other people is social annealing, wherein a group of people undergoes the ‘entropic disintegration -> neural search -> annealing’ process together, within some shared context- a religious service, a sporting event, a retreat. This seems like the natural mechanism by which tribes are formed (loosely speaking, group synchronization of connectome-specific harmonic wave dynamics) and underlies many of our most sacred experiences. The power of social annealing is such that a religious experience that lacks it no longer feels like a religious experience- merely the mouthing of dogma. On the other hand, any group experience that does increase the group’s energy parameter and trigger annealing starts to take on a pseudo-religious frame- e.g. ecstatic dance, festivals, protest marches, even concerts. 

D. Semantically-neutral annealing. Almost all neural annealing is semantic annealing, or annealing toward some intentional object. This process is pointed at something, often the thing that caused the entropic disintegration process in the first place, be it a person, an event, an idea, an ontology. But there’s nothing in the laws of neuroscience that implies annealing has to have an intentional object as a focus. As per Section II, I believe this is a particularly healthy form of annealing.

Toward a new psychology & sociology?

Speculatively, we may be able to re-derive much of psychology and sociology from just the energy-parameter view of the brain: e.g.,

Gopnik 2017 suggests that different developmental windows may involve different implicit ‘heat parameters’ for simulated annealing, with young people having higher parameters. Speculatively, this may correspond to different ‘lived intensity of experience’ at different ages- young brains (and lifelong learners) might not only be more plastic than average, but actually having experience that is objectively more visceral. One way to frame this is that being young is like microdosing on LSD all the time. This could have interesting implications for ethics.

Most likely, there’s been significant recent sexual selection for a higher energy parameter, for several reasons:

  • Selecting for neoteny plausibly also implicitly selects for an energy parameter that starts higher and/or decays less with age;
  • A high energy parameter would be a good proxy for cognitive-emotional-behavioral dynamicism, perhaps the most strongly sexually-selected-for trait;
  • A high energy parameter would be an honest signal of not being in a bad ‘iterated aesthetics’ attractor (otherwise they would have self-destructed previously).

Psychology has various personality metrics, with the most widely used being the Big 5, also known as OCEAN (Openness, Conscientiousness, Extroversion, Agreeableness, Neuroticism). One of the most interesting subfindings here is that we can still get reasonable predictive utility if we collapse these into a one-variable model: the ‘Big One’ personality factor. Scoring high in this factor is ”associated with social desirability, emotionality, motivation, well-being, satisfaction with life, and self-esteem.” Scoring low is associated with depression, frailty, lack of emotionality, and so on. I wouldn’t be surprised if the ‘Big One’ simply tracks how frequently and deeply someone anneals.

Continuing the thread on Social Annealing, I think we can push into sociology with the Neural Annealing model too; to understand a society, we need to understand how and when annealing happens in that society. To gauge the wisdom of a society, look at how its decision-makers anneal; to gauge the cultural direction of a society, look at how its young people anneal. To understand the strongest social bonds of a society, look at the contexts in which group annealing happens.

This also suggests why drugs like alcohol and certain psychedelics are ritualistically celebrated in so many cultures: they allow social-annealing-on-demand, a key technology in building and maintaining social cohesion and coordination.

Likewise, we could envision a field of ‘social archeology’ evaluating annealing patterns in the past: how often did peasants and nobles in the Middle Ages anneal? In what contexts did the annealing happen, and which institutions controlled them? Perhaps most political conflicts could be reinterpreted as conflicts over annealing.[10] And so on. My colleague Andrés has suggested that a good rule of thumb for identifying annealing (and making good movies) is that intuitively, annealing defines where you should actually point the camera if you were making a movie of a historical period, since where annealing is happening is where changes that ‘matter’ are taking place: cognitive updates, decisions about how to feel, and so on.

On the effect of profession on emotional vibrancy: It would be somewhat surprising if certain repeated computational tasks didn’t tend to push regions’ key signatures into being highly coupled (=intense emotions), whereas other classes of tasks push regions’ key signatures into fairly orthogonal configurations (=‘white noise’ as emotional state). A lifetime of dance or poetry might literally make you feel emotions more strongly; a lifetime of doing accounting might literally produce a segmented brain and affective blunting. From Darwin’s autobiography:

I have said that in one respect my mind has changed during the last twenty or thirty years. Up to the age of thirty, or beyond it, poetry of many kinds, such as the works of Milton, Gray, Byron, Wordsworth, Coleridge, and Shelley, gave me great pleasure, and even as a schoolboy I took intense delight in Shakespeare, especially in the historical plays. I have also said that formerly pictures gave me considerable, and music very great delight. But now for many years I cannot endure to read a line of poetry: I have tried lately to read Shakespeare, and found it so intolerably dull that it nauseated me. I have also almost lost my taste for pictures or music. … I retain some taste for fine scenery, but it does not cause me the exquisite delight which it formerly did. …
This curious and lamentable loss of the higher aesthetic tastes is all the odder, as books on history, biographies, and travels (independently of any scientific facts which they may contain), and essays on all sorts of subjects interest me as much as ever they did. My mind seems to have become a kind of machine for grinding general laws out of large collections of facts, but why this should have caused the atrophy of that part of the brain alone, on which the higher tastes depend, I cannot conceive. 

Reading this account, I find it plausible that Darwin repeatedly pushed (and annealed) his mind toward RSHW-driven ‘clockwork piecemeal integration’ interactions rather than CSHW/SSHW-driven global symmetry gradients, although Darwin’s age, sickness, and depression may have also contributed. A warning sign for us theorists and systematizers.


Neural Annealing is a neuroscience paradigm which aims to find the optimal tradeoff between elegance and detail. It does this by identifying a level of abstraction which supports parallel description under three core principles of self-organization: physical self-organization (around connectome resonances), computational self-organization (around minimization of surprise), and energetic self-organization (around conditional entropic disintegration).

There is yet much work to be done: in particular, there are huge bodies of literature around receptor affinities, network topologies, regional anatomies and cell types, and so on. The promise of Neural Annealing is it’s not only a predictive and generative theory in its own right, but it provides a level of description by which to connect these disparate maps, and an extensible context to build on as we add more and more detail to the model.

Finally, we can ask: why does good neuroscience matter? I would offer the following.

The future could be much better than the present. Much better.

Material conditions are only very loosely coupled with well-being. If life is to be radically better in the future, it will be due to better neuroscience pointing out how we can be kinder to ourselves and others, and future neurotechnology changing the hedonic calculus of the human condition.

A unified theory of emotional updating, depression, trauma, meditation, and psychedelics may give us the tools to build a future that’s substantially better than the present. This has been my hope while writing this.

–Michael Edward Johnson, Executive Director, Qualia Research Institute


[1] The ‘semantically neutral energy’ model also suggests why transcranial magnetic stimulation (TMS) seems to help treat depression – essentially, TMS injects a large amount of energy into the brain, and this energy (1) triggers some entropic disintegration, allowing escape from bad local minima, and (2) may slightly collect in the brain’s natural harmonics, which may help pull the brain out of dissonant equilibria. Note that this could be done much more effectively: instead of the present strategy of using a quick flash of unpatterned, pulsed TMS (e.g., 5 seconds @ 100hz) which overpowers the brain but quickly dissipates and likely doesn’t lead to a significant build-up in harmonics, we could instead try an entrainment approach via lower-power, rhythmic, continuous TMS, applied for longer durations (keeping the brain above its ‘recrystallization temperature’ for longer, allowing a fuller self-organization process), perhaps paired with music.

[2] Thanks to Andrés for the idea about somatic information, and the suggestion of sensorium as the label.

[3] I suspect that muscle tension could be a core mechanism for regulating SSHWs and perhaps CSHWs. Tensing muscles will strongly influence body resonance, and one’s body resonance configuration will likely have ripple effects on what sorts of frequencies persist in the brain. This suggests that traditions such as yoga are basically right when they posit a link between problems in muscles and problems in the mind: we may hold tension in one system in order to compensate for a problem in the other. Speculatively, this compensatory regulation may also be found across humans, especially in pair bonds: that tension in your back might in some literal way be an attempt to help your partner with their emotional regulation. This would suggest muscle tension should change significantly after a break-up. (Thanks to Emily Crotteau, Lena Selesneva, and Ivanna Evtukhova for pieces of the puzzle here.)

[4] My colleague Andrés suggests that “[A] more direct method, though perhaps more difficult, would be to look directly for the spectral signatures of injection locking — we’d predict you will see a seriously diminished degree of injection locking signatures on people who are heavily traumatized, and see it come back after MDMA therapy.”

[5] Perhaps we could model Persistent Non-Symbolic Experience (PNSE) as persistent partial injection locking of key regions by low frequency CSHWs: essentially this would involve entraining (and effectively partially disabling) the machinery that usually handles interpretation of certain particulars / cognitive interpretations. Perhaps highly neurotic or traumatized individuals with strong top-down control exhibit the opposite: essentially trying to entrain CSHWs to a specific region (with predictably poor results).

[6] My colleague Andres also recommends against “psychedelic substances that have as part of their activity profile a high level of body-load, such as nausea and cramps as these patterns might themselves become annealing targets (cf. compounds notorious for this, according to PsychonautWiki such as 2C-E, 2C-T-2, and 2C-P, are probably best avoided as therapeutic aids).”

[7] On psychedelic tolerance: if the semantically-neutral energy model of psychedelics proves out, we should also be open to subtle corollaries: e.g., to what extent is the temporary tolerance effect of psychedelics biochemical (depletion of some neurotransmitters, per the current story) and to what extent is it information-theoretic —associated with the release and depletion of systemic sources of Free Energy? I.e., there is potential energy of a sort liberated when the system finds a better local minima, and if the system has undergone strong annealing recently, there are fewer such ‘energetic free lunches’ around to help power the psychedelic effects. (Hypothesis held weakly, as my colleague Andrés points out there are psychedelics which do not trigger tolerance, such as N,N-DMT and 5-MeO-DMT.)

[8] HT to Steve Lehar for pointing at this ’nystagmus’ phenomenon as being somehow linked to MDMA’s mood-lifting effect, and to Andrés for calling my attention to Lehar’s work and suggesting 5-MeO-DMT may also share this mechanism.

[9] This is a reference to Eliezer Yudkowsky’s “Coherent Extrapolated Volition” (CEV) concept, which is an attempt to sketch a heuristic for how to use a radically-powerful optimization process (such as an AGI) safely. Essentially, CEV suggests we could aggregate all human preferences (volitions), find some way to merge them (make them ‘cohere’), then repeat (extrapolate), until we get to a self-stable loop. A ‘psychedelic extrapolated volition’ is a variation of this: if it becomes easier to change yourself on psychedelics, and then that person you turn into can change themselves into someone else, and so on, where do you end up? What generates a ‘positive vector’ here?

[10] This naturally and unfortunately makes the access to and contexts of social annealing an axis of cultural conflict: those who control these events control the emotional tone and contours of coordination of a society. Taking away healthy annealing contexts from your opponents and giving more social annealing opportunities to your people is a key (but also very dirty) way to ‘win’ the culture war. (Perhaps the opioid crisis, and the crack-cocaine crisis before it, could in some sense be exacerbated by a lack of healthier annealing opportunities.)


For attribution in academic contexts, please cite this work as:

Michael Edward Johnson, “Neural Annealing: Toward a Neural Theory of Everything”, https://opentheory.net/2019/11/neural-annealing-toward-a-neural-theory-of-everything/ , San Francisco (2019).


I’d like to thank Andrés Gómez Emilsson for many great conversations on annealing (and first calling my attention to the term), energy sinks, and countless other topics, and offering careful feedback on a draft of this work; Robin Carhart-Harris and Karl Friston for a beautiful description of simulated annealing; Romeo Stevens for wide discussion about annealing & ontologies; Adam Safron for introducing me to the depth of explanation afforded by predictive coding, pointing me toward injection locking, and many great conversations in general; Quintin Frerichs for his hard work toward making therapeutic applications of this theory real, and the rest of the QRI team for support and inspiration; Milan Griffes for careful feedback on a draft of this work; Alex Alekseyenko and James Dama for discussions about simulated annealing; Anthony Markwell for sharing the Buddhist Dhamma with me in such a thoughtful and generous way; Justin Mares for his constant curiosity and encouragement; my parents, for their endless love and patience; Lena Zaitseva and Lena Selesneva for their warmth and support; and especially Ivanna Evtukhova, who has made my life radically better and whose love, energy, and obsession with Buddhist enlightenment was why this work happened.

To gratitude.

Timeline: most of this document written ~Feb-April 2019, as a continuation of The Neuroscience of Meditation and this talk, and shared internally and with select reviewers; section dealing with trauma written July 2019. Document reordered for flow and polished in Oct-Nov and posted Thanksgiving 2019.


Preserving Practical Science (Phronesis)

29 ноября, 2019 - 01:37
Published on November 28, 2019 10:37 PM UTC

This is my first commentary on the previous Saint Louis Junto Meeting whose notes are here. Since a meetup can cover diverse topics, I have decided that I will not include running commentary in the meeting notes, and instead reflect upon a few of the discussions in subsequent posts. If I like the results of this procedure, I will stick with it.


There is a certain type of knowledge gained from experience which is different from scientific theory and from rudimentary skills, but which is at the same time skill-based and scientific. Here are the examples we came up with in our meeting:

i. Iron Working, Guild and Trades, etc. functioned through an unbroken apprentice system. It takes a blacksmith to make a blacksmith. The non-academic approach makes learning what was done at any particular time difficult. High iteration processes generally have this feature. What fields today are very difficult to learn about without doing? ii. NASA’s Apollo Program was so “do engineering” focused that it took a long time after the fact to figure out exactly how all the parts were made to fit together. iii. Nuclear scientists today are monitored so that their know-how is put to use in approved countries, same for WMD specialists, etc. iv. International Students in the US have limited access to highly advanced fields and the most cutting edge. Sometimes for National Security reasons, but mostly for the reason of patent protection and intellectual property rights for professors and grant making agencies. It's not that other countries can figure out these technologies in principle; it's that implementation is somehow a different category of knowledge.

The Greek word is phronesis, which is usually translated as 'practical wisdom' or 'prudence' which is distinct from rudimentary skills (techne) or scientific knowledge (episteme). These three categories or knowing were codified by Aristotle, and we probably should be careful about using them to apply to today. Nonetheless, they are decent starting point for thinking about a type of knowledge that is sometimes hard to preserve, and extremely expensive to recover once lost.

I came up with some additional examples that are not in scientific fields.

i. How to raise a family. A community of friends of different ages and generations develop an art of living and take care to provide advice and help to young parents so that they can do well and ensure the children have the resources and opportunities to succeed. When a community does not have this, it becomes extremely hard to overcome poverty.ii. A business becomes the best at doing some activity, say, HR consulting, because the managers collectively have more experience executing projects of this type than any other business AND consequently they have developed more effective processes than their competitors. If the business goes belly up, what are the chances that the unique know-how developed in the business, but not in the head of any one person, survives?

Society has done a few things to make knowledge more durable.

1. Put it on the internet. Much stuff is on the internet. There is a lot of advice out here, some good, some bad. But for everyday know-how the internet is everyone's one stop shop.

2. Internal audits. Perhaps your information is proprietary, secret, or classified, internal audits ensure that what is happening within the organization is codified and recorded so that it is at least possible to go back in time and figure out what was happening.

3. Good Old Education. There are probably more technical skill programs in existence today than at any time in history. While learning the skill is not exactly the same as getting the experience, it is necessary for the re-creation of that practical wisdom.

To 1: The internet is a great source of knowledge. But I am not prepared to say it is a great source of highly specialized know-how. Is how to be a good University president an easily discoverable and widely discussed skill on the internet? No. How about Governor of a State or Mayor of a City? These obviously require skill and know-how, but the practical wisdom is extremely tied up with the subtle particulars of each individual situation. One might rightly call these fields 'overdetermined' and the people who hold these positions are frequently 'overfitted' for the position through a network of 'who-you-know' stretching back at least one generation.

For many business issues, the no one on the internet has already addressed your specific situation, because if there are 15 binary factors at play, then it is likely no one has encountered this exact situation before. Same goes for scientific researchers and political operatives and all the highly specific factors in your life. You will get general guidance by looking at the archives or a mathematical model, but never a specific answer. Your own judgement is ultimately required, hopefully you have a community in the know to discuss it with... I guess that's why we invented conferences. Yet somehow I don't think conferences are the grand answer to breaking open narrow communities of expertise.

To 2: Internal audits, done well, ensure that everything is being recorded and essential aspects of the system can be put back together again. How to run the business is not usually part of the audit, but the process for creating the product is. Preserving that is essential. Keep doing it! However, there is a danger that mergers and acquisitions will likely screw up the knowledge contained in the processes, misplace that hidden information, and then do things worse forever.

Furthermore, the preservation of old information is hard problem. Doing it without the internet is even harder.

To 3: Two complaints about education. "We teach quickly outdated skills. Technical training is useless if students don't know how to reason." And "Everything I learned in school was theoretical, and though interesting, had no practical value!" Robert Heinlein's most divisive quotation I think concerns the middle ground, the type of practical competence gained after years of both practicing and theorizing.

A human being should be able to change a diaper, plan an invasion, butcher a hog, conn a ship, design a building, write a sonnet, balance accounts, build a wall, set a bone, comfort the dying, take orders, give orders, cooperate, act alone, solve equations, analyse a new problem, pitch manure, program a computer, cook a tasty meal, fight efficiently, die gallantly. Specialization is for insects.

I appreciate the spirit behind Heinlein's quixotic list, a society which had similar skills more widely distributed, could only come to be if communities of experienced practitioners were constantly enrolling inexperienced amateurs into their ranks. Even so, there can never be enough experts to train the next generation of novices in a local area. Much specialization is required.

In conclusion, our society preserves and extends practical wisdom through internet blogs and vlogs, conferences, professional societies, internal audits, research journals, and educational programs. If we can make sure these things happen in our own fields of expertise, then we can go a long way to making sure the future is not condemned to reinventing the wheel, writing, or actuarial science.


Order and Chaos

29 ноября, 2019 - 00:27
Published on November 28, 2019 9:27 PM UTC

Follows from:

Map and Territory

Babble and Prune

Warning: I strongly recommend not using the concepts in this sequence to try and build a generalized artificial intelligence. These concepts describe how humans function as conscious entities, and humans are not known for being safe or friendly to human values. Human limitations currently provide the only check on human abuse of power, and building something with human cognitive abilities but without those limitations would be ill-advised. Human.


This series of articles is about applied metacognition, laying the groundwork for developing the skills to approach and effectively handle any type of problem or situation.

The concepts in this particular article may already be familiar. I’m presenting them here because we will use them in later articles to derive the different types of skills and explain how those skills work and how they fit into our toolbox.

The Map and the Territory

A previous article, No Really, Why Aren’t Rationalists Winning?, established that skills are highly compressed procedural information. In our sequence premier, The Foundational Toolbox for Life: Introduction, we looked at them from a different angle: skills start as paradigms which filter out information. We develop our paradigms into skills by calibrating them with experience to produce useful answers to problems in a practical time frame.

Now we will look at the map/territory distinction and how we use it to define the basic building blocks of cognition itself. Once we’ve done that, we can finally move on to what we can build with those blocks.

Every skillset, from science to art to athletics to management, requires an explicit or implicit mental model of certain aspects of the world: a map. Every person has at least one map in their brain, which represents the territory that is the real world, or at least they part they deal with. The map lets a person predict the outcomes of their actions, and thereby allows them to effectively navigate the territory and change it in pursuit of their desires. Without the map, there would be no way for a person to predict which options lead to desirable outcomes.

Even primitive life forms have evolved rudimentary maps. Their instincts represent the effects of their potential responses to specific stimuli on the probability that they will survive and reproduce. The correlations encoded in these instincts are a narrow, low-resolution map of the organisms’ native environments.

However, instinct maps are updated by the processes of mutation and natural selection—in other words, chance and death. Each individual is stuck with an instinct map that either succeeds or fails fatally. Humans generally want to improve their models of reality in less lethal ways, so they use their more sophisticated neural hardware to learn about the world and update their maps on the individual and cultural levels rather than on the species genome level.

Order and Chaos

The map’s relationship with the territory creates a fundamental dichotomy that helps define every tool in our toolbox: the duality of order and chaos.

“Order” represents the degree to which the map accurately reflects the territory. This accuracy is measured by how well the map makes predictions. In short, order is what we say we “know”. When we speak of requirements and limits, what must happen or what cannot happen, we are speaking of order.

Additionally, “order” can refer to how easily knowledge and information can be compressed, or how much information we can derive from a small sample size. Patterns across time or space are called “orderly” because knowing only a fraction of the pattern can enable us to predict the rest. For all territories of a given size, the more orderly ones require fewer bits of information to describe them with a map. For example, a bilaterally symmetrical object allows you to predict what is on one side if you have already seen the other side, so you can describe it in full by showing only one side and defining the plane of symmetry. The map of a particularly orderly territory might translate to a few sample data points and a relatively simple rule.

By contrast, “chaos” represents the omissions and errors in the map, the degree to which the map fails to accurately represent the territory. Chaos is the “unknown”. When we speak of possibilities and uncertainties, of what may or may not happen, we are speaking of chaos.

The unknowns of chaos includes both unknown unknowns (pure chaos) and known unknowns (chaos bounded by order). Pure chaos manifests as outside context problems or black swan events, like being invaded by a continent you never suspected existed. However, much of the chaos that adult humans experience is bounded by order. Although they don’t know exactly what will happen, they feel fairly certain it will fall within a range of “normal” events. The roll of a die provides a more specific example of bounded chaos, since we know every possible face value even if we don’t know which one it will be. A trusted probability distribution also imposes some certainty on unpredictable outcomes, at least with large sample sizes. Even if individual measurements may vary, we know roughly what the data on the group as a whole will look like.

Whenever something you thought you knew turns out to be false or incomplete, that is chaos as well. The truest knowledge of the territory is limited to our scattered data points of direct experience, and we create our maps to interpolate and explain those data points as best we know how. Whenever we get a new data point that falsifies the map we were using, when we try to predict the territory and fail, it is another manifestation of chaos.

Moreover, chaos can refer to how difficult it is to compress information, or to figure out the details of a situation from limited data. A situation described as “chaotic” is difficult to predict because the information you have about it cannot be used to derive the information you want. For example, in a messy room, the knowledge of one sock's location does not allow you to locate its pair.

Although (or because) they are opposite concepts, order and chaos are more or less inextricable. The known and unknown are present to varying degrees in almost every situation you encounter, because they are essential to conscious existence as we know it. We often say that perfectly certain knowledge of the territory is impossible, but we don’t classify everything as completely unknown, either. Instead of a binary label of “known” or “unknown,” we have gradients of certainty that inform how much of our resources and safety we are willing to bet on various unknowns.

As a technical explanation of these concepts, “chaos” simply describes a relatively smooth and somewhat even distribution of probability mass across a range of hypotheses, where no hypothesis in the range is considered overwhelmingly more likely than another. “Order” describes a sharper, uneven distribution where probability mass is concentrated into a relatively small number of hypotheses. If (as usual) you have a subset of hypotheses that are overwhelmingly more likely than all others but roughly equal in probability with each other, that’s bounded uncertainty: chaos bounded by order (or chaos inherent in order, depending on which one you want to imply is dominant).

You may have gathered that order and chaos are also subjective in their application. A territory cannot be intrinsically “orderly” or “chaotic” without reference to a given map (or compression algorithm). A situation will appear more or less chaotic to you in proportion to your ignorance of it. After all, confusion (or lack thereof) is in the map.

Order and chaos are also implicitly based on what information people consider important. The die roll mentioned above is “bounded by order” because it has a finite number of defined results. However, the reason the number of defined results is so low is because we don’t pay any attention to the location at which the die comes to rest, or the direction it faces, or the amount of time it takes to stop rolling…

The fact that these definitions of order and chaos are relative rather than objective is to be expected, because all the concepts in our toolbox are based on solving problems. Problems can only be defined in terms of a person’s desires, what sorts of obstacles stand in the way of those desires, and what the person can do to overcome those obstacles.

The next section deals with how our minds process order and chaos. Understanding how we deal with the shape of what we know and what we don’t know (and what we don’t know we don’t know) is vital for describing how our skills work.

Guessing and Checking

At the most fundamental level of mental activity that is still complex enough to be recognized as mental, we find two processes. These processes explore chaos and order, respectively, so that the mind can develop and refine its map. I call these processes “guessing and checking”. Elsewhere on this site, they are known as “babble and prune”. As far as I can tell, these phrases refer to the same pair of concepts.

Guessing is more or less free association: it links our current experiences and thoughts with any concepts that are remotely similar, and calls our attention to those concepts. To guess is to throw one’s map up against the territory in various ways (without judging the results—that’s where check comes in). Guessing is the mind wrangling chaos. It follows possibilities based on an initial idea and makes them concrete in the mind. It allows the mind to model (and therefore address) the unknown territory by giving shapes to the potential that lurks within.

Checking is the process by which we judge whether a concept is relevant to the current situation. It evaluates how we are applying the map to the territory, and the predictions we make from it, by comparing them to other observations of the territory or to memories that our guessing has summoned up. Based on this evaluation, the check accepts or rejects the accuracy (predictive utility) of the concept in the given situation. Checking is the mind wrangling order, because it decides what information gets to become and remain part of the map. It allows the mind to produce and curate knowledge by judging how well the map of the known matches the territory (or other parts of the map) in the way that guessing has applied it.

To illustrate how these cognitive processes work, we can look at what happens when each of them is shut off.

All guessing and no checking would be like an incoherent dream, or (people tell me) the effects of some recreational drugs: a parade of random impressions. It would consist of complete free association, but nothing for assessing correspondence with reality and filtering out what doesn't fit.

Inversely, all checking and no guessing would allow one to apply a single concept, but one would have no ability to update the paradigm of how to apply it. For instance, an entity with checking but no guessing might be able to classify organisms as cats or dogs, but it wouldn't be able to realize that some organisms are neither (unless it already had a label for that).

From these examples, it is clear that both guessing and checking are necessary aspects of any skill, because both are necessary to generate and calibrate our maps.

As a more technical explanation of guessing and checking, guessing iterates through locations in hypothesis space. However, hypothesis space—the space of possible maps—is theoretically infinite, with infinite dimensions, and is therefore non-ordered (that is, it doesn’t have a linear sequence). To make iteration through unlimited possibilities computationally tractable, our brains use free association. The brain keys off of immediate sensory inputs or thoughts to find possibly related concepts, then keys off of those concepts to find more distantly related concepts, and so on. Through this process, guessing takes us to the most salient-seeming locations in hypothesis space. Each location visited, correct or not, is added to a map of possibilities as a candidate for representing the territory.

As we guess each hypothesis in turn, checking will accept or reject the hypothesis with varying degrees of confidence by pumping probability mass into or out of it, redistributing the probability mass across the range of hypothesis space we are exploring. It performs this redistribution by updating our map of possibilities, revising the degrees of certainty across the board based on its evaluation of each option.

Naturally, if we examine all the major hypotheses and decide they're still equally likely, then our pumping has canceled itself out. If our checking, based on our prior probabilities, has decided they all match quite well, then we’re still undecided. If checking decides the hypotheses all match equally poorly, it may be that something improbable happened, or that our guessing didn’t go far enough to come up with a more likely hypothesis, or that our check rejected something in error because we were ignorant of a factor making it more probable.

Distinct and Subliminal

There’s one other dichotomy that we need to finish laying the groundwork for the basic skills: distinct versus subliminal.

When a guessing or checking activity takes place in our brains, it can do so in two modes.

First, it can run distinctly, where there is an explicit record (a memory) of the iteration process and we are fully aware of what possibilities or implications we are considering. We refer to distinct processes as taking place in our System 2, which might also be called the “manual” system.

Second, guessing or checking can run subliminally (“below the threshold”) where there is no explicit record of the process, and we are only left aware of the end result, if even that. We frequently form beliefs and decisions based on subliminal processes without realizing we’ve done so, to our detriment or benefit. We say these processes are part of System 1, often called the “automatic” system.

The modes in which our guessing and checking processes run determine what sort of maps we use and how we update them. The maps we use define the types of skills that we employ, what aspects of situations they deal with, and their advantages and disadvantages. We’ll go into more detail on these maps and what happens when cognitive processes run in various modes in the next part of the sequence.


All skills involve both guessing and checking in order to update our maps. What differentiates one skill from another is how each process runs: distinctly or subliminally. These modes of guessing and checking inform what sorts of features a skill’s map contains and therefore what aspects of the territory its map represents. The dichotomy of order and chaos, describing the degree to which a map corresponds with a territory, is a core concept for distinguishing the different tools in our toolbox.

In the next article I’ll introduce the basic mindsets and explain what sorts of maps they use and how they are defined by how they guess and check.


What are the requirements for being "citable?"

29 ноября, 2019 - 00:24
Published on November 28, 2019 9:24 PM UTC

A sort of... "stretch goal", for the 2018 Review, is developing a system wherein LessWrong has proven itself credible enough for some posts to actually be citable by other institutions.

I'm not sure how much of this has to do with "just actually do a good job ensuring accuracy/relevance", and how much has to do with "jumping through arbitrary hoops", and how hard those hoops are to jump through.

Two obvious things to shoot for might be:

  • Being considered a valid source by wikipedia
  • Being considered a valid source by google scholar.

I'm curious if anyone is familiar with how either of those work, in detail, and whether this is an achievable goal.


What sources (i.e., blogs) of nonfiction book reviews do you find most useful?

28 ноября, 2019 - 22:43
Published on November 28, 2019 7:43 PM UTC

Also tell us why. I personally find SSC’s book reviews useful, because I feel like I get a summary of an interesting book, with the added bonus of seeing how the author (whose thought process I like) approaches absorbing its information.


Kansas City Dojo meetup 11-12-19

28 ноября, 2019 - 22:11
Published on November 28, 2019 7:11 PM UTC


We discuss the situation from last meeting, when a newcomer and non-rationalist showed up and got into a testy argument with Life Engineer. We concluded that it was primarily a matter of conflicting expectations; the newcomer was expecting a more "salon"-type, open-ended, intellectual conversation like the kind we have on our casual Sunday lunches. Additionally, he has probably never been asked such probing questions as was asked of him; in fact, most people probably haven’t.

We decided that we need to be more clear on setting expectations. "A genuine desire to change" will be an explicit requirement for anyone who attends, in the same way that "a desire to not be an alcoholic" is a requirement for AA.

Since it wasn't quite time to start yet, and only core members were present, we allowed ourselves to follow a short tangent from Life Engineer about a comic he read recently about the failure modes of the United States government; specifically how political parties and the internet have thrown a wrench into the system that the Founders created. Since we weren't seriously entertaining the problem we didn't allocate much brainpower to it; the only obvious solution is a factory reset of our government, which would likely only happen if something else Very Bad happens.

With the arrival of our 4th and final RSVP for the evening, we began the Dojo.


"W", one of our founding members, began by reporting that his adventures in weight loss continue to go positively. His conscious effort to abstain from Entertainment Eating has become habitual now, requiring little to no effort, after about 2-3 months of implementing it. He spoke about his aspirations; how his ideal self is an "impossible person", something to constantly strive for. In addition to a lovely tangent on "Positive Social Butterflies" (the butterfly effect mixed with an improved variant of the Golden Rule), W talked about how one of his favorite techniques is a particular kind of reframing where instead of seeing what actions he can take to accomplish his goals, he first checks to see if there's anything he can *stop* doing in order to accomplish his goals.

I went next. My book reading is going slowly but surely. Chapter a day on average. My review of A Way of Being is positive so far. I also vented about some drama with friends; will share more when I need actionable advice (I’m just in a waiting period right now). As general actionable advice, Life Engineer thinks people should take more responsibility for the words they say; specifically they should say what they mean. We then concluded my phase by discussing ‘Ask’ vs ‘Guess’ vs ‘Tell’ culture and how it relates to my social drama.

Since the order came around to Life Engineer at this point, he spoke up and said "I bought 10 copies of a Brene Brown book and handed them out to everyone I know."


He is primarily concerned with growing his business. Word of mouth tends to work best in his industry, but he is unsure of how to inject himself into the community. I shared a personal anecdote about my recent move which may or may not be helpful; I printed out an introductory note, and put them on the doors of all my closest tennants, offering friendship and cooperation. I had one person actually reach out, which was surprising, and we plan to hang out soon. Life Engineer was not inspired much by this, but he is interested in how it pans out for me, and thinks I should go all-out and make posters.

We had a semi-related discussion at this point about what I thought was the pretty typical awkward relationship people have with their apartment neighbors. W doesn't experience this; he is very much involved in a culture that his building shares, which puts everyone on a first name basis. He talks about a neighbor who has a missing leg, and muses about why it's a social taboo to openly remark about such things. This threatened to suck us into one of our famous tangents, which we briefly touched on: why there are social taboos about people's injuries/anomalies.


Now we come to our usual topic at the end about how to improve the Dojo. Life Engineer is interested in unpacking the Dojo's slogan "we are all imperfect decision makers"; he doesn't see in his personal life that he ever really makes "bad" decisions. He can see where he could have done something different to make his life easier, but the payoff would have been negligible. However, in his line of work, he encounters people all the time who have the same belief: "I don't make bad decisions", and yet their marriages are falling apart, they have an addiction, etc. He acknowledges that "people make bad decisions" is a reality, but he has trouble seeing it in himself and is interested in whether that's because he truly doesn't make bad decisions, or if he is in the same boat as his clients where he simply doesn't see it.

This is more of an open question for us to mull over than anything else. We need to think about it more in-depth before we propose solutions. He wants to wrestle with it not only due to the obvious concerns (eg "Change into what?", "Is what we are doing as a community valuable?"), but also due to the failure mode of sounding patronizing to people who are in a completely different worldview.


SSC Meetups Everywhere Retrospective

28 ноября, 2019 - 22:10
Published on November 28, 2019 7:10 PM UTC

Slate Star Codex has regular weekly-to-monthly meetups in a bunch of cities around the world. Earlier this autumn, we held a Meetups Everywhere event, hoping to promote and expand these groups. We collected information on existing meetups, got volunteers to create new meetups in cities that didn’t have them already, and posted times and dates prominently on the blog.

During late September and early October, I traveled around the US to attend as many meetups as I could. I hoped my presence would draw more people; I also wanted to learn more about meetups and the community and how best to guide them. Buck Shlegeris and a few other Bay Area effective altruists came along to meet people, talk to them about effective altruism, and potentially nudge them into the recruiting pipeline for EA organizations.

Lots of people asked me how my trip was. In a word: exhausting. I got to meet a lot of people for about three minutes each. There were a lot of really fascinating people with knowledge of a bewildering variety of subjects, but I didn’t get to pick their minds anywhere as thoroughly as I would have liked. I’m sorry if I talked to you for three minutes, you told me about some amazing project you were working on to clone neuroscientists or eradicate bees or convert atmospheric CO2 into vegan meat substitutes, and I mumbled something and walked away. You are all great and I wish I could have spent more time with you.

I finally got to put faces to many of the names I’ve interacted with through the years. For example, Bryan Caplan is exactly how you would expect, in every way. Also, in front of his office, he has a unique painting, which he apparently got by asking a Mexican street artist to paint an homage to Lord of the Rings. The artist had never heard of it before, but Bryan described it to him very enthusiastically, and the completely bonkers result is hanging in front of his office. This is probably a metaphor for something.

Philadelphia hosted their meetup in a beautiful room that looked like a Roman temple, and had miniature cheesesteaks for everybody. Chicago held theirs in a gym; appropriate, given this blog’s focus on BRUTE STRENGTH. Berkeley’s was in a group house with posters representing the Twelve Virtues Of Rationality hanging along the staircase. In Fairbanks, a person who had never read the blog showed up to get a story and an autograph for his brother who did. In New York, someone brought the best bread I have ever had, maybe the best bread anyone has ever had, I am so serious about this. In Boston, the organizers set up a prediction market to determine how many attendees they needed to plan for; they still ended up being off by a factor of two. This is also probably a metaphor for something. If only they had used more BRUTE STRENGTH!

Along the way, I got to see America. Most of it I saw from an airplane window, but I still saw it. In Portland, I ate from a makeshift food court formed by a bunch of really good food trucks congregating in the same empty lot; one of them just sold like a dozen different kinds of french fries. In Texas, I rode with an Uber driver whose day job is driving mechanical bulls to parties that need mechanical bulls, and who Ubers people around while he waits for the party to finish. In Washington DC, I tried to see the White House, only to be thwarted by the construction of a new security fence; they say that before you change the world you must change your own home, and it seems like our Wall-Builder-In-Chief takes this seriously. In Delaware, I stood on the spot where the Swedes first landed in America and declared it to be the colony of New Sweden; probably there are alternate timelines out there who could appreciate this more than I did. In New Jersey, I confirmed that the Pine Barrens are, in fact, really creepy.

People gave me things. You are all so nice, but you also seem to think I am about ten times more classy and fashionable than I really am. One person gave me a beautiful record of their audiobook – a real, honest-to-goodness vinyl record – as if I had any idea what to do with it. A reader in Philadelphia gave me a beautiful glossy magazine about Philadelphia culture, which I stared at intently for twenty minutes. Many people gave me beautifully-bound copies of my own work, which was so incredibly thoughtful that I feel bad that I will have to hide them in a closet so nobody sees them and thinks I am the kind of narcissist who makes beautifully-bound copies of my own work. The Charter Cities Institute people gave me a very nice Charter Cities Institute bag (although I assume that if I ever take it outside in Berkeley, someone will punch me and it will start a National Conversation). I am still really grateful to all of you.

But you already know how great you are. Let’s get to the statistics.

Mingyuan, the Official SSC Meetup Coordinator, sent out a survey to get information on the meetups we weren’t able to visit, and determined that we had somewhere between 81 and 111 meetups around the world. I’m sorry I can’t be more precise. 111 meetups were supposed to happen, 81 organizers reported back to Mingyuan that their meetups happened, and I’m not sure what happened to the other 30. Although most activity was concentrated in the Anglosphere, there were meetups as far away as Bangalore (9 people), Tel Aviv (25 people), Oslo (9 people), and Seoul (4 people). Medellin, Colombia reports a one person meetup; I am sorry it sounds like you did not have a good time. Montreal, Canada, reports a zero person meetup, which sounds very computer-sciency, kind of like a heap of zero grains of sand.

Here’s the histogram of attendance, binned by fives. About twenty meetups had 0-5 people, thirty had 5-10, and the remaining thirty had more than 10. The best-attended meetups were Boston (140), NYC (120), and Berkeley (105). Total meetup attendance around the world was almost 1500 people!

Did the event fulfill its goal of bringing more people to meetups? Many organizers had only a vague idea how many people usually attended their meetups, and many said their city didn’t have a usual meetup group at all. But as best I can tell, about 2.3x as many people attended the Meetups Everywhere meetup in a city compared to the average previous meetup. Breaking it down by tour status, meetups on my tour had much higher attendance (6.1x usual), but even meetups off my tour had somewhat higher attendance (1.6x usual).

Did the event succeed in bringing some people into meetup groups who might stay around later? I suggested meetup organizers bring a signup sheet that people could sign to get on a mailing list for future meetups. My data on this is sparse, because people took the survey question overly literally and wrote things like “I didn’t have a signup sheet, I just asked people for their emails” and then didn’t tell me how many people gave them. But for the 40 meetups where I have data, people on average got a population of new signers equal to 77% of their previous regular attendance; that is, a meetup group that usually had 100 people had 77 extra new people sign up for their mailing last. Breaking it down by tour status, meetups on my tour gained 170%, other meetups gained 58%.

This seems implausibly large; did one event nearly double the attendance of SSC meetup groups around the world? I don’t know how many people who signed up for the mailing list will really start attending regularly. But I will probably survey the organizers again next year, and they might be able to help me figure out how many people stayed around.

In total, 1,476 people attended SSC meetups, and 339 people added their name to mailing lists (the ratio here doesn’t match the previous numbers because most organizers didn’t have a mailing list or didn’t report mailing list data, and the ratios above only counted those who did).

So much for the numbers. What did I learn?

I don’t want to generalize too much – I deliberately went to the biggest meetups, and things that work for a group of 100 people might not apply to a group of 2 people. So take all of this with a grain of salt, but:

1. Tables and chairs kill big meetups. Some people tried to hold meetups at a restaurant or a park with picnic tables or something. Everyone would sit down at the table, talk to the 3-4 people in their immediate neighborhood, and that would be that. Eventually I figured out that I need to force everyone out of the picnic tables and into the rest of the park. This caused a phase shift from solid to gas, with people milling about, talking to everyone, finding the conversations that most interested them.

2. The welcomeness sentence is really important. In the meetup descriptions on the blog, I included a sentence like “Please feel free to come even if you feel awkward about it, even if you’re not ‘the typical SSC reader’, even if you’re worried people won’t like you, etc.” It sounds silly, but I had so many people come up to me saying the only reason they came was because of that sentence. It happened again and again and again. Anybody planning any kind of meetup about anything should strongly consider including a sentence like that (as long as it’s true). Maybe there are other simple hacks like this waiting to be discovered.

3. Group houses are important community nuclei. Obvious in retrospect, but it was pretty stark seeing the level of community in cities that did have rationalist group houses vs. the ones that didn’t, even if there were good meetup groups in both. This also came out in listening to some people mourn the loss of the main group house in their city and talk about all the great things they were no longer able to do.

I was thinking of this last one because a lot of the meetups felt kind of superficial. Everyone shows up, talks about their favorite SSC post or what their job is or what kind of interesting thing they read recently, and then they go home. Lots of people seemed to enjoy that, I enjoyed it, but seeing the kind of really great rationalist communities in the Bay Area or Seattle gave me a sense that more is possible. I don’t know, maybe it’s not possible in cities with only 10 or 20 interested people; maybe only places like the Bay Area and Seattle have enough people, and everywhere it’s possible it’s already happening. But group houses seem to be a big part of it.

I was also struck by the number of female meetup organizers; the female:male ratio on the meetup organizer survey is almost twice that on the SSC survey in general. When there were cities that didn’t have regular meetup groups, and I asked for a volunteer to set one up, it was usually a woman who volunteered.

This suggests to me that we’re not just performing at some kind of theoretical maximum for the amount of people and interest in a given community; there’s a shortage of something (speculatively, social initiative) that (in this community) women are better than men at. I don’t know how to solve this (though integrating more with the EA community, which has more women, might help), but I think it’s an interesting problem.

And Buck has written his own retrospective of his EA work at the meetups here.


Building awareness habits (cognitive)

28 ноября, 2019 - 20:12
Published on November 27, 2019 1:36 AM UTC

Something I struggle with (probably largely due to ADHD) is building habits/patterns of being aware of things I want to notice/practice cognitively.

For example, I find myself easily becoming somewhat thoughtless in social situations, in that I do not stop to observe, think, and measure my actions and responses at all. I have, for a couple years now, been intending to work on this, and not really doing anything to act on this intention. I often plan to step back and just observe in social situations, but never do.

Since I struggle with this, I figure that the best thing I can do now is try and pick up some strategies for both reminding myself to think back to cognitive patterns I want to build when I enter into the relevant situation, and for doing said things.

So, does anyone have suggestions on how one can:

  • More consistently remember something they want to watch out for in various situations (especially social/fast-paced situations)
  • Build new mental patterns/habits (in general)
  • Effectively observe social situations as they occur, in person, to learn from the interactions in order to develop social skills (bonus if the strategy allows for one to not appear/be awkward or violate social etiquette)
  • Somewhat less relevantly, banish aversions to human connection/intimacies that make things like eye contact difficult/overwhelming (the aversion is mostly habitual at this point)


Open-Box Newcomb's Problem and the limitations of the Erasure framing

28 ноября, 2019 - 14:32
Published on November 28, 2019 11:32 AM UTC

One of the most confusing aspects of the Erasure Approach to Newcomb's problem is that in Open-Box Newcomb's it requires you to forget that you've seen that the box is full. This is really a strange thing to do so it deserves further explanation. And as we'll see, this might not be the best way to think about what it happening.

Let's begin by recapping the problem. In a room there are two boxes, with one-containing $1000 and the other being a transparent box that contains either nothing or $1 million. Before you entered the room, a perfect predictor predicted what you would do if you saw $1 million in the transparent box. If it predicted that you would one-boxed, then it put $1 million in the transparent box, otherwise it left the box empty. If you can see $1 million in the transparent box, which choice should you pick?

The argument I provided before was as follows: If you see a full box, then you must be going to one-box if the predictor really is perfect. So there would only be one decision consistent with the problem description and to produce a non-trivial decision theory problem we'd have to erase some information. And the most logical thing to erase would be what you see in the box.

I still mostly agree with this argument, but I feel the reasoning is a bit sparse, so this post will try to break it down in more detail. I'll just note in advance that when you start breaking it down, you end up performing a kind of psychological or social analysis. However, I think this is inevitable when dealing with ambiguous problems; if you could provide a mathematical proof of what an ambiguous problem meant then it wouldn't be ambiguous.

As I noted in Deconfusing Logical Counterfactuals, there is only one choice consistent with the problem (one-boxing), so in order to answer this question we'll have to construct some counterfactuals. And in order to construct these counterfactuals we'll have to consider situations with at least one of the above assumptions missing. Now we want to consider counterfactuals involving both one-boxing and two-boxing. Unfortunately, it is impossible for a two-boxer to see $1 million in a box if a) the money is only in the box if the predictor predicts the agent will one-box in this situation and b) the predictor is perfect.

Speaking very roughly, it is generally understood that the way to resolve this is to relax the assumption that the agent must really be in that situation and to allow the possibility that the agent may only be simulated as being in such as situation by the predictor. I said speaking very roughly because some people claim that the agent could actually be in the simulation. In my mind these people are confused; in order to predict an agent, we may only need to simulate the decision theory parts of its mind, not all the other parts that make you you. A second reason why this isn't precise is because it isn't defined how to simulate an impossible situation; one of my previous posts points out that we can get around this by simulating what an agent would do when given input representing an impossible situation. There may also be some people have doubts about whether a perfect predictor is possible even in theory. I'd suggest that these people read one of my past posts on why the sense in which you "could have chosen otherwise" doesn't break the prediction and how there's a sense that you are pre-commited to every action you take.

In any case, once we have relaxed this assumption, the consistent counterfactuals become either a) the agent actually seeing the full box and one-boxing b) the agent seeing the empty box. In case b), it is actually consistent for the agent to one-box or two-box since the predictor only predicts what would happen if the agent saw a full box. It is then trivial to pick the best counterfactual.

This problem actually demonstrates a limitation of the erasure framing. After all, we didn't just justify the counterfactuals by removing the assumption that you saw a full box; we instead modified it to seeing a full box OR being simulated seeing a full box. In one sense, this is essentially the same thing - since we already knew you were being simulated by the predictor, we essentially just removed the assumption. On the other hand, it is easier to justify that it is the same problem by turning it into an OR than by just removing the assumption.

In other words, thinking about counterfactuals in terms of erasure can be incredibly misleading and in this case actively made it harder justify our counterfactuals. After all, in this case we didn't erase an assumption, but simply relaxed one instead. Perhaps I should pick a better term, but I reluctant to rename this approach until I have a better understanding of what exactly is going on.


Updating a Complex Mental Model - An Applied Election Odds Example

28 ноября, 2019 - 12:29
Published on November 28, 2019 9:29 AM UTC

There are probabilities, and there are probabilities about probabilities. How do these get updated? I've had the same discussion several times, and have tried to describe this, but it is hard without going into the math. The formal model is clear, but I have found that the practical implications are hard to describe concretely. I just ran into a great concrete example, however, and I wanted to work through the logic of how I'm updating as a way to show what should happen.

The example I'm using is my expectations about the 2020 election, how accurate various models are.mjx-chtml {display: inline-block; line-height: 0; text-indent: 0; text-align: left; text-transform: none; font-style: normal; font-weight: normal; font-size: 100%; font-size-adjust: none; letter-spacing: normal; word-wrap: normal; word-spacing: normal; white-space: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0; min-height: 0; border: 0; margin: 0; padding: 1px 0} .MJXc-display {display: block; text-align: center; margin: 1em 0; padding: 0} .mjx-chtml[tabindex]:focus, body :focus .mjx-chtml[tabindex] {display: inline-table} .mjx-full-width {text-align: center; display: table-cell!important; width: 10000em} .mjx-math {display: inline-block; border-collapse: separate; border-spacing: 0} .mjx-math * {display: inline-block; -webkit-box-sizing: content-box!important; -moz-box-sizing: content-box!important; box-sizing: content-box!important; text-align: left} .mjx-numerator {display: block; text-align: center} .mjx-denominator {display: block; text-align: center} .MJXc-stacked {height: 0; position: relative} .MJXc-stacked > * {position: absolute} .MJXc-bevelled > * {display: inline-block} .mjx-stack {display: inline-block} .mjx-op {display: block} .mjx-under {display: table-cell} .mjx-over {display: block} .mjx-over > * {padding-left: 0px!important; padding-right: 0px!important} .mjx-under > * {padding-left: 0px!important; padding-right: 0px!important} .mjx-stack > .mjx-sup {display: block} .mjx-stack > .mjx-sub {display: block} .mjx-prestack > .mjx-presup {display: block} .mjx-prestack > .mjx-presub {display: block} .mjx-delim-h > .mjx-char {display: inline-block} .mjx-surd {vertical-align: top} .mjx-mphantom * {visibility: hidden} .mjx-merror {background-color: #FFFF88; color: #CC0000; border: 1px solid #CC0000; padding: 2px 3px; font-style: normal; font-size: 90%} .mjx-annotation-xml {line-height: normal} .mjx-menclose > svg {fill: none; stroke: currentColor} .mjx-mtr {display: table-row} .mjx-mlabeledtr {display: table-row} .mjx-mtd {display: table-cell; text-align: center} .mjx-label {display: table-row} .mjx-box {display: inline-block} .mjx-block {display: block} .mjx-span {display: inline} .mjx-char {display: block; white-space: pre} .mjx-itable {display: inline-table; width: auto} .mjx-row {display: table-row} .mjx-cell {display: table-cell} .mjx-table {display: table; width: 100%} .mjx-line {display: block; height: 0} .mjx-strut {width: 0; padding-top: 1em} .mjx-vsize {width: 0} .MJXc-space1 {margin-left: .167em} .MJXc-space2 {margin-left: .222em} .MJXc-space3 {margin-left: .278em} .mjx-test.mjx-test-display {display: table!important} .mjx-test.mjx-test-inline {display: inline!important; margin-right: -1px} .mjx-test.mjx-test-default {display: block!important; clear: both} .mjx-ex-box {display: inline-block!important; position: absolute; overflow: hidden; min-height: 0; max-height: none; padding: 0; border: 0; margin: 0; width: 1px; height: 60ex} .mjx-test-inline .mjx-left-box {display: inline-block; width: 0; float: left} .mjx-test-inline .mjx-right-box {display: inline-block; width: 0; float: right} .mjx-test-display .mjx-right-box {display: table-cell!important; width: 10000em!important; min-width: 0; max-width: none; padding: 0; border: 0; margin: 0} .MJXc-TeX-unknown-R {font-family: monospace; font-style: normal; font-weight: normal} .MJXc-TeX-unknown-I {font-family: monospace; font-style: italic; font-weight: normal} .MJXc-TeX-unknown-B {font-family: monospace; font-style: normal; font-weight: bold} .MJXc-TeX-unknown-BI {font-family: monospace; font-style: italic; font-weight: bold} .MJXc-TeX-ams-R {font-family: MJXc-TeX-ams-R,MJXc-TeX-ams-Rw} .MJXc-TeX-cal-B {font-family: MJXc-TeX-cal-B,MJXc-TeX-cal-Bx,MJXc-TeX-cal-Bw} .MJXc-TeX-frak-R {font-family: MJXc-TeX-frak-R,MJXc-TeX-frak-Rw} .MJXc-TeX-frak-B {font-family: MJXc-TeX-frak-B,MJXc-TeX-frak-Bx,MJXc-TeX-frak-Bw} .MJXc-TeX-math-BI {font-family: MJXc-TeX-math-BI,MJXc-TeX-math-BIx,MJXc-TeX-math-BIw} .MJXc-TeX-sans-R {font-family: MJXc-TeX-sans-R,MJXc-TeX-sans-Rw} .MJXc-TeX-sans-B {font-family: MJXc-TeX-sans-B,MJXc-TeX-sans-Bx,MJXc-TeX-sans-Bw} .MJXc-TeX-sans-I {font-family: MJXc-TeX-sans-I,MJXc-TeX-sans-Ix,MJXc-TeX-sans-Iw} .MJXc-TeX-script-R {font-family: MJXc-TeX-script-R,MJXc-TeX-script-Rw} .MJXc-TeX-type-R {font-family: MJXc-TeX-type-R,MJXc-TeX-type-Rw} .MJXc-TeX-cal-R {font-family: MJXc-TeX-cal-R,MJXc-TeX-cal-Rw} .MJXc-TeX-main-B {font-family: MJXc-TeX-main-B,MJXc-TeX-main-Bx,MJXc-TeX-main-Bw} .MJXc-TeX-main-I {font-family: MJXc-TeX-main-I,MJXc-TeX-main-Ix,MJXc-TeX-main-Iw} .MJXc-TeX-main-R {font-family: MJXc-TeX-main-R,MJXc-TeX-main-Rw} .MJXc-TeX-math-I {font-family: MJXc-TeX-math-I,MJXc-TeX-math-Ix,MJXc-TeX-math-Iw} .MJXc-TeX-size1-R {font-family: MJXc-TeX-size1-R,MJXc-TeX-size1-Rw} .MJXc-TeX-size2-R {font-family: MJXc-TeX-size2-R,MJXc-TeX-size2-Rw} .MJXc-TeX-size3-R {font-family: MJXc-TeX-size3-R,MJXc-TeX-size3-Rw} .MJXc-TeX-size4-R {font-family: MJXc-TeX-size4-R,MJXc-TeX-size4-Rw} .MJXc-TeX-vec-R {font-family: MJXc-TeX-vec-R,MJXc-TeX-vec-Rw} .MJXc-TeX-vec-B {font-family: MJXc-TeX-vec-B,MJXc-TeX-vec-Bx,MJXc-TeX-vec-Bw} @font-face {font-family: MJXc-TeX-ams-R; src: local('MathJax_AMS'), local('MathJax_AMS-Regular')} @font-face {font-family: MJXc-TeX-ams-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_AMS-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_AMS-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_AMS-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-cal-B; src: local('MathJax_Caligraphic Bold'), local('MathJax_Caligraphic-Bold')} @font-face {font-family: MJXc-TeX-cal-Bx; src: local('MathJax_Caligraphic'); font-weight: bold} @font-face {font-family: MJXc-TeX-cal-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Caligraphic-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Caligraphic-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Caligraphic-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-frak-R; src: local('MathJax_Fraktur'), local('MathJax_Fraktur-Regular')} @font-face {font-family: MJXc-TeX-frak-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Fraktur-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Fraktur-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Fraktur-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-frak-B; src: local('MathJax_Fraktur Bold'), local('MathJax_Fraktur-Bold')} @font-face {font-family: MJXc-TeX-frak-Bx; src: local('MathJax_Fraktur'); font-weight: bold} @font-face {font-family: MJXc-TeX-frak-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Fraktur-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Fraktur-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Fraktur-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-math-BI; src: local('MathJax_Math BoldItalic'), local('MathJax_Math-BoldItalic')} @font-face {font-family: MJXc-TeX-math-BIx; src: local('MathJax_Math'); font-weight: bold; font-style: italic} @font-face {font-family: MJXc-TeX-math-BIw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Math-BoldItalic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Math-BoldItalic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Math-BoldItalic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-sans-R; src: local('MathJax_SansSerif'), local('MathJax_SansSerif-Regular')} @font-face {font-family: MJXc-TeX-sans-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_SansSerif-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_SansSerif-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_SansSerif-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-sans-B; src: local('MathJax_SansSerif Bold'), local('MathJax_SansSerif-Bold')} @font-face {font-family: MJXc-TeX-sans-Bx; src: local('MathJax_SansSerif'); font-weight: bold} @font-face {font-family: MJXc-TeX-sans-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_SansSerif-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_SansSerif-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_SansSerif-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-sans-I; src: local('MathJax_SansSerif Italic'), local('MathJax_SansSerif-Italic')} @font-face {font-family: MJXc-TeX-sans-Ix; src: local('MathJax_SansSerif'); font-style: italic} @font-face {font-family: MJXc-TeX-sans-Iw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_SansSerif-Italic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_SansSerif-Italic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_SansSerif-Italic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-script-R; src: local('MathJax_Script'), local('MathJax_Script-Regular')} @font-face {font-family: MJXc-TeX-script-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Script-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Script-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Script-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-type-R; src: local('MathJax_Typewriter'), local('MathJax_Typewriter-Regular')} @font-face {font-family: MJXc-TeX-type-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Typewriter-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Typewriter-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Typewriter-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-cal-R; src: local('MathJax_Caligraphic'), local('MathJax_Caligraphic-Regular')} @font-face {font-family: MJXc-TeX-cal-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Caligraphic-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Caligraphic-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Caligraphic-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-main-B; src: local('MathJax_Main Bold'), local('MathJax_Main-Bold')} @font-face {font-family: MJXc-TeX-main-Bx; src: local('MathJax_Main'); font-weight: bold} @font-face {font-family: MJXc-TeX-main-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Main-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Main-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Main-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-main-I; src: local('MathJax_Main Italic'), local('MathJax_Main-Italic')} @font-face {font-family: MJXc-TeX-main-Ix; src: local('MathJax_Main'); font-style: italic} @font-face {font-family: MJXc-TeX-main-Iw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Main-Italic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Main-Italic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Main-Italic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-main-R; src: local('MathJax_Main'), local('MathJax_Main-Regular')} @font-face {font-family: MJXc-TeX-main-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Main-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Main-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Main-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-math-I; src: local('MathJax_Math Italic'), local('MathJax_Math-Italic')} @font-face {font-family: MJXc-TeX-math-Ix; src: local('MathJax_Math'); font-style: italic} @font-face {font-family: MJXc-TeX-math-Iw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Math-Italic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Math-Italic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Math-Italic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size1-R; src: local('MathJax_Size1'), local('MathJax_Size1-Regular')} @font-face {font-family: MJXc-TeX-size1-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size1-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size1-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size1-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size2-R; src: local('MathJax_Size2'), local('MathJax_Size2-Regular')} @font-face {font-family: MJXc-TeX-size2-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size2-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size2-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size2-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size3-R; src: local('MathJax_Size3'), local('MathJax_Size3-Regular')} @font-face {font-family: MJXc-TeX-size3-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size3-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size3-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size3-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size4-R; src: local('MathJax_Size4'), local('MathJax_Size4-Regular')} @font-face {font-family: MJXc-TeX-size4-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size4-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size4-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size4-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-vec-R; src: local('MathJax_Vector'), local('MathJax_Vector-Regular')} @font-face {font-family: MJXc-TeX-vec-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Vector-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Vector-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Vector-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-vec-B; src: local('MathJax_Vector Bold'), local('MathJax_Vector-Bold')} @font-face {font-family: MJXc-TeX-vec-Bx; src: local('MathJax_Vector'); font-weight: bold} @font-face {font-family: MJXc-TeX-vec-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Vector-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Vector-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Vector-Bold.otf') format('opentype')} 1, and how important the inputs are. This type of problem is fairly common - I have both an object level prediction about the winner, and a prediction about / model of how accurate different sources of information will be.

So, what do I do when information comes in that seems surprising? Two things; I update in the direction the information indicates, and I update against the reliability of the data. The second may seem counter-intuitive, but the example makes it clearer.

The economy is doing well - better than expected. Presidents with great economies tend to get re-elected. Trump is also unpopular. Unpopular presidents tend not to get re-elected2. How do we balance these two, and how do they interact? My model of whether he will win is fairly uncertain, and my model of the sources of data is also uncertain. They are also related in complex ways3. For instance, if Trump's popularity plummets because, for instance, the impeachment inquiries find something shocking and horrible even to his base, I expect that GDP matters far less for his reelection chances. Other data sources also constrain how far I will update4 - no level of GDP growth alone will make me say he's certain5 to win.

So I updated towards Trump's reelection based on the economic data, but my underlying model is telling me that it is decreasingly relevant. That means I'm very slightly down-weighting the importance of economic factors compared to approval rating, since he's seemingly not getting credit for the growth (or the growth isn't helping most voters.) The net impact is that I have updated slightly towards Trump's reelection.

1) For long term forecasts of presidential elections, forecasts based on fundamentals do just OK. But forecasts based on polls do poorly far in advance of the election as well. (Special elections seem to point to a huge shift towards the democrats, despite fundamentals.) More complete models take some of each type of information - but how to combine them is tricky. Some models do it poorly, others do it well.

2) I also have expectations about the future inputs to the models. Most presidents have fluctuating approval ratings, so long-term forecasts do poorly. For Trump, his split of approval/disapproval has been remarkably steady, so unless his approval significantly shifts from the current low-40s, or he runs against an incredibly unpopular democrat (which is possible, but seems pretty unlikely,) models that consider this point towards him being unlikely to win. It still may be volatile. For example, the impeachment could solidify his base, or could reduce his popularity further.

3) This is tricky to describe, but for understanding the overall behavior, a useful strategy is to consider the limit - what happens if the economy is amazing, but everyone hates the president? I'd assume he doesn't get reelected. Similarly, if everyone loves the president, but the economy is in a deep recession, (for which he's seemingly not being blamed) he probably gets reelected.

4) Special elections are favoring Democrats, voter turnout among liberals is expected to be very high because of polarization, etc.

5) By which I mean highly confident - certainty is impossible. It would take a confluence of events to make my highly confident. Even with such a confluence of events, however, it is far in advance, so I'm not willing to put odds above ~90% / below ~10% because I think there are fundamentally hard questions about the future that impact the probability. (We don't know who the democratic nominee is, for instance.)


How to make TensorFlow run faster

28 ноября, 2019 - 03:28
Published on November 28, 2019 12:28 AM UTC

I briefly wrote this up for a fellow AI alignment researcher/engineer, so I thought I'd share it here.


I implement these recommendations by setting the following in the code:

tf.config.threading.set_inter_op_parallelism_threads(2) tf.config.threading.set_intra_op_parallelism_threads(6) # Number of physical cores.

And setting the environment variables in the PyCharm run configuration and Python Console settings.

As a result, the per-epoch training time dropped by 20 % with a small multi-layer perceptron on Fashion MNIST.

Another important point is to use a TensorFlow binary that uses all available CPU capabilities. Ie. it should display something like this:

2019-11-27 17:26:42.782399: I tensorflow/core/platform/cpu_feature_guard.cc:145] This TensorFlow binary is optimized with Intel(R) MKL-DNN to use the following CPU instructions in performance critical operations: SSE4.1 SSE4.2 AVX AVX2 FMA

And not something like this:

2019-11-28 09:18:26.315191: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA

One way to achieve this is to install TensorFlow with Conda instead of Pip.

Here is more information about that: https://software.intel.com/en-us/articles/intel-optimization-for-tensorflow-installation-guide

If you're on a Mac and the Conda-TensorFlow crashes with an OMP error, here's the solution: https://stackoverflow.com/a/53692707/5091738


Getting Ready for the FB Donation Match

27 ноября, 2019 - 22:20
Published on November 27, 2019 7:20 PM UTC

Facebook is again going to be matching donations on Giving Tuesday:

  • $7M total
  • $20k max per donor
  • $100k max per organization
  • First-come first-served
  • Processing fees covered by FB
  • 8AM Eastern, Tuesday 12/3

I participated last year, and while the match ran out in seconds I was able to direct the full counterfactual $20k by being (a) prepared and (b) lucky not to get any declines. As with last year, I'm planning to donate more than $20k across several cards, so even if I get declines this year I'll still have a good chance of getting my donation matched.

There are good instructions on the EA Giving Tuesday site. One way things are different this year is that by confirming your identity ahead of time you can get FB to raise your per-donation limit. This means instead of twelve $2,499 transactions I'm planning to do three $9,999 ones.

I did one test $2,501 donation a few days ago to verify that confirming my identity worked, and three $9,999 donations yesterday to check if I got declines and practice. My guess is that this year the donation amount runs out in 1-2s, which means there's probably only time for one $9,999 donation before the $7M is gone. Very curious how it goes though!

Comment via: facebook


[AN #75]: Solving Atari and Go with learned game models, and thoughts from a MIRI employee

27 ноября, 2019 - 21:10
Published on November 27, 2019 6:10 PM UTC

[AN #75]: Solving Atari and Go with learned game models, and thoughts from a MIRI employee View this email in your browser Find all Alignment Newsletter resources here. In particular, you can sign up, or look through this spreadsheet of all summaries that have ever been in the newsletter. I'm always happy to hear feedback; you can send it to me by replying to this email.

Audio version here (may not be up yet).


Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model (Julian Schrittwieser et al) (summarized by Nicholas): Up until now, model-free RL approaches have been state of the art at visually rich domains such as Atari, while model-based RL has excelled for games which require planning many steps ahead, such as Go, chess, and shogi. This paper attains state of the art performance on Atari using a model-based approach, MuZero, while matching AlphaZero (AN #36) at Go, chess, and shogi while using less compute. Importantly, it does this without requiring any advance knowledge of the rules of the game.

MuZero's model has three components:

1. The representation function produces an initial internal state from all existing observations.

2. The dynamics function predicts the next internal state and immediate reward after taking an action in a given internal state.

3. The prediction function generates a policy and a value prediction from an internal state.

Although these are based on the structure of an MDP, the internal states of the model do not necessarily have any human-interpretable meaning. They are trained end-to-end only to accurately predict the policy, value function, and immediate reward. This model is then used to simulate trajectories for use in MCTS.

Nicholas's opinion: This is clearly a major step for model-based RL, becoming the state of the art on a very popular benchmark and enabling planning approaches to be used in domains with unknown rules or dynamics. I am typically optimistic about model-based approaches as progress towards safe AGI. They map well to how humans think about most complex tasks: we consider the likely outcomes of our actions and then plan accordingly. Additionally, model-based RL typically has the safety property that the programmers know what states the algorithm expects to pass through and end up in, which aids with interpretability and auditing. However, MuZero loses that property by using a learned model whose internal states are not constrained to have any semantic meaning. I would be quite excited to see follow up work that enables us to understand what the model components are learning and how to audit them for particularly bad inaccuracies.

Rohin's opinion: Note: This is more speculative than usual. This approach seems really obvious and useful in hindsight (something I last felt for population-based training of hyperparameters). The main performance benefit (that I see) of model-based planning is that it only needs to use the environment interactions to learn how the environment works, rather than how to act optimally in the environment -- it can do the "act optimally" part using some MDP planning algorithm, or by simulating trajectories from the world model rather than requiring the actual environment. Intuitively, it should be significantly easier to learn how an environment works -- consider how easy it is for us to learn the rules of a game, as opposed to playing it well. However, most model-based approaches force the learned model to learn features that are useful for predicting the state, which may not be the ones that are useful for playing well, which can handicap their final performance. Model-free approaches on the other hand learn exactly the features that are needed for playing well -- but they have a much harder learning task, so it takes many more samples to learn, but can lead to better final performance. Ideally, we would like to get the benefits of using an MDP planning algorithm, while still only requiring the agent to learn features that are useful for acting optimally.

This is exactly what MuZero does, similarly to this previous paper: its "model" only predicts actions, rewards, and value functions, all of which are much more clearly relevant to acting optimally. However, the tasks that are learned from environment interactions are in some sense "easier" -- the model only needs to predict, given a sequence of actions, what the immediate reward will be. It notably doesn't need to do a great job of predicting how an action now will affect things ten turns from now, as long as it can predict how things ten turns from now will be given the ten actions used to get there. Of course, the model does need to predict the policy and the value function (both hard and dependent on the future), but the learning signal for this comes from MCTS, whereas model-free RL relies on credit assignment for this purpose. Since MCTS can consider multiple possible future scenarios, while credit assignment only gets to see the trajectory that was actually rolled out, we should expect that MCTS leads to significantly better gradients and faster learning.

I'm Buck Shlegeris, I do research and outreach at MIRI, AMA (Buck Shlegeris) (summarized by Rohin): Here are some beliefs that Buck reported that I think are particularly interesting (selected for relevance to AI safety):

1. He would probably not work on AI safety if he thought there was less than 30% chance of AGI within 50 years.

2. The ideas in Risks from Learned Optimization (AN #58) are extremely important.

3. If we build "business-as-usual ML", there will be inner alignment failures, which can't easily be fixed. In addition, the ML systems' goals may accidentally change as they self-improve, obviating any guarantees we had. The only way to solve this is to have a clearer picture of what we're doing when building these systems. (This was a response to a question about the motivation for MIRI's research agenda, and so may not reflect his actual beliefs, but just his beliefs about MIRI's beliefs.)

4. Different people who work on AI alignment have radically different pictures of what the development of AI will look like, what the alignment problem is, and what solutions might look like.

5. Skilled and experienced AI safety researchers seem to have a much more holistic and much more concrete mindset: they consider a solution to be composed of many parts that solve subproblems that can be put together with different relative strengths, as opposed to searching for a single overall story for everything.

6. External criticism seems relatively unimportant in AI safety, where there isn't an established research community that has already figured out what kinds of arguments are most important.

Rohin's opinion: I strongly agree with 2 and 4, weakly agree with 1, 5, and 6, and disagree with 3.

Technical AI alignment   Problems

Defining AI wireheading (Stuart Armstrong) (summarized by Rohin): This post points out that "wireheading" is a fuzzy category. Consider a weather-controlling AI tasked with increasing atmospheric pressure, as measured by the world's barometers. If it made a tiny dome around each barometer and increased air pressure within the domes, we would call it wireheading. However, if we increase the size of the domes until it's a dome around the entire Earth, then it starts sounding like a perfectly reasonable way to optimize the reward function. Somewhere in the middle, it must have become unclear whether or not it was wireheading. The post suggests that wireheading can be defined as a subset of specification gaming (AN #1), where the "gaming" happens by focusing on some narrow measurement channel, and the fuzziness comes from what counts as a "narrow measurement channel".

Rohin's opinion: You may have noticed that this newsletter doesn't talk about wireheading very much; this is one of the reasons why. It seems like wireheading is a fuzzy subset of specification gaming, and is not particularly likely to be the only kind of specification gaming that could lead to catastrophe. I'd be surprised if we found some sort of solution where we'd say "this solves all of wireheading, but it doesn't solve specification gaming" -- there don't seem to be particular distinguishing features that would allow us to have a solution to wireheading but not specification gaming. There can of course be solutions to particular kinds of wireheading that do have clear distinguishing features, such as reward tampering (AN #71), but I don't usually expect these to be the major sources of AI risk.

Technical agendas and prioritization

The Value Definition Problem (Sammy Martin) (summarized by Rohin): This post considers the Value Definition Problem: what should we make our AI system try to do (AN #33) to have the best chance of a positive outcome? It argues that an answer to the problem should be judged based on how much easier it makes alignment, how competent the AI system has to be to optimize it, and how good the outcome would be if it was optimized. Solutions also differ on how "direct" they are -- on one end, explicitly writing down a utility function would be very direct, while on the other, something like Coherent Extrapolated Volition would be very indirect: it delegates the task of figuring out what is good to the AI system itself.

Rohin's opinion: I fall more on the side of preferring indirect approaches, though by that I mean that we should delegate to future humans, as opposed to defining some particular value-finding mechanism into an AI system that eventually produces a definition of values.

Miscellaneous (Alignment)

Self-Fulfilling Prophecies Aren't Always About Self-Awareness (John Maxwell) (summarized by Rohin): Could we prevent a superintelligent oracle from making self-fulfilling prophecies by preventing it from modeling itself? This post presents three scenarios in which self-fulfilling prophecies would still occur. For example, if instead of modeling itself, it models the fact that there's some AI system whose predictions frequently come true, it may try to predict what that AI system would say, and then say that. This would lead to self-fulfilling prophecies.

Analysing: Dangerous messages from future UFAI via Oracles and Breaking Oracles: hyperrationality and acausal trade (Stuart Armstrong) (summarized by Rohin): These posts point out a problem with counterfactual oracles (AN #59): a future misaligned agential AI system could commit to helping the oracle (e.g. by giving it maximal reward, or making its predictions come true) even in the event of an erasure, as long as the oracle makes predictions that cause humans to build the agential AI system. Alternatively, multiple oracles could acausally cooperate with each other to build an agential AI system that will reward all oracles.

AI strategy and policy

AI Alignment Podcast: Machine Ethics and AI Governance (Lucas Perry and Wendell Wallach) (summarized by Rohin): Machine ethics has aimed to figure out how to embed ethical reasoning in automated systems of today. In contrast, AI alignment starts from an assumption of intelligence, and then asks how to make the system behave well. Wendell expects that we will have to go through stages of development where we figure out how to embed moral reasoning in less intelligent systems before we can solve AI alignment.

Generally in governance, there's a problem that technologies are easy to regulate early on, but that's when we don't know what regulations would be good. Governance has become harder now, because it has become very crowded: there are more than 53 lists of principles for artificial intelligence and lots of proposed regulations and laws. One potential mitigation would be governance coordinating committees: a sort of issues manager that keeps track of a field, maps the issues and gaps, and figures out how they could be addressed.

In the intermediate term, the worry is that AI systems are giving increasing power to those who want to manipulate human behavior. In addition, job loss is a real issue. One possibility is that we could tax corporations relative to how many workers they laid off and how many jobs they created.

Thinking about AGI, governments should probably not be involved now (besides perhaps funding some of the research), since we have so little clarity on what the problem is and what needs to be done. We do need people monitoring risks, but there’s a pretty robust existing community doing this, so government doesn't need to be involved.

Rohin's opinion: I disagree with Wendell that current machine ethics will be necessary for AI alignment -- that might be the case, but it seems like things change significantly once our AI systems are smart enough to actually understand our moral systems, so that we no longer need to design special procedures to embed ethical reasoning in the AI system.

It does seem useful to have coordination on governance, along the lines of governance coordinating committees; it seems a lot better if there's only one or two groups that we need to convince of the importance of an issue, rather than 53 (!!).

Other progress in AI   Reinforcement learning

Learning to Predict Without Looking Ahead: World Models Without Forward Prediction (C. Daniel Freeman et al) (summarized by Sudhanshu): One critique of the World Models (AN #23) paper was that in any realistic setting, you only want to learn the features that are important for the task under consideration, while the VAE used in the paper would learn features for state reconstruction. This paper instead studies world models that are trained directly from reward, rather than by supervised learning on observed future states, which should lead to models that only focus on task-relevant features. Specifically, they use observational dropout on the environment percepts, where the true state is passed to the policy with a peek probability p, while a neural network, M, generates a proxy state with probability 1 - p. At the next time-step, M  takes the same input as the policy, plus the policy's action, and generates the next proxy state, which then may get passed to the controller, again with probability 1 - p.

They investigate whether the emergent 'world model' M behaves like a good forward predictive model. They find that even with very low peek probability e.g. p = 5%, M learns a good enough world model that enables the policy to perform reasonably well. Additionally, they find that world models thus learned can be used to train policies that sometimes transfer well to the real environment. They claim that the world model only learns features that are useful for task performance, but also note that interpretability of those features depends on inductive biases such as the network architecture.

Sudhanshu's opinion: This work warrants a visit for the easy-to-absorb animations and charts. On the other hand, they make a few innocent-sounding observations that made me uncomfortable because they weren't rigourously proved nor labelled as speculation, e.g. a) "At higher peek probabilities, the learned dynamics model is not needed to solve the task thus is never learned.", and b) "Here, the world model clearly only learns reliable transition maps for moving down and to the right, which is sufficient."

While this is a neat bit of work well presented, it is nevertheless still unlikely this (and most other current work in deep model-based RL) will scale to more complex alignment problems such as Embedded World-Models (AN #31); these world models do not capture the notion of an agent, and do not model the agent as an entity making long-horizon plans in the environment.

Deep learning

SATNet: Bridging deep learning and logical reasoning using a differentiable satisfiability solver (Po-Wei Wang et al) (summarized by Asya): Historically, deep learning architectures have struggled with problems that involve logical reasoning, since they often impose non-local constraints that gradient descent has a hard time learning. This paper presents a new technique, SATNet, which allows neural nets to solve logical reasoning problems by encoding them explicitly as MAXSAT-solving neural network layers. A MAXSAT problem provides a large set of logical constraints on an exponentially large set of options, and the goal is to find the option that satisfies as many logical constraints as possible. Since MaxSAT is NP-complete, the authors design a layer that solves a relaxation of the MaxSAT problem in its forward pass (that can be solved quickly, unlike MaxSAT), while the backward pass computes gradients as usual.

In experiment, SATNet is given bit representations of 9,000 9 x 9 Sudoku boards which it uses to learn the logical constraints of Sudoku, then presented with 1,000 test boards to solve. SATNet vastly outperforms traditional convolutional neural networks given the same training / test setup, achieving 98.3% test accuracy where the convolutional net achieves 0%. It performs similarly well on a "Visual" Sudoku problem where the trained network consists of initial layers that perform digit recognition followed by SATNet layers, achieving 63.2% accuracy where the convolutional net achieves 0.1%.

Asya's opinion: My impression is this is a big step forward in being able to embed logical reasoning in current deep learning techniques. From an engineering perspective, it seems extremely useful to be able to train systems that encorporate these layers end-to-end. It's worth being clear that in systems like these, a lot of generality is lost since part of the network is explicitly carved out for solving a particular problem of logical constraints-- it would be hard to use the same network to learn a different problem.


AI Safety Unconference 2019 (David Krueger, Orpheus Lummis, and Gretchen Krueger) (summarized by Rohin): Like last year, there will be an AI safety unconference alongside NeurIPS, on Monday Dec 9 from 10am to 6pm. While the website suggests a registration deadline of Nov 25, the organizers have told me it's a soft deadline, but you probably should register now to secure a place.

Copyright © 2019 Rohin Shah, All rights reserved.

Want to change how you receive these emails?
You can update your preferences or unsubscribe from this list.


Mental Mountains

27 ноября, 2019 - 08:30
Published on November 27, 2019 5:30 AM UTC


Kaj Sotala has an outstanding review of Unlocking The Emotional Brain; I read the book, and Kaj’s review is better.

He begins:

UtEB’s premise is that much if not most of our behavior is driven by emotional learning. Intense emotions generate unconscious predictive models of how the world functions and what caused those emotions to occur. The brain then uses those models to guide our future behavior. Emotional issues and seemingly irrational behaviors are generated from implicit world-models (schemas) which have been formed in response to various external challenges. Each schema contains memories relating to times when the challenge has been encountered and mental structures describing both the problem and a solution to it.

So in one of the book’s example cases, a man named Richard sought help for trouble speaking up at work. He would have good ideas during meetings, but felt inexplicably afraid to voice them. During therapy, he described his narcissistic father, who was always mouthing off about everything. Everyone hated his father for being a fool who wouldn’t shut up. The therapist conjectured that young Richard observed this and formed a predictive model, something like “talking makes people hate you”. This was overly general: talking only makes people hate you if you talk incessantly about really stupid things. But when you’re a kid you don’t have much data, so you end up generalizing a lot from the few examples you have.

When Richard started therapy, he didn’t consciously understand any of this. He just felt emotions (anxiety) at the thought of voicing his opinion. The predictive model output the anxiety, using reasoning like “if you talk, people will hate you, and the prospect of being hated should make you anxious – therefore, anxiety”, but not any of the intermediate steps. The therapist helped Richard tease out the underlying model, and at the end of the session Richard agreed that his symptoms were related to his experience of his father. But knowing this changed nothing; Richard felt as anxious as ever.

Predictions like “speaking up leads to being hated” are special kinds of emotional memory. You can rationally understand that the prediction is no longer useful, but that doesn’t really help; the emotional memory is still there, guiding your unconscious predictions. What should the therapist do?

Here UtEB dives into the science on memory reconsolidation.

Scientists have known for a while that giving rats the protein synthesis inhibitor anisomycin prevents them from forming emotional memories. You can usually give a rat noise-phobia by pairing a certain noise with electric shocks, but this doesn’t work if the rats are on anisomycin first. Probably this means that some kind of protein synthesis is involved in memory. So far, so plausible.

A 2000 study found that anisomycin could also erase existing phobias in a very specific situation. You had to “activate” the phobia – get the rats thinking about it really hard, maybe by playing the scary noise all the time – and then give them the anisomycin. This suggested that when the memory got activated, it somehow “came loose”, and the brain needed to do some protein synthesis to put it back together again.

Thus the idea of memory reconsolidation: you form a consolidated memory, but every time you activate it, you need to reconsolidate it. If the reconsolidation fails, you lose the memory, or you get a slightly different memory, or something like that. If you could disrupt emotional memories like “speaking out makes you hated” while they’re still reconsolidating, maybe you could do something about this.

Anisomycin is pretty toxic, so that’s out. Other protein synthesis inhibitors are also toxic – it turns out proteins are kind of important for life – so they’re out too. Electroconvulsive therapy actually seems to work pretty well for this – the shock disrupts protein formation very effectively (and the more I think about this, the more implications it seems to have). But we can’t do ECT on everybody who wants to be able to speak up at work more, so that’s also out. And the simplest solution – activating a memory and then reminding the patient that they don’t rationally believe it’s true – doesn’t seem to help; the emotional brain doesn’t speak Rationalese.

The authors of UtEB claim to have found a therapy-based method that works, which goes like this:

First, they tease out the exact predictive model and emotional memory behind the symptom (in Richard’s case, the narrative where his father talked too much and ended up universally hated, and so if Richard talks at all, he too will be universally hated). Then they try to get this as far into conscious awareness as possible (or, if you prefer, have consciousness dig as deep into the emotional schema as possible). They call this “the pro-symptom position” – giving the symptom as much room as possible to state its case without rejecting it. So for example, Richard’s therapist tried to get Richard to explain his unconscious pro-symptom reasoning as convincingly as possible: “My father was really into talking, and everybody hated him. This proves that if I speak up at work, people will hate me too.” She even asked Richard to put this statement on an index card, review it every day, and bask in its compellingness. She asked Richard to imagine getting up to speak, and feeling exactly how anxious it made him, while reviewing to himself that the anxiety felt justified given what happened with his father. The goal was to establish a wide, well-trod road from consciousness to the emotional memory.

Next, they try to find a lived and felt experience that contradicts the model. Again, Rationalese doesn’t work; the emotional brain will just ignore it. But it will listen to experiences. For Richard, this was a time when he was at a meeting, had a great idea, but didn’t speak up. A coworker had the same idea, mentioned it, and everyone agreed it was great, and congratulated the other person for having such an amazing idea that would transform their business. Again, there’s this same process of trying to get as much in that moment as possible, bring the relevant feelings back again and again, create as wide and smooth a road from consciousness to the experience as possible.

Finally, the therapist activates the disruptive emotional schema, and before it can reconsolidate, smashes it into the new experience. So Richard’s therapist makes use of the big wide road Richard built that let him fully experience his fear of speaking up, and asks Richard to get into that frame of mind (activate the fear-of-speaking schema). Then she asks him, while keeping the fear-of-speaking schema in mind, to remember the contradictory experience (coworker speaks up and is praised). Then the therapist vividly describes the juxtaposition while Richard tries to hold both in his mind at once.

And then Richard was instantly cured, and never had any problems speaking up at work again. His coworkers all applauded, and became psychotherapists that very day. An eagle named “Psychodynamic Approach” flew into the clinic and perched atop the APA logo and shed a single tear. Coherence Therapy: Practice Manual And Training Guide was read several times, and God Himself showed up and enacted PsyD prescribing across the country. All the cognitive-behavioralists died of schizophrenia and were thrown in the lake of fire for all eternity.

This is, after all, a therapy book.


I like UtEB because it reframes historical/purposeful accounts of symptoms as aspects of a predictive model. We already know the brain has an unconscious predictive model that it uses to figure out how to respond to various situations and which actions have which consequences. In retrospect, this framing perfectly fits the idea of traumatic experiences having outsized effects. Tack on a bit about how the model is more easily updated in childhood (because you’ve seen fewer other things, so your priors are weaker), and you’ve gone a lot of the way to traditional models of therapy.

But I also like it because it helps me think about the idea of separation/noncoherence in the brain. Richard had his schema about how speaking up makes people hate you. He also had lots of evidence that this wasn’t true, both rationally (his understanding that his symptoms were counterproductive) and experientially (his story about a coworker proposing an idea and being accepted). But the evidence failed to naturally propagate; it didn’t connect to the schema that it should have updated. Only after the therapist forced the connection did the information go through. Again, all of this should have been obvious – of course evidence doesn’t propagate through the brain, I was writing posts ten years ago about how even a person who knows ghosts exist will be afraid to stay in an old supposedly-haunted mansion at night with the lights off. But UtEB’s framework helps snap some of this into place.

UtEB’s brain is a mountainous landscape, with fertile valleys separated by towering peaks. Some memories (or pieces of your predictive model, or whatever) live in each valley. But they can’t talk to each other. The passes are narrow and treacherous. They go on believing their own thing, unconstrained by conclusions reached elsewhere.

Consciousness is a capital city on a wide plain. When it needs the information stored in a particular valley, it sends messengers over the passes. These messengers are good enough, but they carry letters, not weighty tomes. Their bandwidth is atrocious; often they can only convey what the valley-dwellers think, and not why. And if a valley gets something wrong, lapses into heresy, as often as not the messengers can’t bring the kind of information that might change their mind.

Links between the capital and the valleys may be tenuous, but valley-to-valley trade is almost non-existent. You can have two valleys full of people working on the same problem, for years, and they will basically never talk.

Sometimes, when it’s very important, the king can order a road built. The passes get cleared out, high-bandwidth communication to a particular communication becomes possible. If he does this to two valleys at once, then they may even be able to share notes directly, each passing through the capital to get to each other. But it isn’t the norm. You have to really be trying.

This ended out a little more flowery than I expected, but I didn’t start thinking this way because it was poetic. I started thinking this way because of this:

Frequent SSC readers will recognize this as from Figure 1 of Friston and Carhart-Harris’ REBUS And The Anarchic Brain: Toward A Unified Model Of The Brain Action Of Psychedelics, which I review here. The paper describes it as “the curvature of the free-energy landscape that contains neuronal dynamics. Effectively, this can be thought of as a flattening of local minima, enabling neuronal dynamics to escape their basins of attraction and—when in flat minima—express long-range correlations and desynchronized activity.”

Moving back a step: the paper is trying to explain what psychedelics do to the brain. It theorizes that they weaken high-level priors (in this case, you can think of these as the tendency to fit everything to an existing narrative), allowing things to be seen more as they are:

A corollary of relaxing high-level priors or beliefs under psychedelics is that ascending prediction errors from lower levels of the system (that are ordinarily unable to update beliefs due to the top-down suppressive influence of heavily-weighted priors) can find freer register in conscious experience, by reaching and impressing on higher levels of the hierarchy. In this work, we propose that this straightforward model can account for the full breadth of subjective phenomena associated with the psychedelic experience.

These ascending prediction errors (ie noticing that you’re wrong about something) can then correct the high-level priors (ie change the narratives you tell about your life):

The ideal result of the process of belief relaxation and revision is a recalibration of the relevant beliefs so that they may better align or harmonize with other levels of the system and with bottom-up information—whether originating from within (e.g., via lower-level intrinsic systems and related interoception) or, at lower doses, outside the individual (i.e., via sensory input or extroception). Such functional harmony or realignment may look like a system better able to guide thought and behavior in an open, unguarded way (Watts et al., 2017; Carhart-Harris et al., 2018b).

This makes psychedelics a potent tool for psychotherapy:

Consistent with the model presented in this work, overweighted high-level priors can be all consuming, exerting excessive influence throughout the mind and brain’s (deep) hierarchy. The negative cognitive bias in depression is a good example of this (Beck, 1972), as are fixed delusions in psychosis (Sterzer et al., 2018).25 In this paper, we propose that psychedelics can be therapeutically effective, precisely because they target the high levels of the brain’s functional hierarchy, primarily affecting the precision weighting of high-level priors or beliefs. More specifically, we propose that psychedelics dose-dependently relax the precision weighting of high-level priors (instantiated by high-level cortex), and in so doing, open them up to an upsurge of previously suppressed bottom-up signaling (e.g., stemming from limbic circuitry). We further propose that this sensitization of high-level priors means that more information can impress on them, potentially inspiring shifts in perspective, felt as insight. One might ask whether relaxation followed by revision of high-level priors or beliefs via psychedelic therapy is easy to see with functional (and anatomic) brain imaging. We presume that it must be detectable, if the right questions are asked in the right way.

Am I imagining this, or are Friston + Carhart-Harris and Unlocking The Emotional Brain getting at the same thing?

Both start with a piece of a predictive model (= high-level prior) telling you something that doesn’t fit the current situation. Both also assume you have enough evidence to convince a rational person that the high-level prior is wrong, or doesn’t apply. But you don’t automatically smash the prior and the evidence together and perform an update. In UtEB‘s model, the update doesn’t happen until you forge conscious links to both pieces of information and try to hold them in consciousness at the same time. In F+CH’s model, the update doesn’t happen until you take psychedelics which make the high-level prior lose some of its convincingness. UtEB is trying to laboriously build roads through mountains; F+CH are trying to cast a magic spell that makes the mountains temporarily vanish. Either way, you get communication between areas that couldn’t communicate before.


Why would mental mountains exist? If we keep trying to get rid of them, through therapy or psychedelics, or whatever, then why not just avoid them in the first place?

Maybe generalization is just hard (thanks to MC for this idea). Suppose Goofus is mean to you. You learn Goofus is mean; if this is your first social experience, maybe you also learn that the world is mean and people have it out for you. Then one day you meet Gallant, who is nice to you. Hopefully the system generalizes to “Gallant is nice, Goofus is still mean, people in general can go either way”.

But suppose one time Gallant is just having a terrible day, and curses at you, and that time he happens to be wearing a red shirt. You don’t want to overfit and conclude “Gallant wearing a red shirt is mean, Gallant wearing a blue shirt is nice”. You want to conclude “Gallant is generally nice, but sometimes slips and is mean.”

But any algorithm that gets too good at resisting the temptation to separate out red-shirt-Gallant and blue-shirt-Gallant risks falling into the opposite failure mode where it doesn’t separate out Gallant and Goofus. It would just average them out, and conclude that people (including both Goofus and Gallant) are medium-niceness.

And suppose Gallant has brown eyes, and Goofus green eyes. You don’t want your algorithm to overgeneralize to “all brown-eyed people are nice, and all green-eyed people are mean”. But suppose the Huns attack you. You do want to generalize to “All Huns are dangerous, even though I can keep treating non-Huns as generally safe”. And you want to do this as quickly as possible, definitely before you meet any more Huns. And the quicker you are to generalize about Huns, the more likely you are to attribute false significance to Gallant’s eye color.

The end result is a predictive model which is a giant mess, made up of constant “This space here generalizes from this example, except this subregion, which generalizes from this other example, except over here, where it doesn’t, and definitely don’t ever try to apply any of those examples over here.” Somehow this all works shockingly well. For example, I spent a few years in Japan, and developed a good model for how to behave in Japanese culture. When I came back to the United States, I effortlessly dropped all of that and went back to having America-appropriate predictions and reflexive actions (except for an embarrassing habit of bowing whenever someone hands me an object, which I still haven’t totally eradicated).

In this model, mental mountains are just the context-dependence that tells me not to use my Japanese predictive model in America, and which prevents evidence that makes me update my Japanese model (like “I notice subways are always on time”) from contaminating my American model as well. Or which prevent things I learn about Gallant (like “always trust him”) from also contaminating my model of Goofus.

There’s actually a real-world equivalent of the “red-shirt-Gallant is bad, blue-shirt-Gallant is good” failure mode. It’s called “splitting”, and you can find it in any psychology textbook. Wikipedia defines it as “the failure in a person’s thinking to bring together the dichotomy of both positive and negative qualities of the self and others into a cohesive, realistic whole.”

In the classic example, a patient is in a mental hospital. He likes his doctor. He praises the doctor to all the other patients, says he’s going to nominate her for an award when he gets out.

Then the doctor offends the patient in some way – maybe refuses one of his requests. All of a sudden, the doctor is abusive, worse than Hitler, worse than Mengele. When he gets out he will report her to the authorities and sue her for everything she owns.

Then the doctor does something right, and it’s back to praise and love again.

The patient has failed to integrate his judgments about the doctor into a coherent whole, “doctor who sometimes does good things but other times does bad things”. It’s as if there’s two predictive models, one of Good Doctor and one of Bad Doctor, and even though both of them refer to the same real-world person, the patient can only use one at a time.

Splitting is most common in borderline personality disorder. The DSM criteria for borderline includes splitting (there defined as “a pattern of unstable and intense interpersonal relationships characterized by alternating between extremes of idealization and devaluation”). They also include things like “markedly and persistently unstable self-image or sense of self”, and “affective instability due to a marked reactivity of mood”, which seem relevant here too.

Some therapists view borderline as a disorder of integration. Nobody is great at having all their different schemas talk to each other, but borderlines are atrocious at it. Their mountains are so high that even different thoughts about the same doctor can’t necessarily talk to each other and coordinate on a coherent position. The capital only has enough messengers to talk to one valley at a time. If tribesmen from the Anger Valley are advising the capital today, the patient becomes truly angry, a kind of anger that utterly refuses to listen to any counterevidence, an anger pure beyond your imagination. If they are happy, they are purely happy, and so on.

About 70% of people diagnosed with dissociative identity disorder (previously known as multiple personality disorder) have borderline personality disorder. The numbers are so high that some researchers are not even convinced that these are two different conditions; maybe DID is just one manifestation of borderline, or especially severe borderline. Considering borderline as a failure of integration, this makes sense; DID is total failure of integration. People in the furthest mountain valleys, frustrated by inability to communicate meaningfully with the capital, secede and set up their own alternative provincial government, pulling nearby valleys into their new coalition. I don’t want to overemphasize this; most popular perceptions of DID are overblown, and at least some cases seem to be at least partly iatrogenic. But if you are bad enough at integrating yourself, it seems to be the sort of thing that can happen.

In his review, Kaj relates this to Internal Family Systems, a weird form of therapy where you imagine your feelings as people/entities and have discussions with them. I’ve always been skeptical of this, because feelings are not, in fact, people/entities, and it’s unclear why you should expect them to answer you when you ask them questions. And in my attempts to self-test the therapy, indeed nobody responded to my questions and I was left feeling kind of silly. But Kaj says:

As many readers know, I have been writing a sequence of posts on multi-agent models of mind. In Building up to an Internal Family Systems model, I suggested that the human mind might contain something like subagents which try to ensure that past catastrophes do not repeat. In subagents, coherence, and akrasia in humans, I suggested that behaviors such as procrastination, indecision, and seemingly inconsistent behavior result from different subagents having disagreements over what to do.

As I already mentioned, my post on integrating disagreeing subagents took the model in the direction of interpreting disagreeing subagents as conflicting beliefs or models within a person’s brain. Subagents, trauma and rationality further suggested that the appearance of drastically different personalities within a single person might result from unintegrated memory networks, which resist integration due to various traumatic experiences.

This post has discussed UtEB’s model of conflicting emotional schemas in a way which further equates “subagents” with beliefs – in this case, the various schemas seem closely related to what e.g. Internal Family Systems calls “parts”. In many situations, it is probably fair to say that this is what subagents are.

This is a model I can get behind. My guess is that in different people, the degree to which mental mountains form a barrier will cause the disconnectedness of valleys to manifest as anything from “multiple personalities”, to IFS-findable “subagents”, to UtEB-style psychiatric symptoms, to “ordinary” beliefs that don’t cause overt problems but might not be very consistent with each other.


This last category forms the crucial problem of rationality.

One can imagine an alien species whose ability to find truth was a simple function of their education and IQ. Everyone who knows the right facts about the economy and is smart enough to put them together will agree on economic policy.

But we don’t work that way. Smart, well-educated people believe all kinds of things, even when they should know better. We call these people biased, a catch-all term meaning something that prevents them from having true beliefs they ought to be able to figure out. I believe most people who don’t believe in anthropogenic climate change are probably biased. Many of them are very smart. Many of them have read a lot on the subject (empirically, reading more about climate change will usually just make everyone more convinced of their current position, whatever it is). Many of them have enough evidence that they should know better. But they don’t.

(again, this is my opinion, sorry to those of you I’m offending. I’m sure you think the same of me. Please bear with me for the space of this example.)

Compare this to Richard, the example patient mentioned above. Richard had enough evidence to realize that companies don’t hate everyone who speaks up at meetings. But he still felt, on a deep level, like speaking up at meetings would get him in trouble. The evidence failed to connect to the emotional schema, the part of him that made the real decisions. Is this the same problem as the global warming case? Where there’s evidence, but it doesn’t connect to people’s real feelings?

(maybe not: Richard might be able to say “I know people won’t hate me for speaking, but for some reason I can’t make myself speak”, whereas I’ve never heard someone say “I know climate change is real, but for some reason I can’t make myself vote to prevent it.” I’m not sure how seriously to take this discrepancy.)

In Crisis of Faith, Eliezer Yudkowsky writes:

Many in this world retain beliefs whose flaws a ten-year-old could point out, if that ten-year-old were hearing the beliefs for the first time. These are not subtle errors we’re talking about. They would be child’s play for an unattached mind to relinquish, if the skepticism of a ten-year-old were applied without evasion…we change our minds less often than we think.

This should scare you down to the marrow of your bones. It means you can be a world-class scientist and conversant with Bayesian mathematics and still fail to reject a belief whose absurdity a fresh-eyed ten-year-old could see. It shows the invincible defensive position which a belief can create for itself, if it has long festered in your mind.

What does it take to defeat an error that has built itself a fortress?

He goes on to describe how hard this is, to discuss the “convulsive, wrenching effort to be rational” that he thinks this requires, the “all-out [war] against yourself”. Some of the techniques he mentions explicitly come from psychotherapy, others seem to share a convergent evolution with it.

The authors of UtEB stress that all forms of therapy involve their process of reconsolidating emotional memories one way or another, whether they know it or not. Eliezer’s work on crisis of faith feels like an ad hoc form of epistemic therapy, one with a similar goal.

Here, too, there is a suggestive psychedelic connection. I can’t count how many stories I’ve heard along the lines of “I was in a bad relationship, I kept telling myself that it was okay and making excuses, and then I took LSD and realized that it obviously wasn’t, and got out.” Certainly many people change religions and politics after a psychedelic experience, though it’s hard to tell exactly what part of the psychedelic experience does this, and enough people end up believing various forms of woo that I hesitate to say it’s all about getting more rational beliefs. But just going off anecdote, this sometimes works.

Rationalists wasted years worrying about various named biases, like the conjunction fallacy or the planning fallacy. But most of the problems we really care about aren’t any of those. They’re more like whatever makes the global warming skeptic fail to connect with all the evidence for global warming.

If the model in Unlocking The Emotional Brain is accurate, it offers a starting point for understanding this kind of bias, and maybe for figuring out ways to counteract it.


Could someone please start a bright home lighting company?

26 ноября, 2019 - 22:20
Published on November 26, 2019 7:20 PM UTC

Elevator pitch: Bring enough light to simulate daylight into your home and office.

This idea has been shared in Less Wrong circles for a couple years. Yudkowsky wrote Inadequate Equilibria in 2017 where he and his wife invented the idea, and Raemon wrote a playbook in 2018 for how to do it yourself. Now I and at least two other friends are trying to build something similar, and I suspect there's a bigger-than-it-looks market opportunity here because it's one of those things that a lot of people would probably want, if they knew it existed and could experience it. And it's only recently become cheap enough to execute well.

Coelux makes a high-end artificial skylight which certainly looks awesome, but it costs upwards of $30k and also takes a lot of headroom in the ceiling. Can we do better for cheaper?

Brightness from first principles

First let's clear up some definitions:

  • Watts is a measure of power consumption, not brightness.

    • "Watt equivalent" brightness is usually listed for LED bulbs, at least for the standard household bulb form factor. You should generally ignore this (instead, just look at the lumens rating), because it is confusing. Normally "watt equivalent" is computed by dividing lumens by 15 or so. (bulb manufacturers like to make LED bulbs that are easy to compare, by having similar brightness to the incandescents they replace, hence "watt equivalent")
  • Lumens output is a measurement of an individual bulb, but says nothing about the distribution of those rays of light. For that you want to be doing math to estimate lux.

  • "Lux", or "luminous flux", is the measurement of how bright light is on a certain surface (such as a wall or your face). Lux is measured in lumens per square meter. Usually, your end goal when designing lighting is to create a certain amount of lux.

    • Direct sunlight shines 100k lux (source for these on Wikipedia)
    • Full daylight (indirect) is more than 10k lux
    • An overcast day or bright TV studio lighting is 1000 lux
    • Indoor office lighting is typically 500
    • Indoor living room at night might be only 50

Side note: This scale surprises me greatly! We usefully make use of vision with four or more orders of magnitude differences in lux within a single day. Our human vision hardware is doing a lot of work to make the world look reasonable within these vast differences of amount of light. Regardless, this post is about getting a lot of lux. I hypothesize that lux is associated with both happiness and productivity, and during the "dark season" when we don't get as much lux from the sun, I'm looking to get some from artificial lights.

If you put a single 1000-lumen (66-watt-equivalent) omnidirectional bulb in the center of a spherical room of 2m radius (which approximates a 12' square bedroom), the lux at the radius of the sphere is 50. So now we can get a sense of the scope of the problem. When doctors say you should be getting 10,000 lux for 30 minutes a day, the defaults for home lighting are two orders of magnitude off.

  • Raemon's bulbs are "100W equivalent" which is ~1500 lumens per bulb. So he's got 36k lumens. If we treat this as a point source and expect that Raemon's head is 2m away from the bulbs, then he's getting 1800 lux, which is twice the "TV studio" lighting and seems pretty respectable. I haven't accounted for reflected light from the ceiling either, so reality might be better than this, but I doubt it changes the calculation by more than a factor of 2 -- but I don't have a robust way of estimating ambient light, so ideas are welcome.
  • David Chapman's plan uses three 20k-lumen LED light bars for offroad SUV driving, for a total of 60k lumens. But because the light bars aim the light at a relatively focused point on the floor, David estimates that most of that light is being delivered to a roughly 6-square-meter workspace for a total of 10k lux. The photos he shared of his workspace seem to support this estimate.
Other important factors besides brightness

Color temperature seems important to well-being. Color temperature is measured in kelvins with reference to black-body radiation, but you can think of it as, on the spectrum from "warm white" to "cool white", what do you prefer? Raemon's plan uses an even split between 2700K and 5000K bulbs. 2700K is quite yellow-y, 5000 is nearly pure white. In my experimentation I discovered that I liked closer to 5000 in the mornings and closer to 2700 in evenings.

And what about light distribution? Large "panels" of bright light would seem the closest to daylight in form-factor. Real windows are brighter near the top, and it is considered dramatic and unnatural to have bright lighting coming from the ground. Also, single bright point sources are painful to look at and can seem harsh. I think there's a lot of flexibility here, but I think my personal ideal light would be a large, window-sized panel of light mounted on the ceiling or high on the wall.

Also, color accuracy: LEDs are notoriously narrow spectrum by default; manufacturers have to do work to make their LEDs look more like incandescent bulbs in how they light up objects of different colors. Check for a measure called Color Rendering Index, or CRI, in product descriptions. 100 is considered perfect color rendering, and anything less than 80 looks increasingly awful as you go down. The difference between CRI 80 and 90 is definitely noticeable to some people. I haven't blind tested myself, and definitely might be imagining it, but I feel like there was some kind of noticeable upgrade of the "coziness" or "warmth" in my room when upgrading from CRI 80 to CRI 95 bulbs.

Dimmability? (Are you kidding? We want brightness, not dimness!) Okay, fine, if you insist. Most high-end LED bulbs seem dimmable today, so I hope this is not an onerous requirement.

Last thing I can think of is flicker. I have only seen flicker as a major problem with really low-end bulbs, but I can easily see and be annoyed by 60hz flicker out of the corner of my eye. Cheap Christmas LED light strings have super bad flicker, but it seems like manufacturers of nicer LEDs today have caught on, because I haven't had any flicker problems with LED bulbs in years.

Okay, so to summarize: I want an all-in-one "light panel" that produces at least 20000 lumens and can be mounted to a wall or ceiling, with no noticeable flicker, good CRI, and adjustable (perhaps automatically adjusting) color temperature throughout the day.

A redditor made a fake window for their basement which is quite impressive for under $200. This is definitely along the axis I am imagining.

I haven't mentioned operating cost. Full-spectrum LEDs seem to output about 75 lumens per watt, so if our panel is 20k lumens then we should expect our panel to draw 266 watts. This seems reasonable to me. If you leave it on 8 hours a day, you're going to use 25 cents per day in electricity (at $.12 per kWh).

Marketing and Costs

What do you think people will pay for the product? I have already put 6+ hours into researching this and don't have a satisfactory solution yet. I would probably pay at least $400 to get that time back, if the result satisfied all my requirements; I expect to put in quite a bit more time, so I think I could probably be convinced to pay north of $1000 for a really good product. Hard to say what others would pay, but I wouldn't be surprised if you could build a good product in the $400-1200 range that would be quite popular.

What about costs? Today, Home Depot sells Cree 90-CRI, 815-lumen bulbs on their website for $1.93 per bulb for a cost of $2.37 per 1000 lumens. This is the cheapest I've seen high quality bulbs. (The higher lumen bulbs are annoyingly quite a bit more expensive). To get 36k lumens at this price costs under $100 retail. Presumably there are cooling considerations when packing LEDs close together but those seem solvable if you're doing the "panel" form factor. There are other costs I'm sure, but it seems like the LEDs and driver are likely to dominate most of the costs. These are dimmable but not color temperature adjustable.

Yuji LEDs sells 2700K-6500K dimmable LED strips, also with 95+ CRI, at $100 for 6250 lumens (so a cost of $16 per 1000 lumens). This is 7x more expensive per lumen, but knowing that it exists is really helpful.

Promotion and Distribution

Kickstarter is the obvious idea for getting this idea out there. I would also recommend starting a subreddit (if it doesn't exist; I haven't checked yet) for do-it-yourselfers who want to build or buy really bright lighting systems for their homes, as I think there is probably enough sustained interest in such a topic for it to exist.

You can also try to get press. The idea of "indoor light as bright as daylight" is probably somewhat viral so I'd hope you can get people to write about you. Coelux got a bunch of press a few years ago doing this exact thing, but their product is so expensive that they don't even list their price on their website, but in articles about Coelux you can see people commenting that they wish they could afford one.

I do think the idea needs to be spread more. Most people don't know this is possible, so there's a lot of work you'll be doing to just explain that such a thing is possible and healthy.


I don't think there's any relevant competition out there today. Coelux is super high end. The competition is do-it-yourselfers, but this market is far bigger than the number of people who are excited to do-it-themself.

Some have mentioned "high bay" lights, which are designed to be mounted high in warehouses and such, and throw a light cone a long distance to the floor. I am excited to try this and I will probably try it next, but I am not super optimistic about it because I expect it to be quite harsh. This is the one that Yuji sells, but you can find cheaper and presumably lower-quality ones on Amazon.

Part of my motivation for writing this blog post is to source ideas for other things that exist that could fill this niche. Comment here if you solved this problem in a way I haven't described! I'll update this post with ideas. If you start this company, also email me and I'll buy one and try your product and probably write about it :)

Building a Sustainable Business

If you put a bunch of research into designing a really great product and it succeeds but gets effectively copied by low-cost clones, you'll be sad. I am not sure how to defend this, and I think it is probably the weakest point of this business model; but it is a weakness that many hardware companies share, and a lot of them still carve out a niche. One idea would be to build up your product's branding and reputation, by explaining why low-cost clones suck in various ways. Another is just to give really good service. Lastly, if you avoid manufacturing things in China, maybe Chinese clone companies won't copy your technology as quickly.


3 Cultural Infrastructure Ideas from MAPLE

26 ноября, 2019 - 21:56
Published on November 26, 2019 6:56 PM UTC

About six months ago, I moved to the Monastic Academy in Vermont. MAPLE for short.

You may have curiosities / questions about what that is and why I moved there. But I'll save that for another time.

I was having a conversation last week about some cultural infrastructure that exists at MAPLE (that I particularly appreciate), and these ideas seemed worth writing up.

Note that MAPLE is a young place, less than a decade old in its current form. So, much of it is "experimental." These ideas aren't time-tested. But my personal experience of them has been surprisingly positive, so far.

I hope you get value out of playing with these ideas in your head or even playing with various implementations of these ideas.

1. The Care Role or Care People

MAPLE loves its roles. All residents have multiple roles in the community.

Some of them are fairly straightforward and boring. E.g. someone's role is to write down the announcements made at meals and then post them on Slack later.

Some of them are like "jobs" or "titles". E.g. someone is the bookkeeper. Someone is the Executive Director.

One special role I had the honor of holding for a few months was the Care role.

The Care role's primary aim is to watch over the health and well-being of the community as a group. This includes their physical, mental, emotional, and psychological well-being.

The Care role has a few "powers."

The Care role can offer check-ins or "Care Talks" to people. So if I, in the Care role, notice someone seems to be struggling emotionally, I can say, "Hey would you like to check in at some point today?" and then schedule such a meeting. (MAPLE has a strict schedule, and this is not something people would normally be able to do during work hours, but it's something Care can do.)

People can also request Care Talks from Care.

The Care role also has the power to plan / suggest Care Days. These are Days for community bonding and are often either for relaxation or emotional processing. Some examples of Care Days we had: we went bowling; we did a bunch of Circling; we visited a nearby waterfall.

The Care role can request changes to the schedule if they believe it would benefit the group's well-being. E.g. asking for a late wake-up. (Our usual wake-up is 4:40AM!)

Ultimately though, the point of this is that it's someone's job to watch over the group in this particular way. That means attending to the group field, learning how to read people even when they are silent, being attentive to individuals but also to the "group as a whole."

For me as Care, it gave me the permission and affordance to devote part of my brain function to tracking the group. Normally I would not bother devoting that much energy and attention to it because I know I wouldn't be able to do much about it even if I were tracking it.

Why devote a bunch of resource to tracking something without the corresponding ability / power to affect it?

But since it was built into the system, I got full permission to track it and then had at least some options for doing something about what I was noticing.

This was also a training opportunity for me. I wasn't perfect at the job. I felt drained sometimes. I got snippy and short sometimes. But it was all basically allowing me to train and improve at the job, as I was doing it. No one is perfect at the Care role. Some people are more suitable than others. But no one is perfect at it.

The Care role also has a Care assistant. The Care assistant is someone to pick up the slack when needed or if Care goes on vacation or something. In practice, I suspect I split doing Care Talks fairly evenly with the Care assistant, since those are a lot for one person to handle. And, people tend to feel more comfortable with certain Care people over others, so it's good to give them an option. The Care assistant is also a good person for the Care role to get support from, since it tends to be more challenging for the Care role to receive Care themselves.

I could imagine, for larger groups, having a Care Team rather than a single Care role with Care assistant.

That said, there is a benefit to having one person hold the mantle primarily. Which is to ensure that someone is mentally constructing a model of the group plus many of the individuals within it, keeping the bird's eye view map. This should be one of Care's main mental projects. If you try to distribute this task amongst multiple people, you'll likely end up with a patchy, stitched-together map.

In addition, understanding group dynamics and what impacts the group is another good mental project for the Care person. E.g. learning how it impacts the group when leaders exhibit stress. Learning how to use love languages to tailor care for individuals. Etc.

1.5. The Ops Role

As an addendum, it's worth mentioning the Ops role too.

At MAPLE, we follow a strict schedule and also have certain standards of behavior.

The Ops role is basically in charge of the schedule and the rules and the policies at MAPLE. They also give a lot of feedback to people (e.g. "please be on time"). This is a big deal. It is also probably the hardest role.

It is important for the Ops role and the Care role to not be the same person, if you can afford it.

The Ops role represents, in a way, "assertive care." The Care role represents "supportive care." These are terms about healthy, skillful parenting that I read originally from the book Growing Up Again.

You can read more about supportive and assertive care here.

Basically, assertive points to structure, and supportive points to nurture. Both are vital.

Care builds models of the group's physical and emotional well-being, how their interactions are going, and reading people.

Ops builds models of what parts of the structure / schedule are important, how to be fair, how to be reasonable, noticing where things are slipping, building theories as to why, and figuring out adjustments. Ops has to learn how to give and receive feedback a lot more. Ops has to make a bunch of judgment calls about what would benefit the group and what would harm the group (in the short-term and long-term), and ultimately has to do it without a higher authority telling them what to do.

It's a difficult position, but it complements the Care role very well.

As Care, I noticed that people seemed to be worse off and struggled more when the Ops role failed to hold a strong, predictable, and reasonable container. The Ops role is doing something that ultimately cares for people's emotional, mental, and physical well-being—same as Care. But they do it from a place of more authority and power.

As Care, I would sometimes find myself wanting to do some "Ops"-like things—like remind people about rules or structures. But it's important for Care to avoid handling those tasks, so that people feel more open and not have that "up for them" with Care. Care creates a space where people can process things and just get support.

It's not really beneficial for Care to take on the Ops role, and it's not beneficial for Ops to take on the Care role. This creates floppiness and confusion.

2. Support Committees

Sometimes, people struggle at MAPLE. Once in a while, they struggle in a way that is more consistent and persistent, in an "adaptive challenge" way. A few Care Talks aren't sufficient for helping them.

If someone starts struggling in this way, MAPLE can decide to spin up a support committee for that person. Let's call this struggling person Bob.

The specific implementation at MAPLE (as far as I know, at this particular time) is:

  • Three people are selected to be on Bob's support committee.
  • Some factors in deciding those people include: Is Bob comfortable with them? Do they have time? Do they want to support Bob? Do they seem like they'd do a decent job of supporting Bob?
  • The way the decision actually gets made differs for each case, but it probably always involves the Executive Director.
  • The support committee meets with Bob about once a week.
  • They discuss ways they can be supportive to Bob. Could he use reminders to avoid caffeine? Could he use an exercise accountability person? Could he use regular Care Talks? Could he use help finding a therapist?
  • They also give Bob feedback of various kinds. E.g. maybe Bob has been making chit-chat during silent periods; maybe Bob has been yelling things at Alice when he gets scared; maybe Bob is taking naps during work period. In this frame, it should be clear that Bob is the responsible party for his own growth and improvement and well-being. Ultimately he has to hold to his commitments / responsibilities / roles in the community, and the support committee can't do that for him. But they can help him as much as seems reasonable / worth trying.
  • Current implementation of this doesn't have a pre-set deadline for when the committee ceases, but there are check-ins with the Executive Director to see how things are progressing with Bob and the support committee.
  • Sometimes, it might come to make sense to ask Bob to leave the community, if things aren't improving after some time has passed (3-6 months maybe?). If everyone put in their best effort, within reason, and still Bob can't hold to his commitments, despite everyone's best intentions, then there may be a decision to part ways.
  • Hopefully most of the time, the support committee thing works enough to get Bob to a place where he's no longer struggling and can get back into the flow of things without a support committee.

I appreciate support committees!

They're trying to strike a tricky balance between being supportive and holding people accountable. But they keep communication channels open and treat it like a two-way street.

Bob isn't totally in the dark about what's going on. He isn't being suddenly told there's a problem and that he can't stay. He also isn't being held totally responsible, as one might be at a normal job. "Either shape up or ship out" sort of thing. It's also not the thing where people act "open and supportive" but really it's still "on you" to fix yourself, and no one lifts a finger, and you have to do all the asking.

With a support committee, Bob gets regular support from the community in a structured way. He gets to set goals for himself, in front of others. He gets regular feedback on how he's doing on those goals. If he needs help, he has people who can brainstorm with him on how to get that help, and if they commit to helping him in some way, they actually do it. If he needs someone to talk to, he can have regularly scheduled Care Talks.

He is neither being coddled nor neglected.

It's also helpful to generally foster a feeling that the community is here for you and that there's a desire to do what's best for everyone, from all parties.

Would this kind of thing work everywhere for all groups? No, of course not.

It's a bit resource-intensive as it currently is. It also seems to ask for a high skill level and value-aligned-ness from people. But there's room to play around with the specific format.

3. The Schedule

The Schedule at MAPLE is not viable for most people in most places.

But many people who come to stay at MAPLE find out that the Schedule is something they hugely benefit from having. It's often named as one of the main pros to MAPLE life.

Basically, there's a rigid schedule in place. It applies to five-and-a-half days out of the week. (Sundays are weird because we go into town to run an event; Mondays are off-schedule days.)

But most days, it's the same routine, and everyone follows it. (The mornings and evenings are the most regimented part of the day, with more flexibility in the middle part.)

4:40AM chanting. 5:30AM meditation. 7AM exercise. 8:05AM breakfast. Then work. Etc. Etc. Up until the last thing, 8:30PM chanting.

Which is more surprising:

  • The fact most people, most of the time, show up on time to each of these activities? (Where "on time" means being a little bit early?)
  • Or the fact that often there's at least one person who's at least one minute late, despite there theoretically being very few other things going on, relatively speaking?


Anyway, here's why I think the Schedule is worth talking about as a cultural infrastructure idea:

It's more conducive to getting into spontaneous motion.

You don't have to plan (as much) about what you're going to do, when. The activities come one right after the other.

At MAPLE I don't get stuck in bed, wondering whether to get up now or later.

I have spent hours and hours of my life struggling with getting out of bed (yay depression). Regardless of my mood or energy level, I just get out of bed, and it's automatic, and I don't think about it, and suddenly I'm putting on my socks, and I'm out the door.

This has translated somewhat to my off-schedule / vacation days also.

When left to my own devices, I do not exercise. I have never managed to exercise regularly as an adult. While I'm on-schedule, I just do it. I don't push myself harder than I can push; sometimes I take it easy and focus on stretching and Tai Chi. But sometimes I sprint, and sometimes I get sore, and my stamina is noticeably higher than before.

This is so much better than what it was like without the Schedule! It has proven to be more effective than my years of attempts to debug the issue using introspection.

The Schedule lets me just skip the decision paralysis. I often find myself "just spontaneously doing it." It becomes automatic habit. Like starting the drive home and "waking up" to the fact I am now home.

This is relaxing. It's more relaxing to just exercise than to internally battle over whether to exercise. It's more relaxing to just get up and start the day than to internally struggle over whether to get up. There is relief in it.

It's easier to tell when people are going through something.

As Care, it was my job to track people's overall well-being.

As it turns out, if someone starts slipping on the Schedule (showing up even a bit late to things more often), it's often an indication of something deeper.

The Schedule provides all these little fishing lines that tug when someone could use some attention, and the feedback is much faster than a verbal check-in.

Sometimes I would find myself annoyed by someone falling through or breaking policy or whatever. If I dug into it, I'd often find out they were struggling on a deeper level. Like I might find out their mom is in the hospital, or they are struggling with a flare up of chronic pain, or something like that.

Once I picked up on that pattern, I learned to view people's small transgressions or tardiness as a signal for more compassion, rather than less. Where my initial reaction might be to tense up, feel resistance, or get annoyed, I can remind myself that they're probably going through some Real Shit and that I would struggle in that situation too, and then I relax.

Everyone's doing it together.

Everyone doing something together is conducive conditions for creating common knowledge, even when there's no speaking involved. Common knowledge is a huge efficiency gain. And I suspect it's part of why it's internally easy for me to "just do it." (And maybe points to why it's harder for me to "just do it" when no one else notices or cares.)

Having more shared reality with each other reduces the need for verbal communication, formal group decision-making processes, and internal waffling.

If everyone can see the fire in the kitchen, you don't need to say a word. People will just mobilize and put out the fire.

If everyone sees that Carol is late, and Carol knows everyone has seen that she is late, it's harder for anyone to create alternative stories, like "Carol was actually on time." No one has to waste time on that.

There are lots of more flexible versions of the Schedule that people use and benefit from already. Shared meals in group houses, for instance.

But I'd love to see more experimentation with this, in communities or group houses or organizations or what-have-you.

Dragon Army attempted some things in this vein, and I saw them getting up early and exercising together on a number of occasions. I'd love to see more attempts along these lines.


Effect of Advertising

26 ноября, 2019 - 17:30
Published on November 26, 2019 2:30 PM UTC

I've recently had several conversations around whether advertising is harmful, and specifically whether ads primarily work by tricking people into purchasing things they don't need. One way to think about this is, what would the world would be like if we didn't allow advertising? No internet ads, TV ads, magazine ads, affiliate links, sponsored posts, product placement, everything. Let's also assume that enforcement is perfect, though of course edge cases would be very tricky. Here's my speculation about how this would change people's purchasing:

  • Products would be a lot stickier. A lot of advertising tries to move people between competitors. Sometimes it's an explicit "here's a way we're better" (ex: we don't charge late fees), other times it's a more general "you should think positively of our company" (ex: we agree with you on political issue Y). Banning ads would probably mean higher prices (Benham 2013) since it would be harder to compete on price.

  • Relatedly, it would be much harder to get many new products started. Say a startup makes a new credit card that keeps your purchase history private: right now a straightforward marketing approach would be (a) show that other credit cards are doing something their target audience doesn't like, (b) build on the audience's sense that this isn't ok, and (c) present the new card as a solution. Without ads they would likely still see uptake among people who were aware of the problem and actively looking for a solution, but mostly people would just stick with the well-known cards.

  • A major way ads work is by building brand associations: people who eat Powdermilk Biscuits are probably Norwegian bachelor farmers, listen to public radio, or want to signal something along those lines. Branded products both provide something of a service, by making more ways to signal identity, and charge for it, by being more expensive to pay for clever ad campaigns. Without ads we would probably still have these associations, however, and products that happened to be associated with coveted identities would still have this role. The way these associations would develop would be less directed, though brands would probably still try pretty hard to influence them even without ads. You can also choose to signal the "frugal" identity, which lets you avoid the brand tax.

  • Reviewers would be much more trustworthy. There's a long history of reviewers getting 'captured' by the industry they review.

  • Purchases of things people hadn't tried before would decrease, both things that people were in retrospect happy to have bought and things they were not. One of the roles of advertising is to let people know about things that, if they knew about them they would want to buy. But "buy stuff they don't need" isn't a great gloss for this, since after buying the products people often like them a lot. On the other hand I do think this applies to children, and one of the things people learn as they grow up is how to interpret ads. Which is also why we have regulations on ads directed at kids.

Don't put too much stock in this: I work on the technical side of ads and don't have a great view into their social role, and even if I was in a role like that it would still be very hard to predict how the world would be different with such a large change. But broad "we'd see more of X and less of Y" analysis gives a way to explore the question, and I'm curious what other people's impressions are.

(Disclosure: I work in ads but am speaking only for myself. I may be biased, though if I thought my work was net negative I wouldn't do it.)

Comment via: facebook