# Новости LessWrong.com

A community blog devoted to refining the art of rationality
Обновлено: 12 минут 12 секунд назад

### ACDT: a hack-y acausal decision theory

15 января, 2020 - 20:22
Published on January 15, 2020 5:22 PM UTC

Inspired by my post on problems with causal decision theory (CDT), here is a hacked version of CDT that seems to be able to imitate timeless decision theory (TDT) and functional decision theory[1] (FDT), as well as updateless decision theory (UDT) under certain circumstances.

Call this ACDT, for (a)causal decision theory. It is, essentially, CDT which can draw extra, acausal arrows on the causal graphs, and which attempts to figure out which graph represents the world it's in. The drawback is its lack of elegance; the advantage, if it works, is that it's simple to specify and focuses attention on the important aspects of deducing the graph.

Defining ACDT CDT and the Newcomb problem

In the Newcomb problem, there is a predictor Ω who leaves two boxes, and predicts whether you will take one ("one-box") or both ("two-box"). If Ω predicts you will one-box, it had put a large prize in that first box; otherwise that box is empty. There is always a small consolation prize in the second box.

In terms of causal graphs, we can represent it this way:

The dark red node is the decision node, which the agent can affect. The green node is a utility node, whose value the agent cares about.

The CDT agent uses the "do" operator from Pearl's Causality. Essentially all the incoming arrows to the decision node are cut (though the CDT agent keeps track of any information gained that way), then the CDT agent maximises its utility by choosing its action:

In this situation, the CDT agent will always two-box, since it treats Ω's decision as fixed, and in that case two-boxing dominates, since you get whatever's in the first box, plus the consolation prize.

ACDT algorithm

The ACDT algorithm is similar, except that when it cuts the causal links to its decision, it also adds potential links from that decision node to all the other nodes in the graph. Then it attempts to figure out which diagram is correct, and then maximises its utility in the CDT way.

Note that ACDT doesn't take a position on what these extra links are - whether they are pointing back in time or are reflecting some more complicated structure (such as the existence of predictors). It just assumes the links could be there, and then works from that.

In a sense, ACDT can be seen as anterior to CDT. How do we know that causality exists, and the rules it runs on? From our experience in the world. If we lived in a world where the Newcomb problem or the predictors exist problem were commonplace, then we'd have a different view of causality.

It might seem gratuitous and wrong to draw extra links coming out of your decision node - but it was also gratuitous and wrong to cut all the links that go into your decision node. Drawing these extra arrows undoes some of the damage, in a way that a CDT agent can understand (they don't understand things that cause their actions, but they do understand consequences of their actions).

ACDT and the Newcomb problem

As well as the standard CDT graph above, ACDT can also consider the following graph, with a link from its decision to Ω's prediction:

It now has to figure out which graph represents the better structure for the situation it finds itself in. If it's encountered the Newcomb problem before, and tried to one-box and two-box a few times, then it knows that the second graph gives more accurate predictions. And so it will one-box, just as well as the TDT family does.

Generalising from other agents

If the ACDT agent has not encountered Ω themselves, but has seen it do the Newcomb problem for other agents, then the "figure out the true graph" becomes more subtle. UDT and TDT are built from the assumption that equivalent algorithms/agents in equivalent situations will produce equivalent results.

But ACDT, built out of CDT and its solipsistic cutting process, has no such assumptions - at least, not initially. It has to learn that the fate of other, similar agents, is evidence for its own graph. Once it learns that generalisation, then it can start to learn from the experience of others.

ACDT on other decision problems Predictors exist

Each round of the predictors exist has a graph similar to the Newcomb problem, with the addition of a node to repeat the game:

After a few rounds, the ACDT agent will learn that the following graph best represents its situation:

And it will then swiftly choose to leave the game.

Prisoner's dilemma with identical copy of itself

If confronted by the prisoner's dilemma with an identical copy of itself, the ACDT agent, though unable to formalise "we are identical", will realise that they always make the same decision:

And it will then choose to cooperate.

Parfit's hitchhiker

The Parfit's hitchhiker problem is as follows:

Suppose you're out in the desert, running out of water, and soon to die - when someone in a motor vehicle drives up next to you. Furthermore, the driver of the motor vehicle is a perfectly selfish ideal game-theoretic agent, and even further, so are you; and what's more, the driver is Paul Ekman, who's really, really good at reading facial microexpressions. The driver says, "Well, I'll convey you to town if it's in my interest to do so - so will you give me $100 from an ATM when we reach town?" Now of course you wish you could answer "Yes", but as an ideal game theorist yourself, you realize that, once you actually reach town, you'll have no further motive to pay off the driver. "Yes," you say. "You're lying," says the driver, and drives off leaving you to die. For ACDT, it will learn the following graph: And will indeed pay the driver. XOR blackmail XOR blackmail is one of my favourite decision problems. An agent has been alerted to a rumor that her house has a terrible termite infestation that would cost her$1,000,000 in damages. She doesn’t know whether this rumor is true.

A greedy predictor with a strong reputation for honesty learns whether or not it’s true, and drafts a letter: I know whether or not you have termites, and I have sent you this letter iff exactly one of the following is true: (i) the rumor is false, and you are going to pay me $1,000 upon receiving this letter; or (ii) the rumor is true, and you will not pay me upon receiving this letter. The predictor then predicts what the agent would do upon receiving the letter, and sends the agent the letter iff exactly one of (i) or (ii) is true. Thus, the claim made by the letter is true. Assume the agent receives the letter. Should she pay up? The CDT agent will have the following graph: And the CDT agent will make the simple and correct decision not to pay. ACDT can eventually reach the same conclusion, but may require more evidence. It also has to consider graphs of the following sort: The error of evidential decision theory (EDT) is, in effect, to act as if the light green arrow existed: that they can affect the existence of the termites through their decision. ACDT, if confronted with similar problems often enough, will eventually learn that the light green arrow has no effect, while the dark green one does have an effect (more correctly: the model with the dark green arrow is more accurate, while the light green arrow doesn't add accuracy). It will then refuse to pay, just like the CDT agent does. Not UDT: counterfactual mugging The ACDT agent described above differs from UDT in that it doesn't pay the counterfactual mugger: Ω appears and says that it has just tossed a fair coin, and given that the coin came up tails, it decided to ask you to give it$100. Whatever you do in this situation, nothing else will happen differently in reality as a result. Naturally you don't want to give up your $100. But Ω also tells you that if the coin came up heads instead of tails, it'd give you$10,000, but only if you'd agree to give it $100 if the coin came up tails. Do you give Ω the$100?

Non-coincidentally, this problem is difficult to represent in a causal graph. One way of seeing it could be this way:

Here the behaviour of the agent in the tails world, determines Ω's behaviour in the heads world. It would be tempting to try and extend ACDT, by drawing an arrow from that decision node to the Ω node in the heads world.

But that doesn't work, because that decision only happens in the tails world - in the heads world, the agent has no decision to make, so ACDT will do nothing. And in the tails world, the heads world is only counterfactually relevant.

Now ACDT, like EDT, can learn, in some circumstances, to pay the counterfactual mugger. If this scenario happens a lot, then it can note that agents that pay in the tails world get rewarded in the heads world, thus getting something like this:

But that's a bit too much of a hack, even for a hack-y method like this. More natural and proper would be to have the ACDT agent not use its decision as the node to cut-and-add-links from, but its policy (or, as in this post, its code). In that case, the counterfactual mugging can be represented as a graph by the ACDT agent:

The ACDT agent might have issues with fully acausal trade (though, depending on your view, this might be a feature not a bug).

The reason being, that since the ACDT agent never gets to experience acausal trade, it never gets to check whether there is a link between it and hypothetical other agents - imagine a Newcomb problem where you never get to see the money (which may be going to a charity you support - but that charity may not exist either), nor whether Ω exists.

If an ACDT ever discovered acausal trade, it would have to do so in an incremental fashion. It would first have to become comfortable enough with prediction problems so that drawing links to predictors is a natural thing for it to do. It would have to become comfortable enough with hypothetical arguments being correct, that it could generalise to situations where it cannot ever get any empirical evidence.

So whether an ACDT agent ever engages in fully acausal trade, depends on how it generalises from examples.

Neural nets learning to be ACDT

It would be interesting to program a neural net ACDT agent, based on these example. If anyone is interested in doing so, let me know and go ahead.

Learning graphs and priors over graphs

The ACDT agent is somewhat slow and clunky at learning, needing quite a few examples before it can accept unconventional setups.

If we want it to go faster, we can choose to modify its priors. For example, we can look at what evidence would convince us that an accurate predictor existed, and put a prior that would have a certain graph, conditional on seeing that evidence.

Or if we want to be closer to UDT, we could formalise statements about algorithms, and about their features and similarities (or formalise mathematical results about proofs, and about how to generalise from known mathematical results). Adding that to the ACDT agent gives an agent much closer to UDT.

So it seems that ACDT+"the correct priors", is close to various different acausal agent designs.

1. Since FDT is still somewhat undefined, I'm viewing as TDT-like rather than UDT-like for the moment. ↩︎

Discuss

### In defense of deviousness

15 января, 2020 - 18:35
Published on January 15, 2020 11:56 AM UTC

Why hard-to-read code can be a good code. A complex criticism to the “keep it simple” universal coding advice

It is easy to find articles and decalogues where experienced software developers manifest what they think are good rules for coding. Usually those recommendations are expressed with strong belief due to the reinforcement from similar opinions of most other authors, and due to they apparently emerge from plausible goals like readability, maintainability or simplicity. Who could question those goals? Probably no one since they are desirable goals but…are they objectively measurable?

After years of trying to think like a machine, what surely has affected my perception of complexity, I can write this response to some of those recommendations, not because of an opposed belief but because a reflection about how those recommendations supposedly drive to the goals is necessary.

I would like to begin with an anecdote

Some years ago I was working in an R&D project trying to solve a really intricate problem. I had been blocked for more than five months when I finally saw the light. One week later I had the corresponding algorithm implemented and it worked. I felt like a victorious warrior after the hardest battle.

Having passed near two years I noticed that a determined casuistic was not correctly treated by the algorithm so I had to review it, nothing special. I read the algorithm’s documentation, I read the source code and, to my surprise, I was unable to understand it!. How was it possible!, I figured out the solution!, I implemented the algorithm!, I commented the source code!, I wrote the extended documentation!, then why wasn’t I able to understand what my own program was doing?

I read all again several times and it took me two hours of thinking about the problem until I achieved the understanding of the algorithm I implemented to solve it. The experience was so disgusting that I promised myself this will never happen again. I had to be more careful and exhaustive when documenting a process as complex as that and when commenting its implementation. Then I prepared to rewrite the extended documentation and…amazingly, I realized that the documentation I initially wrote correctly explained the problem and its solution, and that the comments in the source code were useful and sufficient.

What had actually happened was very simple. After so much time working in other different problems I had almost completely forgotten the details of this one and how I walked around it. The problem was too devious and the solution too complex to expect an immediate comprehension. But despite the initial documentation was adequate I concluded it was necessary to add the following warning for the next time that code need to be maintained: “WARNING: You will have to reflect about the problem and the proposed solution more than an hour before working on it in order to rebuild the mental schemes necessary to understand it.

“Alice in wonderland” versus “The theory of general relativity”

Between the “Hello world” application and the most sophisticated scientific or AI tools, there are many types of software projects each one offering various degrees of complexity. All of them share a pretty obvious characteristic: they are all pieces of software written in one of the different available programming languages. This common feature usually leads to the belief that any piece of software can be maintained by any IT professional with enough experience in the corresponding programming language. While this is true for some maintenance tasks, the whole maintenance job in its full extension requires something else.

In order to successfully perform the complete and continuous maintenance work for a piece of software by a person distinct to the author, the professional should not only be an expert in the technology used to develop the software, this person also needs a deep understanding of the underlying problem that the software solves what, in return, reveals the very sense of the code to be maintained. When this problem is complex and new to the person, the immersion in the source code will demand effort, patience and, probably, additional experience in a specific knowledge field.

You can read “Alice in wonderland” at a rate of one page per minute, but you cannot do that when trying to understand “The theory of general relativity”. And you cannot ask Einstein for keeping it simple, just because it is not simple. It may take several months to feel comfortable with the existent source code when you join a new project, even when it is exhaustively documented. And if you are expecting to find “Alice in wonderland” but faces “The theory of general relativity” you will blame the original programmer because the code is unreadable, is obfuscated and is too hard to maintain arguing he didn’t follow those universal programming rules that everybody knows to generate “good code”.

The spaghetti mind

In the other hand, it is also true that there are disastrous programmers that write code unnecessarily devious, poorly documented and, when criticized, may argue they are misunderstood precisely because they are like Einstein. If you are programming the “Hello world” application and nobody is able to understand it, then you should review your coding methodology. But I’m not speaking about this case, I’m speaking about the case where the deviousness of the code arises from the real and unavoidable complexity of the underlying problem. Is there then a universal methodology to “keep it simple”?

Spaghetti code, this dreadful programming concept where too many things are chaotically interconnected, could be more natural than we usually think. Curiously the software is one of the best artificial representations of the processes carried out within our mind. Indeed our brain storage system is spaghetti memory (the technical term is hypergraph). Unfortunately most of these mental processes are intrinsically more complex than we would like, and little can be done to simplify them.

There is nothing wrong with implementing six or more nested loops when working with multidimensional objects if that is the natural way to do it. It is OK to modify the arguments of a function if you know what you are doing and why it is convenient. There is nothing bad in writing a function or a class method with 500 lines of code if the process demands it. And splitting that code in smaller functions may be of little help. Do not encapsulate a code fragment within a function if you don’t foresee to use that function in several parts or in a recursive algorithm. Forcing the creation of a function is creating a new object that complicates the code. It breaks the natural flow of the code and hinders the possibility of reusing variables already instantiated. And take into account that a function call is a process itself that consumes computing resources so avoid calling functions within a loop if you can.

If you want to increase readability, it’s better to separate the code fragments corresponding to specific subprocesses with blank lines and give them a title with an uppercase comment.

Conclusion“Sometimes we have to find the beauty in the complexity of an efficient solution to a devious problem”

Sometimes an algorithm is extremely complex because the underlying problem to solve is extremely complex, and nothing can be done to make the code easily understandable. Some methodologies and recommendations demand a limit for the size of functions, methods or classes, but you should question how this can really make the code simpler. Make your code compact, optimize it removing unnecessary instructions and document it in detail.

The reaction before complexity depends much on psychology. We are naturally conditioned to perceive harmony in simplicity. But sometimes we have to find the beauty in the complexity of an efficient solution to a devious problem and in the exhaustive management of its wide casuistic.

To deal with devilish deviousness is a dirty work but, from time to time, someone has to do it. Keep it simple? Yes of course…when you can.

Discuss

### What plausible beliefs do you think could likely get you diagnosed with a mental illness by a psychiatrist?

15 января, 2020 - 14:13
Published on January 15, 2020 11:13 AM UTC

Discuss

### Avoiding Rationalization

15 января, 2020 - 13:55
Published on January 15, 2020 10:55 AM UTC

Previously: Red Flags for Rationalization

It is often said that the best way to resist temptation is to avoid it. This includes the temptation to rationalize. You can avoid rationalization by denying it the circumstances in which it would take place.

When you avoid rationalization, you don't need to worry about infinite regress of metacognition, nor about overcompensation. As a handy bonus, you can demonstrate to others that you weren't rationalizing, which otherwise involves a lot of introspection and trust.

Here are three ways to do it:

Double Blinding

Identify the thing that would control your rationalization and arrange not to know it.

The trope namer is experimental science. You might be tempted to give better care to the experimental than the control group, but not if you don't know which is which. In many cases, you can maintain the blinding in statistical analysis as well, comparing "group A" to "group B" and only learning which is which after doing the math.

Similarly, if you are evaluating people (e.g. for a job) and are worried about subconscious sexism (or overcompensation for it), write a script or ask a friend to strip all gender indicators from the application.

Unfortunately, this technique requires you to anticipate the rationalization risk. Once you notice you might be rationalizing, it's usually too late to double-blind.

End-to-End Testing

A logical argument is a series of mental steps that build on each other. If the argument is direct, every step must be correct. As such, it is somewhat similar to a computer program. You wouldn't write a nontrivial computer program, look it over, say "that looks right" and push it to production. You would test it. With as close as possible to a full, end-to-end test before trusting it.

What does testing look like for an argument? Take a claim near the end of the argument which can be observed. Go and observe it. Since you have a concrete prediction, it should be a lot easier to make the observation.

Once you've got that, you don't need the long chain of argument that got you there. So it doesn't matter if you were rationalizing along the way. The bit at the end of the argument which you haven't chopped off still needs scrutinizing, but it's shorter, so you can give each bit more attention.

Suppose you're organizing some sort of event, and you want to not bother planning cleanup because you'll just tell all the crowd at the end "ok, everybody clean up now". You expect this to work out because of the known agentiness and competence of the crowd, overlap with HOPE attendees, estimates of the difficulty of cleanup tasks.... There's a lot of thinking that could have gone wrong, and a lot of communal pride pressuring you to rationalize. Instead of examining each question, try asking these people to clean up in some low-stakes situation.

(Do be careful to actually observe. I've seen people use their arguments to make predictions, then update on those predictions as if they were observations.)

This approach is also highly effective against non-rationalization errors in analysis as well. It's also especially good at spotting problems involving unknown unknowns.

The Side of Safety

Sometimes, you don't need to know.

Suppose you're about to drive a car, and are estimating whether you gain nontrivial benefit from a seatbelt. You conclude you will not, but note this could be ego-driven rationalization causing you to overestimate your driving talent. You could try to re-evaluate more carefully, or you could observe that the costs of wearing a seatbelt are trivial.

When you're uncertain about the quality of your reasoning, it makes sense to have a probability density function of posteriors for a yes-no question. But when the payoff table is lopsideded enough, you might find the vast bulk of the PDF is on one side of the decision threshold. And then you can just decide.

Discuss

### Tips on how to promote effective altruism effectively? Less talk, more action.

15 января, 2020 - 09:33
Published on January 14, 2020 11:17 PM UTC

Hi!

We are two 20 years old Finnish students from the University of Helsinki, who are very concerned about global issues. We have a lot of energy and have met a couple of times for organizing our thoughts. We found out about effective altruism and it seems to be exactly the ideology that we want to strive for.

But, on our meetings, we ran into a problem: what is the most effective way to concentrate this energy, so we can advance from talking to taking action? Our agenda and goals seem to be too ambiguous at the moment, so we hope that you could shed some light on what on earth (no pun intended) should we do.

As far as we know, effective altruism hasn't really had a breakthrough in making a cultural change on how people think. The movement in Finland is even more marginal: there are only 30 people in the EA Finland Telegram group and around 10+ people are attending the meetups. As we don't have that much experience in popularizing a social movement, we would need help on how to achieve our goals regarding this project? At least ridding us from the anxious ambiguity of not knowing exactly what concrete things we should do.

Best regards,

culturechange

P.S. Has there been any research on effective promoting of effective altruism?

Discuss

### What are beliefs you wouldn't want (or would feel apprehensive about being) public if you had (or have) them?

15 января, 2020 - 08:30
Published on January 15, 2020 5:30 AM UTC

Discuss

### Artificial Intelligence and Life Sciences (Why Big Data is not enough to capture biological systems?)

15 января, 2020 - 05:38
Published on January 15, 2020 1:59 AM UTC

Introduction

Artificial intelligence, and in particular Deep learning, is becoming increasingly relevant to biology research. With these algorithms it is possible to analyze large amounts of data that are impossible to analyze with other techniques, detecting characteristics that might otherwise be impossible to recognize. With these algorithms is possible to classify cellular images, make genomic connections, advance drug discovery, and even find links between different types of data, from genomics and imaging to electronic medical records [1].

However, as S. Webb pointed in a technology feature published in Nature, like any computational-biology technique, the results that arise from artificial intelligence are only as good as the data that go in. Overfitting a model to its training data is also a concern. In addition, for deep learning, the criteria for data quantity and quality are often more rigorous than some experimental biologists might expect [2]. Regarding this aspect, if organisms behave more like observers who make decisions depending a constant changing environment, then this strategy is not useful at all because individual decisions introduce constant irregularities in the collected data that cannot be grasped in a simple model or principle [3].

This perspective is not only a challenge to research in biology but seeks for an alternative approach to our understanding of intelligence, since the goal is not only to understand and simply classify individual complex structures and organisms, but is also about how these individual intelligences (heterogeneously) distribute into spaces and (eco) systems, and generate coherence[4] from apparent incoherent world. This perspective also has implications about how any research and derived technologies about intelligence cannot be seen as an isolated phenomenon from the rest of the nature.

Big data as golden pathway in biology?

We currently live in a golden age of the biology in part due to the idea that biological systems consist of a relationship between the parts (e.g. molecules) and the whole (e.g. organisms or ecosystems). Thanks to advances in technology (for instance in microscopy), mathematics and informatics it is possible today to measure and model complex systems by interconnecting different components at different scales in the space and depending on time[5]. To this end, large amounts of data are gathered to identify patterns [6]that allow the identification of dynamic models.

This -apparently successful- working method is defined as the paradigm of "microarray"[7]. This name originates from the microarray technique in biology to visualize relevant intracellular biochemical reactions by labeling proteins with phosphorescent markers. An eye-catching and important example of such a trend can be seen in biosciences, where drug effectivity or disease detection are, in practice, addressed through the study of groups, similarities and other structured characteristics of huge sets of chemical compounds (microarrays) derived from genetic, protein or even tissue samples. Traditional study of biological systems requires reductive methods in which amounts of data are collected by category. To do this, computers are useful to analyze and model this data[8], in part by using machine learning tools such as deep learning[9], to create accurate, real-time models of the response of a system to environmental and internal stimuli, for example in the development of therapies for cancer treatment[10].

However, a significant challenge for the advancement of this systemic approach is that the data needed for modeling are incomplete, contain biases and imbalances, and are continually evolving. Generally, usually the fault lies with the modeler and its methods and technologies used to design and build the model, which has no choice but to increase the amount and diversity of the data, perform various model optimizations and model training, and even make use of sloppy parameterization in its models[11].

Assuming that organisms are like robots that automatically operate through a set of well-defined internal mechanisms, then bias in the data can only be originated from factors outside the organism.

But even when data problems stem from the measurement methodology, we should not forget that biological systems and organisms are also observers who have complex organized interior states that allow observation of the environment. As organisms respond to their inner states, they are also obliged to move within their environment[12]. This implies a constant adaptation that continuously disregards the use of historical data to train models. Thus, flexible responses of the organisms to the environment, imply that bias can also originate from the organism itself, and that organisms can also adapt or restrict their adaption to experimental conditions. Therefore, in biology is also applies that the experimenter can also influence the experimental outputs.

This implies that living entities do not react blind to their environment, but also subjectively decide what that response should be. For instance, Denis Noble argue that “life is a symphony that connect organisms and environment across several scales and is much more than a blind mechanism”[13]. This constant sensing across several scales implies that mechanisms and pathways[14] are also adapting and changing as a constant flow[15], and that they are implicitly incomplete, challenging the explicit or implicit accounting of causal explanations based on mechanisms and pathways.

Therefore, cognition is not the characteristic of single organisms, but implies the participation of all the organisms subject to any analysis. This type of natural distributed cognition poses a challenge to the way we understand the world and process that information, either to build mechanistic models or to use entropy reduction methods in machine learning to build black box models. Recognizing that "biological systems" are a set of observers who make decisions we are continually considering systems with "ears and eyes" that are also able to model their environment and actions, including the experimental conditions to which they are subject. Because organisms, such as cells, are continually interacting and deciding on their environment[16], they can also develop intrinsic biases, for example in the responsiveness of growth factors[17] [18].

Consequently, instead to accumulate ever more data to try to have a better approach of a biological system it is preferable to better acknowledge how an organism senses and “computes”[19] its environment[20]. To this end we argue that the use of concepts coming from cognitive computation will be relevant to better analyze small amounts of data. As has been referred in an overview about this topic, cognitively-inspired mechanisms should be investigated in order to make the algorithms more intelligent, powerful, and effective in extracting insightful knowledge, from huge amounts of heterogeneous Big data [21], with topics that span cognitive inspired neural networks to semantic knowledge understanding. Our focus in this work orients in part to such approaches, while recognize the possibility of a natural distributed cognition in biological systems.

Finally, we want to point out that acknowledging this kind of distributed cognition in biological systems is not only relevant from a philosophical, but also from a practical point of view, since it points to minimize the amount of data required to train models. For instance, Ghanem et al. employed two metaheuristics, the artificial bee colony algorithm and the dragonfly algorithm to define a metaheuristic optimizer to train a multilayer perceptron to reach a set of weights and biases that can yield high performance compared to traditional learning algorithms[22]. Therefore, a clever combination of natural bias with a recognition of heterogeneities in the data used to train models in biology contribute to the development of efficient modelling methodologies using few datasets.

Perspectives

Currently, deep learning algorithms used for biology research have required extremely large and well-annotated datasets to learn how to distinguish features and categorize patterns. Larger, clearly labeled datasets, with millions of data points representing different experimental and physiological conditions, give researchers the most flexibility to train an algorithm. But for optimal training, algorithms require many well-annotated examples, which can be exceptionally difficult to obtain[23].

This implies that a safe use of this technology for modelling of biological systems requires of looking at the model through the lens of the training data[24]. Our central point is that the safe use of methods for modelling in biology and medicine must recognize that organisms are also active and intelligent observers[25], and that modelling technologies can be used in a “safe” way only when organisms or agents behave more or less in a “mechanistic” way.

On the other hand, we argue that the use of small data amounts, for instance based on a clever use of cognitive bias[26], can be a promising way to better understand and eventually model biological systems in a fairer (and safer) way.

These arguments also imply that, from a biological perspective, no form of intelligence should be investigated as an isolated phenomenon, and that any form of intelligence (in any organism, including humans) should be referred to the environment and interaction with other organisms, since nature is a loop where each organism is modeling other organisms as well as the common environment.

Interestingly, this notion is the central topic of the film Jurasic Park: a theme park was able to reproduce dinosaurs and model their behavior to control them; but no one in the park was aware that these dinosaurs were adapting and changing their behavior, and they were also modeling humans, so that in the end the park came under control of those dinosaurs. This was a sci-fi film that seems to be far from reality, but helps to illustrate several examples in biology, like for instance the observed ability of octopuses to develop empathy and friendships with humans[27], or horses that model human behavior and even demonstrate high cognitive abilities (e.g. intelligent Hans effect[28]).

[4] Coherence has in this context a physical meaning

[6] Such as space-based forms of organization or fluctuations in time

[11] This notion also applies to other models, such as climate and economics (Freedman, 2011)

[14] In what follows we distinguish mechanisms, as biological phenomenon that can be understood from fundamental and invariant physicochemical principles, from pathways, which are a sequence of causal steps that string together an upstream cause to a set of causal intermediates to some downstream outcome.

[18] Intrinsic biases in biology are similar but not equal to the concept of internal bias of agents in psychology and economy, which is a concept applicable only to the human cognition.

[19] “Computing” means in this context sensing more than defining algorithms to perform a given operation

Discuss

### Clarifying The Malignity of the Universal Prior: The Lexical Update

15 января, 2020 - 03:00
Published on January 15, 2020 12:00 AM UTC

In Paul's classic post What does the universal prior actually look like? he lays out an argument that the universal prior, if it were to be used for important decisions, would likely be malign, giving predictions that would effectively be under the control of alien consequentialists. He argues for this based on an 'anthropic update' the aliens could make that would be difficult to represent in a short program. We can split this update into two parts: an 'importance update' restricting attention to bits fed into priors used to make important decisions, and what I'm calling a 'lexical update' which depends on the particular variant of the universal prior being used. I still believe that the 'importance update' would be very powerful, but I'm not sure anymore about the 'lexical update'. So in this post I'm going to summarize both in my own words then explain my skepticism towards the 'lexical update'.

As background, note that 'straightforwardly' specifying data such as our experiences in the universal prior will take far more bits than just describing the laws of physics, as you'll also need to describe our location in spacetime, an input method, and a set of Everett branches(!), all of which together will probably take more than 10000 bits(compared to the laws alone which likely only take a few hundred) Thus, any really short program(a few hundred bits, say) that could somehow predict our experiences well would likely have a greater probability according to the universal prior than the 'straightforward' explanation.

Paul's post argues that there likely do exist such programs. I'm going to fix a reference prefix machine U which generates a universal prior. The argument goes:

A) there are many long-running programs with short descriptions according to U, such as our universe.

B) If other programs are like our universe's program, aliens could evolve there and end up taking over their programs.

C) Since their program has high measure in U, the aliens will plausibly have been selected to be motivated to control short programs in U.

D) To control U, the aliens could try to manipulate beings using the universal prior who have control over short programs in U (like us, hypothetically)

E) If the aliens are reasonably motivated to manipulate U, we can sample them doing that with few bits.

F) The aliens will now try to output samples from Q, the distribution over people using the universal prior to make important decisions(decisions impacting short programs in U). They can do this much more efficiently than any 'straightforward' method. For instance, when specifying which planet we are on, the aliens can restrict attention to planets which eventually develop life, saving a great many bits.

G) The aliens can then choose a low-bit broadcast channel in their own universe, so the entire manipulative behavior has a very short description in U.

H) For a short program to compete with the aliens, it would essentially need access to Q. But this seems really hard to specify briefly.

So far I agree. But the post also argues that even a short program that could sample from Q would still lose badly to the aliens, based on what I'm calling a 'lexical update', as follows:

I) In practice most people in U using 'the universal prior' won't use U itself but one of many variants U'(different universal programming languages)

J) Each of those variants U' will have their own Q', the distribution over people making important decisions with U'. Q is then defined as the average over all of those variants(with different U' weighted by simplicity in U)

K) Since short programs in different U' look different from each other, the aliens in those programs will be able to tell which variant U' they are likely to be in.

L) The distributions Q' of people in U using different variants U' all look different. Describing each Q' given Q should take about as many bits as it takes to specify U' using U.

M) But the aliens will already know they are in U', and so can skip that, gaining a large advantage even over Q.

But there's a problem here. In C) it was argued that aliens in short programs in U will be motivated to take over other short programs in U. When we condition on the aliens actually living somewhere short according to U', they are instead motivated to control short programs in U'. This would reduce their motivation to control short programs in U proportionally to the difficulty of describing U in U', and with less motivation, it takes more bits to sample their manipulative behaviors in E). The advantage they gained in L) over Q was proportional to the difficulty of describing U' in U. On average these effects should cancel out, and the aliens' probability mass will be comparable to Q.

The universal prior is still likely malign, as it's probably hard to briefly specify Q, but it no longer seems to me like the aliens would decisively beat Q. I still feel pretty confused about all this so comments pointing out any mistakes or misinterpretations would be appreciated.

Discuss

### A rant against robots

15 января, 2020 - 01:03
Published on January 14, 2020 10:03 PM UTC

What comes to your mind when you hear the word "artificial intelligence" (or "artificial general intelligence")? And if you want to prepare the future, what should come to your mind?

It seems that when most people hear AI, they think of robots. Weirdly, this observation includes both laymen and some top academics. Stuart Russell's book (which I greatly enjoyed) is such an example. It often presents robots as an example of an AI.

But this seems problematic to me. I believe that we should dissociate a lot more AIs from robots. In fact, given that most people will nevertheless think of robots when we discuss AIs, we might even want to use the terminology algorithms rather AIs. And perhaps algorithms with superhuman-level world model and planning capabilities instead of AGIs...

To defend this claim, I shall argue that the most critical aspects of today's and tomorrow's world-scale ethical problems (including x-risks) have and will have to do with algorithms; not robots. Moreover, and most importantly, the example of robots raises both concerns and solutions that seem in fact irrelevant to algorithms. Finally, I'll conclude by arguing that the example of today's large-scale algorithms is actually useful, because it motivates AI alignment.

It's about algorithms, not robots!AIs that matter are algorithms

Today's AI research is mostly driven by non-robotic applications, from natural language processing to image analysis, from protein folding to query answering, from autocompletion to video recommendation. This is where the money is. Google is investing (hundreds of?) millions of dollars in improving its search engine and YouTube's recommendation system. Not in building robots.

Today's ranking, content moderation and automated labeling algorithms are arguably a lot more influential than robots. YouTube's algorithms have arguably become the biggest opinion-maker worldwide. They present risks and opportunities on a massive scale.

And it seems that there is an important probability that tomorrow's most influential algorithms will be somewhat similar, even if they achieve artificial general intelligence. Such algorithms will likely be dematerialized, on the cloud, with numerous copies of themselves stored in several data centres and terminals throughout the world.

And they will be extremely powerful. Not because they have some strong physical power. But because they control the flow of information.

The power of information

At the heart of the distinction between algorithms and robots is the distinction between information and matter. Physics has long turned our attention towards matter and energy. Biology studied on animals, plants and key molecules. Historians focused on monuments, artefacts and industrial revolution. But as these fields grew, they all seem to have been paying more and more attention to information. Physics studied entropy. Biology analyzed gene expressions. History celebrated the invention of language, writing, printing, and now computing.

Arguably, this is becoming the case for all of our society. Information has become critical to every government, every industry and every charity. Essentially all of today's jobs are actually information processing jobs. They are about collecting information, storing information, processing information and emitting information. This very blog post was written after a collection of information, which were then processed and now emitted.

By collecting and analyzing information, you can have a much better idea of what is wrong and what to do. And crucially, by emitting the right information to the right entities, you can start a movement, manipulate individuals and start a revolution. Information is what changes the world.

Better algorithms are information game changers

We humans used to be the leading information processing units on earth. Our human brains were able to collect, store, process and emit information in a way that nothing else on earth could.

But now, there are algorithms. They can collect, store, process and emit far more information than any group of humans ever could. They can now figure out what is wrong and what to do, sometimes far better than we humans can, by learning from information that we humans could not collect, store, process and emit. Algorithms can start movements, manipulate individuals and start revolutions on a global scale. They have become the most powerful entities on earth.

In fact, because such powerful algorithms are deployed by the most powerful companies which also have huge incentives to make their algorithms more capable, it seems much more likely to me that the first algorithm with superhuman-level world model and planning capabilities will be much more similar to YouTube's recommendation algorithm than to a robot. Recall that such an algorithm has access to a truly massive amount of data from all over the world. And that data is clearly critical to algorithmic capabilities.

As another example, an algorithm able to send messages through the internet to get a 3D-printer to print killer drones seem a lot more dangerous than any of the killer drones it creates...

This is why I believe that the biggest challenges of AI safety and ethics have likely little to do with robots. These challenges rather seem to concern information and information processing. They are about algorithms.

Not robots. Algorithms.

The case of robots is very misleading

It would be fine to still focus on robots if they were similar enough to algorithms. In the end, I don't really care why you would want to solve AI or AGI safety; it just matters (to me) that you do want to solve AI or AGI safety.

Unfortunately though, having a robot in mind as an example of AI or AGI seems also extremely misleading. In fact, so many AGI safety debates could probably be easily shortcut by focusing on algorithms rather than robots.

Distributed algorithms are really hard to interrupt

Let's take the case of safe interruptibility. Many AI safety critics would say that this isn't a problem, because you can just unplug the AI. Well, admittedly, if a robot is not skillful enough to prevent you from unplugging it, and if you have access to its plug, yes, sure, you could probably unplug it.

But now try to unplug an algorithm. Especially a distributed algorithm like the YouTube recommendation system or the Google search engine! Even if you were the CEO of Google, I'm skeptical you would be able to interrupt these algorithms.

There's worse. Try to unplug Bitcoin. Well, essentially, you would have to unplug all of the Internet... Good luck with that! This is because Bitcoin was designed to be uninterruptible by any small group of users. This is the whole point of designing distributed algorithms! They are designed to be so-called Byzantine-fault tolerant.

It seems more than reasonable to assume that any algorithm with superhuman-level world knowledge and planning capabilities will make sure it is Byzantine-fault tolerant too.

Algorithms work on very different space and time scales

Another key feature of robots that is misleading is that we usually expect them to interact with us at our space and time scale. Intuitively, whatever a robot says can be analyzed. And if what he says is suspicious, we could have the time to correct it before it causes harm.

The case of large-scale algorithms like the YouTube recommendation system is very different. YouTube "speaks" at the rate of millions of recommendations per minute. It "reads" at the rate of 500 hours of videos per minute, and millions of new human behaviours per minute. And YouTube does so on a global scale.

In particular, this means that no human could ever check even a small fraction of what this algorithm does. The mere oversight of large-scale algorithms is way beyond human capability. We need algorithms for algorithmic surveillance.

Finally, and perhaps most importantly, robots just aren't here. Even self-driving cars have yet to be commercialized. In this context, it's hard to get people to care about AGI risks, or about alignment. The example of robots is not something familiar to them. It's even associated with science fiction and other futuristic dubious stories.

Conversely, large-scale hugely influential and sophisticated algorithms are already here. And they're already changing the world, with massive unpredictable uncontrollable side effects. In fact, it is such side effects of algorithm deployments that are existential risks, especially if algorithms gain superhuman-level world model and planning capabilities.

Interestingly, today's algorithms also already pose huge ethical problems that absolutely need to be solved. Whenever a user searches "vaccine", "Trump" or "AGI risks" on YouTube, there's an ethical dilemma over which video should be recommended first. Sure, it's not a life or death solution (though "vaccine" could be). But this occurs billions of times per day! And it might make a young scholar mock AGI risks rather than be concerned about them.

Perhaps most interestingly to me, alignment (that is, making sure the algorithm's goal is aligned with ours) already seems critical to make today's algorithms robustly beneficial. This means that by focusing on the example of today's algorithms, it may be possible to convince AI safety skeptics to do research that is nevertheless useful to AGI safety. As an added bonus, we wouldn't need to sacrifice any respectability.

This is definitely something I'd sign for!

Conclusion

In this post, I briefly shared my frustration to see people discuss AIs and robots often in a same sentence, without clear distinction between the two. I think that this attitude is highly counter-productive to the advocacy of AI risks and the research in AI safety. I believe that we should insist a lot more on the importance of information and information processing through algorithms. This seems to me to be a more effective way to promote quality discussion and research on algorithmic alignment.

Discuss

### What is the relationship between Preference Learning and Value Learning?

14 января, 2020 - 23:15
Published on January 13, 2020 9:08 PM UTC

It appears that in the last few years the AI Alignment community has dedicated great attention to the Value Learning Problem [1]. In particular, the work of Stuart Armstrong stands out to me.

Concurrently, during the last decade, researcher such as Eyke Hüllermeier Johannes Fürnkranz produced a significant amount of work on the topics of preference learning [2] and preference-based reinforcement learning [3].

While I am not highly familiar with the Value Learning literature, I consider the two fields closely related if not overlapping, but I have not often seen references the Preference Learning work, and vice-versa.

Is this because the two fields are less related than what I think? And more specifically, how do the two fields relate with each other?

References

[1] - Soares, Nate. "The value learning problem." Machine Intelligence Research Institute, Berkley (2015).

[2] - Fürnkranz, Johannes, and Eyke Hüllermeier. Preference learning. Springer US, 2010.

[3] - Fürnkranz, Johannes, et al. "Preference-based reinforcement learning: a formal framework and a policy iteration algorithm." Machine learning 89.1-2 (2012): 123-156.

Discuss

### Is backwards causation absurd?

14 января, 2020 - 22:25
Published on January 14, 2020 7:25 PM UTC

In Newcomb's problem an agent picks either one-box or two-box and finds that no matter which option they picked, a predictor predicted them in advance. I've gone to a lot of effort to explain how this can be without requiring backwards causation (The Prediction Problem, Deconfusing Logical Counterfactuals), yet now I find myself wondering if backwards causation is such a bad explanation after all.

Unfortunately I'm not a physicist, so take what I say with a grain of salt, but there seems to be a reasonable argument that either time or its direction is an illusion. One prominent theory of time is Eternalism in which there is no objective flow of time and terms like "past", "present" and "future" can only be used in a relative sense. An argument in favour of this is that it is often very convenient in physics to model space-time as a 4-dimensional space. If time is just another dimension, why should the future be treated differently than the past? Nothing in this model differentiates the two. If we have two blocks X and Y next to each other, we can view either X as the left one or Y as the left one depending on the direction we look at it from. Similarly, if A causes B in the traditional forwards sense, why can't we symmetrically view B as backwards causing A, where again if we viewed it another way A to B would be backwards causation and B to A would be forwards causation.

Another relativistic argument against time flowing is that simultaneity is only defined relative to a reference frame. Therefore, there is no unified present which is supposed to be what is flowing.

Thirdly, entropy has often been the arrow of time with other physical laws claimed to be reversible. We are in a low-entropy world so entropy increases. However, if we were in a high-entropy world, it would decrease, so time and causation would seem to be going backwards (from our perspective). This would seem to suggest that backwards causation is just as valid a phenomenon as backward causation.

I want to remind readers again that I am not a physicist. This post is more intended to spark discussion that anything else.

(Another possibility I haven't discussed is that causation might be in the map rather than the territory)

Discuss

### CDT going bonkers... forever

14 января, 2020 - 19:19
Published on January 14, 2020 4:19 PM UTC

I've been wanting to get a better example of CDT misbehaving, where the behaviour is more clearly suboptimal than it is in the Newcomb problem (which many people don't seem to accept as CDT being suboptimal).

So consider this simple example: the player is playing against Omega, who will predict their actions[1]. The player can take three actions: "zero", "one", or "leave".

If ever they do "leave", then the experiment is over and they leave. If they choose "zero" or "one", then Omega will predict their action, and compare this to their actual action. If the two match, then the player loses 1 utility; if the action and the prediction differs, then the player gains 3 utility and the experiment ends.

Assume that actually Omega is a perfect or quasi-perfect predictor, with a good model of the player. An FDT or EDT agent would soon realise that they couldn't trick Omega, after a few tries, and would quickly end the game.

But the CDT player would be incapable of reaching this reasoning. Whatever distribution they compute over Omega's prediction, they will always estimate that it (the CDT player) has at least a 50% chance of choosing the other option[2], for an expected utility gain of 1. And so they will continue playing, and continue losing... for ever.

1. Omega will make this prediction not necessarily before the player takes their actions, not even necessarily without seeing this action, but still makes the prediction independently of this knowledge. And that's enough for CDT. ↩︎

2. For example, suppose the CDT agent estimates the prediction will be "zero" with probability p, and "one" with probability 1-p. Then if p≥1/2, they can say "one", and have a probability p≥1/2 of winning, in their own view. If p<1/2, they can say "zero", and have a probability 1/2">1−p>1/2 of winning. ↩︎

Discuss

### Austin LW/SSC Far-comers Meetup: Feb. 8, 1:30pm

14 января, 2020 - 17:46
Published on January 14, 2020 2:46 PM UTC

• When: Saturday, February 8, 2020, 1:30pm.
• Where: Central Market North Lamar, Austin, Texas (4001 N Lamar Blvd, Austin, TX 78756). We'll be in the in-store cafe, either inside or outside depending on weather and seating availability.
• What: This is the designated "far-comers meetup" where we encourage the attendance of everyone who lives too far away from Austin to attend regularly. We currently have no topic or activity planned (other than general discussion), but if we think of something I'll update this post accordingly.

Discuss

### Ascetic aesthetic

14 января, 2020 - 15:58
Published on January 14, 2020 12:58 PM UTC

I have a theory that ethics come from aesthetics. Values come from your view of what is pretty and what is not pretty. Let's say that you value the strong protecting the weak. I don't believe that people thought about this, did a game-theoretical calculation of outcomes, and concluded that "strong protecting the weak" is the best strategy for society. Instead, the strong protecting the weak simply seems right, just like a beautiful view of the mountains and woods looks good, even if you can think of a thousand reasons why living in such an environment is good for your health. We list good-sounding reasons for our values, but instead they are derived from our sense of beautiful. The strong protecting the weak seems right and looks good. It appeals to the same part of our minds as music we like, or beautiful views of nature.

Trying to rationally calculate your actions is good, because "rationality" here means that you actually get to your goals (rational = the way that makes the most sense). But I find that a certain kind of naive view of rationality leads some to ignore their sense of aesthetics. I don't mind people deciding to do the "rational" thing despite their aesthetics, but I think they should at least be aware of their aesthetics before discarding them.

My own aesthetic roughly revolves around asceticism, so I have had the good fortune to call it "ascetic aesthetic". Considering the things I value, most of them check the box for minimalism, independence, resilience, or, more broadly, asceticism. From the type of clothes I like to wear to the type of career I've considered, it always reflects the same… style. It seems silly to compare my plain black shirt and stretchy black jeans with the type of person that I am, but the more I think about it, the more it makes sense. Not to say that my personality is plain, black or stretchy (?) but it values the same underlying attributes as these clothes posses: simple, appropriate for all occasions (= always ready), flexible and so on.

I don't like authority and that is the main reason why I haven't joined the army. But going through training and hardship - that has always been very attractive. Why? Primarily because being calm and ascetic is a key job requirement, and that's the part that appeals to me.

I came to the thought that aesthetics = ethics when I recently talked to a friend. I told him that I stopped regularly drinking coffee because I didn't want to depend on it - I felt ashamed when I got headaches after not having coffee, and thought to myself: "Really man? You've sunk that low that you're experiencing withdrawal, like a junkie?" He was perplexed that I seemed disgusted by the idea of being addicted to something - in his view, being addicted to coffee was not much different than having to eat. He didn't mind his own coffee addiction - coffee was not harmful and he enjoyed having it a couple of times every day. And that's when I realized that we were looking at the same "painting" but with different aesthetics, and the painting was actually values. He's not wrong - being addicted to coffee is not that different from having to eat. But eating is kinda indulgent as well, you know. My aesthetics would prefer fasting.

There's preference ordering in systems of aesthetics, and if you're a capitalist, probably untapped markets for under-served aesthetics. For example, one preference ordering in my aesthetics would be: drinking water is better than alcohol beverages (because the water is somehow… purer? I don't know), but if drinking alcohol, then drinking dry gin is better than sweet cocktails. And I don't think that there is a consistent framework under which this works, it's just a loose notion of indulgence = bad, spread over values, clothes, political opinions, advice given, cars driven, books read and so on.

In practical terms, it's good to get acquainted with your aesthetics. Whether you decide to go with or against them is your decision, but it's good, I think, to first have an understanding of what you find intuitively pleasing, before jumping to a "rational" calculation.

The important question though is where do aesthetics come from? And is there even a generalized aesthetic that manifests itself, or am I trying to tie together completely unrelated phenomena? I don't know yet, and don't know how I'd test it. But, fortunately, my ascetic aesthetic values the search for understanding, so at least I'm on the right path.

Discuss

### Red Flags for Rationalization

14 января, 2020 - 10:34
Published on January 14, 2020 7:34 AM UTC

Previously: Why a New Rationalization Sequence?

What are Red Flags?

A red flag is a warning sign that you may have rationalized. Something that is practical to observe and more likely in the rationalization case than the non-rationalization.

Some are things which are likely to cause rationalization. Others are likely to be caused by it. One on this list is even based on common cause. (I don't have any based on selection on a common effect, but in theory there could be.)

How to Use Red Flags

Seeing a red flag doesn't necessarily mean that you have rationalized, but it's evidence. Likewise, just because you've rationalized doesn't mean your conclusion is wrong, only that it's not as supported as you thought.

So when one of these flags raises, don't give up on ever discovering truth; don't stop-halt-catch-fire; definitely don't invert your conclusion.

Just slow down. Take the hypothesis that you're rationalizing seriously and look for ways to test it. The rest of this sequence will offer tools for the purpose, but just paying attention is half the battle.

A lot of these things can be present to a greater or lesser degree, so you'll want to set thresholds. I'd guess an optimal setting has about 1/3 of triggers be true. High enough that you keep doing your checks seriously, but low because the payoff matrix is quite asymmetrical.

Basically use these as trigger-action planning. Trigger: anything on this list. Action: spend five seconds doing your agenty best to worry about rationalization.

Conflict of Interest

This is a classic reason to distrust someone else's reasoning. If they have something to gain from you believing a conclusion separate from that conclusion being true, you have reason to be suspicious. But what does it mean for you to gain from you believing something apart from it being true?

Not Such Great Liars

Probably the simplest reason is that you need to deceive someone else. If you're not a practiced liar, the easiest way to do this is to deceive yourself.

Simple example: you're running late and need to give an estimate of when you'll arrive. If you say "ten minutes late" and arrive twenty minutes late, it looks like you hit another ten minutes' worth of bad luck, whereas saying "twenty minutes" looks like your fault. You're not good at straight-up lying, but if you can convince yourself you'll only be ten minutes late, all is well.

Unendorsed Values

Values aren't simple, and you aren't always in agreement with yourself. Let's illustrate this with examples:

Perhaps you believe that health and long life are more important that fleeting pleasures like ice cream, but there's a part of you that has a short time preference and knows ice cream is delicious. That part would love to convince the rest of you of a theory of nutrition that holds ice cream as healthy.

Perhaps you believe that you should follow scientific results wherever the evidence leads you, but it seems to be leading someplace that a professor at Duke predicted a few months ago, and there's a part of you that hates Duke. If that part can convince the rest of you that the data is wrong, you won't have to admit that somebody at Duke was right.

Wishful Thinking

A classic cause of rationalization. Expecting good things feels better than expecting bad things, so you'll want to believe it will all come out all right.

Catastrophizing Thinking

The opposite of wishful thinking. I'm not sure what the psychological root is, but it seems common in our community.

Conflict of Ego

The conclusion is: therefore I am a good person. The virtues I am strong at are the most important, and those I am weak at are the least. The work I do is vital to upholding civilization. The actions I took were justified. See Foster & Misra (2013) on Cognitive Dissonance and Affect.

Variant: therefore we are good people. Where "we" can be any group membership the thinker feels strongly about. Note that the individual need not have been involved in the virtue, work or action to feel pressure to rationalize it.

This is particularly insidious when "we" is defined partly by a large set of beliefs, such as the Social Justice Community or the Libertarian Party. Then it is tempting to rationalize that every position "we" have ever taken was correct.

In my experience, the communal variant is more common than the individual one, but that may be an artifact of my social circles.

Reluctance to Test

If you have an opportunity to gain more evidence on the question and feel reluctant, this is a bad sign. This one is illustrated by Harry and Draco discussing Hermione in HPMOR .

Suspicious Timing

Did you stop looking for alternatives as soon as you found this one?

Similarly, did you spend a lot longer looking for evidence on one side than the other?

Failure to Update

This was basically covered in Update Yourself Incrementally and One Argument Against An Army. The pattern of failing to update because the weight of evidence points the other way is a recognizable one.

The Feeling of Doing It

For some people, rationalization has a distinct subjective experience that you can train yourself to recognize. Eliezer writes about it in Singlethink and later refers to it as "don't even start to rationalize".

Agreeing with Idiots

True, reversed stupidity is not intelligence. Nevertheless, if you find yourself arriving at the same conclusion as a large group of idiots, this is a suspicious observation that calls for an explanation. Possibilities include:

• It's a coincidence: they got lucky. This can happen, but the more complex the conclusion, the less likely.
• They're not all that idiotic. People with terrible overall epistemics can still have solid understanding within their comfort zones.
• It's not really the same conclusion; it just sounds it when both are summarized poorly.
• You and they rationalized the conclusion following the same interest.

Naturally, it is this last possibility that concerns us. The less likely the first three, the more worrying the last one.

Disagreeing with Experts

If someone who is clearly established as an expert in the field (possibly by having notable achievements in it) disagrees with you, this is a bad sign. It's more a warning sign of bad logic in general than of rationalization in particular, but rationalization is a common cause of bad logic, and many of the same checks apply.

Discuss

14 января, 2020 - 05:25
Published on January 14, 2020 2:25 AM UTC

Please bring either (or both!) a laptop or a smartphone for this event!

This event is hosted by Arthur Milchior and Rowan Carson.

Anki is a software that helps you learn things and remember them in the long term. Anki can be used to learn a lot of different topics; I successfully used it to learn music, programming languages, mathematics, name of people, and song lyrics. However, ideally, each topic requires one to use Anki differently.

Depending on the interest of the audience, this event will either be a discussion about how we use Anki efficiently, or I will demonstrate the tricks I created and use daily to get the best out of Anki. Most of my tricks are documented on http://www.milchior.fr/blog_en/index.php/category/Anki

Please let me know (at arthur@milchior.fr, in comments, or on facebook if you have any question or topic you want us to consider during the event.

Hosts' bio

I've used anki since 2016 and uses it daily. I contributed to Anki and Ankidroid (for Android) code. I also created add-ons for anki which have been downloaded 40 thousand times.

Discuss

### Anki (Memorization Software) for Beginners

14 января, 2020 - 04:55
Published on January 14, 2020 1:55 AM UTC

Please bring either (or both!) a laptop or a smartphone for this event!

Anki is a spaced-repetition software that helps people to remember things in the long term. For example, Anki is often used to learn foreign languages and medicine, which both require a lot of memorization. Starting to use Anki may be a little bit tricky and the goal of this event is to help you explore and try the software to see whether you like it.

For the first part, I'll help guide the audience through setting up Anki.

Then, I'll quickly show personal examples explaining how Anki changed my life in many ways, to illustrate what can be done with it. For example:

• I now remember the name of people I meet,
• I can finally play some music without music sheets, and
• when I start studying more of a topic after a few months of pauses (mathematics, CS, music theory,...) I don't forget everything I learned months ago!

Afterwards, I will be available to answer any beginner question (if there is any question you already know you have, please ask it to arthur@milchior.fr or in comment below, or on facebook. I'll check all of those).

If you are curious and can't attend, all examples are on http://www.milchior.fr/blog_en/index.php/category/Anki

Hosts' bio

I've used anki since 2016 and uses it daily. I contributed to Anki and Ankidroid (for Android) code. I also created add-ons for anki which have been downloaded 40 thousand times.

Discuss

### Repossessing Degrees

14 января, 2020 - 00:40
Published on January 13, 2020 9:40 PM UTC

Several months ago I argued that we should allow student loans to be discharged through bankruptcy. Yesterday I realized another way to modify student loans to be more compatible with bankruptcy: allow lenders to effectively repossess degrees. This could make having student loans not survive bankruptcy more politically practical.

Sometimes people bring up the idea of repossessing degrees as a way of illustrating the absurdity of student loans, but it's actually pretty reasonable. A large majority of the benefit of getting a degree comes from having the credential as opposed to having learned skills: employers use "do they have a college degree" as a filter. If you took out a student loan to get a degree, and then later declared bankruptcy, the court could require you to no longer represent yourself as having a the degree as a condition of having your debt discharged.

I think the goal is that someone with a repossessed degree should look the same as someone who completed part of the degree but dropped out before finishing. Colleges would need to be part of this as well; one way for this to work is if as a condition of continued participation in the student loan system colleges would need to agree to revoke degrees if asked to do so by the bankruptcy court.

One way this could work poorly would be if people would just ignore the law and tell employers in person that they have a revoked degree? I think they mostly wouldn't: degrees are often used at an early stage of hiring screening, where if you don't have a degree you don't even get to the stage of being able to talk to an interviewer and explain the situation. Additionally, someone who decides to claim they went to Harvard after having their degree repossessed looks just like someone who's lying about having gone to college, or who dropped out.

This would probably need to be combined with some sort of waiting period, something like five years, to avoid the case where someone declares bankruptcy before finishing their degree when there's nothing to repossess, and then completes their degree later.

Implementing this in a way compatible with the first amendment seems tricky but doable. In a standard non-disclosure agreement you trade your right to share some information for something else you value more. That's what we're talking about here, so it seems like we're ok? But I'm just speculating.

This does change some of the incentives around education and hiring: students might try to learn valuable skills, and employers might try to evaluate people based on what they know and can do. But this would be a great outcome!

(Note that this is not compatible with my proposal that we prohibit employers from considering degrees entirely, and is a much less radical alternative.)

Discuss

### What do we mean by “moral uncertainty”?

13 января, 2020 - 15:13
Published on January 13, 2020 12:13 PM UTC

This post follows on from my prior post; consider reading that post first.

In my prior post, I discussed overlaps with and distinctions between moral uncertainty and related concepts. In this post, I continue my attempt to clarify what moral uncertainty actually is (rather than how to make decisions when morally uncertain, which is covered later in the sequence). Specifically, here I’ll discuss:

1. Is what we “ought to do” under moral uncertainty an objective or subjective (i.e., belief-relative) matter?
2. Is what we “ought to do” under moral uncertainty a matter of rationality or morality?

An important aim will be simply clarifying the questions and terms themselves. That said, to foreshadow, the tentative “answers” I’ll arrive at are:

1. It seems both more intuitive and more action-guiding to say that the “ought” is subjective.
2. Whether the “ought” is a rational or a moral one may be a “merely verbal” dispute with no practical significance. But I’m very confident that interpreting the “ought” as a matter of rationality works in any case (i.e., whether or not interpreting it as a matter of morality does, and whether or not the distinction really matters).

This post doesn’t explicitly address what types of moral uncertainty would be meaningful for moral antirealists and/or subjectivists, or explore why a person (or agent) might perceive themselves to be morally uncertain (as opposed to what moral uncertainty “really is”). Those matters will be the subject of a later post.[1]

Epistemic status: The concepts covered here are broad, fuzzy, and overlap in various ways, making definitions and distinctions between them almost inevitably debatable. Additionally, I’m not an expert in these topics (though I have now spent a couple weeks mostly reading about them). I’ve tried to mostly collect, summarise, and synthesise existing ideas (from academic philosophy and the LessWrong and EA communities). I’d appreciate feedback or comments in relation to any mistakes, unclear phrasings, etc. (and just in general!).

Objective or subjective?

(Note: What I discuss here is not the same as the objectivism vs subjectivism debate in metaethics.)

As I noted in a prior post:

Subjective normativity relates to what one should do based on what one believes, whereas objective normativity relates to what one “actually” should do (i.e., based on the true state of affairs).

Hilary Greaves & Owen Cotton-Barratt give an example of this distinction in the context of empirical uncertainty:

Suppose Alice packs the waterproofs but, as the day turns out, it does not rain. Does it follow that Alice made the wrong decision? In one (objective) sense of “wrong”, yes: thanks to that decision, she experienced the mild but unnecessary inconvenience of carrying bulky raingear around all day. But in a second (more subjective) sense, clearly it need not follow that the decision was wrong: if the probability of rain was sufficiently high and Alice sufficiently dislikes getting wet, her decision could easily be the appropriate one to make given her state of ignorance about how the weather would in fact turn out. Normative theories of decision-making under uncertainty aim to capture this second, more subjective, type of evaluation; the standard such account is expected utility theory.

Greaves & Cotton-Barratt then make the analogous distinction for moral uncertainty:

How should one choose, when facing relevant moral uncertainty? In one (objective) sense, of course, what one should do is simply what the true moral hypothesis says one should do. But it seems there is also a second sense of “should”, analogous to the subjective “should” for empirical uncertainty, capturing the sense in which it is appropriate for the agent facing moral uncertainty to be guided by her moral credences [i.e., beliefs], whatever the moral facts may be. (emphasis added)

(This objective vs subjective distinction seems to me somewhat similar - though not identical - to the distinction between ex post and ex ante thinking. We might say that Alice made the right decision ex ante - i.e., based on what she knew when she made her decision - even if it turned out - ex post - that the other decision would’ve worked out better.)

MacAskill notes that, in both the empirical and moral contexts, “The principal argument for thinking that there must be a subjective sense of ‘ought’ is because the objective sense of ‘ought’ is not sufficiently action-guiding.” He illustrates this in the case of moral uncertainty with the following example:

Susan is a doctor, who faces three sick individuals, Greg, Harold and Harry. Greg is a human patient, whereas Harold and Harry are chimpanzees. They all suffer from the same condition. She has a vial of a drug, D. If she administers all of drug D to Greg, he will be completely cured, and if she administers all of drug to the chimpanzees, they will both be completely cured (health 100%). If she splits the drug between the three, then Greg will be almost completely cured (health 99%), and Harold and Harry will be partially cured (health 50%). She is unsure about the value of the welfare of non-human animals: she thinks it is equally likely that chimpanzees’ welfare has no moral value and that chimpanzees’ welfare has the same moral value as human welfare. And, let us suppose, there is no way that she can improve her epistemic state with respect to the relative value of humans and chimpanzees.

[...]

Her three options are as follows:

A: Give all of the drug to Greg

B: Split the drug

C: Give all of the drug to Harold and Harry

Her decision can be represented in the following table, using numbers to represent how good each outcome would be.

Finally, suppose that, according to the true moral theory, chimpanzee welfare is of the same moral value as human welfare and that therefore, she should give all of the drug to Harold and Harry. What should she do?

Clearly, the best outcome would occur if Susan does C. But she doesn’t know that that would cause the best outcome, because she doesn’t know what the “true moral theory” is. She thus has no way to act on the advice “Just do what is objectively morally right.” Meanwhile, as MacAskill notes, “it seems it would be morally reckless for Susan not to choose option B: given what she knows, she would be risking severe wrongdoing by choosing either option A or option C” (emphasis added).

To capture the intuition the Susan should choose option B, and to provide actually followable guidance for action, we need to accept that there is a subjective sense of “should” (or of “ought”) - a sense of “should” that depends in part on what one believes. (This could also be called a “belief-relative” or “credence-relative” sense of “should”.)[2]

An additional argument in favour of accepting that there’s a subjective “should” in relation to moral uncertainty is consistency with how we treat empirical uncertainty, where most people accept that there’s a subjective “should”.[3] This argument is made regularly, including by MacAskill and by Greaves & Cotton-Barratt, and it seems particularly compelling when one considers that it’s often difficult to draw clear lines between empirical and moral uncertainty (see my prior post). That is, if it’s often hard to say whether an uncertainty is empirical or moral, it seems strange to say we should accept a subjective “should” under empirical uncertainty but not under moral uncertainty.

Ultimately, most of what I’ve read on moral uncertainty is premised on there being a subjective sense of “should”, and much of this sequence will rest on that premise also.[4] As far as I can tell, this seems necessary if we are to come up with any meaningful, action-guiding approaches for decision-making under moral uncertainty (“metanormative theories”).

But I should note that some writers do appear to argue that there’s only an objective sense of “should” (one example, I think, is Weatherson, though he uses different language and I’ve only skimmed his paper). Furthermore, while I can’t see how this could lead to action-guiding principles for making decisions under uncertainty, it does seem to me that it’d still allow for resolving one’s uncertainty. In other words, if we do recognise only objective “oughts”:

• We may be stuck with fairly useless principles for decision-making, such as “Just do what’s actually right, even when you don’t know what’s actually right”
• But (as far as I can tell) we could still be guided to clarify and reduce our uncertainties, and thereby bring our beliefs more in line with what’s actually right.
Rational or moral?

There is also debate about what precisely kind of “should” is involved [in cases of moral uncertainty]: rational, moral, or something else again. (Greaves & Cotton-Barratt)

For example, in the above example of Susan the doctor, are we wondering what she rationally ought to do, given her moral uncertainty about the moral status of chimpanzees, or what she morally ought to do?

It may not matter either way

Unfortunately, even after having read up on this, it’s not actually clear to me what the distinction is meant to be. In particular, I haven’t come across a clear explanation of what it would mean for the “should” or “ought” to be moral. I suspect that what that would mean would be partly a matter of interpretation, and that some definitions of a “moral” should could be effectively the same as those for a “rational” should. (But I should note that I didn’t look exhaustively for such explanations and definitions.)

Additionally, both Greaves & Cotton-Barratt and MacAskill explicitly avoid the question of whether what one “ought to do” under moral uncertainty is a matter of rationality or morality.[5] This does not seem to at all hold them back from making valuable contributions to the literature on moral uncertainty (and, more specifically, on how to make decisions when morally uncertain).

Together, the above points make me inclined to believe (though with low confidence) that this may be a “merely verbal” debate with no real, practical implications (at least while the words involved remain as fuzzy as they are).

However, I still did come to two less-dismissive conclusions:

1. I’m very confident that the project of working out meaningful, action-guiding principles for decision-making under moral uncertainty makes sense if we see the relevant “should” as a rational one. (Note: This doesn’t mean that I think the “should” has to be seen as a rational one.)
2. I’m less sure whether that project would make sense if we see the relevant “should” as a moral one. (Note: This doesn’t mean I have any particular reason to believe it wouldn’t make sense if we see the “should” as a moral one.)

I provide my reasoning behind these conclusions below, though, given my sense that this debate may lack practical significance, some readers may wish to just skip to the next section.

A rational “should” likely works

Bykvist writes:

An alternative way to understand the ought relevant to moral uncertainty is in terms of rationality (MacAskillet al., forthcoming; Sepielli, 2013). Rationality, in one important sense at least, has to do with what one should do or intend, given one’s beliefs and preferences. This is the kind of rationality that decision theory often is seen as invoking. It can be spelled out in different ways. One is to see it as a matter of coherence: It is rational to do or intend what coheres with one’s beliefs and preferences (Broome, 2013; for a critic, see Arpaly, 2000). Another way to spell it out is to understand it as matter of rational processes: it is rational to do or intend what would be the output of a rational process, which starts with one’s beliefs and preferences (Kolodny, 2007).

To apply the general idea to moral uncertainty, we do not need to take stand on which version is correct. We only need to assume that when a conscientious moral agent faces moral uncertainty, she cares about doing right and avoid doing wrong but is uncertain about the moral status of her actions. She prefers doing right to doing wrong and is indifferent between different right doings (at least when the right doings have the same moral value, that is, none is morally supererogatory). She also cares more about serious wrongdoings than minor wrongdoings. The idea is then to apply traditional decision theoretical principles, according to which rational choice is some function of the agent’s preferences (utilities) and beliefs (credences). Of course, different decision‐theories provide different principles (and require different kinds of utility information). But the plausible ones at least agree on cases where one option dominates another.

Suppose that you are considering only two theories (which is to simplify considerably, but we only need a logically possible case): “business as usual,” according to which it is permissible to eat factory‐farmed meat and permissible to eat vegetables, and “vegetarianism,” according to which it is impermissible to eat factory‐farmed meat and permissible to eat vegetables. Suppose further that you have slightly more confidence in “business as usual.” The option of eating vegetables will dominate the option of eating meat in terms of your own preferences: No matter which moral theory is true, by eating vegetables, you will ensure an outcome that you weakly [prefer] to the alternative outcome: if “vegetarianism” is true, you prefer the outcome; if “business as usual is true,” you are indifferent between the outcomes. The rational thing for you to do is thus to eat vegetables, given your beliefs and preferences. (lines breaks added)

It seems to me that that reasoning makes perfect sense, and that we can have valid, meaningful, action-guiding principles about what one rationally (and subjectively) should do given one’s moral uncertainty. This seems further supported by the approach Christian Tarsney takes, which seems to be useful and to also treat the relevant “should” as a rational one.

Furthermore, MacAskill seems to suggest that there’s a correlation between (a) writers fully engaging with the project of working out action-guiding principles for decision-making under moral uncertainty and (b) writers considering the relevant “should” to be rational (rather than moral):

(Lockhart 2000, 24,26), (Sepielli 2009, 10) and (Ross 2006) all take metanormative norms to be norms of rationality. (Weatherson 2014) and (Harman 2014) both understand metanormative norms as moral norms. So there is an odd situation in the literature where the defenders of metanormavism (Lockhart, Ross, and Sepielli) and the critics of the view (Weatherson and Harman) seem to be talking past one another.

A moral “should” may or may not work

I haven’t seen any writer (a) explicitly state that they understand the relevant “should” to be a moral one, and then (b) go on to fully engage with the project of working out meaningful, action-guiding principles for decision-making under moral uncertainty. Thus, I have an absence of evidence that one can engage in that project while seeing the “should” as moral, and I take this as (very weak) evidence that one can’t engage in that project while seeing the “should” that way.

Additionally, as noted above, MacAskill writes that Weatherson and Harman (who seem fairly dismissive of that project) see the relevant “should” as a moral one. Arguably, this is evidence that that project of finding such action-guiding principles won’t make sense if we see the “should” as moral (rather than rational). However, I consider this to also be very weak evidence, because:

• It’s only two data points.
• It’s just a correlation anyway.
• I haven’t closely investigated the “correlation” myself. That is, I haven’t checked whether or not Weatherson and Harman’s reasons for dismissiveness seem highly related to them seeing the “should” as moral rather than rational.
Closing remarks

In this post, I’ve aimed to:

• Clarify what is meant by the question “Is what we “ought to do” under moral uncertainty is an objective or subjective matter?”
• Clarify what is meant by the question “Is that ‘ought’ a matter of rationality or of morality?”
• Argue that it seems both more intuitive and more action-guiding to say that the “ought” is subjective.
• Argue that whether the “ought” is a rational or a moral one may be a “merely verbal” dispute with no practical significance (but that interpreting the “ought” as a matter of rationality works in any case).

I hope this has helped give readers more clarity on the seemingly neglected matter of what we actually mean by moral uncertainty. (And as always, I’d welcome any feedback or comments!)

In my next post, I’ll continue in a similar vein, but this time focusing on whether, when we’re talking about moral uncertainty, we’re actually talking about moral risk rather than about moral (Knightian) uncertainty - and whether such a distinction is truly meaningful.

1. But the current post is still relevant for many types of moral antirealist. As noted in my last post, this sequence will sometimes use language that may appear to endorse or presume moral realism, but this is essentially just for convenience. ↩︎

2. We could further divide subjective normativity up into, roughly, “what one should do based on what one actually believes” and “what one should do based on what it would be reasonable for one to believe”. The following quote, while not directly addressing that exact distinction, seems relevant:

Before moving on, we should distinguish subjective credences, that is, degrees of belief, from epistemic credences, that is, the degree of belief that one is epistemically justified in having, given one’s evidence. When I use the term ‘credence’ I refer to epistemic credences (though much of my discussion could be applied to a parallel discussion involving subjective credences); when I want to refer to subjective credences I use the term ‘degrees of belief’.

The reason for this is that appropriateness seems to have some sort of normative force: if it is most appropriate for someone to do something, it seems that, other things being equal, they ought, in the relevant sense of ‘ought’, to do it. But people can have crazy beliefs: a psychopath might think that a killing spree is the most moral thing to do. But there’s no sense in which the psychopath ought to go on a killing spree: rather, he ought to revise his beliefs. We can only capture that idea if we talk about epistemic credences, rather than degrees of belief.

(I found that quote in this comment, where it’s attributed to MacAskill’s BPhil thesis. Unfortunately, I can’t seem to access that thesis, including via Wayback Machine.) ↩︎

3. Though note that Greaves and Cotton-Barratt write:

Not everyone does recognise a subjective reading of the moral ‘ought’, even in the case of empirical uncertainty. One can distinguish between objectivist, (rational-)credence-relative and pluralist views on this matter. According to objectivists (Moore, 1903; Moore, 1912; Ross, 1930, p.32; Thomson, 1986, esp. pp. 177-9; Graham, 2010; Bykvist and Olson, 2011) (respectively, credence-relativists (Prichard, 1933; Ross, 1939; Howard-Snyder, 2005; Zimmermann, 2006; Zimmerman, 2009; Mason, 2013), the “ought” of morality is uniquely an objective (respectively, a credence-relative) one. According to pluralists, “ought” is ambiguous between these two readings (Russell, 1966; Gibbard, 2005; Parfit, 2011; Portmore, 2011; Dorsey, 2012; Olsen, 2017), or varies between the two readings according to context (Kolodny and Macfarlane, 2010).

↩︎
4. In the following quote, Bykvist provides what seems to me (if I’m interpreting it correctly) to be a different way of explaining something similar to the objective vs subjective distinction.

One possible explanation of why so few philosophers have engaged with moral uncertainty might be serious doubt about whether it makes much sense to ask about what one ought do when one is uncertain about what one ought to do. The obvious answer to this question might be thought to be: “you ought to do what you ought to do, no matter whether or not you are certain about it” (Weatherson, 2002, 2014). However, this assumes the same sense of “ought” throughout.

A better option is to assume that there are different kinds of moral ought. We are asking what we morally ought to do, in one sense of ought, when we are not certain about what we morally ought to do, in another sense of ought. One way to make this idea more precise is to think about the different senses as different levels of moral ought. When we face a moral problem, we are asking what we morally ought to do, at the first level. Standard moral theories, such as utilitarianism, Kantianism, and virtue ethics, provide answers to this question. In a case of moral uncertainty, we are moving up one level and asking about what we ought to do, at the second level, when we are not sure what we ought to do at the first level. At this second level, we take into account our credence in various hypotheses about what we ought to do at the first level and what these hypotheses say about the moral value of each action (MacAskill et al., forthcoming). This second level ought provides a way to cope with the moral uncertainty at the first level. It gives us a verdict of how to best manage the risk of doing first order moral wrongs. That there is such a second‐level moral ought of coping with first‐order moral risks seems to be supported by the fact that agents are morally criticizable when they, knowing all the relevant empirical facts, do what they think is very likely to be a first‐order moral wrong when there is another option that is known not to pose any risk of such wrongdoing.

Yet another (and I think similar) way of framing this sort of distinction could make use of the following two terms: “A criterion of rightness tells us what it takes for an action to be right (if it’s actions we’re looking at). A decision procedure is something that we use when we’re thinking about what to do” (Askell).

Specifically, we might say that the true first-order moral theory provides objective “criteria of rightness”, but that we don’t have direct access to what these are. As such, we can use a second-order “decision procedure” that attempts to lead us to take actions that are close as possible to the best actions (according to the unknown criteria of rightness). To do so, this decision procedure must make use of our credences (beliefs) in various moral theories, and is thus subjective. ↩︎

5. Greaves & Cotton-Barratt write: “For the purpose of this article, we will [...] not take a stand on what kind of “should” [is involved in cases of moral uncertainty]. Our question is how the “should” in question behaves in purely extensional terms. Say that an answer to that question is a metanormative theory.”

MacAskill writes: “I introduce the technical term ‘appropriateness’ in order to remain neutral on the issue of whether metanormative norms are rational norms, or some other sort of norms (though noting that they can’t be first-order norms provided by first-order normative theories, on pain of inconsistency).” ↩︎

Discuss

### Drawing Toward Power

13 января, 2020 - 10:46
Published on January 13, 2020 4:24 AM UTC

At the Bay Area pre-solstice unconference, I gave a talk which started as a discussion of how rationalists could build tools to enable better organization and conversation online. I'm a largely improvisational speaker though, and it quickly turned into a discussion of the future of the movement.

I'm pretty old now, and I've seen a number of groups of people move from relative obscurity, to positions of power or, at least, to being subjects of general curiosity. The most relevant, I think, for rationalists are the early (pre-2002) Googlers.

If a social group gets projected onto the broader canvas of mass attention, or their interests get magnified through access to the levers of power, you get criticism and praise in varied amounts (Paul Graham has just written a piece about this). But just as visibly, the tiniest omissions and imperfections of the analysis and attitudes of the founders are also magnified.

I gave a talk in, I think, 2009, about this at a Foo Camp. I warned in an environment that was ready to hear it, but not capable of changing matters much, that geeks were about to become dangerous: that we had a set of Tragic Flaws, that we were already seeing magnified in the wider world.

I don't remember all of the characteristics of geekdom I gave then that would lead to its downfall, but a couple of them stuck with me: that we were bending the workplace into our own vision of a pleasant place to be (unbounded by the 9-5, contract-driven, full of interest and unboring, self-driven, university-like), and that meant that we were work-focused, and pushing others to be (even when their lives could not be bent that way). We experienced burn-out as a part of our lives, and now we were driving others into burn-out. We viewed efficiency as a life goal for ourselves, individually, and that was how we were encouraging others to live. We were strongly influenced by our alienation from others as young people, and that meant (paradoxically) that when we did assume power, we would not recognise it, and instead continued to use the habits and attitudes of outsiders with no power. (I called this "Stalinism": Stalin's paranoia and cruelty may have come from being the weak, minority group experiencing cruelty from the more powerful. I'm not sure if that is truly the case, but perhaps we could give as a better example Bill Gates, who for years assumed that Microsoft had to act as a scrappy competitor, because it was weak, and could easily be destroyed by IBM -- even when it reached the point of dominating the PC market.

I see the same dynamic playing out among rationalists now. Dominic Cummings will not be the last powerbroker who will see the rationalist point-of-view as providing an edge that can be swiftly adopted. At my talk at the Solstice unconference, I described this opportunity as emerging from rationalists appearance as being similar to the existing powerful groups (well-educated, mildly secular, polite, verbal) -- but also seemingly "harmless".

This is important, because some groups who seek to transform society are quickly beaten down with baseball bats, because of either their unfamiliarity, or being all too easily pattern-matched as "dangerous" radicals.(I may have noted that everyone at the Unconference looked like the radical unitarians from Unsong, whose very existence in the book was to play up the joke of how un-dangerous they seemed.)

What this means is that rationalist ideas, even as they are disparaged as weird by the first waves of media, will be far more quickly adopted by enclaves of the powerful than one might expect. Or might be healthy for anyone involved. If Bill Gates, or Googlers (or Stalin!) go from being unforeseen upstarts to being able to affect world events in a decade or two, that means that those minor flaws turn into tragedies without being addressed or corrected.

Rationalists have an advantage of their own internal warning system. You have, as they say, noticed the skulls. So what are the heroic flaws, the blindspots, the monocultural assumptions that will lead to the movement's downfall?

Discuss