Вы здесь

Utility ≠ Reward

Новости LessWrong.com - 5 сентября, 2019 - 20:28
Published on September 5, 2019 5:28 PM UTC

This essay is an adaptation of a talk I gave at the Human-Aligned AI Summer School 2019 about our work on mesa-optimisation. My goal here is to write an informal, accessible and intuitive introduction to the worry that we describe in our full-length report.

I will skip most of the detailed analysis from our report, and encourage the curious reader to follow up this essay with our sequence or report.

The essay has six parts:

Two distinctions draws the foundational distinctions between
“optimised” and “optimising”, and between utility and reward.

What objectives? discusses the behavioral and internal approaches to understanding objectives of ML systems.

Why worry? outlines the risk posed by the utility ≠ reward gap.

Mesa-optimisers introduces our language for analysing this worry.

An alignment agenda sketches different alignment problems presented by these ideas, and suggests transparency and interpretability as a way to solve them.

Where does this leave us? summarises the essay and suggests where to look next.

The views expressed here are my own, and do not necessarily reflect those of my coauthors or MIRI. While I wrote this essay in first person, all of the core ideas are the fruit of an equal collaboration between Joar Skalse, Chris van Merwijk, Evan Hubinger and myself. I wish to thank Chris and Joar for long discussions and input as I was writing my talk, and all three, as well as Jaime Sevilla Molina, for thoughtful comments on this essay.

≈3300 words.

Two distinctions

I wish to draw a distinction which I think is crucial for clarity about AI alignment, yet is rarely drawn. That distinction is between the reward signal of a reinforcement learning (RL) agent and its “utility function”[1]. That is to say, it is not in general true that the policy of an RL agent is optimising for its reward. To explain what I mean by this, I will first draw another distinction, between “optimised” and “optimising”. These distinctions lie at the core of our mesa-optimisation framework.

It’s helpful to begin with an analogy. Viewed abstractly, biological evolution is an optimisation process that searches through configurations of matter to find ones that are good at replication. Humans are a product of this optimisation process, and so we are to some extent good at replicating. Yet we don’t care, by and large, about replication in itself.

Many things we care about look like replication. One might be motivated by starting a family, or by having a legacy, or by similar closely related things. But those are not replication itself. If we cared about replication directly, gamete donation would be a far more mainstream practice than it is, for instance.

Thus I want to distinguish the objective of the selection pressure that produced humans from the objectives that humans pursue. Humans were selected for replication, so we are good replicators. This includes having goals that correlate with replication. But it is plain that we are not motivated by replication itself. As a slogan, though we are optimised for replication, we aren’t optimising for replication.

Another clear case where “optimised” and “optimising” come apart are “dumb” artifacts like bottle caps. They can be heavily optimised for some purpose without optimising for anything at all.

These examples support the distinction I want to make: optimisedoptimising. They also illustrate how this distinction is important in two ways:

1. A system optimised for an objective need not be pursuing any objectives itself. (As illustrated by bottle caps.)
2. The objective a system pursues isn’t determined by the objective it was optimised for. (As illustrated by humans.)

The reason I draw this distinction is to ask the following question:

Our machine learning models are optimised for some loss or reward. But what are they optimising for, if anything? Are they like bottle caps, or like humans, or neither?

In other words, do RL agents have goals? And if so, what are they?

These questions are hard, and I don’t think we have good answers to any of them. In any case, it would be premature, in light of the optimised ≠ optimising distinction, to conclude that a trained RL agent is optimising for its reward signal.

Certainly, the RL agent (understood as the agent’s policy representation, since that is the part that does all of the interesting decision-making) is optimised for performance on its reward function. But in the same way that humans are optimised for replication, but are optimising for our own goals, a policy that was selected for its performance on reward may in fact have its own internally-represented goals, only indirectly linked to the intended reward. A pithy way to put this point is to say that utility ≠ reward, if we want to call the objective a system is optimising its “utility”. (This is by way of metaphor – I don’t suggest that we must model RL agents as expected utility maximizers.)

Let’s make this more concrete with an example. Say that we train an RL agent to perform well on a set of mazes. Reward is given for finding and reaching the exit door in each maze (which happens to always be red). Then we freeze its policy and transfer the agent to a new environment set for testing. In the new mazes, the exit doors are blue, and red distractor objects are scattered elsewhere in the maze. What might the agent do in the new environment?

Three things might happen.

1. It might generalise: the agent could solve the new mazes just as well, reaching the exit and ignoring the distractors.
2. It might break under the distributional shift: the agent, unused to the blue doors and weirdly-shaped distractor objects, could start twitching or walking into walls, and thus fails to reach the exit.
3. But it might also fail to generalise in a more interesting way: the agent could fail to reach the exit, but could instead robustly and competently find the red distractor in each maze we put it in.

To the extent that it's meaningful to talk about the agent's goals, the contrast between the first and third cases suggests that those goals depend only on its policy, and are distinct from its reward signal. It is tempting to say that the objective of the first agent is reaching doors; that the objective of the third agent is to reach red things. It does not matter that in both cases, the policy was optimised to reach doors.

This makes sense if we consider how information about the reward gets into the policy:

For any given action, the policy’s decision is made independently of the reward signal. The reward is only used (standardly, at least) to optimise the policy between actions. So the reward function can’t be the policy’s objective – one cannot be pursuing something one has no direct access to. At best, we can hope that whatever objective the learned policy has access to is an accurate representation of the reward. But the two can come apart, so we must draw a distinction between the reward itself and the policy’s internal objective representation.

To recap: whether an AI system is goal-directed or not is not trivially answered by the fact that it was constructed to optimise an objective. To say that is to fail to draw the optimised ≠ optimising distinction. If we then take seriously goal-directedness in AI systems, then we must draw a distinction between the AI’s internal learned objective and the objective it was trained on; that is, draw the utility ≠ reward distinction.

What objectives?

I’ve been talking about the objective of the RL agent, or its “utility”, as if it is an intuitively sensible object. But what actually is it, and how can we know it? In a given training setup, we know the reward. How do we figure out the utility?

Intuitively, the idea of the internal goal being pursued by a learned system feels compelling to me. Yet right now, we don't have any good ways to make the intuition precise – figuring out how to do that an important open question. As we start thinking about how to make progress, there are at least two approaches we can take: what I’d call the behavioural approach and the internal approach.

Taking the behavioural approach, we look at how decisions made by a system systematically lead to certain outcomes. We then infer objectives from studying those decisions and outcomes, treating the system as a black box. For example, we could apply Inverse Reinforcement Learning to our trained agents. Eliezer’s formalisation of optimisation power also seems to follow this approach.

Or, we can peer inside the system, trying to understand the algorithm implemented by it. This is the internal approach. The goal is to achieve a mechanistic model that is abstract enough to be useful, but still grounded in the agent’s inner workings. Interpretability and transparency research take this approach generally, though as far as I can tell, the specific question of objectives has not yet seen much attention.

It’s unclear whether one approach is better, as both potentially offer useful tools. At present, I am more enthusiastic about the internal approach, both philosophically and as a research direction. Philosophically, I am more excited about it because understanding a model’s decision-making feels more explanatory[2] than making generalisations about its behaviour. As a research direction, it has potential for empirically-grounded insights which might scale to future prosaic AI systems. Additionally, there is the possibility of low-hanging fruit, as this space appears underexplored.

Why worry?

Utility and reward are distinct. So what? If a system is truly optimised for an objective, determining its internal motivation is an unimportant academic debate. Only its real-world performance matters, not the correct interpretation of its internals. And if the performance is optimal, then isn’t our work done?

In practice, we don’t get to optimise performance completely. We want to generalise from limited training data, and we want our systems to be robust to situations not foreseen in training. This means that we don’t get to have a model that’s perfectly optimised for the thing we actually want. We don’t get optimality on the full deployment distribution complete with unexpected situations. At best, we know that the system is optimal on the training distribution. In this case, knowing whether the internal objective of the system matches the objective we selected it for becomes crucial, as if the system’s capabilities generalise while its internal goal is misaligned, bad things can happen.

Say that we prove, somehow, that optimising the world with respect to some objective is safe and useful, and that we can train an RL agent using that objective as reward. The utility ≠ reward distinction means that even in that ideal scenario, we are still not done with alignment. We still need to figure out a way to actually install that objective (and not a different objective that still results in optimal performance in training) into our agent. Otherwise, we risk creating an AI that appears to work correctly in training, but which is revealed to be pursuing a different goal when an unusual situation happens in deployment. So long as we don’t understand how objectives work inside agents, and how we can influence those objectives, we cannot be certain of the safety of any system we build, even if we literally somehow have a proof that the reward it was trained on was “correct”.

Will highly-capable AIs be goal-directed? I don’t know for sure, and it seems hard to gather evidence about this, but my guess is yes. Detailed discussion is beyond our scope, but I invite the interested reader to look at some arguments about this that we present in section 2 of the report. I also endorse Rohin Shah’s Will Humans Build Goal-Directed Agents?.

All this opens the possibility for misalignment between reward and utility. Are there reasons to believe the two will actually come apart? By default, I expect them to. Ambiguity and underdetermination of reward mean that there are many distinct objectives that all result in the same behaviour in training, but which can disagree in testing. Think of the maze agent, whose reward in training could mean “go to red things” or “go to doors”, or a combination of the two. For reasons of bounded rationality, I also expect pressures for learning proxies for the reward instead of the true reward, when such proxies are available. Think of humans, whose goals are largely proxies for reproductive success, rather than replication itself. (This was a very brief overview; section 3 of our report examines this question in depth, and expands on these points more.)

The second reason these ideas matter is that we might not want goal-directedness at all. Maybe we just want tool AI, or AI services, or some other kind of non-agentic AI. Then, we want to be certain that our AI is not somehow goal-directed in a way that would cause trouble off-distribution. This could happen without us building it in – after all, evolution didn’t set out to make goal-directed systems. Goal-directedness just turned out to be a good feature to include in its replicators. Likewise, it may be that goal-directedness is a performance-boosting feature in classifiers, so powerful optimisation techniques would create goal-directed classifiers. Yet perhaps we are willing to take the performance hit in exchange for ensuring our AI is non-agentic. Right now, we don’t even get to choose, because we don’t know when systems are goal-directed, nor how to influence learning processes to avoid learning goal-directedness.

Taking a step back, there is something fundamentally concerning about all this.

We don’t understand our AIs’ objectives, and we don’t know how to set them.

I don’t think this phrase should ring true in a world where we hope to build friendly AI. Yet today, to my ears, it does. I think that is a good reason to look more into this question, whether to solve it or to assure ourselves that the situation is less bad than it sounds.

Mesa-optimisers

This worry is the subject of our report. The framework of mesa-optimisation is a language for talking about goal-directed systems under the influence of optimisation processes, and about the objectives involved.

A part of me is worried that the terminology invites viewing mesa-optimisers as a description of a very specific failure mode, instead of as a language for the general worry described above. I don’t know to what degree this misconception occurs in practice, but I wish to preempt it here anyway. (I want data on this, so please leave a comment if you had confusions about this after reading the original report.)

In brief, our terms describe the relationship between a system doing some optimisation (the base optimiser, e.g.: evolution, SGD), and a goal-directed system (the mesa-optimiser, e.g.: human, ML model) that is being optimised by that first system. The objective of the base optimiser is the base objective; the internal objective of the mesa-optimiser is the mesa-objective.

(“Mesa” is a Greek word that means the opposite of “meta”. The reason we use “mesa” is to highlight that the mesa-optimiser is an optimiser that is itself being optimised by another optimiser. It is a kind of dual to a meta-optimiser, which is an optimiser that is itself optimising another optimiser.

While we’re on the topic of terms, “inner optimiser” is a confusing term that we used in the past to refer to the same thing as “mesa-optimiser”. It did not accurately reflect the concept, and has been retired in favour of the current terminology. Please use ”mesa-optimiser” instead.)

I see “optimiser” in “mesa-optimiser” as a way of capturing goal-directedness, rather than a commitment to some kind of (utility-)maximising structure. What feels important to me is the goal-directedness of the mesa-optimiser, not its optimisational nature: a goal-directed system which isn’t taking strictly optimal actions (but which is still competent at pursuing its mesa-objective) is still worrying. It seems plausible that optimisation is a good way to model goal-directedness—though I don’t think we have made much progress on that front—but equally, it seems plausible that some other approach we have not yet explored could work better. So I myself read the “optimiser” in “mesa-optimiser” analogously to how I accept treating humans as optimisers; as a metaphor, more than anything else.

I am not sure that mesa-optimisation is the best possible framing of these concerns. I would welcome more work that attempts to untangle these ideas, and to improve our concepts.

An alignment agenda

There are at least three alignment-related ideas prompted by this worry.

The first is unintended optimisation. How do we ensure that systems that are not supposed to be goal-directed actually end up being not-goal-directed?

The second is to factor alignment into inner alignment and outer alignment. If we expect our AIs to be goal-directed, we can view alignment as a two-step process. First, ensure outer alignment between humans and the base objective of the AI training setup, and then ensure inner alignment between the base objective and the mesa-objective of the resulting system. The former involves finding low-impact, corrigible, aligned with human preferences, or otherwise desirable reward functions, and has been the focus of much of the progress made by the alignment community so far. The latter involves figuring out learned goals, interpretability, and a whole host of other potential approaches that have not yet seen much popularity in alignment research.

The third is something I want to call end-to-end alignment. It’s not obvious that alignment must factor in the way described above. There is room for trying to set up training in such a way to guarantee a friendly mesa-objective somehow without matching it to a friendly base-objective. That is: to align the AI directly to its human operator, instead of aligning the AI to the reward, and the reward to the human. It’s unclear how this kind of approach would work in practice, but this is something I would like to see explored more. I am drawn to staying focused on what we actually care about (the mesa-objective) and treating other features as merely levers that influence the outcome.

We must make progress on at least one of these problems if we want to guarantee the safety of prosaic AI. If we don’t want goal-directed AI, we need to reliably prevent unintended optimisation. Otherwise, we want to solve either inner and outer alignment, or end-to-end alignment. Success at any of these requires a better understanding of goal-directedness in ML systems, and a better idea of how to control the emergence and nature of learned objectives.

More broadly, it seems that taking these worries seriously will require us to develop better tools for looking inside our AI systems and understanding how they work. In light of these concerns I feel pessimistic about relying solely on black-box alignment techniques. I want to be able to reason about what sort of algorithm is actually implemented by a powerful learned system if I am to feel comfortable deploying it.

Right now, learned systems are (with maybe the exception of feature representation in vision) more-or-less hopelessly opaque to us. Not just in terms of goals, which is the topic here—most aspects of their cognition and decision-making are obscure. The alignment concern about objectives that I am presenting here is just one argument for why we should take this obscurity seriously; there may be other risks hiding in our poor understanding of AI inner workings.

Where does this leave us?

In summary, whether a learned system is pursuing any objective is far from a trivial question. It is also not trivially true that a system optimised for achieving high reward is optimising for reward.

This means that with our current techniques and understanding, we don’t get to know or control what objective a learned system is pursuing. This matters because in unusual situations, it is that objective that will determine the system’s behaviour. If that objective mismatches the base objective, bad things can happen. More broadly, our ignorance about the cognition of current systems does not bode well for our prospects at understanding cognition in more capable systems.

This forms a substantial hole in our prospects at aligning prosaic AI. What sort of work would help patch this hole? Here are some candidates:

• Empirical work. Distilling examples of goal-directed systems and creating convincing scaled-down examples of inner alignment failures, like the maze agent example.
• Philosophical, deconfusion and theoretical work. Improving our conceptual frameworks about goal-directedness. This is a promising place for philosophers to make technical contributions.
• Interpretability and transparency. Getting better tools for understanding decision-making, cognition and goal-representation in ML systems.

These feel to me like the most direct attacks on the problem. I also think there could be relevant work to be done in verification, adversarial training, and even psychology and neuroscience (I have in mind something like a review of how these processes are understood in humans and animals, though that might come up with nothing useful), and likely in many more areas: this list is not intended to be exhaustive.

While the present state of our understanding feels inadequate, I can see promising research directions. This leaves me hopeful that we can make substantial progress, however confusing these questions appear today.

1. By “utility”, I mean something like “the goal pursued by a system”, in the way that it’s used in decision theory. In this post, I am using this word loosely, so I don’t give a precise definition. In general, however, clarity on what exactly “utility” means for an RL agent is an important open question. ↩︎

2. Perhaps the intuition I have is a distant cousin to the distinction drawn by Einstein between principle and constructive theories. The internal approach seems more like a “constructive theory” of objectives. ↩︎

Discuss

Ненасильственное общение. Тренировка

События в Кочерге - 5 сентября, 2019 - 19:30
Как меньше конфликтовать, не поступаясь при этом своими интересами? Ненасильственное общение — это набор навыков для достижения взаимопонимания с людьми. Приходите на наши практические занятия, чтобы осваивать эти навыки и общаться чутче и эффективнее.

Logical Counterfactuals and Proposition graphs, Part 3

Новости LessWrong.com - 5 сентября, 2019 - 18:03
Published on September 5, 2019 3:03 PM UTC

Note that many of the words and symbols I am using are made up. When this maths is better understood, someone should reinvent the notation. My notation isn't that great, but its hard to make good notation when you are still working out what you want to describe.

A theory (of my new type, not standard maths) T is formally defined to be,

Where ψ={s1,s2,...} are the theories symbols. These can be arbitrary mathematical objects.

Ξ={Ξ1,Ξ2,...} is the set of types, also arbitrary.

type:ψ→Ξ is a function.

arity:ψ→∪∞i=0Ξ×Ξ×⋯×Ξi

Is also a function.

An expression E in a theory is defined recursively, it consists of a pair E=[s,v1,v2,⋯,vn]. Here s∈ψ and ∀1≤i≤n:vi is an expression.

Let arity(s)=[x1,x2,...xm]

Sometimes we will write type(E), what we really mean is type(s), and arity(E)=arity(s) We write symbol(E)=s when we want to refer to just s and ignore v1,...,vn

Expressions have the restriction that m=n the number of elements of Ξ that s is mapped to is the same as the number of other expressions.

We also insist that for all i, type(vi)=xi

The base case happens when arity(s)=[ ] the empty list.

All expressions are strictly finite mathematical objects. There are no self referential expressions.

Expressions can be written s(v1,...,vn)

We can define equality between expressions e=s(v1,...) and f=t(w1,...) recursively by saying e=f iff symbol(e)=symbol(f) and forall i:vi=wi

The distinction between expressions e,f is defined recursively.

e−f=[e,f] if symbol(e)≠symbol(f)

e−f=[e,f] if ∃i≠j∈N:vi≠wi and vj≠wj

e−f=None if e=f

e−f=vi−wi if ∃1 i:vi≠wi

These can be uniquely expressed as strings made up of Ξ∪{ ′(′, ′,′, ′)′}

Lets define V(n)=V(Ξn) to be the set of all possible expressions of type Ξn.

A function f:V(n1)×...V(nt)→V(m) is called basic if it is constant, or it is the projection function ( so f(e1,...et)=ek for fixed k≤t) or f can be expressed as

f(e1,...et)=s(v1,...vn) where s is constant and each vi is a basic function of e1,...et.

Note you can cross as much extra junk into f as you like and ignore it. If f(e1,e2) is basic, so is g(e1,e3,e2,e4)=f(e1,e2).

Basic functions can be said to have type(f)=Ξm and arity(f)=[Ξn1,...Ξnt]

Basic functions can be defined in the style of f(α,β)=s1(c1,α,s2(α,β))

Finally, ρ={[f1,g1],[f2,g2],⋯} where fi and gi are basic functions with the same domains and codomains. type(vi)=type(wi)

We write x(α)≡y(z(α)) to mean that the pair of basic functions f(α)=x(α) and g(α)=y(z(α)) are matched in ρ. Ie [f,g]∈ρ where x,y,z∈ψ

We express the concept that for expressions e,f that e−f=[p,q] and there exists [f,g]∈ρ and expressions v1,...vn such that p=f(v1,...vn) and q=g(v1,...vn) by saying

e≡f.

We express that ∃ e1,e2,...,en such that ∀ i<n:ei≡ei+1 and e1=e and en=f as e∼f. (previous posts use ≡ for both concepts)

Lets create a new theory, called T1, this is a very simple theory, it has only 1 constant, 2 functions and 2 substitution rules, with only a single type.

a(b(α))≡α

b(a(α))≡α

With the only constant being 0. This theory can be considered to be a weak theory of integers with only a +1 and -1 function. It has a connected component of its graph for each integer. Propositions are in the same graph if and only if they have the same count(a)−count(b). Theorems look like a(a(a(b(b(0)))))∼b(a(a(0))).

Now consider the theory T2 formed by the symbols f,g,h

f(f(f(g(g(α))))≡α

f(α)≡h(g(α))

f(h(α))≡h(f(α))

g(h(α))≡h(g(α))

It turns out that this theory also has one connected component for each integer, but this time, propositions are in the same component if 2×count(f)−3×count(g)+5×count(h) is the same.

When S=(ψS,ρS,ΞS,typeS,arityS) is a theory and similarly for T.

We can define the disjoint union, S⊔T to be the theory (ψS⊔ψT,ρS⊔ρT,ΞS⊔Ξt,typeS⊔typeT,arityS⊔arityT)

Consider a function f:ΞS→ΞT and a function Q:ψS→basic functions in T, such that type(Q(si))=f(type(si)) and wherearity(si)=[X1,...Xn] means arity(Q(si))=[f(Q(s1)),...f(Q(sn))].

These arity conditions mean that for any expression e=s(v1,...vn) i n S, we can define an expression in T

VT is the set of expressions in T. Call Q∗:VS→VT a transjection if it meets the condition that Q∗(e)=Q(s)(Q∗(v1),...Q∗(vn)). For each Q meeting the above conditions, there exists exactly one transjection.

We can now define a relation.

We say S≲T iff there exists Q∗:VS→VT a transjection such that e∼f in S iff Q∗(e)∼Q∗(f) in T. Call such transjections projections.

Say S≈T iff S≲T and T≲S.

Theorem

S≲T and T≲U implies S≲U.

Proof

There exists a Q:VS→VT and R:VT→VU projections.

The composition of basic maps is basic, by the recursive definition.

The composition of transjections is a transjection.

So R∘Q is a transjection.

For all e,f∈VS, then e∼f⟺Q(e)∼Q(f)⟺R(Q(e))∼R(Q(f))

Hence R∘Q is a projection.

Lemma

S≲S

Let Q be the identity. The identity map is a projection.

Theorem

Suppose S and T are theories, with Q:VS→VT and R:VT→VS transjections.

Suppose that for all e=s(v1,...,vn)∈VS we know that R(Q(e))∼e.

And the same when we swap S with T. Call Q and R psudoinverses.

If we also know that for all [f(v1,...vn),g(v1,...vn)]∈ρS a pair of basic functions, that for all v1,...vn we know Q(f(v1,...))∼Q(g(v1,...)) . Say that Q is axiom preserving. Again we know the same when S is swapped with T, IE that R is axiom preserving.

Note that all these claims are potentially provable with a finite list of a∼b∼c... when the expressions contain the arbitrary constants v1,...vn.

All the stuff above implies that S≈T.

Proof

Consider e≡f∈VS. We know that e−f=[a,b]. st there exists [f,g]∈ρS and some v1,... such that f(v1,...)=a and g(v1,...)=b (or visa versa.)

This tells us that Q(a)∼Q(b)

If ∀i:Q(vi)∼Q(wi) then Q(s(v1,...vn))∼Q(s(w1,...wn)). True because Q(s) is a basic function, and they preserve similarity.

Repeat this to find that Q(e)∼Q(f).

If e∼f then e=e1, f=en and ei≡ei+1, so Q(ei)∼Q(ei+1) so Q(e)∼Q(f).

On the other hand, if Q(e)∼Q(f) then R(Q(e))∼R(Q(f)) because the same reasoning also applies to R. For any e′,f′∈VT we know e′∼f′⟹R(e′)∼R(f′).

We know Q(e),Q(f)∈VT. But S and T are psudoinverses, so e∼R(Q(e))∼ R(Q(f))∼ f hence e∼f⟺Q(e)∼Q(f)

Therefore S≲T. Symmetrically, T≲S so S≈T

Remember our theories T1 and T2 from earlier? The ones about a,b and f,g,h?

We can now express the concept that they are basically the same. T1≈T2.

We can prove this by giving basic functions for each symbol, that generates transjections, and by showing that they are psudoinverses and axiom preserving, we know they are projections and T1≈T2.

Q:T1→T2

Q(0)=0 , Q(a(α))=f(f(g(Q(α)))) , Q(b(α))=f(g(Q(α))).

R:T2→T1

R(f(α))=a(a(R(α))) , R(g(α))=b(b(b(R(α)))) , R(h(α))=a(a(a(a(a(R(α)))))) , R(0)=0 .

For example, pick the symbol a. To show that Q and R are psudoinverses, we need to show that R(Q(a(α)))∼a(α). We know R(Q(a(α)))=R(f(f(g(Q(α)))))=a(a(a(a(b(b(b(R(Q(α)))))))))∼a(a(a(a(b(b(b(α)))))))∼a(α).

To prove these transjections to be psudoinverses, do this with all symbols in ψS⊔ψT.

Finally we prove that Q is axiom preserving. We must show that Q(a(b(α)))∼Q(α) and that Q(b(a(α)))∼Q(α) . Q(a(b(α)))=f(f(g(f(g(Q(α))))))≡f(f(g(h(g(g(Q(α)))))))≡f(f(h(g(g(g(Q(α)))))))≡f(f(f(g(g(Q(α))))))≡Q(α)

Likewise Q(a(b(α)))∼Q(α) .

R is also axiom preserving.

So T1≈T2.

Conclusion

We have formally defined a notion of a theory, and provided a criteria for telling when two theories are trivial distortions of each other. This will allow us notions of logical provability that aren't dependent on an arbitrary choice of formalism. By looking at all equivalent theories, weighted by simplicity, we can hope for a less arbitrary system of logical counterfactuals based on something thematically similar to proof length, although kind of more continuous with graduations of partly true.

Discuss

Somerville Housing Over Time

Новости LessWrong.com - 5 сентября, 2019 - 17:40
Published on September 5, 2019 2:40 PM UTC

I wanted to understand how the number of housing units in Somerville had changed over our history, but wasn't able to find anything searching online. We can get an estimate, however, by looking at the "Year Built" field in the Assessor's Database. This shows the age of Somerville's current housing stock, to the best estimate of the Assessor.

Decade Number Built 17xx 7 180x 36 182x 3 183x 1 184x 30 185x 65 186x 79 187x 143 188x 476 189x 1025 190x 4657 191x 1863 192x 2104 193x 401 194x 55 195x 75 196x 104 197x 181 198x 394 199x 72 200x 24 201x 51 (57 projected)

In estimating how much building was happening in various decades this will be biased towards the present: a building that was demolished and replaced will show up as being built recently, replacing the earlier older listing. On the other hand, newer projects are often larger, as former industrial sites are converted to housing. Still, it shows that current building levels in Somerville, while higher than in the 2000s, are still very low by historical standards.

(The most recent year I see in the data is 2018, so I've scaled the numbers for the 2010s decade up by 10/9 to reflect that we're missing 2019 data. I've also truncated the data to just the decade because my understanding is it's common for the database to use the first year as a stand in for "some time that decade", with 1910 to mean 1910s. Only 11,847 of the 12,145 parcel records contain a "Year Built" field so I'm ignoring the other 2%. I'm also ignoring 84 Perkins St, parcel 14057, which claims to have been built in 1518.)

Discuss

Living the Berkeley idealism

Новости LessWrong.com - 5 сентября, 2019 - 01:20
Published on September 4, 2019 10:20 PM UTC

Quick observation, more funny than insightful.

Today I was thinking about how to publish my thesis when it's finished, and rethinking again what format to put it in. The standard pdf format seems reasonable, but it is merely made of digital print, with no interactivity. Putting in interactivity requires me to not only learn something (web hosting, JavaScript, etc), but at the same time trying to guess which one is future-proof. Flash are Java applets are daily reminders of what not to bet on for future-proof.

Book publishers enjoy a relative longevity. Go to the library and open a book that hasn't been touched for three hundred years, and it would still work. Books run on light energy, with almost no upkeep cost (if located in a building with dry and ~300 K temperature air). It would be a small cause for celebration to find something interactive online that is 10 years old and still works as intended.

To live in this kind of uncertainty is to experience Berkeley's Idealism. Berkeley thought that anything exists only because God is perceiving it, and if God stops perceiving it, that thing disappears. Just like that, as long as you are paying attention to something online, it would keep existing. As soon as you forget about it, it has a serious chance of decaying and stop working.

From which we conclude that Berkeley's God would probably feel annoyed all the time that the world can't seem to run without His constant staring.

Discuss

Новости LessWrong.com - 4 сентября, 2019 - 22:53
Published on September 4, 2019 7:53 PM UTC

This is the 3rd post of 5 containing the transcript of a podcast hosted by Eric Weinstein interviewing Peter Thiel. See here for the first post in sequence.

Student Debt

Peter Thiel: It's like, again, if you come back to something as reductionist as the ever escalating student debt, you know, the bigger the debt gets, you can sort of think what is the 1.6 trillion, what does it pay for? And in a sense, it pays for $1.6 trillion worth of lies about how great the system is. Peter Thiel: And so, the more the debt goes, the crazier the system gets, but also the more you have to tell the lies, and these things sort of go together. It's not a stable sequence. At some point this breaks. You know, again, I would bet on a decade, not a century. Eric Weinstein: Well, this is the fascinating thing, you, of course, famously started the Thiel Fellowship as a program which, correct me if I'm wrong on this, 2005 is when student debt became non-dischargeable even in bankruptcy. Peter Thiel: Yes. The Bush 43 bankruptcy revision. If you don't pay off your student loans when you're 65 the government will garnish your social security wages to pay off your student debt. Eric Weinstein: Right. This is amazing that this exists in a modern society. And of course, well, so let me ask, am I right that you were attacking what was necessary to keep the college mythology going, and you were frightened that college might be enervating some of our sort of most dynamic minds? Peter Thiel: Well, I think there are sort of lot of different critiques one can have of the universities. I think the debt one is a very simple one. It's always dangerous to be burdened with too much debt. It sort of does limit your freedom of action. And it seems especially pernicious to do this super early in your career. Peter Thiel: And so, if out of the gate you owe$100,000, and it's never clear you can get out of that hole, that's going to either demotivate you, or it's going to push you into maybe slightly higher paying, very uncreative professions of the sort that are probably less good at moving our whole society forwards. And so I think the whole thing is extraordinarily pernicious.

Peter Thiel: I started talking about this back in 2010, 2000, it was already like controversial, but it was not, you know... younger people all agreed with me.

Eric Weinstein: The younger people did?

Peter Thiel: And it's a decade later, it's a lot crazier, we haven't yet completely won, but I think there are sort of more and more people who agree with this. I think at this point the Gen X parents of college students tend to agree, whereas I would say the baby boomer parents, you know, 15 years ago, would not have agreed.

Peter Thiel: The 2008 crisis was a big watershed in this too, where you could say the tracking debt, you know, roughly made sense as long as everything, all the tracked careers worked, and 2008 really blew up, you know, consulting, banking, you know, sort of a number of the more track professions got blown up, and so that was kind of a watershed.

Eric Weinstein: I mean this is incredibly dangerous, but also, therefore, quite interesting, if you imagine that the baby boomers have, in some sense, in order to keep the structure of the university going, have loaded it up with administrators, have hiked the tuition much faster than even medical inflation, let alone general inflation, this becomes a crushing debt problem for people who are entering the system.

Eric Weinstein: I saw a recent article that said that the company that, I think it's called Seeking Arrangements, which introduces older men and women with money to younger men and women with a need for money for some sort of ambiguous hybridized dating, companionship, financial transfer. And the claim was that lots of students were using this supposed sugar daddy-ing and sugar mommy, I don't know what the terminology is, in order to alleviate their debt burden.

Eric Weinstein: It's almost as if the baby boomers, in so creating a system, are subjecting their own children to things that are pushing them towards a gray area a few clicks before you get to honest prostitution.

Peter Thiel: No, look, I don't want to impute too much intentionality to how this happened.

Eric Weinstein: No, no, no, it's somewhat emergent.

Peter Thiel: I think a lot of these, it was mostly emergent, mostly these things people, you know, yeah, that we had sort of somewhat cancerous, we don't distinguish real growth from cancerous growth, and then once the cancer sort of the metastasizes at a certain size, you know, you have, you sort of somehow try to keep the whole thing going, and it doesn't make that much sense.

Peter Thiel: But yes, I think one of the reasons, one of the challenges in, on our side, let's be a little more self critical here, on this, is that the question we always are confronted with, well, what is the alternative? How do you actually do something?

Peter Thiel: And it's not obvious what the individual alternatives are. You know, on an individual level, if you get into an elite university, it probably still makes sense to go, you know, it probably doesn't make sense to go to number 100 or something like this.

Eric Weinstein: Yeah, I think that's right.

Peter Thiel: There is sort of a way it can still work individually even if it does not work for our country as a whole. And so, there are sort of all these challenges in coming up with alternate tracks.

Peter Thiel: I think in software there's some degree to which people are going to be hired if they're just good at coding, and it's not quite as critical that they have a computer science degree. You know, can one do this in other careers, other fields? I would tend to think one could. It's been slow to happen.

Political Solutions for Students

Eric Weinstein: Well, so you and I have been excited about a great number of things that have been taking place outside of the institutional system, but one of the things that I continue to be mystified by is that we are somewhat politically divided, where you are well known as a conservative and I really come from a fairly radical progressive streak. So, we have this common view of a lot of the problems, but sometimes we come to very different ideas about how those problems should be solved.

Eric Weinstein: Do you want to maybe just try riffing?

Peter Thiel: Sure.

Eric Weinstein: Like, assume that we somehow found ourselves in possession of some degree of power, with an ability to direct a little bit more than we have currently. What would you do to create the preconditions - so not necessarily picking particular projects - but what would you try to do to create the preconditions where people are really dreaming about futures, both at a technological level, family formation, making our civil society healthier. Where would you start to work first?

Peter Thiel: So, I'm always a little bit uncomfortable with this sort of question, because-

Eric Weinstein: You can turn it on me, too.

Peter Thiel: ... because I feel like, you know, we're not going to be dictators of the United States, and then, you know, there all sorts of things we could do if we were dictators. But certainly, I would look at the college debt thing very seriously. I would say that it's dischargeable in bankruptcy, and if people go bankrupt then part of the debt has to be paid for by the university that did it. There has to be some sort of local accountability. So, this would be-

Eric Weinstein: Love that.

Peter Thiel: ... that would be sort of a more right wing answer.

Peter Thiel: The left wing answer is we should socialize the debt in some ways, and the universities should never pay for it, which would be more the, you know, Sanders-Warren approach. But so, that would be one version.

Peter Thiel: I think one of the main ways inequality has manifested in our society in the last 20, 30 years - I think it's more stagnation than inequality - but just on the inequality side it's the runaway housing costs, and there's sort of, there's a baby boomer version where you have super strict zoning laws so that the house prices go up, and the house is your nest egg. It's not a place to live, it's your nest egg for retirement. And I would, yeah, I would try to figure out some ways to dial all that stuff back massively.

Peter Thiel: And that's probably intergenerational transfer, where it's bad for the asset prices of baby boomer homeowners, but better for younger people to get started in sort of family formation or starting households.

Eric Weinstein: What do you think about the idea of a CED, a college equivalency degree, where you can prove that you have a level of knowledge that would be equivalent, let's say, to a graduating Harvard chemistry major, right? Or a fraction thereof, where you have the ability to prove that through some sort of online delivery mechanism, you can-

Peter Thiel: Great idea. I love it.

Eric Weinstein: Yeah?

Peter Thiel: I think it's very hard to implement. Again, I think these things are hard to do, but great idea.

Peter Thiel: But look, we have all these people who have something like Stockholm syndrome, where they, you know, if you got a Harvard chemistry degree, and if you suspect that actually the knowledge could be had by a lot of people, and if it's just a set of tests you have to pass, that your degree would be a lot less special, you'll resist this very, very hard.

Peter Thiel: You know, if you're in an HR department, or in a company hiring people, you will want to hire people who went to a good college because you went to a good college, and if we broaden the hiring and said we're going to hire all sorts of people, maybe that's self-defeating for your own position. So, you know, I think one should not underestimate how many people have a form of Stockholm syndrome here.

Eric Weinstein: I should've said earlier that the Thiel Fellowship, for those who don't know, is a program that has historically, at least began paying very young people who had been admitted to colleges to drop out of those colleges. So, they got to keep the idea that they'd been admitted to some fairly prestigious place, but then they were given money to actually live their dreams and not put them on hold.

Peter Thiel: Yes, it has been an extremely successful and effective program. It's not scalable.

Eric Weinstein: Right.

Peter Thiel: So, we had to hack the prestige status thing, where it was as hard, or harder, to get a Thiel fellowship than to get into a top university. And so, that's part that's very hard to scale.

Eric Weinstein: When I was looking at that program for you, one of the things that I floated was the idea that if you look at every advanced degree, like a JD, or an MD, a PhD, none of them seem to carry the requirement of having a BA, which is quite mysterious.

Eric Weinstein: And if you fail to get a PhD, let's say, there's usually an embedded master's degree that you get as a going away present. And therefore, if you could get people to skip college, if you give them, perhaps, four years of their lives back, and you could use the first year of graduate school, which is very often kind of a rapid recapitulation of what undergraduate was, so everybody's on a level playing field, and then, worse comes to worst, people would leave with a master's. They would, in general, get a stipend, because a lot of the tuition is remitted to them in graduate programs. Is that a viable program to get some group of people who are highly motivated to avoid the BA entirely as sort of the administrator's degree rather than the professor's degree?

Peter Thiel: Let me see. There are all these different subtle critiques I can have, or disagreements, but yeah, I think the BA is not as valuable as it looks. I also think the PhD is not as valuable as it looks.

Eric Weinstein: Oh, you know how to hurt a guy.

Peter Thiel: So, I sort of feel it's a problem across the board. It strikes me that what you're proposing is a bit of an uphill struggle, because at the top universities the BA is the far more prestigious degree than the PhD at this point. So, if you're at Stanford or Harvard, you know, it's pretty hard to get into the undergraduate, and then you have more PhD students than you have undergraduates.

Peter Thiel: There are all these people who are a very questionable track. They've made questionable choices. And they probably are going to have some sort of psychological breakdown in their future. You know, their dating prospects aren't good. There are all these things that are a little bit off.

Peter Thiel: So yeah, in theory, if you had a super tightly controlled PhD program, that might work, but you have to at least make those two changes. As it is, the people in graduate schools, like, it's like Tribbles in Star Trek. We have just so many, and they all feel expendable and unneeded, and that's not a good place to be.

Peter Thiel: And, whereas I think the undergraduate conceit is still that it's more K-selected instead of R-selected, that it's more that everybody is special and valuable. You know, that's often not true either.

Peter Thiel: So, I'd be critical of both, and I think, but yeah, if we could have a real PhD that was the required, you know, that was much harder, and that actually led to sort of an academic position or some other comparable position, that would be good.

Teleology

Peter Thiel: You know, one of the questions I always come back to in this, is what is the teleology of these programs? Where do they go? One of the analogies I've come up with, is I think elite undergraduate education is like junior high school football.

Eric Weinstein: Junior high school football. I did not see that coming.

Peter Thiel: Playing football in junior high school is probably not damaging for you, but it's not going anywhere-

Eric Weinstein: Ah, I see.

Peter Thiel: -because if you keep playing football in high school, and college, and then professionally, that's just bad. And the better you are, the more successful you are, the less well it works.

Peter Thiel: And then the question is what's the motivational structure? And when I was an undergraduate in the 1980s there was still a part of it where you thought the professors were cool, it might be something you'd like to be at some point in the future, and they were role models, just like in junior high school football an NFL player would have been a role model.

Eric Weinstein: But now it just looks like brain damage in both sides.

Peter Thiel: And now we think it's, yeah, you're just doing lots of brain damage, and it's a track that doesn't work, and therefore the teleology sort of has broken down.

Peter Thiel: So undergraduate, part of the teleology was that it was preparing you for graduate school, and that part doesn't work, and that's what's gotten deranged. Then graduate school, well, it's preparing you to be a postdoc, and then, well, that's the postdoc apocalypse, or whatever you want to call it, postdocalypse.

Eric Weinstein: Postdocalypse?

Peter Thiel: Postdocalypse.

Eric Weinstein: You heard it here, folks, postdocalypse.

Peter Thiel: But just at every step, I think, the teleology of the system is in really bad shape. Of course, this is true of all these institutions with fake growth that are sociopathic or pathological, but at the universities it's striking as very bad.

Peter Thiel: And I think this was already true in important ways back in the '80s, early '90s, when I was going through the system. And when I think back on it, I think I was most intensely motivated academically in high school, because the teleology was really clear. You were trying to get into a good college. And then, by the time I was at Stanford, it was a little bit less clear, by the time I was at law school, really unclear where that was going. And by the time I was 25 I was far less motivated than at age 18, and I think these dynamics are just more extreme than ever today.

Eric Weinstein: What I find so dispiriting about your diagnosis is first of all that I agree with it. Second of all, if we don't train people in these fields, if we don't get people to go into molecular biology, or bioinformatics, or something like that, we're never going to be able to find the low hanging fruit in that orchard. So, it seems to me that we have to find some way that it makes sense for a life to explore these questions.

Conformity and Malthus

Eric Weinstein: One of the things that I don't understand, and I don't know if you have any insight, is it feels to me that almost all of our institutions are carbon copies of each other at different levels of quality. And that there are only a tiny number of actually innovative institutions. It used to be that, you know, Reed college was sex, drugs, and Goethe, and you had St. John's with the great books curriculum that didn't look like anything else, or Deep Springs, and the university of Chicago was crazy about young people, but the diversity of institutions is unbelievably low. Is that wrong?

Peter Thiel: I think that's fair, but I would say the bigger problem with a lot of these fields is, yeah, I think we have to keep training people. I think we need to keep training people in physics or even these fields that seem completely dead, you know?

Eric Weinstein: That’s super important.

Peter Thiel: But I think the question we have to always ask is how many people should we be training-

Eric Weinstein: Way fewer.

Peter Thiel: -and my intuition is you want the gates to be very tight.

Peter Thiel: One of my friends is a professor in the Stanford economics department, and the way he describes it to me is they have about 30 graduate students starting PhDs in economics at Stanford every year. It's six to eight years to get a PhD. At the end of the first year, the faculty has an implicit ranking of the students, where they’ve sort of agreed who the top three or four are. The ranking never changes. The top three or four have, are able to get a good position in academia, the others not so much.

Peter Thiel: And, you know, we're pretending to be kind to people and we're actually being cruel.

Eric Weinstein: Incredibly cruel.

Peter Thiel: And so, I think that if there are going to be - you know, it's a supply demand of labor - if there are going to be good positions in academia, where you can have a reasonable life, it's not a monastic vow of poverty that you're taking to be an academic, if we're going to have that, you don't want this sort of Malthusian struggle. If you have 10 graduate students in a chemistry lab, and you have to have a fistfight for a Bunsen burner or a beaker, and you know, and if some somebody says one politically incorrect thing, you can happily throw everyone, them all out of the overcrowded bus. The buses still overcrowded with nine people on it. That's what's unhealthy.

Peter Thiel: And so, yes, it would be mistake to say we should dial this down and have zero people in these fields.

Eric Weinstein: Right. But this is what's scary to me.

Peter Thiel: That's not what I'm advocating, or what was being advocated here, but there is a point where if you just add more and more people in a starvation Malthusian context, that's not healthy.

Power Laws

Eric Weinstein: Well, this gets to another topic which, I think, is really important, and it's a dangerous one to discuss, which is it seems to me that power laws, those distributions with very thick tails where you have a small number of outliers that often dominate all other activity, are ubiquitous, and that particularly with respect to talent, whether we like them or not, they seem to be present, where a small number of people do a fantastic amount of all of the innovation.

Eric Weinstein: What do we do, if power laws are common, to make people more comfortable with the fact that there is a kind of endowment inequality that seems to be part of species makeup? I mean, I don't even think it's just limited to humans.

Peter Thiel: Well, I'm not convinced these sort of power laws are equally true in all fields of activity. You know, the United States was a frontier country in the 19th century, and most people were farmers, and presumably some people were better farmers than others, but everyone started with 140 acres of land, and there was this wide open frontier. Even if you had some parts of the society that had more of a power law dynamic, there was a large part that didn't. And that was what, I think, gave it a certain amount of health.

Peter Thiel: And yeah, the challenge is if we've geared our society saying that all that matters is education, and PhDs, and academic research, and that this has this crazy power law dynamic, then you're just going to have a society in which there are lots of people playing video games in basements or something like that.

Peter Thiel: So, that's that's the way I would frame it. But yeah, I think there definitely are some areas where this is the case. And then we just need, you know, we need more growth for the whole society. If you have growth, you'll have a rising tide that lifts all boats. So, it's the stagnation, is the problem.

Eric Weinstein: Well, I've joked about this as we are not even communistic in our progressivism, because the old formulation of communism was from each according to his abilities, to each according to his needs, and the inability to recognize different levels of ability. I mean, almost every mathematician or physicist who encountered John von Neumann just said, "The guy is smarter than I am." He's not necessarily the deepest, or he did all of the great work, but you know when you're dealing with somebody who's able to employ skills that you simply don't have. I mean, I know I'm not a concert pianist, and-

Peter Thiel: Right. Look, I don't know how you solve the social problem if everybody has to be a mathematician or a concert pianist. I want a society in which we have great mathematicians and great concert pianists. That seems that that would be a very healthy society. It's very unhealthy if every parent thinks their child has to be a mathematician or a concert pianist, and that's the kind of society we unfortunately have.

Next post on Friday will be Political Violence and Distraction Theories.

Discuss

Contest: $1,000 for good questions to ask to an Oracle AI Новости LessWrong.com - 4 сентября, 2019 - 21:48 Published on September 4, 2019 6:48 PM UTC Edit: contest closed now, will start assessing the entries. The contest I'm offering$1,000 for good questions to ask of AI Oracles. Good questions are those that are safe and useful: that allows us to get information out of the Oracle without increasing risk.

To enter, put your suggestion in the comments below. The contest ends at the end[1] of the 31st of August, 2019.

Oracles

A perennial suggestion for a safe AI design is the Oracle AI: an AI confined to a sandbox of some sort, that interacts with the world only by answering questions.

This is, of course, not safe in general; an Oracle AI can influence the world through the contents of its answers, allowing it to potentially escape the sandbox.

Two of the safest designs seem to be the counterfactual Oracle, and the low bandwidth Oracle. These are detailed here, here, and here, but in short:

• A counterfactual Oracle is one whose objective function (or reward, or loss function) is only non-trivial in worlds where its answer is not seen by humans. Hence it has no motivation to manipulate humans through its answer.
• A low bandwidth Oracle is one that must select its answers off a relatively small list. Though this answer is a self-confirming prediction, the negative effects and potential for manipulation is restricted because there are only a few possible answers available.

Note that both of these Oracles are designed to be episodic (they are run for single episodes, get their rewards by the end of that episode, aren't asked further questions before the episode ends, and are only motivated to best perform on that one episode), to avoid incentives to longer term manipulation.

The counterfactual and low bandwidth Oracles are safer than unrestricted Oracles, but this safety comes at a price. The price is that we can no longer "ask" the Oracle any question we feel like, and we certainly can't have long discussions to clarify terms and so on. For the counterfactual Oracle, the answer might not even mean anything real to us - it's about another world, that we don't inhabit.

Despite this, its possible to get a surprising amount of good work out of these designs. To give one example, suppose we want to fund various one of a million projects on AI safety, but are unsure which one would perform better. We can't directly ask either Oracle, but there are indirect ways of getting advice:

• We could ask the low bandwidth Oracle which team A we should fund; we then choose a team B at random, and reward the Oracle if, at the end of a year, we judge A to have performed better than B.
• The counterfactual Oracle can answer a similar question, indirectly. We commit that, if we don't see its answer, we will select team A and team B at random and fund them for year, and compare their performance at the end of the year. We then ask for which team A[2] it expects to most consistently outperform any team B.

Both these answers get around some of the restrictions by deferring to the judgement of our future or counterfactual selves, averaged across many randomised universes.

But can we do better? Can we do more?

This is the purpose of this contest: for you to propose ways of using either Oracle design to get the most safe-but-useful work.

So I'm offering $1,000 for interesting new questions we can ask of these Oracles. Of this: •$350 for the best question to ask a counterfactual Oracle.
• $350 for the best question to ask a low bandwidth Oracle. •$300 to be distributed as I see fit among the non-winning entries; I'll be mainly looking for innovative and interesting ideas that don't quite work.

Exceptional rewards go to those who open up a whole new category of useful questions.

Questions and criteria

Put your suggested questions in the comment below. Because of the illusion of transparency, it is better to explain more rather than less (within reason).

Comments that are submissions must be on their separate comment threads, start with "Submission", and you must specify which Oracle design you are submitting for. You may submit as many as you want; I will still delete them if I judge them to be spam. Anyone can comment on any submission. I may choose to ask for clarifications on your design; you may also choose to edit the submission to add clarifications (label these as edits).

It may be useful for you to include details of the physical setup, what the Oracle is trying to maximise/minimise/predict and what the counterfactual behaviour of the Oracle users humans are assumed to be (in the counterfactual Oracle setup). Explanations as to how your design is safe or useful could be helpful, unless it's obvious. Some short examples can be found here.

EDIT after seeing some of the answers: decide on the length of each episode, and how the outcome is calculated. The Oracle is run once an episode only (and other Oracles can't generally be used on the same problem; if you want to run multiple Oracles, you have to justify why this would work), and has to get objective/loss/reward by the end of that episode, which therefore has to be estimated in some way at that point.

1. A note on timezones: as long as it's still the 31 of August, anywhere in the world, your submission will be counted. ↩︎

2. These kind of conditional questions can be answered by a counterfactual Oracle, see the paper here for more details. ↩︎

Discuss

Kansas City SSC meetup

Новости LessWrong.com - 4 сентября, 2019 - 18:27
Published on September 4, 2019 3:27 PM UTC

We meet every Sunday, but this meeting in particular will be special inasmuch as it is the formally-announced meetup on SSC.

We will welcome any newcomers and encourage them to share their SSC origin story, and what they are looking for in a group that meets regularly.

Discuss

The Power to Judge Startup Ideas

Новости LessWrong.com - 4 сентября, 2019 - 18:07
Published on September 4, 2019 3:07 PM UTC

This is Part III of the Specificity Sequence

When Steve claims that Uber exploits its drivers, he's role-playing the surface behaviors of an opinionated intellectual, but doesn't bother to actually be an opinionated intellectual, which would require him to nail down a coherent opinion.

It turns out that a lot of startups are founded by people doing something analogous to Steve: they role-play the surface behaviors of running a company and building a product, but don't bother to nail down a coherent picture of what customers would ever come to their business for.

Startup Steves

Paul Graham, cofounder of Y Combinator, calls this failure mode one of The 18 Mistakes that Kill Startups:

Having No Specific User In Mind
A surprising number of founders seem willing to assume that someone, they’re not sure exactly who, will want what they’re building. Do the founders want it? No, they’re not the target market. Who is? Teenagers. People interested in local events (that one is a perennial tarpit). Or “business” users. What business users? Gas stations? Movie studios? Defense contractors?

I'm in the startup industry, and I watch a lot of startups committing suicide by not being specific enough about who their customer is. From my perspective, the failure to be specific isn’t just a top-18 mistake, it’s the #1 mistake that founders make.

If you watch Paul Graham Office Hours at Startup School 2011, you can see for yourself that most of the founders on stage don’t seem to have a specific idea of who they’re building their product for and what difference it makes in their lives. Eliezer observes:

There was an exchange in Paul Graham [and Harj Taggar]’s office hours that went like this, while interviewing a startup that did metrics — analyzing pageviews, roughly — and the entrepreneur was having great trouble describing what they did that Mixpanel didn’t. It went on for a while. It was painful to watch.Paul: I don’t get what the difference is. I still don’t get what the difference is. What’s the difference between you and Mixpanel?Entrepreneur: The difference is — when you have to supplement — they’re a view company and we’re a platform. That’s what it comes down to. They’re like a view, a reporting company. If you need something they don’t have, a feature -Harj: So what’s an example of somewhere you’d use your thing over Mixpanel? Can you give a use-case?Entrepreneur: Yeah, I mean, we had revenue on day zero. There’s a good reason for um… it’s a start up, it’s a series A company in the daily deals space. One we’ve signed a social game company to -Harj: And why do they prefer your thing?Paul: That wasn’t what Harj was asking.The problem (from the perspective of our present discussion) is that the Entrepreneur did not understand that Paul and Harj were repeatedly asking him to move downward on the ladder of abstraction. When the Entrepreneur said “We had revenue on day zero”, he was trying to offer confirmation of the abstract statement “We can do things Mixpanel can’t”, but Paul and Harj still had no idea what his startup actually did.

How many early-stage startups have no specific user in mind? I’d guess about 80% of them. And how bad is not having a specific user in mind? So bad that I don't think they should even be considered a real startup, in the same way that Steve's argument about Uber wasn't a real argument.

Every Startup's Demolishable Claim

Every startup founder makes the same claim to themselves and to the investors they pitch for funding: "We're going to make a lot of money." So what I do, naturally, is ask the founder to furnish a specific example of that claim: a hypothetical story about a single person who might be convinced to pay them a few bucks. And here's how the conversation usually goes:

Founder: We're going to make billions of dollars and have millions of users!Liron: Ok, what's a hypothetical example of how you give one specific user some value?Founder: [Nothing]

Maybe they don't literally say nothing, but they say something that doesn't count for one of these reasons:

• They answer in the abstract instead of giving the example I requested of how they might give value to a specific user
• They choose a specific example wherein their startup's product or service isn't any better for their hypothetical user than the user's available alternatives

At this point, I understand if you think I'm just knocking down a straw man, so here's a real example.

Golden is a 2-year-old startup with $5M in funding from Andreessen Horowitz, Founders Fund, and other notable investors. Their product is intended to be a superior alternative to Wikipedia. Here's an excerpt from the conversation I had with Golden's founder, Jude Gomila, on Twitter: Liron: What specific use case exists on Golden today which is better than could have been achieved if the same amount of writer-effort had been spent on a pre-existing platform?Jude: Quick tldr on this, some points covered in the blog post, however, 1. 1000x the topic space as a mission 2. removal of notability req 3. Using AI to automate flows 4. Using AI to compile knowledge 5. Better fact validation/hi res cites 6. Better schema eg timeline 7. Features and functions eg faving, activity feed, parallel rabbit hole 8. query results like these as well plus many many more. Have you tested the editor: magic cells, citations product and AI suggestions? As for the specific examples Jude provided in response to my question... well, I'll just give you the first two and you can judge them for yourself: Golden has received more funding than startups normally get before having any market traction to show, and the company's high profile makes it a juicy example to illustrate my point here. But there are countless other companies I could have singled out instead. Remember, the majority of early-stage startups are operating in this same failure mode. There are enough examples of startups visibly failing this way that I've started a blog to collect them. The Value Prop Story Test When I chat with a founder about their new startup, or I look through the slide deck that they're using to pitch their idea to investors, the first thing I do is try to pull out what I call a Value Prop Story: one specific story wherein their startup gives somebody some value. A well-formed Value Prop Story must fit into this template: 1. Describe a specific person with a specific problem 2. Describe their current best effort to solve their problem 3. Describe why it’s still a problem 4. Describe how their life gets better thanks to you I've previously observed that telling a well-formed Value Prop Story doesn’t require you to show any market research or empirical evidence validating the quality of your idea. This is like how Steve didn't yet need to give us any empirical or theoretical justifications for his claim about Uber's driver exploitation, he just needed to tell us a story about one hypothetical specific driver getting exploited in a specific way. Who is a specific hypothetical person who will use your product, and in which specific scenario will they use it? That's it, that's the question most startups can’t answer. Answering this question seems objectively easy to me, in the sense that a well-designed AI wouldn't stumble over it at all. What about for a brain though, is it a tough mental operation? Actually, I think you'll find that this is an easy mental operation if you actually have a good startup idea. Here's a Value Prop Story I wrote about my own startup without much trouble: 1. Describe a specific person with a specific problem 23 year old male who can’t get a date 2. Describe their current best effort to solve their problem He gets a Tinder account and does his best to use it on his own 3. Describe why it’s still a problem His matches barely respond to his messages, and when they do, the conversion feels boring and forced. He uses it for 1 hour every day but only manages to get 1 date every 2 months. 4. Describe how their life gets better thanks to you Once Relationship Hero coaches guide him through writing his texts, he suddenly has much better conversations that result in a date each week Since my startup actually has a broad range of use cases (clients come to us for help with a broad range of relationship issues), this Value Prop Story isn't particularly representative of what we do. Its job was merely to prove that there are more than zero plausible specific use cases for Relationship Hero, and it gets that job done. Given how easy this exercise is - we're talking five minutes, tops - I find it mind-boggling that 80% of startups recklessly skip it and go straight to, um… whatever else they think startups are supposed to do. Paul Graham writes: Another of the characteristic mistakes of young founders is to go through the motions of starting a startup. They make up some plausible-sounding idea, raise money at a good valuation, rent a cool office, hire a bunch of people. From the outside that seems like what startups do. But the next step after rent a cool office and hire a bunch of people is: gradually realize how completely fucked they are, because while imitating all the outward forms of a startup they have neglected the one thing that's actually essential: making something people want. Why would you spend time and money building a product when you can’t yet tell a specific Value Prop Story? I think it's because designing and building a product is fun and gives you a false sense of control. You can lie to yourself the whole time about the likelihood that you’ll eventually get people to use what you’re building. But people usually won’t use what you’re building. Whenever a new startup excitedly launches their product for the first time, the most likely outcome is that they get literally zero users. The Secret famously claimed that wishing for something makes the universe give it to you, which is BS, but the converse is true: If you haven’t made a specific enough wish about what your initial market traction is supposed to look like, then the universe won’t give you any traction. The Extra-Powerful Sanity Check Is it healthy for us to be obsessed with judging startups and demolishing claims about their value propositions? When we say that a startup idea is bad on account of lacking a Value Prop Story, is it right and proper to feel pleased with ourselves, or are we being gratuitously adversarial? Along these lines, Mixpanel cofounder Suhail Doshi has tweeted: I get little satisfaction stomping on someone's startup idea. It's so easy to. Somewhere deep, hidden in their abstract description is a distinct yet narrow problem worth solving that's significant. It's more fun to attempt finding it, together. I basically agree with this, and I basically agree with the commenters on my demolish bad arguments post who emphasized that we should seek to shine a light on whatever kernels of truth our conversation partner may have brought to the table. But... Have you ever sanity checked something? A sanity check is like when you punch 583x772 into your calculator, and you quickly multiply the two rightmost digits in your head, 3 x 2 = 6, and then confirm that the calculator's output ends in a 6. If you ever accidentally punch the wrong sequence of keys into the calculator, then you'll be pretty likely to see the calculator's answer end in something other than a 6. It's a good use of two seconds of your time to calculate 3 x 2; you get a substantial dose of Bayesian evidence for your trouble. The Value Prop Story test is likewise a sanity check for startup ideas. In theory, of course a startup founder who is already hiring a team of engineers and building a software product should be able to describe how one specific user will get value from that product. In practice, they often can't. And it's easy for us to quickly check. Here's what's crazy though: We usually expect sanity checks to have a low rate of detecting failures. You expect to successfully multiply numbers on your calculator most of the time, but you do the 3 x 2 sanity check anyway because it's quick. But with the Value Prop Story test, you'll see a high rate of failures! A sanity check with a high failure rate is a rare treat; it's an extra-powerful sanity check. When you're lucky enough to have an extra-powerful sanity check in your toolbox, don't make it a final step in your process, make it the first step in the process. So here's how you can use the Value Prop Story test to upend the traditional order of operations for building a startup: First, repeatedly sanity check yourself with the value prop test until you pass it. Second, do everything else. In all seriousness, I've recommend that early-stage startup founders follow this flow chart: But how should we treat founders who are stuck in the flowchart's "Give Value to One Person" stage? When someone is struggling to pass a sanity check, that doesn't mean we should write off their potential to succeed. It means we should focus our effort on helping them pass the sanity check. Applying the Value Prop Story test is like placing a low bar in a founder's path. Yes, the bar will trip the ones who aren't seeing it. But for the ones who do see the bar, they can step up onto it and then be on their way. And the next step in their path, such as building a quality product, or building a sales funnel, is sure to be a steeper one than that little first one. Next post: The Power to Make Scientific Breakthroughs (coming this weekend) Discuss SSC Paris meetup Новости LessWrong.com - 4 сентября, 2019 - 13:42 Published on September 4, 2019 9:33 AM UTC There will be a SSC meetup in Paris on Sunday September 15th, starting at 3pm. The meetup will take place in the Jardins du Trocadéro, the exact location is 48°51’37.9″N 2°17’11.8″E. For more information, you can join our Discord at https://discord.gg/SDDETuZ . Discuss Caching on Success Новости LessWrong.com - 4 сентября, 2019 - 09:34 Published on September 4, 2019 6:34 AM UTC What I previously took from Cached Thoughts was primarily us failing to examine memes before trotting them out on cue. Some snippet of information A gets lodged *from outside* as the response to some question Q. Whenever Q appears we consult the lookup table and out pops A. More recently I was reflecting on a pattern I have noticed in my work as a statistical programmer. I realised that I was following this pattern on a regular basis. In particular, by caching action sequences that worked. It goes as follows. My code doesn't work in situation Q, so try X. It still doesn't work, so try Y. It still doesn't work, so try Z. It works. So, when presented with failure Q do XYZ. Clearly, this is not proper science. However, it is fairly effective. When presented with that particular kind of failure again, XYZ will make things work. However, change the problem slightly and XYZ may well fail because we don't fully understand why it succeeded in the first instance. Why not do proper science all the time then? I'd love to, in theory. However, if we look at just those three available actions: possibly the order matters, in which case there are 5 other orderings to explore. Possibly we could select some subsets to explore and there are four we didn't examine at all yet {{Y}, {Z}, {X,Z}, {Y,Z}}. The length two subsets each have two orderings, so there are something like 12 remaining action sequences to try out Of course, with intuition, many of action sequences could be dismissed. But the point is that it's pretty time consuming and explodes with the action menu. The cost of testing is not just cognitive. E.g. suppose that testing one action sequence burns$1000 of resources, or training a model takes 24 hours. Or e.g. you just need this thing to work right now and have three ideas, do you faff about looking to understand the optimal solution or do you just throw all three at it and fix the problem?

So, I see why my brain often wants to stick with the tried and true method. The risky element is that it then wants to confabulate some reason why XYZ in particular worked. E.g. suppose X was ineffective and only YZ mattered, I'll find I still want to tell a story about why X was important. Consciously I'm now trying to avoid this by saying something like, "I suspected XYZ would work and it did, but I'm not sure how to credit that to the components."

What would a good heuristic for finding the minimal solution look like? I'm talking about general principles that could help pare back to minimal solutions independently of the underlying subject matter. My first thought would be to try substrings of the existing solution starting from the end (Z, YZ, Y).

The other issue is to make the action set legible. It's easy to make changes in the world without clearly recording what those changes are or bucketing them into discrete containers. We may have an idea for an action W and implement it, but afterwards realise it could be decomposed into Y and Z. But if we don't reflect on the decomposition, in future problem Q may well result in cache hit W. I'd certainly be interested in patterns people have for forcing these decompositions more regularly.

In summary: caching may save a lot more effort than I had previously given it credit for. While also routinely muddying my thinking much more often than I had though was the case, even in highly goal directed behaviour.

Discuss

Уличная эпистемология. Тренировка

События в Кочерге - 3 сентября, 2019 - 19:30
Уличная эпистемология – это особый способ ведения диалогов. Он позволяет исследовать любые убеждения, даже на самые взрывные темы, при этом не скатываясь в спор и позволяя собеседникам улучшать методы познания.

Counterfactuals are an Answer, Not a Question

Новости LessWrong.com - 3 сентября, 2019 - 18:36
Published on September 3, 2019 3:36 PM UTC

I'm going to subtly contradict my last post on logical counterfactuals. I now think that raw or actually-consistent counterfactuals are just an especially useful model of what counterfactuals are as opposed to the be all and end all.

An Old Classic

You may have heard this one before: The Ship of Theseus is a very old ship, so old that all the original parts had been replaced. Is it still the Ship of Theseus? What if we gathered all of the original parts and used them to rebuild the ship? Would this new construct be the Ship of Theseus now?

If you've internalised the idea of the map and the territory, the answer should be clear. Terms like "The Ship of Theseus" are human constructions with no real existence. They can be defined in whatever manner is most useful.

Counterfactuals Don't Exist in the Universe

So let's start with regular counterfactuals. What are they? Do they exist in the universe itself? Unless you're a modal realist (ie. David Lewis and his clones) the answer is no. Given the exact state of universe and an agent, the agent can only make one decision*. In other words, counterfactuals are something we construct.

This means that it is a mistake to search for an objectively true definition. Just as in the Ship of Theseus, our expectation should be that multiple definitions could have value. Of course, some definitions may be more useful or natural than others and indeed, as I've argued, Raw Counterfactuals are particularly natural. However, I wouldn't go as far as I went in that post where I argued that insofar any other kind of counterfactual had meaning, it was derived from Raw Counterfactuals.

Logical counterfactuals are no different. We may not know everything about logic, but this doesn't mean that logic could have been different. We can construct a model where we imagine that logic being different than it is, but there isn't a fact of the matter about how logic would be if 1+1=3 instead of 2 that exists as part of standard logic without any extensions, any more than counterfactuals exist in-universe without any extensions.

Since counterfactuals don't have an objectively true definition, asking, "What are counterfactuals?" is a confused question. It'd be better to ask, "What definitions of counterfactuals are useful?", but then this is still kind of vague. A better answer is to figure out the kinds of questions that we tend to want to answer when counterfactuals arise. This will vary, but in general counterfactuals relate to problems that can be solved via iteration.

An Initial Question

Raw counterfactuals are especially important because they answer a particularly common kind of question related to partial knowledge. One common simplification is to assume that we can enumerate all the states of the universe that would match our given knowledge. Since states are only valid if they are consistent, the states we enumerate will be raw counterfactuals. Indeed, as I previously argued, most uses of CDT-style counterfactuals are justified in terms of how they approximate raw counterfactuals. Pretending an agent's magically decides to turn left instead of right at the last moment, isn't that different from assuming that the universe was slightly different so that agent was always going to turn left, but that this change wasn't ever going to affect anything else.

When the Story Breaks

So why doesn't the story end here? Why might we desire other notions of counterfactual? I won't claim that all of the following situations would lead to an alternative notion of counterfactual, but I wouldn't want to rule it out.

Firstly, the assumption in the previous question is that we already have abstracted out a world mode; that is, for any observation, we have an idea of what worlds are possible and other relevant facts like probabilities. While this describes ideal Bayesian agents, it doesn't really describe more realistic agents who choose their model based at least somewhat upon their observations.

Secondly, we might sometimes want to consider the possibility that our logical assumptions or deductions might be incorrect. This is possible, but tricky, since as if our logic is incorrect, we'll be using our flawed logic to determine how to handle this issues.

Thirdly, it's not always immediately obvious whether a particular state is consistent or not. For example, if you know that you're a utility maximiser, then it would be inconsistent for you to pick any option that provides suboptimal utility, but if you knew the utility was suboptimal you wouldn't need to examine that option. So it would often be useful to be able to analyse these inconsistent situations. Then you could just iterate over a bunch of possibilities without worrying about consistency in advance.

Ambient Decision Theory (much of which has been wrapped into FDT) takes advantage of this. You consider various counterfactuals (X chooses action1, action2, ect.), only one of which should be consistent given the fact that the program is a utility maximiser. The hope is that the only consistent result will represent the best move and it will in the right circumstance.

Inconsistency

This last strategy is neat, but it faces significant challenges. For one, what does it mean to say that when we choose the only consistent result we choose the best outcome? "Best" requires a comparison, which requires a value. In other words, we have assumes a way of assigning utility to inconsistent situations. What does this mean?

From a logically viewpoint, if we ever have P and Not P, then we have a contradiction and can prove anything. So we shouldn't strictly view these inconsistent situations as logical entities, at least in the classical sense. Instead, it may be more useful to view them as a datastructure and the utility function as something that extracts a particular element or elements, then runs a regular utility function over these.

For example, suppose we have:

Viewed as logical propositions, we can prove that the mouse is also dead from the contradiction, but viewed as a datastructure, the mouse is only alive. This doesn't perfectly map to the situation where you're given an inconsistent list of logical propositions and told to look for contradictions, but it is close enough to be analogous as in either case, the inconsistent object is more a pseudo-logical object than a logical object.

So given these datastructures, we can define a notion of "best" for inconsistent situations, but this begs the question, "Why do we care about the best such (inconsistent) datastructure?" Do we only care about these inconsistent objects insofar as they stand in for an actually consistent situation? I suspect the answer could be no and I don't currently have a very good answer for this, but I'm hoping to have one soon.

*Quantum mechanics allows this to be a probability distribution, but then it's just probabilistically deterministic instead, so it only complicates the issue without really changing anything

Discuss

How to write good AI forecasting questions + Question Database (Forecasting infrastructure, part 3)

Новости LessWrong.com - 3 сентября, 2019 - 17:50
Published on September 3, 2019 2:50 PM UTC

This post introduces an open-source database of 76 questions about AI progress, together with detailed resolution conditions, categorisations and several spreadsheets compiling outside views and data, as well as learning points about how to write good AI forecasting questions. It is the third part in a series of blog posts which motivate and introduce pieces of infrastructure intended to improve our ability to forecast novel and uncertain domains like AI.

Background and motivation

Through our work on AI forecasting in the recent year, we’ve tried to write many questions that track important facets of progress, and gained some experience in how hard this is; and how to do it better.

In doing this, we’ve found that most previous public attempts at writing AI forecasting questions (e.g. this and this) fall prey to several failure modes that worsen the signal of the questions; as did many of the questions we wrote ourselves. Overall, we think operationalisation is an unsolved problem, and this has been the impetus behind the work described in this sequence of posts on forecasting infrastructure.

A previous great resource on this topic is Allan Dafoe’s AI Governance Research Agenda, which has an appendix with forecasting desiderata (page 52-53). This blog post complements that agenda by adding a large number of concrete examples.

We begin by categorising and giving examples of some ways in which technical forecasting question can fail to capture the important, intended uncertainty. (Note that the below examples are not fully fleshed out questions, in order to allow for easier reading.) We then describe the question database we’re open-sourcing.

Ambiguity

Terms that can have many different meanings, such as “AGI” or “hardcoded knowledge”.

Underspecification

Resolution criteria that neglects to specify how questions should be resolved in some possible scenarios.

Examples

This question resolves positively if an article in a reputable journal article finds that commercially-available automated speech recognition is better than human speech recognition (in the sense of having a lower transcription error rate).

If a journal article is published that finds that in only in some domains commercially-available automated speech recognition is better (e.g. for HR meetings), but worse in most other domains, it is unclear from the resolution criteria how this question should be resolved.

Spurious resolution

Edge-case resolutions that technically satisfy the description as written, yet fail to capture the intention of the question.

Examples

Positively resolving the question:

Will there be an incident causing the loss of human life in the South China Sea (a highly politically contested sea in the pacific ocean) by 2018?

by having a battleship accidentally run over a fishing boat. (This is adapted from an actual question used by the Good Judgment Project.)

Positively resolving the question:

Will an AI lab have been nationalized by 2024?

by the US government nationalising GM for auto-manufacturing reasons, yet GM nonetheless having a self-driving car research division.

Trivial pathways

Most of the variance in the forecasting outcome of the question is driven by an unrelated causal pathway to resolution, which “screens off” the intended pathways.

A question which avoids resolution by trivial pathways is roughly what Allan Dafoe calls an “accurate indicator”:

We want [AI forecasting questions] to be accurate indicators, as opposed to noisy indicators that are not highly correlated with the important events. Specifically, where E is the occurrence or near occurrence of some important event, and Y is whether the target has been reached, we want P(not Y|not E)~1, and P(Y | E) ~1. An indicator may fail to be informative if it can be “gamed” in that there are ways of achieving the indicator without the important event being near. It may be a noisy indicator if it depends on otherwise irrelevant factors, such as whether a target happens to take on symbolic importance as the focus of research”.

Examples

Forecasting

When will there be a superhuman Angry Birds agent using no hardcoded knowledge?

and realizing that there seems to be little active interest in the yearly benchmark competition (with performance even declining over years). This means that the probability entirely depends on whether anyone with enough money and competence decides to work on it, as opposed to what key components make Angry Birds difficult (e.g. physics-based simulation) and how fast progress is in those domains.

Forecasting

How sample efficient will the best Dota 2 RL agent be in 2020?

by analyzing OpenAI’s decision on whether or not to build a new agent, rather than the underlying difficulty and progress of RL in partially observable, high-dimensional environments.

Forecasting

Will there have been a 2-year interval where the amount of training compute used in the largest AI experiment did not grow 32x, before 2024?

by analyzing how often big experiments are run, rather than what the key drivers of the trend are (e.g. parallelizability and compute economics) and how they will change in the future.

Failed parameter tuning

Any free variables are set to values which do not maximise uncertainty.

Examples

When will 100% of jobs be automatable at a performance close to the median employee and cost <10,000x of that employee?

be very different from using the parameter 99.999%, as certain edge-case jobs (“underwater basket weaving”) might be surprisingly hard to automate but for non-interesting reasons.

Will global investment in AI R&D be <$100 trillion in 2021? is not interesting, even though asking about values in the range of ~$30B to ~$1T might have been. Non-incentive compatible questions Questions where the answer that would score highest on some scoring metric is different from the forecaster’s true belief. Examples Forecasting ”Will the world end in 2024?” as 1% (or whatever else is the minimum for the given platform), because for any higher number you wouldn’t be around to cash out the rewards of your calibration. Question database We’re releasing a database of 76 questions about AI progress, together with detailed resolution conditions, categorisations and several spreadsheet compiling outside views and data. We make these questions freely available for use by any forecasting project (under an open-source MIT license). The database has grown organically through our work on Metaculus AI, and several of the questions have associated quantitative forecasts and discussion on that site. The resolution conditions have often been honed and improved through interaction with forecasters. Moreover, in cases where the questions stem from elsewhere, such as the BERI open-source question set, or the AI Index, we’ve often spent a substantial amount of time improving the resolution conditions in cases where they were lacking. Of particular interest might be the column “Robustness”, which tracks our overall estimate of the quality of questions -- that is, the extent to which they avoid suffering from the failure modes listed above. For example, the question: By 2021, will a neural model reach >=70% performance on a high school mathematics exam? is based on a 2019 DeepMind paper, and on the face of it liable to several of the failure modes above. Yet it’s robustness is rated as “high”, since we have specified a detailed resolution condition as: By 2021, will there EITHER…1. … be a credible report of a neural model with a score of >=70% on the task suite used in the 2019 DeepMind paper…2. OR be judged by a council of experts that it’s 95% likely such a model could be implemented, were a sufficiently competent lab to try… 3. OR be a neural model with performance on another benchmark judge by a council of experts to be equally impressive, with 95% confidence? The question can still fail in winner’s curse/Goodharting-style cases, where the best performing algorithm on a particular benchmark overestimates progress in that domain, simply because selecting for benchmark performance also selects for overfitting to the benchmark as opposed to mastering the underlying challenge. We don’t yet have a good default way of resolving such questions in a robust manner. How to contribute to the database We welcome contributions to the database. Airtable (the software where we’re hosting it) doesn’t allow for comments, so if you have a list of edits/additions you’d like to make, please email hello@parallelforecast.com and we can make you an editor. Discuss How Specificity Works Новости LessWrong.com - 3 сентября, 2019 - 15:11 Published on September 3, 2019 12:11 PM UTC This is Part II of the Specificity Sequence You saw what mayhem we brought forth when we activated the first power of specificity, the power to demolish arguments, and hopefully your curiosity is piqued to see what’ll happen when we activate all the other powers. But first, let’s pause here to ask: How does specificity work? Consider this dialogue between Steve and one of his pals: Steve: Information should be free!Steve's Pal: Whoa, this is thought-provoking stuff. Ok, so, would you say you're advocating for digital socialism, or more like digital libertarianism? Oh jeez. Not only is Steve's pal not pushing for Steve to be more specific, the pal is an enabler who pushes Steve to be less specific. They're climbing the ladder of abstraction the wrong way. The Ladder of Abstraction When you want to nail down a claim, the operative word is “down”: you want to bring the discussion down the ladder of abstraction: If Steve says, “Information should be free!” and I'm trying to understand what he means, here's what I'd say: Liron: Ok, why do you think I shouldn’t have had to pay Amazon for my paperback copy of To Kill A Mockingbird? This way, I'm nosediving all the way down to the bottom rung of the ladder of abstraction. Down here, the conversation becomes grounded in the concrete language of everyday experience, with substantive statements like “Amazon charged my credit card$6.99 and kicked back \$1.15 to Harper Lee’s estate.”

And how about that free shipping, Steve?

Having this kind of grounded discussion is usually more productive than having a flingfest of the higher level ballpit-words “information”, “freedom”, “socialism”, and “libertarianism”.

As Steve and I are hanging out on the bottom rung of the ladder of abstraction, talking through specific examples of information getting exchanged freely vs. non-freely, we’ll be able to notice if certain features seem to remain constant across our chosen examples. For instance, we might discuss various hypothetical authors who yearn to write books for a living, and observe that all such authors can still have some plausible mechanism to earn money (running ads on their blogs?), even without directly charging for the privilege of reading their books.

After loading our brains with specific examples, we can then abstract over these examples, climb our way back up the ladder of abstraction, and put forth a generalized claim about whether “information should be free”. Since we've been careful to think about specific example scenarios that our claim applies to, we'll be putting forth a coherent and meaningful proposition - a claim that others can study under the magnifying lens of specificity, not an empty claim that gets demolished.

So when someone makes an abstract assertion like "information should be free", the best thing you can do is hold their hand and guide them down the ladder of abstraction. I know it’s tempting to skip that process and just attack or admire their original abstract claim. But show me two people discussing a topic in purely abstract terms, and I'll show you two people who are talking past each other.

How To Slide Downward

How do you take a concept and slide it down the ladder of abstraction to obtain a more specific concept? What mental operation must your brain perform?

In Replace the Symbol with the Substance, Eliezer explains how to do it with baseball terms:

You have to visualize. You have to make your mind’s eye see the details, as though looking for the first time.Is that a “bat”? No, it’s a long, round, tapering, wooden rod, narrowing at one end so that a human can grasp and swing it.Is that a “ball”? No, it’s a leather-covered spheroid with a symmetrical stitching pattern, hard but not metal-hard, which someone can grasp and throw, or strike with the wooden rod, or catch.Are those “bases”? No, they’re fixed positions on a game field, that players try to run to as quickly as possible because of their safety within the game’s artificial rules.

Or imagine you’re discussing the US’s education system. Here's how Eliezer slides it down the ladder of abstraction:

Why are you going to “school”? To get an “education” ending in a “degree”. Blank out the forbidden words and all their obvious synonyms, visualize the actual details, and you’re much more likely to notice that “school” currently seems to consist of sitting next to bored teenagers listening to material you already know, that a “degree” is a piece of paper with some writing on it, and that “education” is forgetting the material as soon as you’re tested on it.Leaky generalizations often manifest through categorizations: People who actually learn in classrooms are categorized as “getting an education”, so “getting an education” must be good; but then anyone who actually shows up at a college will also match against the concept “getting an education”, whether or not they learn.

Let’s try it ourselves with the concept of a school “lecture”. What do we get when we slide it down the ladder of abstraction?

"A stage presentation of publicly-available educational material, hand-produced and performed by a professor who works at your educational institution, which you watch by locating yourself in a set building at a set meeting time, and which proceeds in a fixed order and at a fixed rate like broadcast television pre-YouTube."

Wow. When you hear it that way, it raises a lot of questions:

• How about using the best educational materials from the internet as a course’s official materials? Those are surely better than what any professor in your school has ever performed.
• How about making it a standard expectation for students to consume lectures at their own pace, including taking advantage of the pause and speed up / slow down features?
• How about not forcing students to show up to a lecture hall at a specific time?

If people had never heard the word “lecture”, if people were always forced to talk about lectures via a specific description of what a lecture is… well, then they would have killed off lectures by now.

I believe the college lecture is only alive today because the word “lecture” is a protective abstraction-bubble.

I grabbed this out of Steve’s ball pit

If you crack open the protective shell of “lecture” and press your nose close enough, you breathe in the stinky innards: “a stage presentation of publicly-available educational material”.

If everyone involved with the university system — administrators, professors, parents, students — were themselves cracking open the shell and taking a whiff of “lecture”, they would have noticed when the expiration date passed (the day YouTube went mainstream) and taken out the trash.

Instead, we’ve landed in a weird place where the concept of a “stage presentation of publicly-available educational material” is ridiculous on its face, while our ears still tell us that the concept of a school “lecture” sounds pretty good.

Ground Your Terms

You probably know that to have a clear discussion (or just a clear thought), you need to define your terms. How do you define a term?

S. I. Hayakawa illustrates an attempt to define the term “red” by connecting it to concepts higher up the ladder of abstraction (h/t Eliezer):

“What is meant by the word red?”
“It’s a color.”
“What’s a color?”
“Why, it’s a quality things have.”
“What’s a quality?”

This approach of defining something by sliding it up the ladder of abstraction doesn’t feel productive. It might help define “red”, but it’s neither necessary nor sufficient to define “red”.

Similarly, when I asked Steve to define what “exploiting workers” means in regard to Uber, and he put forth “to use selfishly for one’s own ends”, we found ourselves no closer to understanding what the heck his point was supposed to be.

So how can we nail down “red”? How can we slide it down the ladder of abstraction? Hayakawa illustrates:

“What is meant by the word red?”
“Well, the next time you see some cars stopped at an intersection, look at the traffic light facing them. Also, you might go to the fire department and see how their trucks are painted.”

Now we’re getting somewhere. This is a good enough definition to satisfy someone who previously didn’t know what “red” means. Whenever we define a concept in this manner, by sliding it down the ladder of abstraction, we can call it grounding the concept.

Grounding is easy enough to do. Just follow Eliezer’s instructions from the previous section:

You have to visualize. You have to make your mind’s eye see the details, as though looking for the first time.

For example: What is fire?

If you know chemistry, you might define it as “rapid oxidation accompanied by heat and, usually, light”. But if you don’t know chemistry, what would you say? Most people would give up.

Don’t worry, just follow the instructions to ground it. Close your eyes and describe what you see:

"The bright orange heat and light that appears when I strike a match, and can sometimes be transferred to other things it touches, and keeps appearing as long as there is wood and air around it."

Concepts have many definitions, and not all of them are groundings. But in daily life, grounding a term is usually as good as precisely defining it, if not better. So when a term is confusing, just slide that sucker down the ladder of abstraction.

Effort and Risk Asymmetry

If you observe your own stream of consciousness in a discussion, you might feel it being gently buoyed up the ladder of abstraction. You might start a discussion with a few specific statements about firetrucks, but before you know it you're talking about redness and colors in general.

Why is that? If the most productive kind of discussion is grounded and concrete, then why do so many people seem to relish the experience of pontificating and arguing abstractly?

Eliezer says in The 5-Second Level:

Abstraction is a path of least resistance, a form of mental laziness. Over-abstraction happens because it’s easy to be abstract. It’s easier to say “red is a color” than to pause your thoughts for long enough to come up with the example of a stop sign. Abstraction is a path of least resistance, a form of mental laziness.

It seems our brain stores each concept with an easily-navigable pointer upwards toward its category (like red→color), but doesn’t store easily-navigable pointers downward toward specific example (like red→firetruck). I’m not sure what larger aspect of the brain's architecture accounts for this, but here's one observation about why the two operations are asymmetrical:

When you slide a concept up, you remove information. When you slide it down, you add information, and that requires you to make an arbitrary choice with more degrees of freedom. I think there's a sense in which going up the ladder of abstraction is safer, while going down is riskier, in the sense of leaking information about you that the elephant in the brain is keen to monitor and filter.

Eliezer's description of how he intentionally sticks his neck out in arguments has always stayed with me (source), and I suspect it's related to why we find abstraction appealing:

I stick my neck out so that it can be chopped off if I'm wrong, and when I stick my neck out it stays stuck out, and if I have to withdraw it I'll do so as a visible concession.  I may parry[...] but I at least endeavor not to dodge. Where I plant my standard, I have sent an invitation to capture that banner; and I'll stand by that invitation.

When you say something abstract, like "information should be free", the space of possible things you can mean is vast. That means you're not sticking your neck out - you're not affixing your neck to a precise location in claim-space, so you never have to fear that an opponent's sword might slash there. Plus, as a bonus, vague statements make you sound smarter. You're signaling more intelligence and sophistication talking about "digital libertarianism" than about "buying a paperback on Amazon". That's why abstraction is appealing.

The upshot is that you'll have to make a sustained conscious effort to acquire the skill and habit of activating your specificity powers. But it'll be worth it.

Next post: The Power to Judge Startup Ideas (coming tomorrow)

Discuss

[Hammertime Final Exam] Quantum Walk, Oracles and Sunk Meaning

Новости LessWrong.com - 3 сентября, 2019 - 14:59
Published on September 3, 2019 11:59 AM UTC

One and a half years ago, alkjash published his Hammertime sequence. Admittedly a bit late to the party I recently went through the 30 days together with a few other aspiring rationalists, who too may post their "final exams" in the coming weeks.

1. Design a instrumental rationality technique.
2. Introduce a rationality principle or framework.
3. Describe a cognitive defect, bias, or blindspot.

For each of these, we're supposed to spend 5 minutes brainstorming plus 5 minutes writing the text. This worked out for the brainstorming in my case, but the writing took somewhat longer, especially turning the whole thing into a semi-readable post (I couldn't figure out how to properly format stuff in here; took me 15 minutes to create proper headlines as they were never limited to my selection but turned my whole text into a headline, plus creating these fancy separators always removed big chunks of text; so, sorry about that).

Technique: Quantum Walk

Murphyjitsu is a CFAR technique that helps you bulletproof plans that might feel secure but in reality probably aren't. It makes you focus on all these things that could go wrong which you haven't taken into account yet.

I propose a similar yet inverse method: picking goals that you're hesitant to set because they seem unreachable for whatever reason. Given you have such a goal, imagine that x months from now you will have reached it, and now think what course of actions may have brought you from where you are now to that point.
So here we are not looking for unforeseen failure modes, but instead for a hidden path to success, one that may make the difference between shying away from that goal, and making tangible progress toward it.

The reason I'm calling it "quantum walk" is twofold: Firstly, I couldn't think of any better name, as "anti murphyjitsu" sounds suboptimal. Secondly, I recently read Life on the Edge and am now primed to see quantum mechanics everywhere. One thing the book explained was how photosynthesis contains a step that only works due to quantum mechanics: some particle has to find a certain very specific spot to settle in. If that particle behaved in a classical way, finding that spot would be incredibly unlikely and the whole process of photosynthesis wouldn't actually work. Yet utilizing quantum laws the particle finds its way to its destination.
The technique's approach is similar in the way that you're not searching a way to the goal from where you are now. Instead you simply accept that the goal is reached in some hypothetical future situation, and backtrack from there. The analogy is weak, but at least the name has a certain ring to it.

For those interested, I suggest the following challenge: Pick any goal you'd like to have but are afraid to actually accept as a goal. In case you can't think of any, use the technique of picking any existing goal and doubling its scale until it seems utterly ridiculous. Now try to answer the question "assuming x months from now I've reached that goal, what course of actions has led me there?" and feel very free to share whether or not this has led you to any new insights.

(Disclaimer: my explanation of that photosynthesis phenomenon may be somewhat (or very) off, but as it's not really the point of the text I take that risk)

There are times when we're trying to learn or understand something but aren't really getting anywhere. Whether it's learning some esoteric concept in physics class, assembling a piece of furniture, setting up a computer or debugging a piece of code. Sometimes these cases seem really obscure and highly confusing - we don't even know where to start, let alone how to solve the issue. As aspiring rationalists we're aware that this confusion stems not from the territory itself but from the map we're using, yet we feel unable to adjust our map accordingly: it's so distant from the territory that banging our head against the wall may seem just as promising a step as any other we could possibly think of.

There is one thing we have control about however: becoming an expert on our own confusion. Only after whe know exactly what we don't understand and why can we take the necessary steps to fill in the gaps and slowly get to a higher vantage point. One way to approach this is to ask oneself which question, if answered by a hypothetical oracle that truthfully answers yes or no to any question that has a clear answer, would reduce one's confusion the most.

Tutoring often works in a way that the student struggles to understand a certain concept and thus the teacher tries to explain it to the student. Hence the teacher is more active and the student, in a comparably passive role, takes the consumed information and tries to add it to their mental model wherever it sticks. This framework turns things around and puts the student in the active role and any progress emerges from them instead of the teacher. The hypothetical teacher in this case - the oracle - is completely reactive and does nothing other than passively answering the student's enquiries.

In reality of course there is no oracle. At best there's another person with an understanding that exceeds yours. At worst there's still the internet and text books.

I personally often find myself in the situation that I struggle with something, say some API I want to use in a web application but it doesn't behave as expected, and I get frustrated and blame outside sources such as the developer of that API or the bad documentation. This framework forces me to take full responsibility and focus on what actually matters, which is what exactly is currently limiting my understanding and what I could do to change that. Once I have a set of questions the answers to which would allow me to progress, the process of finding answers to these questions is often easier than expected. I tend to focus so much on wanting to find answers that I forget to realize the crucial part is to come up with the right questions. And the latter doesn't depend on the territory to which I'm lacking access, but entirely on my personal map.

Bias: Sunk Meaning

We all know the concept of "sunk cost", usually in the context of money, time, effort or some other limited resource. I'd like to discuss a related idea which could be called "sunk meaning".

Imagine a cow is slaughtered because somebody wants to eat its meat. For whatever reason, that meat turns out to be unusable and cannot be consumed. The only thing that remains which could be utilized in any way is the cow's skin.

You are now invited as the expert on whether to skin the cow in order to produce some leather or not, and you know with certainty that doing so would cost exactly X resources (including time, money, material etc.). You also know there's a new innovative method to synthetize perfect leather indistinguishable from animal leather but without using an animal, which also costs exactly X resources. You also happen to own a waste dematerializer, so getting rid of the cow's remains is not an issue whichever action you choose.
You have these two options, each exactly equally expensive and with exactly the same outcome. Which option would you prefer, if any?

I'd assume more than half of all people would feel like using the cow is the reasonable option here, for some reason similar to "otherwise the cow would have died for nothing". I recently experienced a member of the rationalist community use that exact line of reasoning.

To this I counter the following: causality never moves back in time.

A present action can have no causal effect on something that happened in the past. Using pieces of the cow today does not affect the death of the cow in any real way. Saying "otherwise the cow would have died for nothing" says nothing about the cow or its death, only about our personal current interpretation. The thing that may or may not improve by us using the cow instead of the synthetic method has nothing to do with what the argument states, instead it has everything to do with how we personally feel.

The thing I'm hinting at is that every time we feel like something gives meaning to the past, we should be aware that this is 100% imaginary. It may certainly affect how people feel and behave and as such has to be taken seriously, but the same is true for sunk cost, and we should at the very least be that honest when communicating that issue. Falling for sunk meaning is just as (ir)rational as falling for sunk cost.
We're not using the cow's skin because it gives meaning to the cow's death or because it makes it less of a waste. We're using the cow's skin to feel better.

Discuss

AIXSU - AI and X-risk Strategy Unconference

Новости LessWrong.com - 3 сентября, 2019 - 14:35
Published on September 3, 2019 11:35 AM UTC

Start: Friday, November 29, 10am
End: Sunday, December 1, 7pm
Location: EA Hotel, 36 York Street, Blackpool

AIXSU is an unconference on AI and existential risk strategy. As it is an unconference, the event will be created by the participants. There will be an empty schedule which you, the participants, will fill up with talks, discussions and more.

AIXSU is inspired by TAISU, which was a successful AI Safety unconference at the EA Hotel in August. The AI and existential risk strategy space seems to be in need for more events and AIXSU hopes to close this gap a bit. The unconference will be three days long.

To enable high-level discussion during the unconference, we require that all participants have some prior involvement with AI or existential risk strategy. AI and existential risk strategy concerns the broad spectrum of things we need to solve in order for humanity to handle the technological transitions ahead of us. Topics of interest include but are not limited to: Macrostrategy, technological forecasting, technological scenarios, AI governance, AI policy, AI ethics, cooperative principles and institutions, and foundational philosophy on the future of humanity. Here is an incomplete list of sufficient criteria:

• Are currently or have previously worked for or interned at an established existential risk reduction organization
• Have published papers or sufficiently high quality blog posts on strategy-related topics
• Combination of involvement in AI safety or other existential risk work with involvement in AI strategy. For example, you’ve worked on AI safety on and off for a few years and also have an active interest in strategy-related questions . Or you attended one of AI Safety Camp, MSFP/AISFP, Human-aligned AI Summer School, and are now focusing on strategy.
• You are pursuing a future in AI strategy or existential risk strategy and have read relevant texts on the topic.

If you feel uncertain about qualifying, please feel free to reach out and we can have a chat about it.

You can participate in the unconference as many or as few days as you would like to. You are also welcome to stay longer at the EA Hotel before or after the unconference.

Price: Pay what you want (cost price is £10/person/day).
Food: All meals will be provided by EA Hotel. All food will be vegan.
Lodging: The EA hotel has two dorm rooms that have been reserved for AIXSU participants. If the dorm rooms are filled up enough, or if you would like your own room, there are many nearby hotels that you can book. We will provide information on nearby hotels.

Attendance is on a first-come, first-served basis. Make sure to apply soon if you want to secure your spot.

Apply to attend AIXSU here

Discuss

Book Review: Ages Of Discord

Новости LessWrong.com - 3 сентября, 2019 - 09:30
Published on September 3, 2019 6:30 AM UTC

I.

I recently reviewed Secular Cycles, which presents a demographic-structural theory of the growth and decline of pre-industrial civilizations. When land is plentiful, population grows and the economy prospers. When land reaches its carrying capacity and income declines to subsistence, the area is at risk of famines, diseases, and wars – which kill enough people that land becomes plentiful again. During good times, elites prosper and act in unity; during bad times, elites turn on each other in an age of backstabbing and civil strife. It seemed pretty reasonable, and authors Peter Turchin and Sergey Nefedov had lots of data to support it.

Ages of Discord is Turchin’s attempt to apply the same theory to modern America. There are many reasons to think this shouldn’t work, and the book does a bad job addressing them. So I want to start by presenting Turchin’s data showing such cycles exist, so we can at least see why the hypothesis might be tempting. Once we’ve seen the data, we can decide how turned off we want to be by the theoretical problems.

The first of Turchin’s two cyclic patterns is a long cycle of national growth and decline. In Secular Cycles‘ pre-industrial societies, this pattern lasted about 300 years; in Ages of Discord‘s picture of the modern US, it lasts about 150:

This summary figure combines many more specific datasets. For example, archaeologists frequently assess the prosperity of a period by the heights of its skeletons. Well-nourished, happy children tend to grow taller; a layer with tall skeletons probably represents good times during the relevant archaeological period; one with stunted skeletons probably represents famine and stress. What if we applied this to the modern US?

Average US height and life expectancy over time. As far as I can tell, the height graph is raw data. The life expectancy graph is the raw data minus an assumed constant positive trend – that is, given that technological advance is increasing life expectancy at a linear rate, what are the other factors you see when you subtract that out? The exact statistical logic be buried in Turchin’s source (Historical Statistics of the United States, Carter et al 2004), which I don’t have and can’t judge.

This next graph is the median wage divided by GDP, a crude measure of income equality:

Lower values represent more inequality.

This next graph is median female age at first marriage. Turchin draws on research suggesting this tracks social optimism. In good times, young people can easily become independent and start supporting a family; in bad times, they will want to wait to make sure their lives are stable before settling down:

This next graph is Yale tuition as a multiple of average manufacturing worker income. To some degree this will track inequality in general, but Turchin thinks it also measures something like “difficulty of upward mobility”:

This next graph shows DW-NOMINATE’s “Political Polarization Index”, a complicated metric occasionally used by historians of politics. It measures the difference in voting patterns between the average Democrat in Congress and the average Republican in Congress (or for periods before the Democrats and Republicans, whichever two major parties there were). During times of low partisanship, congressional votes will be dominated by local or individual factors; during times of high partisanship, it will be dominated by party identification:

I’ve included only those graphs which cover the entire 1780 – present period; the book includes many others that only cover shorter intervals (mostly the more recent periods when we have better data). All of them, including the shorter ones not included here, reflect the same general pattern. You can see it most easily if you standardize all the indicators to the same scale, match the signs so that up always means good and down always means bad, and put them all together:

Note that these aren’t exactly the same indicators I featured above; we’ll discuss immigration later.

The “average” line on this graph is the one that went into making the summary graphic above. Turchin believes that after the American Revolution, there was a period of instability lasting a few decades (eg Shays’ Rebellion, Whiskey Rebellion) but that America reached a maximum of unity, prosperity, and equality around 1820. Things gradually got worse from there, culminating in a peak of inequality, misery, and division around 1900. The reforms of the Progressive Era gradually made things better, with another unity/prosperity/equality maximum around 1960. Since then, an increasing confluence of negative factors (named here as the Reagan Era trend reversal, but Turchin admits it began before Reagan) has been making things worse again.

II.

Along with this “grand cycle” of 150 years, Turchin adds a shorter instability cycle of 40-60 years. This is the same 40-60 year instability cycle that appeared in Secular Cycles, where Turchin called it “the bigenerational cycle”, or the “fathers and sons cycle”.

Timing and intensity of internal war in medieval and early modern England, from Turchin and Nefedov 2009.

To check this empirically, Turchin tries to measure the number of “instability events” in the US over various periods. He very correctly tries to use lists made by others (since they are harder to bias), but when people haven’t catalogued exactly the kind of instability he’s interested in over the entire 1780 – present period, he sometimes adds his own interpretation. He ends up summing riots, lynchings, terrorism (including assassinations), and mass shootings – you can see his definition for each of these starting on page 114; the short version is that all the definitions seem reasonable but inevitably include a lot of degrees of freedom.

When he adds all this together, here’s what happens:

Political instability / violent events show three peaks, around 1870, 1920, and 1970.

The 1870 peak includes the Civil War, various Civil War associated violence (eg draft riots), and the violence around Reconstruction (including the rise of the Ku Klux Klan and related violence to try to control newly emancipated blacks).

The 1920 peak includes the height of the early US labor movement. Turchin discusses the Mine War, an “undeclared war” from 1920-1921 between bosses and laborers in Appalachian coal country:

Although it started as a labor dispute, it eventually turned into the largest armed insurrection in US history, other than the Civil War. Between 10,000 and 15,000 miners armed with rifles fought thouasnds of strike-breakers and sheriff’s deputies, called the Logan Defenders. The insurrection was ended by the US Army. While such violent incidents were exceptional, they took place against a background of a general “class war” that had been intensifying since the violent teens. “In 1919 nearly four million workers (21% of the workforce) took disruptive action in the face of employer reluctance to recognize or bargain with unions” (Domhoff and Webber, 2011:74).

Along with labor violence, 1920 was also a peak in racial violence:

Race-motivated riots also peaked around 1920. The two most serious such outbreaks were the Red Summer of 1919 (McWhirter 2011) and the Tulsa (Oklahoma) Race Riot. The Red Summer involved riots in more than 20 cities across the United States and resulted in something like 1,000 fatalities. The Tulsa riot in 1921, which caused about 300 deaths, took on an aspect of civil war, in which thousands of whites and blacks, armed with firearms, fought in the streets, and most of the Greenwood District, a prosperous black neighborhood, was destroyed.

And terrorism:

The bombing campaign by Italian anarchists (“Galleanists”) culminated in the 1920 explosion on Wall Street, which caused 38 fatalities.

The same problems: labor unrest, racial violence, terrorism – repeated during the 1970s spike. Instead of quoting Turchin on this, I want to quote this Status 451 review of Days of Rage, because it blew my mind:

“People have completely forgotten that in 1972 we had over nineteen hundred domestic bombings in the United States.” — Max Noel, FBI (ret.)

Recently, I had my head torn off by a book: Bryan Burrough’s Days of Rage, about the 1970s underground. It’s the most important book I’ve read in a year. So I did a series of running tweetstorms about it, and Clark asked me if he could collect them for posterity. I’ve edited them slightly for editorial coherence.

Days of Rage is important, because this stuff is forgotten and it shouldn’t be. The 1970s underground wasn’t small. It was hundreds of people becoming urban guerrillas. Bombing buildings: the Pentagon, the Capitol, courthouses, restaurants, corporations. Robbing banks. Assassinating police. People really thought that revolution was imminent, and thought violence would bring it about.

One thing that Burrough returns to in Days of Rage, over and over and over, is how forgotten so much of this stuff is. Puerto Rican separatists bombed NYC like 300 times, killed people, shot up Congress, tried to kill POTUS (Truman). Nobody remembers it.

The passage speaks to me because – yeah, nobody remembers it. This is also how I feel about the 1920 spike in violence. I’d heard about the Tulsa race riot, but the Mine War and the bombing of Wall Street and all the other stuff was new to me. This matters because my intuitions before reading this book would not have been that there were three giant spikes in violence/instability in US history located fifty years apart. I think the lesson I learn is not to trust my intuitions, and to be a little more sympathetic to Turchin’s data.

One more thing: the 1770 spike was obviously the American Revolution and all of the riots and communal violence associated with it (eg against Tories). Where was the 1820 spike? Turchin admits it didn’t happen. He says that because 1820 was the absolute best part of the 150 year grand cycle, everybody was so happy and well-off and patriotic that the scheduled instability peak just fizzled out. Although Turchin doesn’t mention it, you could make a similar argument that the 1870 spike was especially bad (see: the entire frickin’ Civil War) because it hit close to (though not exactly at) the worst part of the grand cycle. 1920 hit around the middle, and 1970 during a somewhat-good period, so they fell in between the nonissue of 1820 and the disaster of 1870.

III.

I haven’t forgotten the original question – what drives these 150 year cycles of rise and decline – but I want to stay with the data just a little longer. Again, these data are really interesting. Either some sort of really interesting theory has to be behind them – or they’re just low-quality data cherry-picked to make a point. Which are they? Here are a couple of spot-checks to see if the data are any good.

First spot check: can I confirm Turchin’s data from independent sources?

Here is a graph of average US height over time which seems broadly similar to Turchin’s.

Here is a different measure of US income inequality over time, which again seems broadly similar to Turchin’s. Piketty also presents very similar data, though his story places more emphasis on the World Wars and less on the labor movement.

– The Columbia Law Review measures political polarization over time and gets mostly the same numbers as Turchin.

I’m going to consider this successfully checked; Turchin’s data all seem basically accurate.

Second spot check: do other indicators Turchin didn’t include confirm the pattern he detects, or did he just cherry-pick the data series that worked? Spoiler: I wasn’t able to do this one. It was too hard to think of measures that should reflect general well-being and that we have 200+ years of unconfounded data for. But here are my various failures:

– The annual improvement in mortality rate does not seem to follow the cyclic pattern. But isn’t this more driven by a few random factors like smoking rates and the logic of technological advance?

Treasury bonds maybe kind of follow the pattern until 1980, after which they go crazy.

Divorce rates look kind of iffy, but isn’t that just a bunch of random factors?

Homicide rates, with the general downward trend removed, sort of follow the pattern, except for the recent decline?

USD/GBP exchange rates don’t show the pattern at all, but that could be because of things going on in Britain?

The thing is – really I have no reason to expect divorce rates, homicide rates, exchange rates etc to track national flourishing. For one thing, they may just be totally unrelated. For another, even if they were tenuously related, there are all sorts of other random factors that can affect them. The problem is, I would have said this was true for height, age at first marriage, and income inequality too, before Turchin gave me convincing-sounding stories for why it wasn’t. I think my lesson is that I have no idea which indicators should vs. shouldn’t follow a secular-cyclic pattern and so I can’t do this spot check against cherry-picking the way I hoped.

Third spot check: common sense. Here are some things that stood out to me:

– The Civil War is at a low-ish part of the cycle, but by no means the lowest.

– The Great Depression happened at a medium part of the cycle, when things should have been quickly getting better.

– Even though there was a lot of new optimism with Reagan, continuing through the Clinton years, the cycle does not reflect this at all.

Maybe we can rescue the first and third problem by combining the 150 year cycle with the shorter 50 year cycle. The Civil War was determined by the 50-year cycle having its occasional burst of violence at the same time the 150-year cycle was at a low-ish point. People have good memories of Reagan because the chaos of the 1970 violence burst had ended.

As for the second, Turchin is aware of the problem. He writes:

There is a widely held belief among economists and other social scientists that the 1930s were the “defining moment” in the development of the American politico-economic system (Bordo et al 1998). When we look at the major structural-demographic variables, however, the decade of the 1930s does not seem to be a turning point. Structural-demographic trends that were established during the Progressive Era continues through the 1930s, although some of them accelerated.

Most notably, all the well-being variables that went through trend reversals before the Great Depression – between 1900 and 1920. From roughly 1910 and to 1960 they all increased roughly monotonically, with only one or two minor fluctuations around the upward trend. The dynamics of real wages also do not exhibit a breaking point in the 1930s, although there was a minor acceleration after 1932.

By comparison, he plays up the conveniently-timed (and hitherto unknown to me) depression of the mid-1890s. Quoting Turchin quoting McCormick:

No depression had ever been as deep and tragic as the one that lasted from 1893 to 1897. Millions suffered unemployment, especially during the winters of 1893-4 and 1894-5, and thousands of ‘tramps’ wandered the countryside in search of food […]

Despite real hardship resulting form massive unemployment, well-being indicators suggest that the human cost of the Great Depression of the 1930s did not match that of the “First Great Depression” of the 1890s (see also Grant 1983:3-11 for a general discussion of the severity of the 1890s depression. Furthermore, while the 1930s are remembered as a period of violent labor unrest, the intensity of class struggle was actually lower than during the 1890s depression. According to the US Political Violence Database (Turchin et al. 2012) there were 32 lethal labor disputes during the 1890s that collectively caused 140 deaths, compared with 20 such disputes in the 1930s with the total of 55 deaths. Furthermore, the last lethal strike in US labor history was in 1937…in other words, the 1930s was actually the last uptick of violent class struggle in the US, superimposed on an overall declining trend.

The 1930s Depression is probably remembered (or rather misremembered) as the worst economic slump in US history, simply because it was the last of the great depressions of the post-Civil War era.

Fourth spot check: Did I randomly notice any egregious errors while reading the book?

On page 70, Turchin discusses “the great cholera epidemic of 1849, which carried away up to 10% of the American population”. This seemed unbelievably high to me. I checked the source he cited, Kohl’s “Encyclopedia Of Plague And Pestilence”, which did give that number. But every other source I checked agreed that the epidemic “only” killed between 0.3% – 1% of the US population (it did hit 10% in a few especially unlucky cities like St. Louis). I cannot fault Turchin’s scholarship in the sense of correctly repeating something written in an encyclopedia, but unless I’m missing something I do fault his common sense.

Also, on page 234, Turchin interprets the percent of medical school graduates who get a residency as “the gap between the demand and supply of MD positions”, which he ties into a wider argument about elite overproduction. But I think this shows a limited understanding of how the medical system works. There is currently a severe undersupply of doctors – try getting an appointment with a specialist who takes insurance in a reasonable amount of time if you don’t believe me. Residencies aren’t limited by organic demand. They’re limited because the government places so many restrictions on them that hospitals don’t sponsor them without government funding, and the government is too stingy to fund more of them. None of this has anything to do with elite overproduction.

These are just two small errors in a long book. But they’re two errors in medicine, the field I know something about. This makes me worry about Gell-Mann Amnesia: if I notice errors in my own field, how many errors must there be in other fields that I just didn’t catch?

My overall conclusion from the spot-checks is that the data as presented are basically accurate, but that everything else is so dependent on litigating which things are vs. aren’t in accordance with the theory that I basically give up.

IV.

Okay. We’ve gone through the data supporting the grand cycle. We’ve gone through the data and theory for the 40-60 year instability cycle. We’ve gone through the reasons to trust vs. distrust the data. Time to go back to the question we started with: why should the grand cycle, originally derived from the Malthusian principles that govern pre-industrial societies, hold in the modern US? Food and land are no longer limiting resources; famines, disease, and wars no longer substantially decrease population. Almost every factor that drives the original secular cycle is missing; why even consider the possibility that it might still apply?

I’ve put this off because, even though this is the obvious question Ages of Discord faces from page one, I found it hard to get a single clear answer.

Sometimes, Turchin talks about the supply vs. demand of labor. In times when the supply of labor outpaces demand, wages go down, inequality increases, elites fragment, and the country gets worse, mimicking the “land is at carrying capacity” stage of the Malthusian cycle. In times when demand for labor exceeds supply, wages go up, inequality decreases, elites unite, and the country gets better. The government is controlled by plutocrats, who always want wages to be low. So they implement policies that increase the supply of labor, especially loose immigration laws. But their actions cause inequality to increase and everyone to become miserable. Ordinary people organize resistance: populist movements, socialist cadres, labor unions. The system teeters on the edge of violence, revolution, and total disintegration. Since the elites don’t want those things, they take a step back, realize they’re killing the goose that lays the golden egg, and decide to loosen their grip on the neck of the populace. The government becomes moderately pro-labor and progressive for a while, and tightens immigration laws. The oversupply of labor decreases, wages go up, inequality goes down, and everyone is happy. After everyone has been happy for a while, the populists/socialists/unions lose relevance and drift apart. A new generation of elites who have never felt threatened come to power, and they think to themselves “What if we used our control of the government to squeeze labor harder?” Thus the cycle begins again.

But at other times, Turchin talks more about “elite overproduction”. When there are relatively few elites, they can cooperate for their common good. Bipartisanship is high, everyone is unified behind a system perceived as wise and benevolent, and we get a historical period like the 1820s US golden age that historians call The Era Of Good Feelings. But as the number of elites outstrips the number of high-status positions, competition heats up. Elites realize they can get a leg up in an increasingly difficult rat race by backstabbing against each other and the country. Government and culture enter a defect-defect era of hyperpartisanship, where everyone burns the commons of productive norms and institutions in order to get ahead. Eventually…some process reverses this or something?…and then the cycle starts again.

At still other times, Turchin seems to retreat to a sort of mathematical formalism. He constructs an extremely hokey-looking dynamic feedback model, based on ideas like “assume that the level of discontent among ordinary people equals the urbanization rate x the age structure x the inverse of their wages relative to the elite” or “let us define the fiscal distress index as debt ÷ GDP x the level of distrust in state institutions”. Then he puts these all together into a model that calculates how the the level of discontent affects and is affected by the level of state fiscal distress and a few dozen other variables. On the one hand, this is really cool, and watching it in action gives you the same kind of feeling Seldon must have had inventing psychohistory. On the other, it seems really made-up. Turchin admits that dynamic feedback systems are infamous for going completely haywire if they are even a tiny bit skew to reality, but assures us that he understands the cutting-edge of the field and how to make them not to do that. I don’t know enough to judge whether he’s right or wrong, but my priors are on “extremely, almost unfathomably wrong”. Still, at times he reminds us that the shifts of dynamic feedback systems can be attributed only to the system in its entirety, and that trying to tell stories about or point to specific factors involved in any particular shift is an approximation at best.

All of these three stories run into problems almost immediately.

First, the supply of labor story focuses pretty heavily on immigration. Turchin puts a lot of work into showing that immigration follows the secular cycle patterns; it is highest at the worst part of the cycle, and lowest at the best parts:

In this model, immigration is a tool of the plutocracy. High supply of labor (relative to demand) drives down wages, increases inequality, and lowers workers’ bargaining power. If the labor supply is poorly organized, comes from places that don’t understand the concept of “union”, don’t know their rights, and have racial and linguistic barriers preventing them from cooperating with the rest of the working class, well, even better. Thus, periods when the plutocracy is successfully squeezing the working class are marked by high immigration. Periods when the plutocracy fears the working class and feels compelled to be nice to them are marked by low immigration.

This position makes some sense and is loosely supported by the long-term data above. But isn’t this one of the most-studied topics in the history of economics? Hasn’t it been proven almost beyond doubt that immigrants don’t steal jobs from American workers, and that since they consume products themselves (and thus increase the demand for labor) they don’t affect the supply/demand balance that sets wages?

It appears I might just be totally miscalibrated on this topic. I checked the IGM Economic Experts Panel. Although most of the expert economists surveyed believed immigration was a net good for America, they did say (50% agree to only 9% disagree) that “unless they were compensated by others, many low-skilled American workers would be substantially worse off if a larger number of low-skilled foreign workers were legally allowed to enter the US each year”. I’m having trouble seeing the difference between this statement (which economists seem very convinced is true) and “you should worry about immigrants stealing your job” (which everyone seems very convinced is false). It might be something like – immigration generally makes “the economy better”, but there’s no guarantee that these gains are evently distributed, and so it can be bad for low-skilled workers in particular? I don’t know, this would still represent a pretty big update, but given that I was told all top economists think one thing, and now I have a survey of all top economists saying the other, I guess big updates are unavoidable. Interested in hearing from someone who knows more about this.

Even if it’s true that immigration can hurt low-skilled workers, Turchin’s position – which is that increased immigration is responsible for a very large portion of post-1973 wage stagnation and the recent trend toward rising inequality – sounds shocking to current political sensibilities. But all Turchin has to say is:

An imbalance between labor supply and demand clearly played an important role in driving real wages down after 1978. As Harvard economist George J. Borjas recently wrote, “The best empirical research that tries to examine what has actually happened in the US labor market aligns well with economic theory: An increase in the number of workers leads to lower wages.”

My impression was that Borjas was an increasingly isolated contrarian voice, so once again, I just don’t know what to do here.

Second, the plutocratic oppression story relies pretty heavily on the idea that inequality is a unique bad. This fits the zeitgeist pretty well, but it’s a little confusing. Why should commoners care about their wages relative to elites, as opposed to their absolute wages? Although median-wage-relative-to-GDP has gone down over the past few decades, absolute median wage has gone up – just a little, slowly enough that it’s rightly considered a problem – but it has gone up. Since modern wages are well above 1950s wages, in way sense should modern people feel like they are economically bad off in a way 1950s people didn’t? This isn’t a problem for Turchin’s theory so much as a general mystery, but it’s a general mystery I care about a lot. One answer is that the cost disease is fueled by a Baumol effect pegged to per capital income (see part 3 here), and this is a way that increasing elite wealth can absolutely (not relatively) immiserate the lower classes.

Likewise, what about The Spirit Level Delusion and other resources showing that, across countries, inequality is not particularly correlated with social bads? Does this challenge Turchin’s America-centric findings that everything gets worse along with inequality levels?

Third, the plutocratic oppression story meshes poorly with the elite overproduction story. In elite overproduction, united elites are a sign of good times to come; divided elites means dysfunctional government and potential violence. But as Pseudoerasmus points out, united elites are often united against the commoners, and we should expect inequality to be highest at times when the elites are able to work together to fight for a larger share of the pie. But I think this is the opposite of Turchin’s story, where elites unite only to make concessions, and elite unity equals popular prosperity.

Fourth, everything about the elite overproduction story confuses me. Who are “elites”? This category made sense in Secular Cycles, which discussed agrarian societies with a distinct titled nobility. But Turchin wants to define US elites in terms of wealth, which follows a continuous distribution. And if you’re defining elites by wealth, it doesn’t make sense to talk about “not enough high-status positions for all elites”; if you’re elite (by virtue of your great wealth), by definition you already have what you need to maintain your elite status. Turchin seems aware of this issue, and sometimes talks about “elite aspirants” – some kind of upper class who expect to be wealthy, but might or might not get that aspiration fulfilled. But then understanding elite overproduction hinges on what makes one non-rich-person person a commoner vs. another non-rich-person an “elite aspirant”, and I don’t remember any clear discussion of this in the book.

Fifth, what drives elite overproduction? Why do elites (as a percent of the population) increase during some periods and decrease during others? Why should this be a cycle rather than a random walk?

My guess is that Ages of Discord contains answers to some of these questions and I just missed them. But I missed them after reading the book pretty closely to try to find them, and I didn’t feel like there were any similar holes in Secular Cycles. As a result, although the book had some fascinating data, I felt like it lacked a clear and lucid thesis about exactly what was going on.

V.

Accepting the data as basically right, do we have to try to wring some sense out of the theory?

The data cover a cycle and a half. That means we only sort of barely get to see the cycle “repeat”. The conclusion that it is a cycle and not some disconnected trends is based only on the single coincidence that it was 70ish years from the first turning point (1820) to the second (1890), and also 70ish years from the second to the third (1960).

A parsimonious explanation would be “for some reason things were going unusually well around 1820, unusually badly around 1890, and unusually well around 1960 again.” This is actually really interesting – I didn’t know it was true before reading this book, and it changes my conception of American history a lot. But it’s a lot less interesting than the discovery of a secular cycle.

I think the parsimonious explanation is close to what Thomas Piketty argued in his Capital In The Twenty-First Century. Inequality was rising until the World Wars, because that’s what inequality naturally does given reasonable assumptions about growth rates. Then the Depression and World Wars wiped out a lot of existing money and power structures and made things equal again for a little while. Then inequality started rising again, because that’s what inequality naturally does given reasonable assumptions about growth rates. Add in a pinch of The Spirit Level – inequality is a mysterious magic poison that somehow makes everything else worse – and there’s not much left to be explained.

(some exceptions: why was inequality decreasing until 1820? Does inequality really drive political polarization? When immigration corresponds to periods of high inequality, is the immigration causing the inequality? And what about the 50 year cycle of violence? That’s another coincidence we didn’t include in the coincidence list!)

So what can we get from Ages of Discord that we can’t get from Piketty?

First, the concept of “elite overproduction” is one that worms its way into your head. It’s the sort of thing that was constantly in the background of Increasingly Competitive College Admissions: Much More Than You Wanted To Know. It’s the sort of thing you think about when a million fresh-faced college graduates want to become Journalists and Shape The Conversation and Fight For Justice and realistically just end up getting ground up and spit out by clickbait websites. Ages of Discord didn’t do a great job breaking down its exact dynamics, but I’m grateful for its work bringing it from a sort of shared unconscious assumption into the light where we can talk about it.

Second, the idea of a deep link between various indicators of goodness and badness – like wages and partisan polarization – is an important one. It forces me to reevaluate things I had considered settled, like that immigration doesn’t worsen inequality, or that inequality is not a magical curse that poisons everything.

Third, historians have to choose what events to focus on. Normal historians usually focus on the same normal events. Unusual historians sometimes focus on neglected events that support their unusual theses, so reading someone like Turchin is a good way to learn parts of history you’d never encounter otherwise. Some of these I was able to mention above – like the Mine War of 1920 or the cholera epidemic of 1849; I might make another post for some of the others.

Fourth, it tries to link events most people would consider separate – wage stagnation since 1973, the Great Stagnation in technology, the decline of Peter Thiel’s “definite optimism”, the rise of partisan polarization. I’m not sure exactly how it links them or what it has to stay about the link, but link them it does.

But the most important thing about this book is that Turchin claims to be able to predict the future. The book (written just before Trump was elected in 2016) ends by saying that “we live in times of intensifying structural-demographic pressures for instability”. The next bigenerational burst of violence is scheduled for about 2020 (realistically +/- a few years). It’s at a low point in the grand cycle, so it should be a doozy.

What about beyond that? It’s unclear exactly where he thinks we are right now in the grand cycle. If the current cycle lasts exactly as long as the last one, we would expect it to bottom out in 2030, but Turchin never claims every cycle is exactly as long. A few of his graphs suggest a hint of curvature, suggesting we might currently be in the worst of it. The socialists seem to have gotten their act together and become an important political force, which the theory predicts is a necessary precursor to change.

I think we can count the book as having made correct predictions if violence spikes in the very near future (are the current number of mass shootings enough to satisfy this requirement? I would have to see it graphed using the same measurements as past spikes), and if sometime in the next decade or so things start looking like there’s a ray of light at the end of the tunnel.

I am pretty interested in finding other ways to test Turchin’s theories. I’m going to ask some of my math genius friends to see if the dynamic feedback models check out; if anyone wants to help, let me know how I can help you (if money is an issue, I can send you a copy of the book, and I will definitely publish anything you find on this blog). If anyone has any other ideas for to indicators that should be correlated with the secular cycle, and ideas about how to find them, I’m intereted in that too. And if anyone thinks they can explain the elite overproduction issue, please enlighten me.

I ended my review of Secular Cycles by saying:

One thing that strikes me about [Turchin]’s cycles is the ideological component. They describe how, during a growth phase, everyone is optimistic and patriotic, secure in the knowledge that there is enough for everybody. During the stagflation phase, inequality increases, but concern about inequality increases even more, zero-sum thinking predominates, and social trust craters (both because people are actually defecting, and because it’s in lots of people’s interest to play up the degree to which people are defecting). By the crisis phase, partisanship is much stronger than patriotism and radicals are talking openly about how violence is ethically obligatory.

And then, eventually, things get better. There is a new Augustan Age of virtue and the reestablishment of all good things. This is a really interesting claim. Western philosophy tends to think in terms of trends, not cycles. We see everything going on around us, and we think this is some endless trend towards more partisanship, more inequality, more hatred, and more state dysfunction. But Secular Cycles offers a narrative where endless trends can end, and things can get better after all.

This is still the hope, I guess. I don’t have a lot of faith in human effort to restore niceness, community, and civilization. All I can do is pray the Vast Formless Things accomplish it for us without asking us first.

Discuss

Open & Welcome Thread - September 2019

Новости LessWrong.com - 3 сентября, 2019 - 05:53
Published on September 3, 2019 2:53 AM UTC

• If it’s worth saying, but not worth its own post, here's a place to put it.
• And, if you are new to LessWrong, here's the place to introduce yourself.
• Personal stories, anecdotes, or just general comments on how you found us and what you hope to get from the site and community are welcome.

If you want to explore the community more, I recommend reading the Library, checking recent Curated posts, seeing if there are any meetups in your area, and checking out the Getting Started section of the LessWrong FAQ.

The Open Thread sequence is here.

Discuss

The Transparent Society: A radical transformation that we should probably undergo

Новости LessWrong.com - 3 сентября, 2019 - 05:27
Published on September 3, 2019 2:27 AM UTC

In 1998, David Brin published a vision of a potentially inevitable societal shift towards all-encompassing democratised surveillance. It could not be called a panopticon, because those who patrol the observation house would be as visible as anyone else.

We would be able to watch our watchers.

Privacy would disappear, but many kinds of evil would go along with it.

The book was okay. It's definitely worth reading the first few chapters and some of Brin's reflections on the development of the internet are really interseting, but I found the rest very meandering and a little bit unsatisfying. I'm going to summarise my own understanding of radical societal transparency and why I'm convinced that it would be extremely good, actually so that I don't have to keep recommending the entire book, which didn't capture my stance very well.

I'm also going to go over some of the potential problems a transparent society would have, and explain why I'm not yet deterred by them. Some of them could turn out to be lethal to the idea, but that seems unlikely to me so far. I'm eager to explore those doubts until we're sure.

It should not surprise anyone that a radically open society would have many advantages. Information is useful. If we know more about each other, we can arrange more trades, and we can trust each other more easily.

In order of importance:

• It would likely prevent most "easy nuke" technological x-risks discussed in Bostrom's black ball paper (Vulnerable Universe)
• That is to say, if a very harmful technology rapidly emerges, for instance, a method for manufacturing a species-ending virus requiring only very common lab equipment, the transparent society would be able to police against it without any special laws. Every government could fail utterly to recognise the threat, and we might still be able to survive it; In this example, biotech workers would simply be able to watch each others' behaviour, make a note of it if anyone reads about this apocalyptic method, keeps copies of the method around, seems to be especially moody, or whether they have, you know, physically gone to the lab at 3am and started to actuate the method. The probability of the species one day being wiped out by a few unbalanced individuals wielding humanity's inevitably growing powers decreases dramatically.
• The easy nukes concern is most often dismissed with an argument that as technology gets stronger, we will also find new technologies to police its misuse. My answer to that is that, yes, that's what the transparent society is. What were you expecting. Some kind of anti-biotech that would cancel the dangerous biotech out? (I do not envy whoever has to think about securing the human body against its many attack vectors) This is the technology that will police against the misuse of other technologies, now help us to deploy it.
• Bostrom discusses surveillance, but does not discuss this position that a surveillance state might be less inclined to tyranny and more liveable if we stopped torturing ourselves with this impossible project of privacy and just let data be free, as data seems to want.
• preventing crime
• Remember, crime includes rape and trafficking and murder. People do really really unambiguously bad things sometimes and the world would be dramatically improved if they couldn't get away with them any more. If a person does not find that possibility exciting, they might have an affect disorder.
• No doubt, a lot of laws would become too powerful under transparency and would need to be thrown out, but as long as we don't make it a boiled frog thing, there's plenty of energy around right now to get those laws thrown out if legislators can be made to realise we've hit an inflection point and we need to react to it.. Still. Worthy of more discussion.
• Watching the watchers
• Politicians, police, and anyone else in a sensitive public position would be under a lot of pressure to be as open as possible about their dealings. This opens the way to having genuinely trustworthy politicians and police, which is a big deal.
• Positive changes to the legal system as a result of detection becoming easier?
• Parole could be a lot more effective. It would be possible, for the first time, to enforce a sentence/treatment "do not speak with or listen to bad influences through any channel"
• Promoting social pressures to donate a lot more
• Humans have always gone to great expense to signal strength and moral purity. We should hope that this energy could be harnessed for useful ends, as in Raikoth's symbolic beads
• If both individuals and organisations had to be more open about their income and expenses it's a lot easier to imagine these pressures coming to bear. If the information about peoples' personal donations were exposed unconditionally, our taboo against discussing them might not be able to hold together. We would not be able to hide our friends' shame about not buying enough bednets.
• That said, I'm confused as to why there is so little social pressure to donate to things, as it is. I wonder how much of it is value-dysphoria, knowing that the values we espouse don't quite align with our hearts, everyone knowing it, softening when our friends confide in us that they aren't living up to those values, "It's okay, I understand that it's just not what you sincerely wanted to do." I hope that radical openness will allow us to, first, admit that what is agreed to be good is not always what we as individuals desire to see (to admit that we are not, at heart, altruists, as is plain from the records of our choices), second, that it would allow us to get closer to figuring out what our real values are, so that we can develop truly humane systems of accountability to pursue those instead.
• Free, complete, and accurate statistics about every facet of human life. Please think of all the science we could do.
• Forcing people to accept and contend with the weirdness of other people, and their own weirdness.
• We would no longer be able to hide from it. This is one thing that makes me hopeful that transparency wouldn't result in the emergence of some new totalitarian normative orthodoxy. There would be heretics everywhere and we'd all be able to hear them (when we choose to) and any crusade short of complete totalitarianism would never be able to completely silence them.
• Automated systems for maintaining information about who owns what, how much they seem to use it, and then using that to arrange mutually beneficial trades (or if you're of the position transparency might obviate the need for money, think of it as; it would be easier for us to notice opportunities to make peoples' lives better by sharing things with them).
• Worrying less about surveillance capitalism/states.
• With less of an imbalance in hoarded data, The People would have just as much surveillance capacity as Google. Though. If the megacorps can analyse the data better than The People can, maybe they still have to worry. More has been written about this, which others could probably recall more easily than I could. Homo Deus anticipated large, transformative effects of Big Data Analysis, but I don't remember being moved by any specific claims, maybe Harari cites someone else, in those sections? I don't have a copy on hand to check.

• Even in the most open tribes, humans seem to have an instinct for shyness. I'm not sure we know what happens to humans when they're deprived of the opportunity to do things in private. Maybe it's mostly about sex. I dunno. What are the evolutionary teloses underlying humans' coyness about sex?
• There's infidelity, of course. We reserve our right to fuck and not tell our spouses, but it seems to be mostly agreed upon that it's not especially good that we have this right.
• I could imagine there being a thing about obscuring paternity leading to greater tribal cohesion... but I don't think anything like that exists in any developed country to be protected. Also doesn't seem terribly hard to accommodate under standard transparency technologies.
• I guess I'm not very worried about this. In most tribes, people do most things in the company of others. It is strange to us to share so much, but there's nothing unnatural about it, no reason to think humans would thrive less under it.
• The emergence of new, hyper-strong universal orthodoxies.
• Transparency makes it possible to enforce against even the tiniest transgression against a dominant power, which may make the dominant power incontestable.
• (counterargument: see 'forcing people to accept and contend with the weirdness of other people')
• A weakness that a transparent state's non-transparent enemies could easily exploit to destroy them (assuming the future will contain any major wars, which, since the creation of the atomic bomb, it's not clear that we can have major wars any more, still, worth considering).
• Information imbalance in war
• Imagine that you're playing a game of chess, and the enemy can see all of your pieces, but you can't see any of theirs. You'd be fucked. But that isn't a legitimate analogy, it would be more like a game of chess where the enemy can see all of your pieces and you can only see a subset of theirs.
The question for me is whether the internal cohesion of a transparent society will make it strong enough that it could win such a war uphill, to extend the analogy; you can't see all of their pieces all of the time, but if your high trust society with its perfect economy of complete information is able to build more pieces than them, maybe you win anyway. You're at a disadvantage, but you also have this other advantage ready to go. Which one weighs more, the advantage or the disadvantage? I don't know. We'll need to experiment. I'm very eager to do that.
• So, the easiest first step would be to actually define and play the Open Vs Closed Chess Variant. I'm a game designer, but I'm not an expert on chess, I'm not sure what the specific rules should be... White should have more pieces, or more powerful pieces, to represent its stronger economy and internal cohesion. Black should have partially invisible pieces, or maybe white should only see black's moves two turns after they were effected? It's difficult. If anyone reading this knows chess very well, I'd enjoy collaborating on getting this variant designed and played and reviewed as an analogy.
• (It's tempting to me to propose doing a thing where black has to deal with some uncertainty about the position of its own pieces, to reflect the awkward realities of not being a transparent society, but I don't think that would be charitable. In a closed society, perhaps the brain does not know what the stomach is doing, but it is still going to know what the arm is doing- it will still be able to coordinate its military with a fair bit of clarity.)
• But it would probably be more informative to talk to wargamers.
• We must wonder why anyone would attack a transparent society when it could demonstrate thoroughly that it is institutionally incapable of being the first one to break the treaty. A transparent state could be far more able to prove nonaggression. When they say they aren't plotting anything, an intelligence agency can see directly that they aren't plotting anything.
• I'm not sure whether "not being able to keep technological secrets" counts as a significant weakness. The scarce asset is generally not theory, theory is hard to protect, the scarce asset is usually practitioners.
• The problems a transparent society has with protecting registered intellectual property are no different that the problems of a closed society. It wouldn't be IP if it wasn't circulating in the open. The whole idea of IP is much more closely aligned with radical openness than closedness; a surreal releasing-yet-protecting of private information that enables conversations, inspiration and trade that would otherwise be impossible.

So here's what we should try to do in light of all of that:

• Investigate the problem areas described above and try to resolve the difficult remaining questions in the ways suggested. In summary,
• Figure out whether there are potentials for lastingly destructive social consensus monoculture.
• Figure out whether a radically open society with a wealth advantage of about one order of magnitude can survive aggression (war or sabotage) from a closed one.
• Figure out what a good legal system would look like in a transparent society. It is likely to be harder, considering that every law would be consistently enforced.

If the answer to those questions is "It'll be fine, go ahead",

• Develop the relevant technologies. Transparent computing (trusted computing, smart contracts, that kind of thing), cheap recording devices, better wireless networks.
• Promote the culture of radical openness. Pursue the dream of a society where honesty is rewarded, that votes in politicians on the basis of who they really are rather than how good they are at acting. Promote socially positive radically open celebrities. Ensure that the support exists.

If the answer turns out to be "no, this would be bad actually"...

You must still try to deploy the constrained forms of global surveillance and policing proposed in Bostrom's black ball paper. It is well documented that we failed to handle nukes, and only an idiot would bet that nukes are the blackest ball that's gonna come out of the urn.

Disarmament still hasn't happened.

As long as the bomb can be hidden, there will remain an indomitable incentive to have the bomb.

Is there a good reason to think we're going to be able to defuse it with anything short of a total abolition of secrecy?

Discuss