# Новости LessWrong.com

A community blog devoted to refining the art of rationality
Обновлено: 43 минуты 57 секунд назад

### Who are some people you met that were the most extreme on some axis?

9 декабря, 2019 - 19:49
Published on December 9, 2019 4:49 PM UTC

Characteristics like:

• person that had the most imaginary friends
• person that was the most extroverted
• person that lived the most online
• person that recorded their life the most
• person that asked the most questions online
• person that was the most secure
• person that was the most caring
• person that was the most frugal
• person that was the most transparent
• person that was the most productive
• person that was the most minimalist
• person that spent the least
• person with the most automated life
• person with the most artificial/mindful way to speak
• person that ate the same thing the most
• person that was the most (emotionally) empathetic
• person that was the most emotional
• person that was the most expressive
• etc.

I'm curious about a general description of them (you should probably avoid outing anyone without their consent).

I also value a lot interacting with such people in general. I find them interesting and they help me understand a wider range of human experience:)

Discuss

### What are some things you would do (more) if you were less averse to being/looking weird?

9 декабря, 2019 - 19:10
Published on December 9, 2019 4:10 PM UTC

Discuss

### Long Bets by Confidence Level

9 декабря, 2019 - 17:20
Published on December 9, 2019 2:20 PM UTC

If you want to make a long-term bet one of your options is to register your bet with the Long Now Foundation as a Long Bet. They have some rules, which are roughly:

• Both parties put up the same amount, at least $200/each. • Long Bets effectively runs a donor-advised fund (DAF). • When the bet concludes the winner chooses a charity to receive the money. • The charity gets the initial stakes, plus half the investment income. While people have all sorts of reasons why they might want to use Long Bets, one question is: how confident do you need to be for placing a long bet to result in more money going to your preferred charity than just putting the money in a DAF now? Let's say I claim we'll have talking horses ten years from now, and you're skeptical. You consider betting$1000 against my $1000 via Long Bets. If you win you'll get your$1000 back, my $1000, and half the investment income which (figuring the stock market returns a nominal 7%) will be ~$967, for a total of ~$2967. On the other hand, if you had just put your$1000 in a DAF you'd have ~1967. Is this a good deal? Provided putting the money in a DAF for at least that long would otherwise be your best option, if you're 100% confident that (a) you'll win and (b) Long Bets will still be around, then it's a solid deal. You're up about 50%. On the other hand, the less confident you are the worse the deal looks: For example, at 60% confidence you're neutral at 6 years, and negative after that. At 75% you're down to neutral at 16 years. At 90%, 32 years. At 99%, 75 years. For an organization trying to promote long-term thinking, it's surprising they would choose a fee structure that penalizes long-term bets so heavily. Comment via: facebook Discuss ### What are the best arguments and/or plans for doing work in "AI policy"? 9 декабря, 2019 - 10:04 Published on December 9, 2019 7:04 AM UTC I'm looking to get oriented in the space of "AI policy": interventions that involve world governments (particularly the US government) and existential risk from strong AI. When I hear people talk about "AI policy", my initial reaction is skepticism, because (so far) I can think of very few actions that governments could take that seem to help make progress on the core problems of ensuring safe AI. However, I haven't read much about this area, and I don't know what actual policy recommendations people have in mind. So what should I read to start? Can people link to plans and proposals in AI policy space? Research papers, general interest web pages, and one's own models, are all admissible. Thanks. Discuss ### The Paradox of Robustness 9 декабря, 2019 - 04:49 Published on December 9, 2019 1:49 AM UTC This post builds off 2-D robustness by identifying a conflicting dynamic present in the decomposition of robustness. Vladimir Mikulik illustrates that robustness can be decomposed into two variables, each varying freely on two axes: robustness in capabilities, and robustness in alignment. In fact, these quantities are not always orthogonal to each other. Sometimes, getting more of one necessarily means getting less of the other. Hence, the "paradox." Let me explain. Human DNA is considered robust in the following sense: if you mutate a single gene in any given person's genome, the result is highly unlikely to lead to any large functional deficits in that person. On the other hand, humans are not robust in the alignment sense, since our values drift frequently. Given a hard-coded agent that explicitly computed the consequences of its actions, and then took the action which maximized expected value according to its utility function, we would observe precisely the opposite behavior. Mutate a single line of code and the functionality of this agent would almost certainly be eliminated. However, the agent is still robust in the alignment sense, as its values will never drift. This pattern shows up frequently in computer science and can be generalized beyond mere "capabilities" versus "alignment." Perhaps another way of framing the issue is by using the terms robustness in function, and robustness in specification. For example, it has long been argued that the problem with the "Good old fashioned AI" is that logical systems are not very robust. In light of our previous framing, connectionist systems were intended to solve the problem of robustness in function, at the cost of robustness in specification. Logical systems are robust because their behavior is very predictable off-distribution, but they are not always robust in the sense of being able to adapt to sudden, unpredictable changes. This distinction is relevant because it hits at the core of why we would expect mesa optimizers to come into existence. It is currently common in machine learning to view our models as essentially bags of heuristics. In order to surpass human intelligence, however, our models must eventually exhibit some form of explicit reasoning, such as mathematical or inductive reasoning -- in short, they must look something like a mesa optimizer. Thus, if the current learning paradigm is to create general intelligence, we will necessarily encounter problems endemic to the "old" type of AI. The stronger version of my claim is that methods of solving robustness in function may ultimately be counterproductive, if what we really wanted was robustness in specification. Here's one oversimplified example of why that might be the case: Suppose one's solution to robustness for machine learning was to simply slap together a gargantuan neural network that trained across the true distribution of real environments it could encounter. In one sense, the problem has truly been solved. But now our neural network is too big to understand. We might expect it to perform extremely well and be quite robust to changes in its environment, but be brittle in the sense of having no explicit principles which underlie its goal seeking. Unlike a hard-coded approach, which was hand-designed to preserve reflective stability and be internally coherent and comprehensible, this new neural network is instead a complete mess. So, in one sense, by making our model more robust, we have actually made it more brittle. Discuss ### The 2018 Review: Helping LessWrong Users Understand Old Posts 9 декабря, 2019 - 03:54 Published on December 9, 2019 12:54 AM UTC LessWrong is currently doing a major review of 2018 — looking back at old posts and considering which of them have stood the test of time. There are three phases: • Nomination (completed) • Review (ends Dec 31st) • Voting on the best posts (ends January 7th) We’re now in the Review Phase, and there are 75 posts that got two or more nominations. The full list is here. Now is the time to dig into those posts, and for each one ask questions like “What did it add to the conversation?”, “Was it epistemically sound?” and “How do I know these things?”. The LessWrong team will award2000 in prizes to the reviews that are most helpful to them for deciding what goes into the Best of 2018 book.

If you’re a nominated author and for whatever reason don’t want one or more of your posts to be considered for the Best of 2018 book, drop me an email at benitopace@gmail.com.

Creating Inputs For LW Users' Thinking

The goal for the next month is for us to try to figure out which posts we think were the best in 2018.

Not which posts were talked about a lot when they were published, or which posts were highly upvoted at the time, but which posts, with the benefit of hindsight, you're most grateful for being published, and are well suited to be part of the foundation of future conversations.

This is in part an effort to reward the best writing, and in part an effort to solve the bandwidth problem (there were more than 2000 posts written in 2018) so that we can build common knowledge of the best ideas that came out of 2018.

With that aim, when I'm reviewing a post, the main question I'm asking myself is

What information can I give to other users to help them think clearly and accurately about whether a given post should be added to our annual journal?

A large part of the review phase is about producing inputs for our collective thinking. With that in mind, I’ve gathered some examples of things you can write that are help others understand posts and their impacts.

1) Personal Experience Reports

There were a lot of examples of this in the nomination phase, which I found really useful, and would find useful to read more of. Here are some examples:

This post... may have actually had the single-largest effect size on "amount of time I spent thinking thoughts descending from it."

Joh N. Swentworth

A special case here is data from the author themselves, e.g. “Yeah, this has been central to my thinking” or “I didn’t really think about it again” or “I actually changed my mind and think this is useful but wrong”. I would generally be excited for users to review their own posts now that they've had ~1.5 years of hindsight, and I plan to do that for all the posts I've written that were nominated.

If a post had a big or otherwise interesting impact on you, consider writing that up.

2) Big Picture Analysis (e.g. Book Reviews)

There are lots of great book reviews on the web that really help the reader understand the context of the book, and explain what it says and adds to the conversation. Som good examples on LessWrong are be the reviews of Pearl's Book of Why, The Elephant in the Brain, The Secret of Our Success, Consciousness Explained, Design Principles of Biological Circuits, The Case Against Education (part 2, part 3), and The Structure of Scientific Revolutions.

Many of these reviews do a great job of things like

• Talking about how the post fits into the broader conversation on that topic
• Trying to pass the ITT of the author by explaining how they see the world
• Looking at that same topic through their own worldview
• Pointing out places they see things differently and offering alternative hypotheses.

A review of some LessWrong posts would be that time Scott reviewed Inadequate Equilibria. Oh, and don’t forget that time Scott reviewed Inadequate Equilibria.

Many of the posts we’re reviewing are shorter than most of the reviews I linked to, so it doesn’t apply literally, but much of the spirit of these reviews is great. Also check out others short book reviews and consider writing something in that style (e.g. SSC, Thing of Things).

Consider picking a book review style you like and applying it to one of the nominated posts.

3) Testing Subclaims (e.g. Epistemic Spot Checks)

Elizabeth Van Nostrand has written several posts in this style.

For another example, in Scott's review of Secular Cycles, one way he tried to think about the ideas in the book was to gather a bunch of alternative data sets on which to test some of the author’s claims.

These things aren't meant to be full reviews of the entire book or paper, or advice on overall how to judge it. They take narrower questions that are definitively answerable, like is a random sample of testable claims literally true, and answers them as fully as possible.

If there is an important subclaim of a post you think you can check out, consider trying to verify/falsify the claim and writing up your results and partial results.

Go forth and think out loud!

Discuss

### Books on the zeitgeist of science during Lord Kelvin's time.

9 декабря, 2019 - 03:17
Published on December 9, 2019 12:17 AM UTC

There is nothing new to be discovered in physics now. All that remains is more and more precise measurement.

That Lord Kelvin quote has always stuck out to me. I remember hearing from some other source that this was a general attitude at the time.

Does anyone have good book recommendations to get a feel for what the intellectual / scientific zeitgeist of this moment in history in Europe was like?

Discuss

### What determines the balance between intelligence signaling and virtue signaling?

9 декабря, 2019 - 03:11
Published on December 9, 2019 12:11 AM UTC

Lately I've come to think of human civilization as largely built on the backs of intelligence and virtue signaling. In other words, civilization depends very much on the positive side effects of (not necessarily conscious) intelligence and virtue signaling, as channeled by various institutions. As evolutionary psychologist Geoffrey Miller says, "it’s all signaling all the way down."

A question I'm trying to figure out now is, what determines the relative proportions of intelligence vs virtue signaling? (Miller argued that intelligence signaling can be considered a kind of virtue signaling, but that seems debatable to me, and in any case, for ease of discussion I'll use "virtue signaling" to mean "other kinds of virtue signaling besides intelligence signaling".) It seems that if you get too much of one type of signaling versus the other, things can go horribly wrong (the link is to Gwern's awesome review/summary of a book about the Cultural Revolution). We're seeing this more and more in Western societies, in places like journalism, academia, government, education, and even business. But what's causing this?

One theory is that Twitter with its character limit, and social media and shorter attention spans in general, have made it much easier to do virtue signaling relative to intelligence signaling. But this seems too simplistic and there has to be more to it, even if it is part of the explanation.

Another idea is that intelligence is valued more when a society feels threatened by an outside force, for which they need competent people to protect themselves from. US policy changes after Sputnik is a good example of this. This may also explain why intelligence signaling continues to dominate or at least is not dominated by virtue signaling in the rationalist and EA communities (i.e., we're really worried about the threat from Unfriendly AI).

Does anyone have other ideas, or have seen more systematic research into this question?

Once we understand the above, here are some followup questions: Is the trend towards more virtue signaling at the expense of intelligence signaling likely to reverse itself? How bad can things get, realistically, if it doesn't? Is there anything we can or should do about the problem? How can we at least protect our own communities from runaway virtue signaling? (The recent calls against appeals to consequences make more sense to me now, given this framing, but I still think they may err too much in the other direction.)

PS, it was interesting to read this in Miller's latest book Virtue Signaling:

Where does the term ‘virtue signaling’ come from? Some say it goes back to 2015, when British journalist/author James Bartholomew wrote a brilliant piece for The Spectator called ‘The awful rise of ‘virtue signaling.’’ Some say it goes back to the Rationalist blog ‘LessWrong,’ which was using the term at least as far back as 2013. Even before that, many folks in the Rationalist and Effective Altruism subcultures were aware of how signaling theory explains a lot of ideological behavior, and how signaling can undermine the rationality of political discussion.

I didn't know that "virtue signaling" was first coined (or at least used in writing) on LessWong. Unfortunately, from a search, it doesn't seem like there was much substantial discussion around this term. Signaling in general was much discussed on LessWrong and OvercomingBias, but I find myself still updating towards it being more important than I had realized.

Discuss

### Counterfactuals: Smoking Lesion vs. Newcomb's

9 декабря, 2019 - 00:02
Published on December 8, 2019 9:02 PM UTC

We will consider a special version of the Smoking Lesion where there is 100% correlation between smoking and cancer - ie. if you have the lesion, then you smoke and have cancer, if you don't have the lesion, then you don't smoke and don't have cancer. We'll also assume the predictor is perfect in the version of Newcomb's we are considering. Further, we'll assume that the Lesion is outside of the "core" part of your brain, which we'll just refer to as the brain and assume that it affects this be sending hormones to it.

Causality

Notice how similar the problems are. Getting the $1000 or to smoke a cigarette is a Small Gain. Getting cancer or missing out on the$1 million is a Big Loss. Anyone who Smokes or Two-Boxes gets a Small Gain and a Big Loss. Anyone who Doesn't Smoke or One-boxes gets neither.

So while from one perspective these problems might seem the same, they seem different when we try to think about it casually.

For Newcomb's:

• Imagine a One-Boxer counterfactually Two-Boxers
• Then their brain my be that of a Two-Boxer, so they are predicted to Two-Box, so they miss out on the million

Brain --> Decision------------------------------> Outcome

\-----> Prediction ---> Box Contents---/

For Smoking Lesion:

• Imagine that a Non-Smoker counterfactually Smokes
• Then we don't imagine this giving them the Lesion, so they still don't get cancer

Lesion --> Brain ------>Smoking------> Outcome

\-----> Cancer -------------------/

Or at least these are the standard interpretations of these problems. The key question two ask here is why does it seem reasonable to imagine the predictor changing its prediction if you counterfactually Two-Box, but the lesion remaining the same if you counterfactually smoke?

The mystery deepens when we realise that in Smoking Lesion, the Lesion is taken to cause both Smoking and Cancer, while in Newcomb's, your Brain causes both your Decision and the Prediction. For some reason, we seem more inclined to cut the link between Smoking and the Lesion than between your Decision and your Brain.

How do we explain this? One possibility is that for there Lesion there is simply more indirection - the link is Lesion -> Brain -> Decision - and that this pushes us to see it as easier to cut. However, I think it's worthwhile paying attention to the specific links. The link between your Brain and your Decision is a very tightly coupled link. It's hard to imagine a mismatch here without the situation becoming inconsistent. We could imagine a situation where the output of your brain goes to a chip which makes the final decision, but then we've added an entirely new element into the problem and so we hardly seem to be talking about the same problem.

On the other hand, this is much easier to do with the link between the Lesion and Brain - you just imagine the hormones never arriving. That would contradict the problem statement, but it isn't inconsistent physically. But why do we accept this as the same problem?

Some objects in problems "have a purpose" in that if they don't perform a particular function, we'll feel like the problem "doesn't match the description". For example, the "purpose" of your brain is to make decisions and the "purpose" of a predictor is to predict your decisions. If we intervene to break either the Brain-Decision linkage or the Brain-Predictor linkage, then it'll feel like we've "broken the problem".

In contrast, the Lesion has two purposes - to affect your behaviour and whether you have cancer. If we strip it of one, then it still has the other, so the problem doesn't feel broken. In other words, in order to justify breaking a linkage, it's not enough that it just be a past linkage, but we also have to be able to justify that we're still considering the same problem.

Reflections

It's interesting to compare my analysis of Smoking Lesion to CDT. In this particular instance, we intervene at a point in time and only casually flow the effects forward in the same way that CDT has. However, we haven't completely ignored the inconsistency issue since we can imagine whatever hormones the lesion releases not actually reaching the brain. This involves ignoring one aspect of the problem, but prevents the physical inconsistency. And the reason why we can do this for the Smoking Lesion, but not Newcomb's Problem is that the coupling from Brain to Lesion is not as tight as that from Decision to Brain.

The counterfactuals ended up depending on both the physical situation and the socio-linguistic conventions. How tightly various aspects of the situation were bound determined the amount of intervention that would be required to break the linkage without introducing an inconsistency, while the socio-linguistic conventions determined whether the counterfactual was accepted as still being the same problem.

Discuss

### Progress and preservation in IDA

8 декабря, 2019 - 22:39
Published on December 8, 2019 7:39 PM UTC

This post arose out of my attempts to understand IDA and ways it could fail. It might help you do the same and could provide useful vocabulary for discussing desiderata for IDA.

We want IDA to satisfy progress---decomposition should make answering questions easier---and preservation---semantics should be retained across transformations. We need progress in each decomposition and, furthermore, repeated decompositions must be able to eventually simplify each question such that it can be answered directly by a human. Also, each decomposition and aggregation of questions and answers must introduce no more than a bounded amount of semantic drift and, furthermore, repeated decompositions and aggregations should also introduce no more than a bounded amount of semantic drift.

IDA

Iterated distillation and amplification (henceforth IDA) is a proposal for improving the capability of human-machine systems to suprahuman levels in complex domains where even evaluation of system outputs may be beyond unaugmented human capabilities. For a detailed explanation of the mechanics, I'll refer you to the original paper just linked, section 0 of Machine Learning Projects for Iterated Distillation and Amplification, or one of the many other explanations floating around the Web.

We can view IDA as dynamic programming with function approximation[1] instead of a tabular cache. Just like the cache in dynamic programming, the machine learning component of IDA is a performance optimization. We can excise it and look at just the divide-and-conquer aspect of IDA in our analysis. Then this simplified IDA roughly consists of: (1) repeatedly decomposing tasks into simpler subtasks; (2) eventually completing sufficiently simple subtasks; and (3) aggregating outputs from subtasks into an output which completes the original, undecomposed task. We'll examine this simplified model[2] in the rest of the post. (If you'd like a more concrete description of the divide-and-conquer component of IDA, there's a runnable Haskell demo here.)

Safety is progress plus preservation

For type systems, the slogan is "safety is progress plus preservation". Because we're using this only as a cute analogy and organizing framework, we'll not get into the details. But for type systems:

• Progress: "A well-typed term is [...] either [...] a value or it can take a step according to the evaluation rules."
• Preservation: "If a well-typed term takes a step of evaluation, then the resulting term is also well typed."

(Both from (Pierce 2002).)

We also need progress and preservation in IDA. Roughly:

• Progress: A question is easy enough to be answered directly or can be decomposed into easier subquestions.
• Preservation: The answer from aggregating subquestion answers is just as good as answering the original question.

Let's try to make this more precise.

progress

There are several ways we might interpret "easier". One that seems to have some intuitive appeal is that one question is easier than another if it can be answered with fewer computational resources[3].

Regardless, we'll say that we satisfy progressqa if a question Q is decomposed into subquestions q such that every subquestion q in q is not harder than Q and at least one is easier. This is the most obvious thing that IDA is supposed to provide---a way to make hard problems tractable.

But just noting the existence of such a decomposition isn't enough. We also need to be able to find and carry out such a decomposition more easily than answering the original question. We'll call this property progress↓. progress↑ demands that we be able to find and carry out an aggregation of subquestion answers that's easier than answering the original question.

Each of these three properties is necessary but they are not even jointly sufficient for progress[4]---it could be the case that each of decomposition, answering and aggregation is easier than answering the original question but that all three together are not.

We can also view this graphically. In the figure below representing a single step of decomposition and aggregation, we want it to be the case that the computation represented by the arrow from original Q0 to corresponding answer A0 is harder than any of the computations represented by the other arrows.

progressqa, progress↓ and progress↑ mean that the top arrow from Q0 to A0 represents a more difficult computation than each of the bottom, left, and right arrows, respectively.

preservation

There are also several possible interpretations of "as good as". To start with, let's assume it means that one question and answer pair is just as good as another if they have exactly the same denotation.

We say that a decomposition satisfies preservation↓ if the denotations of (Q,A) and (Q,¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯aggregate(¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯answer(decompose(Q)))) are identical where (Q,A) is a question and answer pair, ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯aggregation is an ideal aggregation, and ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯answer is an ideal answering algorithm. We say that an aggregation satisfies preservation↑ if the denotations of (Q,A) and (Q,aggregate(¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯answer(¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯decompose(Q)))) are identical where (Q,A) is a question and answer pair, ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯decompose is an ideal decomposition, and ¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯answer is an ideal answering algorithm.

Explained differently, preservation↓ requires that the below diagram commute while assuming that answering and aggregation are ideal. preservation↑ requires that the diagram commute while assuming that answering and decomposition are ideal.

preservation↓ means that the diagram commutes with an ideal bottom and right arrow. preservation↑ mean that the diagram commutes with an ideal bottom and left arrow.

PROGRESS

progressqa actually isn't sufficient for our purposes---it could be the case that a series of decompositions produce easier and easier questions but never actually produce questions that are simple enough for a human to answer directly. We name the requirement that our decompositions eventually produce human-answerable subquestions PROGRESSqa.

PRESERVATION

Now let's relax our definition of "as good as" a bit since it's quite demanding. Instead of requiring that the question and answer pairs have exactly the same denotation, we allow some wiggle room. We could do this in a variety of ways including: (1) suppose there is some metric space of meanings and require that the denotations are within ϵ of each other; (2) require that acting on either question-answer pair produces the same expected utility; (3) require that the utilities produced by acting on each question-answer pair are within ϵ of each other. For the sake of discussion let's assume something like (1) or (3).

Hopefully, the mutatis mutandis for preservation↓ and preservation↑ with this new interpretation of "good enough" is clear enough. (Briefly, the aggregated, answered, decomposition should be within ϵ of the original answer.)

Unfortunately, the new interpretation means that the single-step (i.e. just one level of decomposition and aggregation) properties are no longer sufficient to guarantee multi-step preservation. It could be the case that each step introduces skew less than ϵ but that the cumulative skew between the original question and a fully decomposed set of human-answerable questions exceeds ϵ. We'll call the requirement that the series of decompositions maintain skew less than ϵ, PRESERVATION↓, and that the series of aggregations maintains skew less than ϵ, PRESERVATION↑.

PRESERVATION↓ means that the left hand side of the diagram doesn't break commutativity. PRESERVATION↑ mean that the right-hand side doesn't break commutativity.

Summary

For every question, there must be a full decomposition to human-answerable questions satisfying PROGRESSqa and each decomposed set of questions along the way must satisfy each of progressqa, progress↓, and progress↑. That full decomposition must satisfy PRESERVATION↓ and the corresponding full aggregation must satisfy PRESERVATION↑. Each decomposition and aggregation along the way must satisfy preservation↓ and preservation↑.

progress and preservation properties apply to single steps of decomposition and aggregation. PROGRESS and PRESERVATION properties apply to repeated decomposition and aggregation.

References

Pierce, Benjamin C, and C Benjamin. 2002. Types and Programming Languages. MIT press.

1. Asking whether IDA problems have the optimal substructure and overlapping subproblems that dynamic programming requires also seems fruitful. ↩︎

2. This should be okay because function approximation only makes the problems of progress and preservation harder. ↩︎

3. Of course, "computational resources" is a leaky abstraction. ↩︎

4. If we settled on a precise notion of "easier", we could specify what would be sufficient. For example, if difficulty just adds, the overall progress requirement would be that the sum of difficulties from decomposition, aggregation and answering is no more than the difficulty from answering the original question in other ways. ↩︎

Discuss

### Confabulation

8 декабря, 2019 - 13:18
Published on December 8, 2019 10:18 AM UTC

When someone asks me "why" I did or said something I usually lie because the truthful answer is "I don't know". I literally don't know why I make >99% of my decisions. I think through none of these decisions rationally. It's usually some mixture of gut instinct, intuition, cultural norms, common sense and my emotional state at the time.

Instead, I make up a rational-sounding answer on the spot. Even when writing a mathematical proof I'll start with an answer and then rationalize it post hoc. If I'm unusual it's because I confabulate consciously. Most humans confabulate unconsciously. This is well-established through studies of split-brained patients, Anton's syndrome, Capgras' syndrome and choice blindness.

Confabulation is lazy evaluation. At the end of the day it's more important to be less wrong than more rational. Confabulation is cheaper and faster than reason. If you can confabulate the right behavior then you should confabulate instead of reasoning.

Confabulation becomes a problem when you misconstrue it for reason. A long time ago I wanted to understand Christianity so I asked a Christian a series of "why" questions the way I'd ask a physicist. His answers became increasingly confabulated until he eventually accused me of attacking him. I have stayed friends with another Christian from the same church who simply declares "I don't know".

Mathematics is a unique field because if you put any two mathematicians in a room with a mountain of stationary then they will eventually agree on what they can and can't prove. This is performed by confabulating smaller and smaller inductive leaps until they're all reduced to trivialities.

We perform a less rigorous form of proof writing when we rationally explain a decision. Rationality is best rationed to points of disagreement. Lazy evaluation is adequate for the vast swathes of human agreement. In this way reason and confabulation coexist mutualistically. One cannot exist without the other. Together they constitute rationality.

Which brings us to antimemes.

Antimemes are self-keeping secrets. Occasionally you'll stumble upon one by accident. When this happens you'll unconsciously shy away it the way your eyes drift from the building next to the Leaky Cauldron to the one on the other side. Normally this is the end of things. You move on and forget about it. Your mind stitches over the antimeme the same way it stitches over your blind spots. But if someone else draws your attention to the antimeme then you will emit series of confabulations.

Saying things makes you believe them. Just thinking you said something (even when you didn't) makes you believe it. The more confabulations you state to defend an antimeme the harder it'll get for you to catch that antimeme. You're digging yourself deeper into a conceptual hole. It is therefore imperative to short-circuit antimemetic confabulation as early as you can.

How do you distinguish antimemetic confabulations from the rationally symbiotic kind?

Unlike good confabulations, antimemetic confabulations will make you increasingly uncomfortable. You might even get angry. The distractions feel like being in the brain of a beginner meditator or distractible writer. You can recognize this pattern as an antimemetic signature. People love explaining things. If you feel uncomfortable showing off you're knowledge it's probably because you have something to hide.

Once you've identified antimemetic confabulation all you have to do is set your ego aside, admit ignorance and methodically sift through the data. You'll find it eventually.

Discuss

### How do you create a sequence?

8 декабря, 2019 - 11:36
Published on December 8, 2019 8:36 AM UTC

How do you create a sequence linking several posts together? I can't find a button in the UI. Is there a setting I need to enable? Is this feature only available to certain people?

Discuss

### Credibility, Peaceful bargaining

8 декабря, 2019 - 08:26
Published on December 8, 2019 5:26 AM UTC

Credibility is a central issue in strategic interaction. By credibility, we refer to the issue of whether one agent has reason to believe that another will do what they say they will do. Credibility (or lack thereof) plays a crucial role in the efficacy of contracts (Fehr et al., 1997; Bohnet et al., 2001), negotiated settlements for avoiding destructiveconflict (Powell, 2006), and commitments to carry out (or refuse to give in to) threats(e.g., Kilgour and Zagare 1991; Konrad and Skaperdas 1997).

In game theory, the fact that Nash equilibria (Section 1.1) sometimes involve non-credible threats motivates a refined solution concept called subgame perfect equilibrium (SPE). An SPE is a Nash equilibrium of an extensive-form game in which a Nash equilibrium is also played at each subgame. In the threat game depicted in Figure 1, “carry out” is not played in a SPE, because the threatener has no reason to carry out the threat once the threatened party has refused to give in; that is, “carry out’’ isn’t a Nash equilibrium of the subgame played after the threatened party refuses to give in.

So in an SPE-based analysis of one-shot threat situations between rational agents, threats are never carried out because they are not credible (i.e., they violate subgame perfection).

However, agents may establish credibility in the case of repeated interactions by repeatedly making good on their claims (Sobel, 1985). Secondly, despite the fact that carrying out a threat in the one-shot threat game violates subgame perfection, it is a well-known result from behavioral game theory that humans typically refuse unfair splits in the Ultimatum Game [1] (Güth et al., 1982; Henrich et al., 2006), which is equivalent to carrying out the threat in the one-shot threat game. So executing commitments which are irrational (by the SPE criterion) may still be a feature of human-in-the-loop systems (Section 6), or perhaps systems which have some humanlike game-theoretic heuristics in virtue of being trained in multi-agent environments (Section 5.2). Lastly, threats may become credible if the threatener has credibly committed to carrying out the threat (in the case of the game in Fig. 1, this means convincing the opponent that they have removed the option (or made it costly) to “Not carry out’’). There is a considerable game-theoretic literature on credible commitment, both on how credibility can be achieved (Schelling, 1960) and on the analysis of games under the assumption thatcredible commitment is possible (Von Stackelberg, 2010; Nash, 1953; Muthoo, 1996; Bagwell, 1995).

3.1 Commitment capabilities

It is possible that TAI systems may be relatively transparent to one another; capable of self-modifying or constructing sophisticated commitment devices; and making various other “computer-mediated contracts’’ (Varian, 2010); see also the lengthy discussionsin Garfinkel and Dafoe (2019) and Kroll et al. (2016), discussed in Footnote 1, of potential implications of cryptographic technology for credibility.
We want to understand how plausible changes in the ability to make credible commitments affect risks from cooperation failures.

• In what ways does artificial intelligence make credibility more difficult, rather than less so? For instance, AIs lack evolutionarily established mechanisms (like credible signs of anger; Hirshleifer 1987) for signaling their intentions to other agents.

• The credibility of an agent’s stated commitments likely depends on how interpretable [2] that agent is to others. What are the possible ways in which interpretability may develop, and how does this affect the propensity to make commitments? For instance, in trajectories where AI agents are increasingly opaque to their overseers, will these agents be motivated to make commitments while they are still interpretable enough to overseers that these commitments are credible?

• In the case of training regimes involving the imitation of human exemplars (see Section 6), can humans also make credible commitments on behalf of the AI system which is imitating them?

3.2 Open-source game theory

Tennenholtz (2004) introduced program games, in which players submit programs that have access to the source codes of their counterparts. Program games provide a model of interaction under mutual transparency. Tennenholtz showed that in the Prisoner’s Dilemma, both players submitting Algorithm 1 is a program equilibrium (that is, a Nash equilibrium of the corresponding program game). Thus agents may have incentive to participate in program games, as these promote more cooperative outcomes than the corresponding non-program games.

Algorithm 1:Tennenholtz (2004)'s construction of a program equilibri-um of the one-shot Prisoner's Dilemma. The program cooperates if its counterpart'sprogram's source code is identical to its own (and thus both players cooperate), anddefects otherwise.Input: Program source codess1,s2if s1=s2 then|return Cooperateelse|return Defectend

For these reasons, program games may be helpful to our understanding of interactions among advanced AIs.

Other models of strategic interaction between agents who are transparent to one another have been studied (more on this in Section 5); following Critch (2019), we will call this broader area open-source game theory. Game theory with source-codetransparency has been studied by Fortnow 2009; Halpern and Pass 2018; LaVictoireet al. 2014; Critch 2019; Oesterheld 2019, and models of multi-agent learning under transparency are given by Brafman and Tennenholtz (2003); Foerster et al. (2018). But open-source game theory is in its infancy and many challenges remain [3].

• The study of program games has, for the most part, focused on the simple setting of two-player, one-shot games. How can (cooperative) program equilibrium strategies be automatically constructed in general settings?

• Under what circumstances would agents be incentivized to enter into open-source interactions?

• How can program equilibrium be made to promote more efficient outcomes even in cases of incomplete access to counterparts’ source codes?

• As a toy example, consider two robots playing a single-shot program prisoner’s dilemma, in which their respective moves are indicated by a simultaneous button press. In the absence of verification that the output of the source code actually causes the agent to press the button, it is possible that the output of the program does not match the actual physical action taken. What are the prospects for closing such "credibility gaps’’? The literature on (physical) zero-knowledge proofs (Fisch et al., 2014; Glaser et al., 2014) may be helpful here.

• See also the discussion in Section 3.2 on multi-agent learning under varying degrees of transparency.

4 Peaceful bargaining mechanisms

In other sections of the agenda, we have proposed research directions for improving our general understanding of cooperation and conflict among TAI systems. In this section, on the other hand, we consider several families of strategies designed to actually avoid catastrophic cooperation failure. The idea of such "peaceful bargaining mechanisms'' is, roughly speaking, to find strategies which are 1) peaceful (i.e., avoid conflict) and 2) preferred by rational agents to non-peaceful strategies[4].

In the first subsection, we present some directions for identifying mechanisms which could implement peaceful settlements, drawing largely on existing ideas in the literatures on rational bargaining. In the second subsection we sketch a proposal for how agents might mitigate downsides from threats by effectively modifying their utility function. This proposal is called surrogate goals.

4.1 Rational crisis bargaining

As discussed in Section 1.1, there are two standard explanations for war among rational agents: credibility (the agents cannot credibly commit to the terms of a peaceful settlement) and incomplete information (the agents have differing private information which makes each of them optimistic about their prospects of winning, and incentives not to disclose or to misrepresent this information).

Fey and Ramsay (2011) model crisis bargaining under incomplete information. They show that in 2-player crisis bargaining games with voluntary agreements (players are able to reject a proposed settlement if they think they will be better off going to war); mutually known costs of war; unknown types θ1,θ2 measuring the players' military strength; a commonly known function p(θ1,θ2) giving the probability of player 1 winning when the true types are θ1,θ2; and a common prior over types; a peaceful settlement exists if and only if the costs of war are sufficiently large. Such a settlement must compensate each player's strongest possible type by the amount they expect to gain in war.

Potential problems facing the resolution of conflict in such cases include:

• Reliance on common prior μ and agreed-upon win probability model p(θ1,θ2). If players disagree on these quantities it is not clear how bargaining will proceed. How can players come to an agreement on these quantities, without generating a regress of bargaining problems? One possibility is to defer to a mutually trusted party to estimate these quantities from publicly observed data. This raises its own questions. For example, what conditions must a third party satisfy so that their judgements are trusted by each player? (Cf. Kydd (2003), Rauchhaus (2006), and sources therein on mediation).

• The exact costs of conflict to each player ci are likely to be private information, as well. The assumption of a common prior, or the ability to agree upon a prior, may be particularly unrealistic in the case of costs.

Recall that another form of cooperation failure is the simultaneous commitment to strategies which lead to catastrophic threats being carried out (Section 2.2). Such "commitment games'' may be modeled as a game of Chicken (Table 1), where Defection corresponds to making commitments to carry out a threat if one's demands are not met, while Cooperation corresponds to not making such commitments. Thus we are interested in bargaining strategies which avoid mutual Defection in commitment games. Such a strategy is sketched in Example 4.1.1.

Example 4.1.1 (Careful commitments).

Consider two agents with access to commitment devices. Each may decide to commit to carrying out a threat if their counterpart does not forfeit some prize (of value 1 to each party, say). As before, call this decision D. However, they may instead commit to carrying out their threat only if their counterpart does not agree to a certain split of the prize (say, a split in which Player 1 gets p). Call this commitment Cp, for "cooperating with split p''.

When would an agent prefer to make the more sophisticated commitment Cp? In order to say whether an agent expects to do better by making Cp, we need to be able to say how well they expect to do in the "original'' commitment game where their choice is between D and C. This is not straightforward, as Chicken admits three Nash equilibria. However, it may be reasonable to regard the players' expected values under mixed strategy Nash equilibrium as the values they expect from playing this game. Thus, split p could be chosen such that p and 1−p exceed player 1 and 2's respective expected payoffs under the mixed strategy Nash equilibrium. Many such splits may exist. This calls for the selection among p, for which we may turn to a bargaining solution concept such as Nash (Nash, 1950) or Kalai-Smorokindsky (Kalai et al., 1975). If each player uses the same bargaining solution, then each will prefer to committing to honoring the resulting split of the prize to playing the original threat game, and carried-out threats will be avoided.

Of course, this mechanism is brittle in that it relies on a single take-it-or-leave-it proposal which will fail if the agents use different bargaining solutions, or have slightly different estimates of each players' payoffs. However, this could be generalized to a commitment to a more complex and robust bargaining procedure, such as an alternating-offers procedure (Rubinstein 1982; Binmoreet al. 1986; see Muthoo (1996) for a thorough review of such models) or the sequential cooperative bargaining procedure of Van Damme (1986).

Finally, note that in the case where there is uncertainty over whether each player has a commitment device, sufficiently high stakes will mean that players with commitment devices will still have Chicken-like payoffs. So this model can be straightforwardly extended to cases of where the credibility of a threat comes in degrees. An example of a simple bargaining procedure to commit to is the Bayesian version of the Nash bargaining solution (Harsanyi and Selten, 1972).

Lastly, see Kydd (2010)'s review of potential applications of the literature rational crisis bargaining to resolving real-world conflict.

4.2 Surrogate goals [5]

In this section we introduce surrogate goals, a recent [6] proposal for limiting the downsides from cooperation failures (Baumann, 2017, 2018) [7]. We will focus on the phenomenon of coercive threats (for game-theoretic discussion see Ellsberg (1968); Har-renstein et al. (2007)), though the technique is more general. The proposal is: In order to deflect threats against the things it terminally values, an agent adopts a new (surrogate) goal [8]. This goal may still be threatened, but threats carried out against this goal are benign. Furthermore, the surrogate goal is chosen such that it incentives at most marginally more threats.

In Example 4.2.1, we give an example of an operationalization of surrogate goals in a threat game.

Example 4.2.1 (Surrogate goals via representatives)

Consider the game between Threatener and Target, where Threatener makes a demand of Target, such as giving up some resource. Threatener can — at some cost — commit to carrying out a threat against Target . Target can likewise commit to give in to such threats or not. A simple model of this game is given in the payoff matrix in Table 3 (a normal-form variant of the threat game discussed in Section 3 [9]).

Unfortunately, players may sometimes play (Threaten, Not give in). For example, this may be due to uncoordinated selection among the two pure-strategy Nash equilibria ((Give in, Threaten) and (Not give in, Not threaten)).

But suppose that, in the above scenario, Target is capable of certain kinds of credible commitments, or otherwise is represented by an agent, Target’s Representative, who is. Then Target or Target’s Representative may modify its goal architecture to adopt a surrogate goal whose fulfillment is not actually valuable to that player, and which is slightly cheaper for Threatener to threaten. (More generally, Target could modify itself to commit to acting as if it had a surrogate goal in threat situations.) If this modification is credible, then it is rational for Threatener to threaten the surrogate goal, obviating the risk of threats against Target’s true goals being carried out.

As a first pass at a formal analysis: Adopting an additional threatenable goal adds a column to the payoff matrix, as in Table 4. And this column weakly dominates the old threat column (i.e., the threat against Target’s true goals). So a rational player would never threaten Target’s true goal. Target does not themselves care about the new type of threats being carried out, so for her, the utilities are given by the blue numbers in Table 4.

This application of surrogate goals, in which a threat game is already underway but players have the opportunity to self-modify or create representatives with surrogate goals, is only one possibility. Another is to consider the adoption of a surrogate goal as the choice of an agent (before it encounters any threat) to commit to acting according to a new utility function, rather than the one which represents their true goals. This could be modeled, for instance, as an extensive-form game of incomplete information in which the agent decides which utility function to commit to by reasoning about (among other things) what sorts of threats having the utility function might provoke. Such models have a signaling game component, as the player must successfully signal to distrustful counterparts that it will actually act according to the surrogate utility function when threatened. The game-theoretic literature on signaling (Kreps and Sobel, 1994) and the literature on inferring preferences in multi-agent settings (Yu et al., 2019; Lin et al., 2019) may suggest useful models. The implementation of surrogate goals faces a number of obstacles. Some problems and questions include:

• The surrogate goal must be credible, i.e., threateners must believe that the agent will act consistently with the stated surrogate goal. TAI systems are unlikely to have easily-identifiable goals, and so must signal their goals to others through their actions. This raises questions both of how to signal so that the surrogate goal is at all credible, and how to signal in a way that doesn’t interfere too much with the agent’s true goals. One possibility in the context of Example 4.2.1 is the use of zero-knowledge proofs (Goldwasser et al., 1989; Goldreich and Oren,1994) to reveal the Target's surrogate goal (but not how they will actually respond to a threat) to the Threatener.

• How does an agent come to adopt an appropriate surrogate goal, practically speaking? For instance, how can advanced ML agents be trained to reason correctly about the choice of surrogate goal?

• The reasoning which leads to the adoption of a surrogate goal might in fact lead to iterated surrogate goals. That is, after having adopted a surrogate goal, Target may adopt a surrogate goal to protect that surrogate goal, and so on. Given that Threatener must be incentivized to threaten a newly adopted surrogate goal rather than the previous goal, this may result in Target giving up much more of its resources than it would if only the initial surrogate goal were threatened.

• How do surrogate goals interact with open-source game theory (Sections 3.2 and 5.1)? For instance, do open source interactions automatically lead to the use of surrogate goals in some circumstances?

• In order to deflect threats against the original goal, the adoption of a surrogate goal must lead to a similar distribution of outcomes as the original threat game (modulo the need to be slightly cheaper to threaten). Informally, Target should expect Target’s Representative to have the same propensity to give in as Target; how this is made precise depends on the details of the formal surrogate goals model.

A crucial step in the investigation of surrogate goals is the development of appropriate theoretical models. This will help to gain traction on the problems listed above.

1. The Ultimatum Game is the 2-player game in which Player 1 proposes a split (pX,(1−p)X) of an amount of money X, and Player 2 accepts or rejects the split. If they accept, both players get the proposed amount, whereas if they reject, neither player gets anything. The unique SPE of this game is for Player 1 to propose as little as possible, and for Player 2 to accept the offer. ↩︎

2. See Lipton (2016); Doshi-Velez and Kim (2017) for recent discussions of interpretability in machine learning. ↩︎

3. See also Section 5.1 for discussion of open-source game theory in the context of contemporary machine learning, and Section 2 for policy considerations surrounding the implementation of open-source interaction. ↩︎

4. More precisely, we borrow the term "peaceful bargaining mechanisms'' from Fey and Ramsay (2011), for whom a "peaceful mechanism'' is a mapping from each player's type to a payoff such that the probability of war is 0 for every set of types. ↩︎

5. This subsection is based on notes by Caspar Oesterheld. ↩︎

6. Although, the idea of modifying preferences in order to get better outcomes for each player was discussed by Raub (1990) under the name "preference adaptation’’, who applied it to the promotion of cooperation in the one-shot Prisoner’s Dilemma. ↩︎

7. See also the discussion of surrogate goals and related mechanisms in Christiano and Wiblin (2019). ↩︎

8. Modifications of an agent’s utility function have been discussed in other contexts. Omohundro (2008) argues that "AIs will try to preserve their utility functions’’ and "AIs will try to prevent counterfeit utility’’. Everitt et al. (2016) present a formal model of a reinforcement learning agent who is able to modify its utility function, and study conditions under which agents self-modify. ↩︎

9. Note that the normal form representation in Table 3 is over-simplifying; it assumes the credibility of threats, which we saw in Section 3 to be problematic. For simplicity of exposition, we will nevertheless focus on this normal-form game in this section. ↩︎

Discuss

### What do the Charter Cities Institute likely mean when they refer to long term problems with the use of eminent domain?

8 декабря, 2019 - 03:53
Published on December 8, 2019 12:53 AM UTC

Without using eminent domain, a large chunk of the possible future value goes to surrounding land-owners who may have done little or nothing to create that value. It does not seem economically possible to build a city that is cheap to live in without locking the price of land down in some way, at some point. It is not obvious how to do this well, but eminent domain seems to be a necessary component of it. If even fairly rural land starts out pre-speculated, there is no hope, there is a value/livability value that cities cannot ever rise above.

Apparently, the Charter Cities Institute for all their dreams, does not dream of transcending that limit any time soon. They seem to disagree with the use of eminent domain completely

From their FAQ

How do you minimize the risk of charter city developers using eminent domain to secure land?The Charter Cities Institute will never become involved with a project that takes land from its rightful [weasel words?] owners. Generally, charter cities are decades long projects. As such, we encourage developers to take long term perspectives. While eminent domain might save money in the short run, it delegitimizes the charter city and sets up a host of problems later on.

Does anyone know what they're talking about? A search of their site didn't turn up anything.

Discuss

### The Lesson To Unlearn

8 декабря, 2019 - 03:50
Published on December 8, 2019 12:50 AM UTC

The most damaging thing you learned in school wasn't something you learned in any specific class. It was learning to get good grades.

Discuss

### Ungendered Spanish

7 декабря, 2019 - 19:20
Published on December 7, 2019 4:20 PM UTC

Spanish has gramatical gender in a way English doesn't:

una amiga ruidosa — a loud (female) friend
un amigo ruidoso — a loud (male) friend
unas amigas ruidosas — some loud (female) friends
unos amigos ruidosos — some loud (not-all-female) friends

I remember when I was studying Spanish, learning the rule that even if you had a hundred girls and one boy you would use the male plural. My class all thought this was very sexist and unfair, but our teacher told us we were misapplying American norms and intuitions.

It's been interesting, ~twenty years later, following the development of gender-neutral ‑e:

unes amigues ruidoses — some loud friends

Spanish gender-neutral ‑e has something in common with English singular they that makes me optimistic about it: a decent path through which it could become a standard and unremarkable part of the language. For they this has looked like:

• Existing long-standing use in someone and everyone constructions: someone lost their fork.

• Usage expands into more generic constructions: I hear you have a new lab partner; what are they like?

• Many non-binary people adopt it as their pronoun. People get practice referring to specific named individuals with it: Pat said they might be early.

• Usage expands into cases where the person's gender is not relevant: The person who gave me a ride home from the dance last night doesn't take care of their car.

• [prediction] Usage expands to where people use they unless they specifically want to emphasize gender.

Unlike the alternatives, amigos y amigas, amigxs, amig@s, and amig*s, gender-netural ‑e fits well with spoken Spanish. Reading articles from a while ago it seems strange to me that this wasn't seen as more of a priority before? Still there's now something of a path for it to enter the language as it's generally spoken:

• Existing long-standing use in words like estudiante, though still with gendered articles and agreement (los estudiantes ruidosos).

• Usage starts as an inclusive plural: Les estudiantes ruidoses.

• Usage also starts with non-binary people: Presento a Sol, mi espose.

• Usage expands to when the individual isn't known: Necesito une amigue...

• Usage expands to when the gender isn't known: No puedo adivinar el género de le maestre por su voz.

• [prediction] Usage expands to when the speaker doesn't want to specify gender for whatever reason.

• [prediction] Usage expands to where people use ‑e unless they specifically want to emphasize gender.

My Spanish is pretty bad, and the cultural issues are probably different in ways I'm missing as well, but I'm very curious to see where this goes.

Discuss

### The 5 Main Muscles Made Easy.

7 декабря, 2019 - 12:17
Published on December 7, 2019 9:17 AM UTC

Anatomy is wordy and it's easy to get lost... but knowing the details isn't important:

• Study the pictures. See.

Keep thinking about these 5 (paired - left and right) 'main muscles of moment' and how you use them as you move through your daily life, with your 'Base-Line' muscles at the core.

1. Rectus femoris

Below the knee, feel for the lump at the front of your shin bone (tibia). Run your hands up, over your kneecaps and front of your thighs, to just below the sticking-out bone at the front of your pelvis (hip bone). This is the full extent of the rectus femoris muscle.

• Aim for the whole muscle to be active.
• A strong pole down the front of each thigh.
• Think of pulling your kneecaps up. (+/- downward from hips).
The rectus femoris muscles align the hip and knee joints.

From shin - ligament with kneecap turning into a layer of tough connective tissue. From hip - layer of tough connective tissue down the front. Muscle tissue sandwiched between the layers. ≬ ≬ ∫ ∫

2. Gluteus maximus.

The largest skeletal muscles of the body (covering a lot of complicated anatomy prone to pain/injury).

Hands on buttocks - feel for the muscles contracting. "Buns of steel".

The gluteus maximus work in tandem with the rectus femoris stabilising the legs through a full range of natural movement when connected to Base-Line support.
3. Pelvic floor. BASE

A basket of muscles within the bones of the pelvis.

Left and right sides a mirror image.

Forming a crescent shape on the body's midline.

Aim for a balanced contraction left and right sides, forming the base foundation for the body.

Closely associated with the anus and genitals.

Pelvic floor muscles at the root of all movement & the base point to align with the 'body map in the mind'.
4. Rectus abdominis. LINE

"The abs" = rectus abdominis muscles.

Think of this muscles as your central LINE, extending from BASE pelvic floor.

• 2 side-by-side strips from pelvis to chest.
• Made up of 'panels' of muscle. The panels create the "6 pack look" but the number of sections of muscle depends on the individual - 4, 6, 8, 10 packs can occur.
• As you breathe in, place you hands over the muscles, starting from the pubic symphysis between your legs then move your hand up, feeling the muscles activate and elongate - section by section - all the way up to your chest.
The rectus abdominis muscles. Our 'core pillar of strength'. Think stronger and longer with every breath in.
5. Trapezius

A blanket of muscle that should be smooth and wrinkle-free.

From mid-back to the back of the skull, extending out towards each shoulder, the trapezius muscles should be free to move in all directions.

Picture the 6 sections (approximating 2 triangles and a horizontal strip on each side).

• Hands on ears and fingers to back of skull - feel for the bump in middle. This is the external occipital protuberance (see below).
• Move your fingers towards your ears to feel the ridge where the trapezius muscles attach to the skull and 'drop down' from.
• Sculpted down the neck, attaching to both collar bone and shoulder blade - feel for the bones.
• Hands on lower ribs, thumbs pointing down, fingers around back. Move thumbs under ribs towards the spine to feel the lower extent of the trapezius muscles - where they come to a 'V' mid-back.

Think of your arms starting from your midline, i.e. including the scapula, not just starting at your shoulders.

Like wings extending from the middle of your back,

Lift your shoulders from below rather than pulling them up.

The trapezius muscles - guiding and supporting the head and arms through a full range of movement.
Working Towards Body Alignment:

Imagine a ribbon from pubic symphysis of your pelvis:

to the external occipital protuberance (midline bump) at the back of your skull:

It should be possible to fully extend the ribbon - our anatomy 'aligned'. This is possible when the body has a full range of movement and is functioning at optimum.

Working with Base-Line (pelvic floor rectus abdominis) muscles gives us a connection to our linea alba (white line in Latin).

THE LINEA ALBA - OUR PRIMARY GUIDE FOR BODY ALIGNMENT.

Feel for the anatomical markers associate with the linea alba and their state of alignment:

1. Pubic symphysis (home of the clitoris/suspensory ligament of the penis).

2. Navel (belly button).

3. Midline "⋏" at the bottom of breastbone (sternum).

5 Key Muscles to Balance Mind and Body.

Time and Effort Required.
• Find these 5 muscles on your body.
• Be guided by your Base-Line.
• Feel for engagement and balance.
• Develop the connection between body and brain.

Link to 3D model on biodigital.com. (not finished but might be worth a play with.)

Discuss

### What is a reasonably complete set of the different meanings of "hard work"?

7 декабря, 2019 - 07:54
Published on December 7, 2019 4:54 AM UTC

The concept of "hard work" seems very confusing to me.

I think it could mean combinations of many different things. It doesn't seem that helpful for actually carving reality at the joints for engineering purposes and getting better at whatever traits are needed. It seems like a messy construct composed of many dimensions, and in fact the dials on those dimensions could be set to very different settings and still be called "hard work" without distinguishing between the different underlying factors.

I want to be able to reductionistically decompose what another person could mean when they use the phrase, so I could then narrow it down to the precise anticipation of experience that they are implying. I have some guesses as to some of those parts below.

What I want is a list of Hard-Workingness Theories, composed of particular precise-as-possible lower-level pieces, rank-ordered by plausibility/usefulness/interestingness at capturing what you think people in general mean, and/or your best attempt at defining it. Maybe there is some general feature that neatly explains most of it.

Spoiler alert if you want to think before seeing my thoughts (also, apologies for redundancies and bad grouping in my list). If someone wants (someone else) to work hard, it could mean any combination of the following components:

• Working for a long period of time, relative to some baseline, measured in hours.
• Muscular usage, measured by literal sweat, heart rate, and soreness the day after
• How physically fast a person is at completing a task
• Engaging in activities that correlate with high trait Conscientiousness (this still makes me unsatisfied with how much it passes the buck but at least it's reasonably operationalized)
• How ritualized a person's schedule is
• How much a person's attention is focused on the work
• How much a person's attention is required for success at the work
• How much working memory is required to perform a task
• How valuable others think the task(s) performed are, measured in dollars
• How meaningful a person felt a task was afterwards
• How long the time is between effort and payoff (deferment of gratification, time preference)
• How much enthusiasm a person displays for doing a task
• How much the person worked compared to other people doing the same thing
• How much VUCA the person is capable of tolerating (Volatility, Uncertainty, Complexity, and Ambiguity). I suspect this is the linchpin of much of what constitutes "hard work" for white collar workers in the 21st century. It's quite different and possibly sometimes even in conflict with Muscular Usage, however.
• The amount that a person learns from a task
• How much anxiety a person faces and does a task anyway
• Or more generally than the previous: "the ability for a person to do what a person doesn't want to do". (!) I think this usage has the largest attack surface for being used as a way to shame other people for doing things that aren't actually good for them but other people want them to do.
• "Whether a person is exercising their conscientiousness above their normal conscientiousness level" (distinguished from, "this person has a high conscientiousness level")
• Ability for the person to tolerate physical pain while doing a task
• The amount of different kinds of actions a person has to do while working, or how many different stimuli they have to stay engaged with (multitasking, parallelization)
• Relatedly: the amount of mental switching costs a person can engage in.
• Whether they can do things that others can't
• How intelligent they are
• Whether, after they are done working, they feel exhausted and like they can't handle any more of it
• To what extent the work is not something they are intrinsically motivated to do anyway without recompense of some kind
• How 'enjoyable' or 'unenjoyable' a task is
• The extent to which someone would do a thing without being observed by other people
• How much the person actually gets done, quantitatively
• How effective a person is/how much they get done qualitatively
• A judgement of how much other people deem the work to be hardworking (I think relevant in cases where humans are really bad at judging what a profession is actually like and they over- or under-estimate its impressiveness?)
• The extent to which a person changes themselves to do a thing they could not previously do
• The extent to which a person does not change themselves but changes their environment to make themselves more effective
• Attention to detail/thoroughness/checking
• Attempting to perform the task has a high risk of failure for that person, yet they do it anyway
• Attempting to perform the task has a high risk of failure for most other people, but not that person
• The person judging hard-workingness does not understand what the work involves
• The extent to which the person being judged to be hard-working has high-status generally, or has a high-status job in particular, and the extent to which praising/imputing their hard-workingness would therefore be consonant with their external identity regardless of how they actually function at their work
• The extent to which a task is socially acceptable. Is drug-dealing "hard-working" (which drugs)? Is thievery? Is prostitution? What would polls of different demographics of people say about those fields?

There are various examples that could be classified as "hard work" or "not hard work" depending on what exactly is meant by the example and the particular profile of meanings a hard-workingness classifier is using. One can test their theories of hard-workingness on these examples and ask, "Once fully specified, when will or won't this category be hard work, according to this hard-workingness theory?"

• Going to the gym
• Calling people
• Listening and taking notes
• Listening and not taking notes
• Traveling
• Giving a talk
• Making a Discord server
• Moderating a Discord server
• Writing a LessWrong post
• Writing fiction
• Driving a U-Haul to pick up a fridge
• Moving a couch
• Going to a church when you don't normally go but really having to reel in your impulse to argue with people there
• Getting to a place on time but you're quite tired
• Being an airline pilot
• Being an air traffic controller
• Being a toddler
• Starting a startup
• Doing things that have a high beneficial impact on the world

Discuss

### The New Age of Social Engineering

7 декабря, 2019 - 03:23
Published on December 7, 2019 12:23 AM UTC

Why have so many online social networks failed to form healthy communities, and instead gained notoriety as hostile spaces? I argue that the reason these platforms have failed is because they didn’t learn the lessons taught by the High Moderns when humans were first faced with the challenge of engineering alongside systems that were built through millennia of natural evolution. In a chaotic environment such as human social relations, a different engineering approach is necessary to ensure that more good is done than harm. To gain the skills necessary to make these projects a success we need to learn from the history of social environments themselves, and of human engineering strategies. What follows is the story of social evolution becoming social engineering, how the meaning of both has changed radically in the last 20 years, and what this means for designers in the new Information Era.

Part 1 — Ten millenniums of social engineering

A key part of my thesis is that the way we our social environment is formed has changed over the course of human history, and more rapidly in recent years. How do we know that to be true? Much of the work I’m building on comes out of the accounts provided by The Secret of Our Success by Joseph Henrich, as well as Seeing Like a State by James C. Scott. There are many things that I disagree with in these works, but I think they both get to the core idea that there exist two main ways in which human society develops. One of those ways is via an evolutionary process, where some societies develop some technique that aids in survival and flourishing, pass it on, and end up growing and outcompeting other societies. The people practicing these traditions often don’t have concrete knowledge as to why they work, but they become enshrined as tradition because they help the group succeed. This goes from knowledge about what plants are edible, to complex ideas like how the group should be structured. On the other hand, there is social engineering. In social engineering, explicit models of human behavior are used to derive new social conventions and structures. Usually this doesn’t mean designing something new from whole cloth, but instead an effective synthesis of ideas that the culture has generated over time into a compelling ideological canon or into new distinct institutions.

For most of human history, we relied primarily on social evolution instead of social engineering. This was for good reason: social engineering when done poorly is very often worse than social evolution. A mother who breaks tradition and tries feeding new plants to their children because she doesn’t know of any reason those plants are harmful may discover unexpected side effects of their consumption. This reality is often referred to as Chesterton’s fence, and is often discussed as an argument in favor of traditionalism. However, much of the social and technological progress of the industrial era has through the rejection of tradition. How do we square these conflicting forces? I think that a key reason is simply because for a society to succeed in social engineering, a detailed historical record and careful specialists are usually necessary. Only in this way can new first principle knowledge be solidified and built-upon. It’s for that reason that the societies that appear the most engineered also in general tend to be those with more detailed historical records and information about other differing societies. When one is able to see the culture “from above”, that unique perspective can enable one to design an effective institutions.

One excellent example is perhaps one of the most successful early cultural engineers, Confucius. The Great Teacher developed his unique philosophy while travelling around China and seeing various social issues, their causes, and the variety of different social structures present in China at the time. By synthesizing these insights into a central canon, he created an enduring cultural institution that was central to Chinese administration for centuries. Of course it is true that Confucianism relies heavily on tradition, and can in some ways be considered no more than a collection of various preexisting traditions, but its success indicates that it must have some quality beyond that of the constituent parts. Ultimately, Confucianism and the society it created lost supremacy because while it itself was a result of synthesis, it became unable to assimilate or change at pace with the world it inhabited. What once created a powerful bureaucratic class capable of financing great discoveries, ended up as a chain that left the society unable to appreciate the possibility of learning from outside influences. Innovation became tradition, and tradition cannot change course by its very nature.

Part 2 — Modernity

We’re going to leave behind ancient societies, because although they are a rich source of insight, others have better studied those trends in depth than I. Instead we are going to turn to look at the relatively modern. I am going to focus for a minute on the United States. For all that American Exceptionalism is a real risk, I think there is something somewhat unique and interesting about the formation of the US. Specifically, the US is one of the best examples of what I consider full social engineering. A group of people sat down in a room and set out to design, in a written legal document, how its society would function. It was a group of what can only be called engineers who set out to design structures to improve upon the governments they were aware of. They didn’t just try and say “tyranny is bad so we won’t be tyrants,” they tried to engineer complex social structures that took advantage of human behavior in order to guide the behavior of the government — independent of any single political actor. The idea of applying contractual thinking to the structure of our nations and institutions didn’t start with the US, but the US can be seen as a culmination of those ideas. This project has had varied success to say the least, but it’s notable that so many modern institutions function in this way. A group of founders get together and try to set the community’s direction at both an object and meta-level. Just as the objects and tools we use have become increasingly engineered, so too have our institutions become influenced by engineering. The evolution and design of social norms is at the heart of what we call society. I would venture to say that it is the defining feature of the human species, and of intelligent species in general. The Machiavellian Intelligence Hypothesis posits that intelligence arose, not to better use tools or better hunt prey, but instead to better compete in the social arena. Therefore, I think it’s fair to say the top down engineering in the world of social norms faces an uphill battle to outperform the metis of the traditional culture. However, this applies much less when we turn our eyes towards the engineering of the physical spaces our cultures occupy.

Social Engineering in the context of the physical is about the design of objects and spaces in the traditional engineering sense, but with a consideration to how that design influences the group rather than the individual. It’s obvious that a designer making a chair must consider how the chair interacts with the behaviors and preferences of the person who will eventually use the chair. This same thoughtfulness should be applied, and usually is, when dealing with objects and spaces that drive social interaction. Someone trying to build a successful bar will think carefully about the layout and decoration of the space and how that will influence their patrons. They may think about other layouts they’ve seen and how they might improve on those designs in order to give the space the mood they want. The pub is undeniably a social institution, and it is designed to both encourage and discourage certain types of social behavior. Every space you interact with, from the supermarket to the sidewalk, has generations of trial, error, and improvement. That doesn’t mean every space is perfect, but it’s easy to forget the marvel that is present all around us. However, it is the process of conscious engineering that has also introduced many institutions that are detrimental to healthy communities. One such piece of design often discussed is the American shopping mall.

In 1954, architect Victor Gruen built the first modern shopping mall in Michigan. Two years before his death in 1978 he would describe malls as destroyers of cities. The suburbanization of America, a process due in large part to decisions made by urban engineers, killed key social spaces and is widely seen as a social engineering mistake of the highest order. Many of these engineers would see the damages within their lifetime, and like Gruen, spend their life trying to reverse course. For much of human history, cities didn’t have designers, and even when city design started, it satisfied itself with general zoning, usually to protect and enforce class divisions, as often seen in early Chinese urban environments. City planning has introduced big improvements in quality of life and health, but also class segregation and the destruction of communities. Much of the excellent book, Seeing Like a State by James C. Scott, is focused on this phenomenon. The core conclusion of that book is that top-down planning is deeply flawed, evidenced by a variety of failures at such tasks, from farming to the city of Brasilia. The points made by Scott are well argued, and I recommend taking a look at the analysis of the book by Scott Alexander or Lou Keep at the very least. I agree with Scott in that top down planning of human environments is an incredibly difficult task to do successfully, and there are a mountain of failures left behind by the High Moderns for us to learn from. As Scott Alexander once said, “The road we’re on is littered with the skulls of the people who tried to do this before us.”

To me the key takeaway is that the High Moderns failed because of the particular approaches they took when undertaking their design projects. They regularly ignored the actual desires of the people living in the cities they were to redesign, and their motivations were counter to the goals of the populace. The government wanted more legible, easier to tax cities; the citizens want community and more local control (features which notably go hand in hand with organized resistance). Today, most of our designers are deeply aware of the failures of the high moderns. Books such as Seeing Like a State detail these past failures and provide guidance towards avoiding these pitfalls. Arguably cities at the forefront of growth have overlearnedthese lessons, with any attempt to demolish old buildings met with fierce opposition. We aren’t perfect, but lessons have been learned in the way we design our social spaces, with one massive exception.

Part 3 — The Internet

The internet has opened up a new frontier in the design of social institution. We are designing platforms that are used by wide masses of people, that grow more quickly than any historical analog, and that provide unprecedented levers of control over discourse. Facebook was founded in 2004; ten years later there were more monthly active Facebook users than Catholics. The ability to implement a new social institution with basically no startup cost and end up with this much influence is unprecedented, and thus it’s not surprising that we’ve seen so many instances of the new social internet having issues with healthy discourse. So much of the evolutionary work that went into shaping our meatspace social institutions has been ignored during the construction of these new online spaces. Platform designers often repeat the mistakes of the High Moderns. The users of the platforms are rarely given a voice in discussions, and are frequently treated as antagonists rather than stakeholders. Worst of all, centralized platforms have very little immediate incentives to improve the quality of discourse, as network effects prevent users from easily moving to a nicer competitor. Network operators are encouraged to gain users as fast as possible, keep them in the space as long as possible to view ads, and completely ignore the social well-being of the communities that form. Additionally, online platforms provide a nigh microscopic level of control over the interactions between users. Someone designing a bar can choose the layout of the tables and the lighting, but an online platform designer can run automated testing to determine what text, fonts, and layouts best guide the user into behaving in a desired way. In the case of platforms like YouTube, complex AI systems are constantly optimizing every nanometer of the system to maximize ad revenue.

In the early days of the internet there was a lot of optimism about the ability of the web to bring people together, to form new understanding between distant peoples, and to provide an escape from tyranny. It’s important to recognize some of the successes on these fronts. I regularly communicate with people from other nations, and that has helped give me a broader view of the world and of differing cultural norms. However, anybody can see that we have failed to live up to this early promise. Most of the social spaces online were driven in design by technological constraints and financial motives, not by a consistent dedication to building prosocial institutions. The rapid expansion of the internet has left engineers struggling to make their websites function at all, let alone spend resources on deep analysis of user behaviors. Such work is only done by established players with the goal of increasing revenue, and thus almost inevitably results in a worsening of user experiences because of misaligned incentives. To fix these issues we need both philosophical changes and technological changes.

I’d like to take a moment to go back to those early internet idealists. To imagine the potential provided by the internet. Imagine settlers arriving at a continent where there is unlimited space for new communities and cultures. Where physical violence is impossible, and where nobody can be prevented from leaving a community they don’t like. We aren’t there, and maybe it will be a long long time before we get there. Despite that, I’m an optimist at heart. I believe in human ingenuity, and in the potential for people to rise up to the challenge and opportunity they are presented with. The internet is still young, and we have time to make sure that future generations will benefit from it in ways we can’t even imagine today.

Part 4 — Closing Remarks

I tried to focus on a descriptive approach in this essay. It’s obviously informed by my own perspective, but I avoid spending a lot of time on the particulars of how I would design a social platform, instead focusing on the technological and incentive structures that provide the foundation for any platform. In the next part of this series on the design of social platforms, I intend to dive more deeply into the specifics of how I might design a platform given the sentiments expressed above, as well as my own thoughts on the nature of community.

Footnotes

This essay is heavily inspired by The Uruk Series by Sam[ ]zdat also known as Lou Keep, whose writing was really inspirational to me, a person who is more high modernist by nature.

Discuss

### What subfields of mathematics are most useful for what subfields of AI?

6 декабря, 2019 - 23:45
Published on December 6, 2019 8:45 PM UTC

I’ve seen https://www.lesswrong.com/posts/snzFQJsNYqzPZS2nK/course-recommendations-for-friendliness-researchers, but that seems to be specialized to AI alignment.

Bonus points for sorting the prerequisites into different import tiers.

Some advice on how to balance and order the learning time between math, coding, and CS theoretic parts will be appreciated.

Discuss