Вы здесь

Новости LessWrong.com

Подписка на Лента Новости LessWrong.com Новости LessWrong.com
A community blog devoted to refining the art of rationality
Обновлено: 29 минут 39 секунд назад

I found a wild explanation for two big anomalies in metaphysics then became very doubtful of it

1 апреля, 2019 - 06:19
Published on April 1, 2019 3:19 AM UTC

It's april first so I'm going to talk about the theology I've been doing.

I wanted to play this as straight as I possibly could, but partway through writing the essay I tripped over some difficult questions I don't think I can answer. I can no longer in good conscience project the kind of anguished conviction that I'd need to be able to prank myself and my readers with a bostromian religious conversion. I am sorry. I still think it's an interesting theory, so instead of really trying to sell you on it I'll just summarise it and discuss the flaws. I'll include the rest of the essay, but I recommend it only to those with a special interest in this sort of thing.

Problems to be solved

I'll start by observing two things about the anthropics of our universe that don't make sense under the standard model (insofar as we have a standard model for anthropics)

  • Anthropic measure seems to be concentrated in humans/living things, even though most of the things that exist are dead clouds of hydrogen. It is strange that when existence observes itself, despite the relative rarity of living things, observers like you or I find our subjectivity situated in the positions of living things.
    • I assume panpsychism as the null hypothesis. It is often thought that any part of the universe that observes must be conscious or sentient to do it, that observing or measuring is intrinsically linked with those patterns of behaviour. It seems clear to me that that's overfitting. The thing you know to possess anthropic measure (you) happens to have conscious behaviour, so you presume there's a link. Your dataset has only one element. That's not enough.
      • Think of Alzheimers. If you progressively take away the features of conscious behaviour, memory formation, symbolic thought, would you really doubt that anthropic measure still remains? Subjectivity/experience and conscious behaviour are not the same thing.
    • If you don't consider anthropic measure to be a coherent or important concept, I wrote this short for you. The concept of anthropic measure is a mysterious bastard, but a time will come when betting agents wont be able to avoid dealing with it any more.
  • The Doomsday Argument. It is strange that we find ourselves in the early stages of a civilization's history, considering that there will be many millions of times more of us later on. In the theory proposed here, I'm exploring a rarely mentioned branch of the doomsday trilemma.

The Teeming Consortium Hypothesis tries to make sense of these anomalies. It claims that

  • Living things eventually come to pervade and control almost every region of the universe
    • (I still find this kinda plausible. The universe is not designed to be impermeable to superintelligences and things that are not designed to be impermeable to superintelligences (or even regular intelligences) generally wont be?)
  • Life will tend towards the goal of making as much life (think; progeny) as possible, and higher efficiencies are possible in simulations.
    • Biological evolution, in reality, does not optimise the ratio of living to dead very well. Even if you can build a host planet without any immense core of stone (quite dead, is stone), the process produces a lot of dead bodies, dirt, mounds of skeletons (coral, chalk). Plants? Do we want plants or any other kind of non-sentient life? Better to just spin up a natural-ish physical simulation, while only approximating its non-sentient features.

The measure concentration anomaly is no longer an anomaly; it is explained. We find ourselves in the position of a sentient thing because most of existence has been made into a simulation substrate for sentient things. A sentient thing is the most common sort of thing that a thing can turn out to be. It is now possible to be a panpsychist, and to also expect living things to have disproportionate concentrations of subjectivity, at the same time!

The anomaly of conspicuous youth is no longer an anomaly; it is explained that we are not truly young. We are part of an old thing. At the same time, the doomsday argument may still be sort of true and we might not get the future we expected if we try to lift the stars and aestevate; we will stop resembling Life and we will no longer be so interesting. Perhaps we will be thrown away, or perhaps we will be sent somewhere new

  • (A future post might explore the question "What if we make an acausal commitment to providing the living occupants of our simulations with an afterlife beyond the end of the simulation, to boost the probability that we will find we're already covered by such an agreement." I have done a post about theories in that genre before, but I don't recommend that post for various reasons. It was way dry to the point of hostility and some of the arguments it was making fail under some approximate metaphysical relative measure arguments I came across later (in summary, it turned out that we probably couldn't "afford" it, that it isn't a profitable use of resources for most kinds of agents. I may now see a variant that wont not run into this issue.))

I consider The Teeming Consortium Hypothesis less "The universe is weird because god did it" more "The universe is weird because the sorts of gods we're going to build totally would do it".

The problems

I am not certain that these flaws are terminal. All I can say is that my physics/metaphysics is gravely offended by them. If anyone sees a way to overcome these issues, please speak up, it would be very important, if you could.

The problems mainly concern entropy. It seems that, as a rule, there is generally much more entropy in the universe than energy. Before we get uranium, we had stars, burning, exploding continuously for tens of billions of years. If entropy can have anthropic measure, then that's a whole lot of stuff that can't ever be made living. We can get out of it by proposing that entropy doesn't have anthropic measure, does not participate in subjectivity, and while this seems less arbitrary than claiming that dead stuff uniquely has no anthropic measure, ultimately, it still seems like the same crime, though a less severe case.

I don't know that the entropy that exists can ever be contained. See ("The predication of Containment" for more). I was going to leave this as a "maybe we'll figure this out, let's experiment", but now, I'd say it pushes the theory into the red, for me. Note, it is not enough to just slow down entropy for a long time. No matter how long you can keep your machines running, if the entropy gets in, they'll exist far longer in a state of death.

The SellHere lies the remains of a failed essay. It's not even proofed. I do not recommend continuing beyond this point. Unless you want to.

I'll define panpsychism as the notion that, essentially, everything that exists possesses some anthropic measure. It is a position from which the universe can be observed.

To me, panpsychism feels like the null hypothesis. This is not an ordinary position, most people, philosophers included, tacitly assume that anthropic measure and the behaviours of sentience are- if not equivalent- self-evidently coincident.

I've found an argument that seems to allow them both to coexist.

I accept panpsychism's assertion that there is no obvious reason that a thing's degree of subjectivity should be entangled with whether or not it implements the behaviour of thought or agency. That seems intuitive, to me. If I were a formless spirit floating in a void before light or time, when I open whatever cognitive medium I have to external inputs, I would not bet that I would find myself in the vessel of a living thing. Living things are very rare, in the grand scheme of things. Dead things are common, so, if all I know is that I exist, I am probably a dead thing.

And yet, despite their rarity, I have opened my eyes and found myself in the position of a living thing. We can play a lot of silly games by selecting arbitrary reference classes, but the reference class of the living does not seem arbitrary. (I think a program for identifying living things would be quite compact.)

Panpsychism, as a conclusion, then, seems unlikely. Despite how physically tiny You are, You is where you have ended up. It stands to reason, then, that there must be something special about You (or if you don't want to seem solopsistic, maybe all humans are special in this way, or all higher mammals, or all living things), some quality that attracts anthropic measure, and since we're good children of Science, we're going to tend to focus on qualities of ours that are salient between taxa: our aliveness, our sapience. Probably, we reason, it has something to do with that stuff.

The Concentrated Reality hypothesis attempts to explain the coincidence between life and measure. It does this from the stance that seems most to me like the null hypothesis: Panpsychism, as I define it: Anthropic measure starts out evenly distributed throughout the universe, and does not inherently favour humans over anything else.

Concentrated Reality then explains how anthropic measure still finds itself concentrated in such small strange things as humans.

The Teeming Consortium premise holds that the majority of species (of those who can agree well enough to be said to collectively value anything) value the abundance of living things.

It's not an odd telos for a consortium of living things to cohere around. I don't think I know many individual humans who have advocated it, but if we ever, somehow, reach far enough out to start interfacing with alien civilizations, and if we had to find some universal morality to build a covenant around, I struggle to think what else could amount from those negotiations. The directive of the teeming consortium handily allows everyone to say, it's okay that you exist. Life is its own purpose. We will continue the work of our collective mother, Evolution. We will tile the universe with the sorts of systems that fascinate us most, and no citizen of our consortium will ever be made sacrifice to a dead, grey monument.

How do they define life? Life is not a negotiated agency rearranging matter. That is what it will build, but that is not its character. Life is shreds of sinews fighting their way up out of the dirt. To invoke life, you must invoke the earlier stages of a bloody, painful, brutally random full life-history. Its parents were not machines. Its parents were chance and soil and physics, and nothing else.

It's often been thought conspicuous that we find ourselves in the early stages of a life-history, rather than in the celestial gardens of some post-organic intergalactic commonwealth. The Doomsday Argument (Or see a video. Interestingly, Nick expresses a reluctance to accept the argument as being instructive) assumes that the conspicuous youth of our upswell implies that there shall be no gardens.

I'll agree that our youth is conspicuous, but Concentrated Reality offers a different explanation: The doomsday didn't come, the teeming consortium is very real, it made its celestial gardens, and we are presently inside them.

Materially, their gardens only loosely approximate our stars and our mountains. Much greater quantities of mass and energy- sometimes to the point of redundancy- are assigned to giving reality to the bodies and minds. Aside from the consortium's imperative to make life the most abundant substance in the universe, there are a few reasons this measure concentration might be a natural requirement of simulation.

  • If you want the story to make sense from the perspective of the living, that's the thing you have to get absolutely right. The protagonists wont notice if the trees that fall out in the forest don't make a sound. For all of that, you can use an approximation. An approximation requires much less computing resources. That is to say, less mass and less energy. That is to say, less existence.
  • Even when one of these living things turns a microscope upon a nano-scale silicone lattice and demands that it behave less like an approximation, that lattice still requires less resources to simulate than it takes to simulate a mind. The dead are predictable, and the question of which properties of the dead will be perceptible to the living has mostly predictable answers.

You might, now, ask, "What does this hypothesis predict? How can it be tested? What use is it?"

There is nothing vacuous about this hypothesis.

It explains two of our reality's big peculiarities (Life's conspicuous youth, the relationship between life and anthropic measure)

It is predicated upon a number of significant claims about metaphysics, intelligent life, and technology, that we will get into below. If any of them fail to apply, the theory also fails.

The predications might just be predictions. I'm going to call them predications because to me it feels like I'm producing them by rationalising. Sufficiently holistic, sensitive rationalising might not be distinguishable from ingenuous rationality. I don't know. Maybe rationality unmoores us all from common sense and maybe we have no choice but to believe crazy things. All I can say now is that, if the metaphysics to dismiss the predications exists, I haven't stumbled over it yet.

The predication of Permiability

In the territories of our consortium, the measure of the living is boosted, but what about deadzones outside of those territories? What about regions of existence whose physical laws are not so finely tuned as to produce or support life? What about the vast places beyond any consortium's reach where life never stirs?

The premise of Permiability holds that there are very few impermeable deadzones, that the boundaries between different parts of the universe tend to be leaky.

The Concentrated Existence hypothesis seems to depend on the assumption that the universe (that universes, generally) will be found to be permeable. We were wrong when we said no one would ever go beyond the mountains, or the desert, or the oceans. We were wrong when we said no one could go to the moon. Maybe our models are wrong to predict that no one will ever leave the local cluster, and maybe they will be wrong again in so many other ways. If I ask of you; are you confident that no technology will ever let us move faster than c, for now, it wouldn't be wise to give a firm answer. We will struggle to grow as vast, clever and mighty as anything here can, and then we'll meet the walls of the known universe again, and then if we find that the old laws still hold and our expansion has to end there, we will be able to say with some degree of confidence that the multiverse probably isn't Permeable and so the Concentrated Reality hypothesis cannot apply and the dead will always outnumber the living. For now, the question has to remain open.

Let's get closer to formalising this

Define universes as regions of existence that cannot interact with each other.

Permeability makes two subclaims

Innoculation: most universes (a fraction of universes with measure close to one) contain life somewhere. That is to say, in any fairly complex thing, there is a place with the right balance of variation and stability for life to emerge.

Transmissibility: there are very few barriers within universes that an intelligent species cannot pass through.

An issue with entropy

Imagine that a consortium is able to colonise the entire multiverse. For a time, the whole thing teems. Every piece of matter anywhere is ingeniously slotted into a single great resplendent garden.

What happens when the energy runs out.

However long it lasted, there is now just as much Stuff in the universe as their was before, but now all of it is Dead, and Death will have a much longer reign than Life did.

So how can the full coverage occur if entropy exists?

I see two possible answers.

  • Entropy, specifically, does not have anthropic measure, and so does not detract from the measure of the living. I don't like this answer. Anthropic measure is weird enough that it might be true, but I don't like it.
  • Most of the entropy that exists in a universe can be contained. To break this down into subclaims:
    • Reversible computers are possible- that it is possible to instate complex computations that run without consuming energy.
      • It is currently inconceivable to us, but it is hardly outlandish to think that we will keep trying to do it forever, and forever is a long time for a technological species. Eventually, we may break through to some level of reality with simple enough physics to create a frictionless wheel.
  • Hidden Wells: What entropy naturally exists (exists before any living thing can reach that section of the universe) is relatively small next to stable (largely hidden) usable energy
    • A parable to explain why I consider this plausible:
    • For a long time, humans derived their energy from forces of nature that they could see with their eyes. Wood sustained their fires. People drove their querns and food drove the people. Water powered their mills. Unearthed fossils powered their engines.
    • One day, a great change visited us. Humans discovered that way down in the matter of things hiding behind the shells of atoms was an immense well of energy, far greater than anything they had conceived of before. Even though this new energy had been lying there completely silent and stable in the earth, it was so vast that humans now had the power to destroy themselves, nature, to unleash forces that could turn the whole of the earth's surface into a barren desert.
    • I do not think nuclear energy will be the last instance of Life discovering an extremely voluminous, extremely stable source of energy hidden deeper in the matter of existence than we had previously looked. I think this will happen many times. I think this is just how things tend to be. Wells of energy hidden behind the shells of atoms, beyond the reach of nature's roving fires.
  • That the entropy can be permanently contained - that the unknowability of some small part of reality will not gradually infect the rest of the universe.
    • To illuminate this, it might be fun to ask the following question.
    • Say there's a superintelligent line of cells in a Conway's Game of Life system (or, a line of cells whose state we can control).
    • A small portion of the grid is configured in an unknown, random state.
    • Can a physics gen support a way of containing the entropic part of the system that works most of the time? Can we prove that we can't?

[Epistemic status: A great writer said, last year, "There’s a Jewish tradition that laypeople should only speculate on the nature of God during Passover, because God is closer to us and such speculations might succeed. And there’s an atheist tradition that laypeople should only speculate on the nature of God on April Fools’ Day, because believing in God is dumb, and at least then you can say you’re only kidding.".]


On the Nature of Agency

1 апреля, 2019 - 04:32
Published on April 1, 2019 1:32 AM UTC

Epistemic status: Fairly high confidence. Probably not complete, but I do think pieces presented are all part of the true picture.

Agency and being an agent are common terms within the Effective Altruist and Rationalist communities. I have only used heard them used positively, typically to praise someone’s virtue as an agent or to decry the lack of agents and agency.

As a concept, agency is related to planning. Since I’ve been writing about planning of late, I thought I’d attempt a breakdown of agency within my general planning paradigm. I apologize that this write-up is a little rushed.

Examples of Agency and Non-Agency

Keeping the exposition concrete, let’s start with some instances of things I expect to be described as more or less agentic.

Things likely to be described as more agentic:
  • Setting non-trivial, non-standard (ambitious) goals and achieving them.
  • Dropping out of school and founding a startup.
  • Becoming president.
  • Making a million dollars.
  • Self-teaching rather than enrolling in courses.
  • Building and customizing your own things rather than purchasing pre-made.
  • Researching niche areas which you think are promising instead of the popular ones.
  • Disregarding typical relationship structures and creating your own which work for you.
  • Noticing there are issues in your workplace and instigating change.
  • Noticing there are issues in your society and instigating change.
  • Gaming the system (especially when the system is unjust).
  • Reading diverse materials to form your own models and opinions.
  • Being able to receive a task with limited instruction and able to execute it competently to a high standard.
  • Ignoring conventional advice and inventing your own better way.
  • Having heretical thoughts and believing things other people think are wrong or crazy.
  • Strong willingness to trust one’s own opinion despite the views of others and even experts.
  • Accomplishing any difficult task which most people are unable to.
Things likely to be described as less agentic:
  • Spending your life working on the family farm like your parents before you.
  • Unquestioningly adopting the views of your friends, family, faith, or other authorities.
  • Proceeding through school, undergraduate, graduate degree as the simplest pathway.
  • Sticking to conservatively prestigious or stable professions like medicine, law, nursing, construction, teaching, or even programming.
  • Seeking social approval for one’s plans and actions or at least ensuring that one’s actions do not leave the range of typically socially-approved actions.
  • Prioritizing safety, security, and stability over gambits for greater gain.
  • Requiring specific direction, instruction, training, or guidance to complete novel tasks.
  • Following current fads or what’s in fashion. Generally high imitation of others.

I have not worked hard to craft these lists so I doubt they are properly comprehensive or representative, but they should suffice to get us on the same page.

At times it has been popular, and admittedly controversial, to speak of how some people are PCs (player characters) and others are mere NPCs (non-player characters). PCs (agents) do interesting things and save the day. NPCs (non-agents) followed scripted, boring behaviors like stock and man the village store for the duration of the game. PCs are the heroes, NPCs are not. (It is usually the case that anyone is accomplished or impressive is granted the title of agent.)

The Ingredients of Agency

What causes people in one list to be agentic and those in other to be not so? A ready answer is that people being agentic are willing to be weird. The examples divide nicely along conformity vs nonconformity, doing what everyone else does vs forging your own path.

This is emphatically true - agency requires willingness to be different - but I argue that it is incidental. If you think agency is about being weird, you have missed the point. Though it is not overly apparent from the examples, the core of agency is about accomplishing goals strategically. Foremost, an agent has a goal and is trying to select their actions so as to accomplish that goal.

But in a way, so does everyone. We need a little more detail than this standard definition that you’ve probably heard already. Even if we say that a computer NPC is mindlessly executing their programming, a human shopkeeper legitimately does have their own goals and values towards which their actions contribute. It should be uncontroversial to say that all humans are choosing their actions in a way that digital video game NPCs are not. So what makes the difference between a boring human shopkeeper and Barack Obama?

It is not that one chooses their actions and the other does not at all, but rather the process by which they do so.

First, we must note that planning is really, super-duper, fricking hard. Planning well requires the ability to predict reality well and do some seriously involved computation. Given this, one of the easiest ways to plan is to model your plan off someone else’s. It’s even better if you can model your plan of those executed by dozens, hundreds, or thousands of others. When you choose actions already taken by others, you have access to some really good data about will happen when you take those actions. If I want to go to grad school, there’s a large supply of people I could talk to for advice. By imitating the plans of others, I ensure that I probably won’t get any worse results than they did, plus it’s easier to know which plans are low-variance when lots of people have tried them.

The difference is that agents are usually executing new computation and taking risks with plans that have much higher uncertainty and higher risk associated. The non-agent gets to rely on the fact that many people’s model thought particular actions were a good idea, whereas the agent much more needs to rely on their own models.

Consider the archetypical founders dropping out of college to work on their idea (back before this was a cool, admirable archetype). Most people were following a pathway with a predictably good outcome. Wozniak, Jobs, and Gates probably would have graduated and gotten fine jobs just like people in their reference class. But they instead calculated that a better option for them was to drop out with the attendant risk. This was a course of action that stemmed from them thinking for themselves what would most lead towards their goals and values. Bringing their own models and computation to the situation.

This bumps into another feature of agency: agents who are running their own action-selection computation for themselves rather than imitating others (including their past selves) are able to be a lot more responsive to their individual situation. Plans made my the collective have limited ability to include parameters which customize the plan to the individual.

Returning to the question of willingness to be weird: it is more a prerequisite for agency than the core definition. An agent who is trying to accomplish a goal as strategically as possible and who is running new computation and performing a search for the optimal plan for them - they simply don’t want to be restricted to any existing solutions. If an existing solution is the best, no problem, it’s just that you don’t want to throw out an optimal solution just because it’s unusual.

What other people do is useful data, but to an agent it won’t inherently be a limitation. (Admittedly, you do have to account for how other people will react to your deviance in your plans. More on this soon.)

Mini-summary: an agent tries to accomplish a goal by running relatively more of their own new computation/planning relative to pure imitation of cached plans of others or their past selves; they will not discard plans simply because they are unusual.

Now why be agentic? When you imitate the plans of others, you protect against downside risk and likely won’t get worse than most. On the other hand, you probably won’t get better results either. You cap your expected outcomes within a comfortable range.

I suspect that among the traits which cause people to exhibit the behaviors we consider agentic are:

  • A sense that more is possible. They believe that there are reachable outcomes much better than the existing default.
  • An aspiration, striving, or ambition to the more which can they envision.
  • Something to protect.

  • Conversely, complacency is the enemy of agency.

There has to be something which makes a person want to invest the effort to come up with their own plans rather than marching along the beaten paths with everyone else.

Or maybe not, maybe some people have powerful and active minds so that it’s relatively cheap to them to be thinking fresh for themselves. Maybe in their case, the impetus is boredom.

An agent must believe that more is possible, and more crucially they must believe that it possible for them to cause that more. This corresponds to the locus of control and self-efficacy variables in the core self-evaluations framework.

Further, any agent whose significant work you’re able to see has likely possessed a good measure of conscientiousness. I’m not sure if lazy geniuses might count as an exception. Still, I expect a strong correlation here. Most people who are conscientious are not agents, but those agents you who observe are probably conscientious.

The last few traits could be considered “positive traits” active traits that agents must possess. There are also “negative traits”, traits that most people have and agents must have less of.

Agents strive for more, but the price they pay is a willingness to risk getting even less. If you drop out of college, you make millions of dollars or you might end up broke and without a degree. When you make your own plans, possibly go off the beaten path, there is likelihood of failure. What’s worse, if you fail then you can be blamed for your failure. Pity may be withheld because you could have played it safe and gone along with everyone else, and instead, you decided to be weird.

Across all the different situations, agents might be risking money, home, respect, limb, life, love, career, freedom and all else they value. Not everyone has the constitution for that.

Now, just a bit more needs to be said around agents and the social situation. Above it was implied that the plans are of others are essentially orthogonal to those of an agents. They’re not limited by them. That is true as far as the planning process goes, but as far as acting one plans goes, it takes a little more.

An agent doesn’t just risk that their unusual plans might fail in ways more standard plans don’t, they also have to risk they will 1) lose out on approval because they are not doing the standard things, 2) actively be punished for being a deviant with their plans.

If there is status attached to going along certain popular pathways, e.g. working in the right prestigious organizations, than anyone who decides to follow a different plan that only makes sense to them must necessarily forego status they might have otherwise attained. (Perhaps they are gambling that they’ll make more eventually on their own path, but at least at first they are foregoing.) This creates a strong filter that agents are those people who were either indifferent to status or willing to sacrifice it for greater gain.

Ideally it would only be potentially foregone status which would affect agents, yet instead there is the further element that deviance is often actively punished. It’s the stereotype that the establishment strikes out against the anti-establishment. Everyone group will have its known truths and its taboos. Arrogance and hubris are sins. We are hypocrites who simultaneously praise those who have gone above and beyond while sneering at those who attempt to do the same. Agents must have thick skin.

Indeed, agents must have thick skin and be willing to gamble. In contrast, imitation (which approximates non-agency) serves the multifold function of a) saving computation, b) reducing risk, and c) guarding against social opprobrium and even optimizing for social reward.

Everyday Agents

I fear the above discussion of agency has tended too grandiose, too much towards revolutionaries and founders of billion dollar companies. Really though, we need agency on much more mundale scales too.

Consider that an agentic employee is a supremely useful employee since:

  • If you give them a task with limited instruction, they will use their own new computation/planning to figure out how to execute it well. They don’t need things step by step.
  • They will supply their own sense that more is possible and push for excellence.
  • They will take initiative to make things better because of their sense that more is possible.
  • They will not be inhibited by excessive fear of failure or your disapproval because they did something other than you explicit instruction.
  • They’re willing to take on new and unusual tasks and learn new skills because:
    • They have high self-efficacy
    • They’re in the habit of thinking their own fresh thoughts instead imitating and enacting ready-made plans.
    • They’re willing to fail in the course of trial-and-error to figure things out.

An agentic employee is the kind of employee who doesn’t succumb to defensive decision-making.

Why Agency is Uncommon

The discussion so far be summarized neatly by saying what is which makes agency uncommon:

  • Agency is rare because it involves planning for yourself, going off the beaten the path rather than imitating and copying the plans of others or your past self. Planning for yourself is really, really hard.
    • It requires the skill of planning for yourself.
    • It requires the expenditure of effort to do so.
  • Agency requires both a sense that more is possible and a striving to reach that more. Conversely agency is poisoned by the presence of complacency.
  • Agency requires belief in one’s self-efficacy and that one is the locus of control in their life.
  • Agency requires lower than average risk-aversion since attempting potentially non-standard plans means risking non-standard failure.
    • In particular, it requires low social risk-aversion.
    • This applies at both the macro and micro scale.
  • Agency requires conscientiousness.
  • Agency requires a resilience to social sacrifice either passively via foregone status or approval or actively via the punishment received for deviating from the norm.
Agent/Non-Agent More and Less Agentic

This post is primarily written in terms of agents and non-agents. While convenient, this language is dangerous. I fear that when being an agent is cool, everyone will think themselves is one and go to sleep each night congratulating themselves for being an agentic unlike all those bad dumb non-agents.

Better to treat agency is a spectrum upon which you can be scoring higher or lower on any given day.

  • How agentic was I today?
  • Was I being too risk averse today?
  • Was I worrying too much about social approval?
  • Am I trying to think with fresh eyes and from first principles of new ways I could accomplish my goals? Or am I just rehashing the same possibilities again and again?
Addendum: Mysterious Old Wizards

A friend of mine has the hypothesis that a primary way to cause people to be more agentic is to have someone be their mysterious old wizard a la Gandalf, Dumbledore, and Quirrell. A mysterious old wizard shows up, believes in someone, and probably says some mysterious stuff, and this help induces agency.

I can see this working. This might have happened to me a bit, too. If someone shows up and is sufficiently high-status in your mind, and they tell you that you are capable of great things, they can cause all the following:

  • You allow yourself to believe more is possible because the wizard believes it too.
  • You believe that you are capable (self-efficacy, locus of control) because the wizard does.
  • You are willing to go on your quest despite social opprobrium, because now you only care about the society of you and the wizard, not anyone else.

I can see it working.


Experimental Open Thread April 2019: Socratic method

1 апреля, 2019 - 04:29
Published on April 1, 2019 1:29 AM UTC

This post was popular, but the idea never got picked up. Let's have an experimental open thread this month!

The rules:

Top level comments would be claims. Second level comments would be discouraged from directly saying that someone is wrong and instead encouraged to ask them questions instead to get them to think

Let top level comments be debatable claims, first tier responses be questions, second tier answers, responses, answers, etc. Try to go as deep as possible, I'd expect an actual update to be increasingly likely to happen as you continue the conversation.


Open Thread April 2019

1 апреля, 2019 - 04:14
Published on April 1, 2019 1:14 AM UTC

If it’s worth saying, but not worth its own post, you can put it here.

Also, if you are new to LessWrong and want to introduce yourself, this is the place to do it. Personal stories, anecdotes, or just general comments on how you found us and what you hope to get from the site and community are welcome. If you want to explore the community more, I recommend reading the Library, checking recent Curated posts, and seeing if there are any meetups in your area.

The Open Thread sequence is here.


What are effective strategies for mitigating the impact of acute sleep deprivation on cognition?

31 марта, 2019 - 21:31
Published on March 31, 2019 6:31 PM UTC

I've recently been finding that I struggle much more with intellectual work (math, hard programming, writing, etc.) when I sleep less 6.5-7 hours. While I'm at peace with the fact that I seem to generally require >7 hours a sleep, it's frustrating that even though I set aside enough time for adequate sleep, I'll often wake up after only ~6 hours of sleep and not be able to fall back asleep.

My cognitive ability seems to be impacted by a single night of bad sleep even when I've been sleeping well in the recent past. Concretely, if I've slept 8 hours every night for two weeks, a single night of poor sleep can still result in a ~50% less productive day.

In addition to impacting productivity, acute sleep deprivation also leaves me much less capable of entertaining myself by thinking, so I become much more inclined to seek out distracting forms of entertainment like scrolling through the internet. It also seems to increase my cravings for generally "unhealthy" foods (I've seen references to this in literature, but won't bother linking them since it's not the focus of my question).

Other useful notes about my general sleep habits/history include:

  • I'm not sure if I've always been this sensitive to sleep deprivation and just notice it more due to a combination of more introspective and spending more time on certain activities or if something's changed and I've become more sensitive.
  • I generally have 1 cup of coffee in the morning around when I wake up. More cups of coffee do not seem to offset sleep deprivation's impact on my cognitive ability, and in fact have at times exacerbated it.
  • I've tried napping when it's fit with my schedule and each time ended up lying awake for the 20-40 minutes during which I intended to nap.

I'd love to hear others' strategies for mitigating the impact of acute sleep deprivation on cognitive ability. I've done some preliminary searching for papers, articles, etc., but those that I've found focus on reducing tiredness rather than on returning cognitive ability to baseline. I'm open to trying strategies including but not limited to diet changes, supplements, medication, and habit changes.


The Case for The EA Hotel

31 марта, 2019 - 15:31
Published on March 31, 2019 12:31 PM UTC

Epistemic Status: I strongly believe all the things I’m writing here. These are mostly heuristics and mental models rather than hard data, which I think is necessary for a project so young. I’m trying to make a strong case for the EA hotel, not a balanced one (although it will probably be balanced by the $100 on the line for articles taking the opposite view).

The EA Chasm

There’s something broken about the pipeline for both talent and projects in the EA Community. There’s a space in which there’s a lot of talented people in EA who want to do good, and there’s a lot of people with ideas about projects that could do good. Finally, projects like Charity Entrepreneurship seem to indicate that there’s no shortage of ways to do good. What’s missing is a way to go from a talented EA—with no evidence behind your project, no previous projects under your belt, and little status within the EA community—to someone who has enough capital to prove that their project has merit.

This gap exists for a number of reasons, including strong risk aversion in the EA community, a lack of diversity in grant decision making processes, and a lack of manpower to vet hundreds of projects for the small amount of money they would need to prove themselves enough to move up to the “projects with strong evidence” category. A number of solutions have also been proposed to fill in this gap, including an EA projects evaluation platform and a suggestion for EAs to work on Non-EA projects in order to get a good track record and higher status (and thus be able to be hired or get grants). However, both of these suggestions miss out on one of the big reasons the chasm needs to be filled—strong vetting is nice, but there’s no replacement for simply trying many things and seeing what works.

Why The Chasm Matters

This Chasm is a big deal for the community. Organizations like CEA can work to guide the community towards a better future, and organizations like Charity Entrepreneurship can slowly work to allow more organizations that do good work. But by not tapping into the creativity and sheer variety of thought of the bottom two sections of the picture above, the EA community is losing out on a large number of utils that come from trying a lot of things from a diversity of perspectives, creating tight feedback loops, and seeing what works.

Silicon Valley is great proof of this concept. While it’s true that the standards for seed funding have been growing in recent years (and this may be another factor in the EA model, if they’re trying to copy Silicon Valley), it’s also true that preseed accelerators with extremely low vetting standards have still generated tens of billions of dollars worth of value. EA, with a surplus of ideas that don’t have capital to get off the ground, and a surplus of talented individuals willing to work on these ideas, should view this is a neglected opportunity to do a lot of good for the world. And they should view the EA Hotel as a wonderful proof of concept for an organization looking to fill in this Chasm.

The EA Hotel is More Effective Than Directly Sponsoring Individuals or Projects

One way to view the EA hotel is as a grant giving organization that pays for people’s living expenses for a period of time, while those people have opportunities to prove that their projects are good enough to get to the next stage of funding. For EAs who are still looking for projects, it provides a bridge to focus on gaining skills and knowledge while getting chances to join new projects as they circulate through the hotel.

When the EA hotel is looked at in this light, the question then becomes “does it make more sense to fund individual projects and EAs, rather than letting the EA hotel fund them for you?” The EA Hotel has several features that make it a more effective option.

No Rent

The largest living expense for most people (especially the large number of EAs in London, Oxford, and the Bay Area) is rent. When sponsoring someone yourself, most of your money will be going into that black hole. The EA hotel has done the efficient thing and bought the hotel outright. This means that rent is not something you as a funder have to pay, and the longer the hotel lasts and the more residents it helps, the more efficient this mechanism becomes over paying rent.

Cheap Cost of Living and Lower Standard of Living

One unique thing about the EA hotel as a grant-giving mechanism is that it forces the residents to move to Blackpool. While there are some downsides to this, I think there are two huge upsides from a cost-effectiveness perspective. The first is that the cost of living is extremely low. Just like with rent, funding the EA hotel here consistently makes your money go further than funding a random EA who would choose their own place to live.

Another important fact is that standard of living here is simply lower. While trying to be extremely frugal in San Francisco, I couldn’t help but notice that my standard of living and happiness was impacted by those around me. However, as a consequence of living in Blackpool, and a secondary consequence of only having my savings and a small living stipend, I’ve found that I’ve been happier with a much cheaper standard of living in Blackpool. There’s some data that shows that how standard of living impacts happiness is relative to others in your immediate environment, and is not absolute. This means that I can be happy and productive at a much cheaper cost at the EA hotel than at a group house in Berkeley, and your donation dollars can stretch further.

Propinquity and Collaboration

By putting all of the projects together under the same roof, the EA hotel does an excellent job of fostering connections, encouraging collaborations, and creating a strong environment for serendipity and synergy among projects. In my short time here, I’ve seen a methodologist help an organization with designing their RCT, a coder help a different organization automate one of their biggest bottlenecks, and an organization which needed help on measuring impact get help from someone who had written an important paper on the matter. More importantly than these individual collaborations, I’ve seen people’s ideas grow and develop as they get exposed to critiques and new ways of thinking. This is an effect you simply don’t get if you sponsor projects separately instead of as a group.

Superconnecting and Status Building

The final thing I’ve seen from the EA hotel is that, while being in a cheap, out-of-the-way city, it’s enough of a unique attraction (and there’s always enough free rooms available) that it has become a ‘destination’ for EAs to check out when they’re in Europe. This is an important fact, as normally one of the benefits of being in a more expensive city (and one of the reasons most startup incubators are located there) is that it allows you to begin building connections with the people you’ll need to know when moving to the next stage of the pyramid. However, by having the “hotel” aspect, and becoming a destination, the EA hotel manages to attract a steady stream of individuals from all aspects of the EA community. It has managed to become an effective networking hub while being in a city with a cheap cost of living, and has achieved something for projects that merely funding them to live on their own could not.

The EA Hotel Is An Effective Incubator

Thus far, I’ve made the case that there’s a surplus of potential in the EA community, and a Chasm that needs to be filled to use the surplus. I’ve also made the case that something like the EA Hotel is an effective way to fill that Chasm. What I haven’t done is make the case that this particular team and project have done a good job of realizing that goal.

In the following section, I’ll attempt to give my inside view of why I believe the project and team are suited for filling the goal, as a 3-month resident of the hotel, and someone who has witnessed and created other teams and cultures.

Correct Acceptance Standards

One persistent criticism of the hotel is that it has too low standards for what projects it accepts. However, the standard that the hotel has (accept everyone when there’s space, and only prioritize when they’re over capacity) is the correct choice for an organization that’s trying to fill the Chasm like the EA hotel is.

Let’s return to our Silicon Valley metaphor, and the pre-seed incubator I alluded to earlier, The Founders Institute. The Founders Institute, while I don’t think they admit it publicly, has a similar policy of accepting as many candidates as there are slots, and trying to maximize the amount of projects rather than having some perceived quality cutoff. The Founders Institute knows two things.

  1. At this stage in their career, it’s very hard to vet first time founders. Without a track record, all they have to go on is charisma and clarity of thought, which is actually something that many first time founders will only learn only through the process of creating their first startup.
  2. Sometimes the best ideas look completely ridiculous. Consider the idea of creating a website where strangers can rent out their homes to other strangers.

So instead, the Founders Institute does something else—it implements a series of tight feedback loops and standards, causing founders to have to prove both themselves and their projects to graduate the program. While the acceptance rate for the Founders institute is very high, the graduation rate is only around 30%. The hope is that most of those 30% have achieved enough in their project to get them a more traditional seed round.

Similarly, the EA hotel has weekly check-ins to gauge the progress of their participants, and is working on implementing more stringent feedback loops for the people who enter the hotel. The goal, instead of trying to vet the people and projects up front, is to use the process itself to vet the project and the individual. As they pass increasingly high bars, they eventually cross the bar where they achieve good evidence for their project, and can then move on to the next stage of the pyramid. If it turns out they can’t meet that bar, they go back down to the previous stage of the pyramid, work on leveling up, and try again when they think they’re ready.

Removing Trivial Inconveniences

As a creator or early participant in a new project, focus is everything. Time and attention are wasted when put toward things other than those that directly work to impact your biggest metrics, or validate your biggest assumptions. Furthermore, the type of work you have to do to validate or invalidate these assumptions is scary, hard, and often emotionally draining. Every little bit of energy that you can save by not having to deal with trivial inconveniences is a blessing.

At the EA Hotel, my grocery shopping is taken care of for me. Dinners are cooked for me. Grab and go food for breakfast and lunch is restocked without my having to think about it. My dishes are done for me. My sheets are changed for me. All of this allows me to avoid an incredible amount of context switching that simply doesn’t have to happen because the EA Hotel recognizes the importance of focus. Furthermore, they’re always improving. A big portion of the managers’ job is finding trivial inconveniences and removing them. Areas get more organized over time, systems get refined over time, busy work gets removed over time. This is exactly the environment that I expect to be able to more effectively create valuable projects over time.

A Productivity Culture

I’ve been a part of several group houses in which a large portion of the people who lived there worked from home. I’ve been a part of at least one attempt to instill a strong culture of working hard when in the presence of your peers. There’s only one culture that I’ve been a part of that I think has more of a culture of productivity than the EA Hotel, and I’d say it’s in the top 5-10% of creating and sustaining strong organizational cultures, including both for-profits and non-profits (so much so that it has been called cult-like).

I think that the attitude towards trivial inconveniences is a big part of this. The idea that the management is clearing so much space for work creates a culture where everyone is simply working. I should note that when bringing this to the EA hotel, there was at least one person who said he doesn’t believe the EA Hotel has enough of a productivity culture. When polled, everyone else present agreed that they’re more productive here than they have been in any other context. This is a huge boon for an organization trying to vet projects as quickly as possible, and somehow the EA hotel does this better than any other co-living situation I’ve been a part of.

A Growth Culture

One of the big pushes the EA hotel has made in the last few months is to foster a culture of growth. There are weekly talks by members of the hotel that are highly attended. There are weekly opportunities for debugging bottlenecks in your life, and learning new skills to make that debugging more effective. A sizable portion of the hotel attends the local gym, and there is someone to go with almost every day of the week, at various times that suit you.

As important as these individual activities are, the most important thing is the culture that develops around them. Growth is accepted and expected at the EA hotel, and that’s important for people creating new projects and learning the skills as they go.

A Support Culture

Another big shift I’ve seen the EA hotel make in the past few months is towards a culture of support. In concrete terms, you can see this in the sizable population of people who participate in morning and nightly hugs, greeting each other with a strong hug the first and last time they see each other every day. It’s also strongly visible in the existence of a Hotel Guest Representative, whose main job is to listen when hotel residents are having a hard time and look out for their interests. It’s visible in the nightly group dinner, and the easy discussion that usually accompanies it. However, it’s more visible in the day-to-day interactions you have with guests, such as when someone offered to bring food up to my room when I was sick, or seeing the celebration when a guest got their paper published in a journal.

Creating a project from scratch is hard, first time startup founders often find themselves falling into depression and loneliness, failing simply because they don’t have the support to take on the demands of the job. The support culture of the EA Hotel goes a long way towards making it more bearable.

Consistent Improvement

A final thing that has impressed me about the EA hotel is the ability of Greg and the trustees to take feedback and improve the concept over time. I’ve already mentioned the changes I’ve seen in the past few months, but an even better sign to me is the way Greg listens to criticism and responds to feedback. Whenever he hears a good idea, it’s immediately written down, and the best ideas are tested and implemented over time. This gives me cause to believe that the EA hotel hasn’t just lucked onto the above aspects of its culture, but is likely to continue to develop into an even more effective organization over time.


I’ve made three major claims in this post. First, I’ve made the claim that EA as a movement could be doing a lot more good if it filled the Chasm in its pipeline. Second, I’ve argued that something like the EA Hotel is a good way to fill this Chasm. Finally, I’ve argued that the EA Hotel has functioned well as an organization dedicated to filling this Chasm, and that there’s reason to believe it will continue to do well in the future. While I don’t think it’s impossible the EA Hotel could fail at this goal, my inside view gives me confidence that it’s very likely to succeed. However, even with the outside view, I believe the models given in this post make the case that the EA Hotel, if successful, would be highly useful in expectation, and make a strong cost-effectiveness case for an EA Hotel like project if you think those models are accurate. I look forward to your feedback and comments.

About Me

I’m Matt Goldenberg. I’ve been living at the EA Hotel for about three months, and I’m due to leave in another three. I’ve been a resident at a number of EA and rationality group houses in the Bay Area, including a brief stay at Event Horizon, a stint at Milvia House, and as a cofounder of Gentle Mesa. I’ve previously run a number of small businesses, and one startup. While at the EA hotel, I’ve been working on my main project Project Metis, as well as writing a number of articles on the side for LessWrong.


What would 10x or 100x better than CFAR look like?

31 марта, 2019 - 08:55
Published on March 31, 2019 5:55 AM UTC

If you were creating a rationality-increasing org, how would you design it? Focus on whatever properties you care about.


What LessWrong/Rationality/EA chat-servers exist that newcomers can join?

31 марта, 2019 - 06:30
Published on March 31, 2019 3:30 AM UTC

A few times over the past few years newcomers to the community have asked me how they can start engaging with the community, and one of the lowest-cost ways to engage has historically been chatrooms.

I know that there has been a pretty significant proliferation of LW/Rationality/EA-related chat-servers in the past few years, and it seems like a good idea to collate all of them into a central repository.

I know about Elo's LW-Slack, and I know of the abstract existence of a bunch of EA-related Discord channels, but haven't been super active in this space, so I would appreciate other people's contributions.

If we get a good list, I think it would make sense to add that list to the community tab to make it easier to discover (or at least link this question from the community section).


Why Planning is Hard: A Multifaceted Model

31 марта, 2019 - 05:33
Published on March 31, 2019 2:33 AM UTC

Epistemic confidence: Highly confident. This post doesn’t cover everything relevant to the topic, but I am confident that everything presented is solid.

You may have noticed that planning can be rather difficult. Granted, not all plans are difficult, planning what you’re going to have for dinner isn’t too bad usually; however serious planning can range from merely daunting to seemingly intractable. Think of the challenge of planning towards a satisfying career, a fulfilling relationship, the success of your startup, or the simple preservation and flourishing of human civilization.

I present here a gears-level, reductionist account of planning which makes it starkly clear why planning is hard. The point not being that we should give up because planning is futile. Far from it. With a detailed model of the factors which make planning hard, we can derive a unified roadmap for how to get better at planning and ultimately have a unified, powerful approach to making better plans.

ContentsWhat is planning?

We can’t talk about why planning is hard before clarify what planning is. It’s pretty simple:

Planning is the selection of actions in order to achieve desired outcomes. [1]

The need for planning arises when the following conditions hold:

  • There are multiple states that the world can be in.
  • You have preferences over which states the world is in.
  • There are multiple actions available to you, each of which might cause the world to more likely be in some states rather than others.
  • You cannot take all of the actions, either because you lack the resources to do so or because the actions are inherently exclusive.

These conditions lead to a refinement of the definition above: planning is the selection of actions from among competing alternatives. One must prioritize among one’s available options and allocate resources to whichever options rank most highly.

Often one is allocating resources among multiple options in proportion to their priority, but I think we can still rightly call it prioritization even when one is only deciding between allocating 100% of their resources to only one option of out two. In other words, all of planning is prioritization. [2]

Planning is a Prediction/Information Problem

On what basis should we select or prioritize one action over another? Of course, we should select whichever actions we most expect to lead us to the worlds we most prefer. We should choose actions based on the Expected Value (EV) we assign to them.

And therein lies one of the core challenges of planning. Expecting is just another word for predicting. And predicting is just word which means assigning each action a probability distribution over the world states it will result in. That’s hard to do well.

Planning is hard because predicting is hard. Of course, predicting is a lot easier when you have more information, but usually we have far less than we’d like, so planning is hard because of limited information. Planning is a prediction problem and an information problem.

People of think of planning as being about “doing”, but in truth planning is just as much about “knowing”. One thing this amounts to is that the instrumental rationality involved in planning is inseparable from the epistemic rationality of having true beliefs, good models, and making accurate predictions. Any planner’s ability is going to be capped by their epistemic skill.

What we can do with this realization is that in any situation where we’re planning, we can pay deliberate attention to the predictions we need to make, the information we have, and ways in which might be able to make better predictions. (I call this the Information Context of a plan.) Alternatively stated, one can begin to approach planning with an uncertainty reduction mindset [3]. Rather than allocating all of one’s resources from the outset, one instead allocates a portion of the resources towards “purchasing” information which improves the subsequent allocation of the rest.

We do this already in many cases (reading reviews, asking friends, trials), but the occasions where we fail to do this can cost us big time: the student who didn’t research before starting law school, the company that spent years developing a product before testing with users, the couple who rushed into marriage, etc. (Not the only cause, but emotions often make us impatient to go all out with an option without adequate consideration.)

Unfortunately, while it’s easy to say that getting more information and reducing uncertainty is a good thing, there’s still a lot of complexity in managing uncertainty. Knowing how much is okay and how to efficiently reduce it. Still, always remembering that prediction is a core challenge of planning is a start.

Planning is a Computation Problem

Unfortunately, even having all the right information and perfect prediction would not be enough to make planning easy. Even when one can perfectly predict the outcome of all alternative options, planning often gives rise to intractably large computational problems which are NP-Complete. (I provide a small mathematical treatment in this comment below.)

We are usually allocating a finite set of resources (our time, our money) amount a set of options in order to accomplish a variety of goals. Example: I might care about my health, entertainment, career, friendships, romance, art, and education. Towards these values, I could in a given week: sleep, play tennis, go to the gym, watch Netflix, work late, call up by bestie, go on Tinder, draw a picture, read a textbook, etc. I get to allocate my time to some combination of activities, but the thing is, the number of possible combinations for allocating my time in a single week is mind-boggling.

A super-simplified example: I have 30 discretionary hours in a week and 10 possible activities I could spend each of those hours on, but I only spend time on activities in two-hour blocks. This results in 10^(30/2) = 10^15 = one thousand trillion different combinations of how I could spend my time that week. Even if I could perfectly predict how much I’d like each combination of time usage, I could never consider them all explicitly.

This computational complexity was very salient to me in my last job as a Product Manager where every six weeks I would choose which projects my team of Data Scientists would work on. Under ideal circumstances, I might get to choose ten projects out of a candidate twenty projects. Choosing was hard, especially since in addition to pure value of each project, I had to juggle the facts that: a) projects were often not independent, b) projects often have future costs, e.g. maintenance and tech debt, c) there are costs to not doing or delaying some projects, d) the benefits of different projects are spread over different time horizons, e) projects aren’t commensurate, e.g. protecting downside risk vs building new functionality, f) some projects are necessary to preserve future optionality, g) budget constraints were soft, I could sometimes steal time from elsewhere it meant I could select a better set of projects.

Even if I’d had perfect information and prediction, which I certainly didn’t, the number of possibilities was large enough to be intractable by any analytical method of prioritization.

I am least certain about how to best tackle the computation problem of planning, but my current guess is that very much relies on effectively using our instinctive, intuitive, System 1 thinking in conjunction with our System 2 thinking. System 1 is better at handling problems too large to be consciously considered, while System 2 can ensure that System 1 is paying attention to all the relevant considerations. I like cost-benefit analyses and decision-matrices not because I think they should make the final decision, but because I think the exercise of creating them loads the right information into System 1.

Planning is a Self-Knowledge & Self-Mastery Problem

In a three-fold manner, planning is a problem of self-knowledge and self-mastery.

Predicting yourself

The first problem of planning stated was that of prediction and information. With the exception of plans made for teams, organizations, countries, and the like, your plans will concern yourself and your actions. That means that invariably one of the most important things to be able to predict is yourself.

Sometimes this is hard because we lack information. It might be hard to predict how you will behave in novel circumstances. Other times it can be hard to predict ourselves because we are averse to making the best possible predictions we could. We refuse to admit that realistically we are not going to get up at 6am to go to the gym.

A particular aspect of yourself which it is key to be able to predict is your motivation. Plans which rely on unrealistic predictions about how motivated you will feel while executing the plans are probably going to be unsuccessful plans.

Advice I have for this aspect of planning is a) start paying attention to yourself and to predicting yourself, fortunately usually you have a lot of data to work with, b) adopt a policy of radical-self honesty, what is true is already so and good predictions are the basis of good plans.

Knowing what you want

This post started with the definition that planning is the selection of actions towards desired outcomes. It is rather important that you correctly identify your actual desired outcomes, otherwise, any actions you select aren’t likely to be worth much. “I thought I wanted X actually but I didn’t” is too often the foil of a supposedly successful plan where the planner evidently lacked good self-prediction.

I’m of the belief that one’s deep-down desired outcomes (personal and moral) are contained inside one’s brain, regardless of whether one has achieved good conscious access to them or inferred all the consequences of them. Much of good planning is reducing uncertainty around what is that you actually want and value.

Self-prediction is a special case of the general prediction/information problem, but it requires different techniques of uncertainty reduction than outside-world prediction does. Introspective methods such as meditation, Focusing, Internal Family Systems, etc. are helpful for having better knowledge of which outcomes you actually want.

Of course, complicating the discovery of what your desired outcomes are is that the notion of you may be a little complicated. I have been persuaded over the years that it can be very useful to think of yourself as being made up of parts or sub-agents, each with their own particular desires. Mastering yourself means coming to understand your parts and their desires leading into an ability to make plans that satisfy all yourself. This can be crucial as making plans which parts of you are not onboard often causes those parts to undermine the plan.

Relatedly, “making yourself do something” should always be a warning sign that some part of you is being disregarded. Sometimes that’s legitimate, but really only if you’re accounting for it in your self-predictions and how it’s going to affect overall success.

Self-mastery of your human brain

Human brains are really, really good, but they’re also really complicated with a lot of different moving pieces and no user manual. Most of us manage, but there are gains to be had from better at operating our own minds.

Heuristics and biases

To get better at predicting, it helps to understand when our brains natively do and don’t make good predictions of their own accord. That leads directly into heuristics and biases and attempts to become a lens which sees its flaws.

Using System 1 (intuition) and System 2 (explicit reason) in harmony

Human brains run on both and each system has its advantages and disadvantages. The human who is leveraging their brain to its fullest extent is using all of their mind in harmony, not privileging or disregarding one kind of thinking inappropriately. This hard though, but really a key part of getting good at planning.

Emotional Mastery

Arguably emotions could be lumped under System 1, but it’s worth calling them out separately. While emotions serve multiple purposes, one of the things they do is carry an important signal of information from you subconscious mind. If you feel anxious about something or having a niggling doubt or whatever, that’s because there’s a part of your mind has been processing raw data and finding it significant. The skilled planner and predictor will be someone who can extract that valuable information from their emotions.

However, emotional mastery is perhaps even more important to planning in another way. Many, many plans are bad plans because people lapse into optimizing for their emotions rather than actual long-term desired outcomes. A person who rushes into a plan with inadequate information because they dislike being in a state of indecision is likely choosing a worse plan because they were unable to handle their unpleasant emotions. Or a person who only executes extremely conservative plans because anything bold makes them too feel anxious. I believe schools of thought which help with emotional mastery such as ACT, CBT, and DBT have a place among the training materials for great planners.

Planning is Recursive

We’ve covered that planning requires identifying your options, predicting their outcomes, evaluating the goodness of said outcomes, and then somehow crunching through all the different combinations of possible actions to select the best overall set.

Except planning is recursive. Any difficult plan is going to be composed of multiple sub-plans and for each one of those sub-plans the whole process needs to repeated again: identifying options, making predictions, selecting combinations, etc.

First, this adds to the already extensive amount of computation involved in planning. Second, it can often cause plans to span multiple domains straining all but the most versatile generalists. Have pity for the aspiring baker whose plan was to make the best muffins in town, but now has to figure out double-entry accounting and US tax code. Or the physicist-cum-founder who wanted to create clean energy technologies but now needs to figure out a sales and marketing strategy too.

At a base level, the challenges of good planning remain constant between domains, e.g. making good predictions, as well the advisable steps, e.g. reducing uncertainty. Yet the concrete steps for doing these might look very different. The physicist-founder might have been very good at reducing uncertainty in the laboratory with experiments, but the feedback loop for iterating on sales strategy might be completely different, even if in both cases you’re trying to reduce uncertainty.

Shifting between domains is probably a reason why people who are good at reducing uncertainty and planning in some domains are negligent in others. Arguably, the skilled planner is going to figure out the right specific techniques to achieve good plans across all the domains they touch. In some domains, you might have lots of experts you can poll; in others it might be cheap to run experiments; in some you can build good explicit models; in others it’s all about training intuition; and yet in further cases it might not be easy at all and you’re left trying to draw tenuous inferences from historical example.

Summary: What It Takes to Be a Great Planner

We can neatly summarize why planning can be so difficult with a list of all the traits one should have in order to overcome the difficulties.

  • One must be excellent at making predictions across domains, helped by their mastery of epistemic skill and virtue.
  • One must be skilled at identifying which information one needs yet lacks and at executing sub-plans to efficiently obtain this information across different domains and the disparate techniques required to do so.
  • One must be skilled at making choices within combinatorially explosive problems involving the selection of combinations of multiple options towards multiple goals via the use of heuristics, excellent intuition/instinct, or some other reliable method.
  • One must be able to make use of their System 1 and System 2 to think in harmony.
  • One must have knowledge of their true values, goals, and desired states of the world.
  • One must have knowledge of their sub-parts and alignment between them so that one does not undermine oneself.
  • One must be able to predict one’s own behavior including the behavior of one’s motivations and emotions so that one can plan effectively around them.
  • One must be able to reduce uncertainty about one’s self-model via experiments or introspective tools.
  • One must be able to listen to the information contained in their emotions.
  • One must not let their emotions coerce them into plans which sate their emotions while sacrificing their overall goals.
  • One must be able to plan effectively up and down the recursive tree of their plans: enumerating, evaluating, and reducing uncertainty wherever needed.


[1] This broad definition stands in opposition to a common definition which uses planning primarily in contexts of scheduling, e.g. plan your day, plan your week. The broader definition here is essentially synonymous with decision-making, perhaps differing only in connotation. Decisions somewhat imply a one-off choice between options whereas planning implies selection of multiple actions to be taken over time. The term planning also somewhat more than decision-making highlights that there is a goal one wishes to achieve.

[2] I often hear people faced with clear prioritization tasks plead that they need more resource, that resource scarcity is the problem. Sometimes it is, sometimes the best action is to get more resources. But often it is a fallacy to think that if you can't solve prioritization now, it will somehow be easier when you have more resources. However many resources you have, they will be finite and you will still be able to think of things you'd like to do with even more resources, things which feel just as necessary. In short, you should get used to choosing sooner rather than later.

[3] To be technical, uncertainty reduction is about concentrating the probability mass of your prior distribution into narrower bands.


Rationality Vienna Meetup April 2019

31 марта, 2019 - 03:46
Published on March 31, 2019 12:46 AM UTC

Open to all people interested in epistemology, overcoming biases, science, self-transformation and all topics related.

15:00 - 15:30: arrival & socializing with tea/coffee
15:30 - 16:00: round of introductions (if new attendees - warmly welcome!)
16:00 - 18:00 open microphone & discussion TBD
~18:00: quickly cleaning the room together

Followed by dinner together in the city.
You are also welcome to only join for the dinner.


Ductive Defender: a probability game prototype

30 марта, 2019 - 15:31
Published on March 30, 2019 12:31 PM UTC

I made a new version of my probability/inference game, and uploaded it here. It's free and playable in-browser: if using scientific thinking to defend a spaceship from explosive mines sounds interesting to you, try it out and let me know what you think.


Will superintelligent AI be immortal?

30 марта, 2019 - 11:50
Published on March 30, 2019 8:50 AM UTC

There are two points of view. Either such AI will be immortal and it will find ways to overcome the possible end of the Universe and will make an infinite amount of computations. For example, Tipler's Omega is immortal.

Or the superintelligent AI will die at the end, few billion years from now, and thus it will make only a finite amount of computations (this idea is behind Bostrom's "astronomical waste").

The difference has important consequences for the final goals for AI and for our utilitarian calculations. In the first case (possibility of AI's immortality) the main instrumental goal of AI is to find ways to survive the end of the universe.

In the second case, the goal of AI is to create as much utility as possible before it dies.


List of Q&A Assumptions and Uncertainties [LW2.0 internal document]

30 марта, 2019 - 02:16
Published on March 29, 2019 11:16 PM UTC


1. This is the second in a series of internal LessWrong 2.0 team document we are sharing publicly (with minimal editing) in an effort to help keep the community up to date with what we're thinking about and working on.

2. Caveat! This is internal document and does not represent any team consensus or conclusions; it was written by me (Ruby) alone and expresses my own in-progress understanding and reasoning. To the extent that the models/arguments of the other team members are included here, they've been filtered through me and aren't necessarily captured with high fidelity or strong endorsement. Since it was written on March 17th, it isn't even up to date with my own thinking

3. I, Ruby (Ruben Bloom), am trialling with the LessWrong 2.0 team in a generalist/product/analytics capacity. Most of my work so far has been trying to help evaluate the hypothesis that Q&A is a feasible mechanism to achieve intellectual progress at scale. I've been talking to researchers; thinking about use-cases, personas, and jobs to be done; and examining the data so far.


Epistemic status: Since the 18th when I first wrote this, I have many new lists and a lot more information. Yet this one still serves as a great into into all the questions to be asked about Q&A and what it can and should be.

March 18, 2019

Related: Q&A Review + Case for a Marketplace

    • Is it actually the case that Q&A for serious research is this big, new, different thing which requires a big shift for people? Maybe it's not such an adjustment?
    • How willing are people to do serious research work for others on the internet?
  • RESEARCH PROCESS (and suitability for collaboration) <tease these out by talking through their recent research>
    • Can "significant research" be partitioned into discrete questions?
      • Or is it more that there is a deeper bigger question around which someone needs to become an expert, and that any question posed in downstream of the real question and can't be treated in isolation?
      • Perhaps talk to the Ought folk about this.
    • Do people have general open research questions they want vaguely want answered and are willing to have sit unanswered for a relatively long period of time?
      • Or do they mainly have (and prioritize) research questions which are currently part of their work?
    • How much interaction between the research requester and research contributor is required?
      • Can someone take a research question and execute successfully on their own without too much feedback from the person requesting the research?
      • If necessary, does Q&A facilitate this adequately? Are back and forth comments good enough?
      • Are busy research requesters willing to put in the time to interact with people trying to contribute, contributors who they don't have know and haven't necessarily vetted?
    • What kind of research questions are amenable to the format of LessWrong's Q&A?
  • PERCEPTIONS AND PRIOR BELIEFS <should get answered semi-automatically interviews>
    • Is the mix of research and less research-y questions on Q&A now causing people to not think of Q&A as a place for serious research questions?
    • What are people's current impressions, expectations, anticipations of LW's Q&A, segmented by level of exposure?
      • e.g. if I tell someone LessWrong has a Q&A with the goal of serious research progress, what do they imagine? What's their reaction?
      • Do people think that they could be helped by Q&A? Do they want to use it?
  • INCENTIVES, WILLINGNESS, & EXPERIENCE <get at these questions by talking through how interviewees might or might not use Q&A>
    • How much (and what kind) of incentives are needed for contributors to want to contribute?
      • Are bounties of cash prizes enough?
        • If yes, is it because the money makes the effort worth it, OR
        • it just that cash prizes are a costly signal is important and once that's clear, people would be glad to help?
        • Is bounty complexity an actual issue?
        • Are people doing an EV calculation with bounties such even if a nominal bounty is $500, people don't necessarily think they're worth a lot of work? Their EV is like $50
    • How good does the ROI need to be for question askers to want to use the platform?
    • How low does the time and attention cost need to be for question askers to want to use the platform?
    • How much effort are question answerers willing to invest already?
      • It does look like that some StackOverflow questions are very involved. So some people are willing to take time to answer things.
      • A few of the questions/answers on Q&A right now are pretty involved. Not many, but a few.
    • What is the population of adequately skilled and available question answerers within the domains we care about? Is it enough to support a good Q&A ecosystem?
      • How many people believe they're qualified? <probably need more general polling>
        • What's the distribution of people in the 2x2 grid of "thinks they're qualified" x"actually qualified"?
    • What user base of contributors do we have to reach before the question asker experience is good enough to retain users?
  • OTHER <expect to come up in talking through their use of Q&A
    • Is privacy a major issue for potential question askers?
      • How do they feel if there are closed groups?
    • Is trust in research quality an issue for question askers?
      • What does it take to evaluate whether a research contribution is good?
        • How much can it be done just by reading the contribution or will it require redoing serious work?
        • Are question askers willing to do this?
        • Are third parties willing to do the evaluation?


Review of Q&A [LW2.0 internal document]

30 марта, 2019 - 02:15
Published on March 29, 2019 11:15 PM UTC


1. This is the first in a series of internal LessWrong 2.0 team documents we are sharing publicly (with minimal editing) in an effort to help keep the community up to date with what we're thinking about and working on.

2. Caveat! This is internal document and does not represent any team consensus or conclusions; it was written by me (Ruby) alone and expresses my in-progress understanding and reasoning. To the extent that the models/arguments of the other team members are included here, they've been filtered through me and aren't necessarily captured with high fidelity or strong endorsement. Since it was written on March 17th, it isn't even up to date with my own thinking

3. I, Ruby (Ruben Bloom), am trialling with the LessWrong 2.0 team in a generalist/product/analytics capacity. Most of my work so far has been trying to help evaluate the hypothesis that Q&A is a feasible mechanism to achieve intellectual progress at scale. I've been talking to researchers; thinking about use-cases, personas, and jobs to be done; and examining the data so far.


Epistemic status: this is one of the earlier documents I wrote in thinking about Q&A and my thinking has developed a lot since, especially since interviewing multiple researchers. Subsequent documents (to be published soon) have much more developed thoughts.

In particular, subsequent docs have a much better analysis of the uncertainties and challenges of making Q&A work that this one. This document is worth reading in addition to them mostly for an introduction to thinking about the different kinds of questions, our goals, and how things are going so far.

Originally written March 17th

I’ve been thinking a lot about Q&A the past week since it’s a major priority for the team right now. This doc contains a dump of many of my thoughts. In thinking about Q&A, it also occurred to me that an actually marketplace for intellectual labor could do a lot of good and is strong in a number of places where Q&A is weak. This document also describes that vision and why I think it might be a good idea.

1. Observations of Q&A so Far.

First off, pulling some stats from my analytics report (numbers as of 2019-03-11):

How long has Q&A been live?Since 2018-12-07. Just about 3 months as of 2019-03-11 (94 days)How many questions?94 questions published + 20 draftsHow many answers?191 answers, 171 direct comments on answersHow many people asking questions?59 distinct usernames posted questions (including the LW team)How many people answering questions?117 unique usernames posting answers172 unique usernames who answered or posted direct comment on questionHow many people engaging overall?Including questions, answers, and comments, 226 usernames have actively engaged with Q&A.

Note: "viewCount" is a little unreliable on LW2 (I think it might double-count sometimes); "num_distinct_viewers" refers only to logged-in viewers.

Spreadsheet of Questions as of 2019-03-08

List of Q&A Uncertainties

See Appendix for all my workings on the Whiteboard

Q&A might be a single feature/product in the UI and in the codebase, but there are multiple distinct uses for the single feature. Different people trying to accomplish different things. Going over the questions, I see rough clusters, listed pretty much in order of descending prevalence:

  1. Asking for recommendations, feedback, and personal experience.
    1. “Which self-help has helped you?”, “Is there a way to hire academics hourly?”
  2. Asking for personal advice.
    1. “What’s the best way to improve my English pronunciation?”
  3. Asking conceptual/philosophy/ideology/models/theory type question.
    1. “What are the components of intellectual honesty?”
  4. Asking for opinions
    1. “How does OpenAI's language model affect our AI timeline estimates?”
  5. Asking for help studying a topic.
    1. “What are some concrete problems about logical counterfactuals?”, “Is the human brain a valid choice choice of Universal Turing Machine . . . ?”
  6. Asking general research/lit-review-ish questions (not sure how to name)
    1. “Does anti-malaria charity destroy the local anti-malaria industry?”, “Understanding Information Cascades”, “How large is the fallout area of the biggest cobalt bomb we can build?”
  7. Asking open research-type questions (not sure to how name this cluster)
    1. “When is CDT Dutch-Bookable?”, “How does Gradient Descent Interact with Goodhart?”, “Understanding Information Cascades”

These questions are roughly ordered from "high prevalence + easier to answer" to "low prevalence + harder to answer".

A few things stick out. I know the team has noticed already, but want to list them here anyway is part of the bigger argument. The questions which are most prevalent are those which are:

  1. relatively quick to ask, e.g. write a few paragraphs at most.
  2. there is a [relatively] large population of people who are qualified to answer.
  3. the kind of questions people are used to asking elsewhere, e.g. CFAR Alumni Mailing List, Facebook, Reddit, LessWrong (posts and comments) Quora, StackExchange.
  4. the kinds of questions for which there are existing forums, as above.
  5. they can answered primarily using the answerer’s existing knowledge, e.g. people who answer advanced math problems but using their existing understanding.
  6. the questions can be answered in a single session at one’s computer, often without needing even to open another browser tab.

What is apparent is that questions which break from the above trends, e.g. questions which can be hard to explain (taking a long to write up), require skill/expertise to answer, can’t be answered purely from an answerer’s existing knowledge (unless by fluke they’re expert in a niche area), and require more effort than simply typing an answer or explanation -- these questions are really of a very different kind. They’re a very different category and both asking and answering such questions is a very different activity from asking the other kind.

What we see is that LessWrong’s Q&A is doing very well with the first kind -- the kind of questions people are already used to asking and answering elsewhere. There’s been roughly a question per day for the three months Q&A has been live, but the overwhelming majority are requests for recommendations and advice, opinions, and philosophical discussion. Only a small minority (no more than a couple dozen) are solid research-y questions.

There’ve been a few of the “help me understand”/confusions type you might see on StackExchange (which I think are real good). And a few pure research-y type questions, but around half of those were asked by the LessWrong team and friends. Around 10% of questions, really on the order of 10 questions or less in the last three months by my count.

I think these latter questions are more the sort we’d judge to be “actual serious intellectual progress”, or at least, those are the questions we’d love to see people asking more. They’re the kinds of questions that predominantly the LessWrong team is creating rather than users.

2. Our vision for Q&A is getting people to do a new and effortful thing. That’s hard.

The previous section can be summarized as follows:

  • Q&A has been getting used since it was launched, but primarily by people do things they were already used to doing elsewhere. And things which are relatively low effort.
  • The vision for Q&A is scaling up intellectual progress on important problems. Doing real research. People taking their large questions, carving of pieces, people going off and making their own contributions of research (without hiring and all that overhead).

The thing about the LW vision for Q&A is that it means getting people to do a new and different thing from what they’re used to, plus that thing is way more effort. It’s not impossible, but it is hard.

It’s not a new and better way to do something they’re already doing, it’s a new thing they haven’t even dreamt of. Moreover, it looks like something else which they are used to, e.g. Quora, StackExchange, Facebook - so that’s how they use it and how they expect other to use it by default. The term “category creation” comes to mind, if that means anything. AirBnB was new category. LessWrong is trying to create a new category, but it looks like existing categories.

3. Bounties: the potential solution and its challenges

The most straightforward way to get people to expend effort is to pay them. Or create the possibility of payment. Hence bounties. Done right, I think bounties could work, but I think it’s going to be a tough uphill battle to implement them in a way which does work.

4. Challenges facing bounties (and Q&A in general)
  • Even if we have a system which works well, it’s going to be new and different and we’re going to have to work to get users to understand it and adopt it. A lot user education and training.
    • The closest analogue I can think of is Kaggle competitions, but they’re still pretty different: clear objective evaluation, you build transparently valuable skills, it feels good to get a high rank even if you don’t win, there are career rewards just for participating and doing relatively well.
  • Uncertainty around payment. People might do a lot of work for money, but the incentive is much weaker if you’re unsure if you’ll get paid. People decide whether it’s based on EV, not the absolute number of dollars pledged.
    • And you might not be bothered to read and understand complicated bounty rules.
  • People with questions and placing bounties might not trust the research quality of whoever random person happens to answer.
    • A mathematical review can be checked, but it’s harder to that with lit reviews and generalist research.
    • Evaluating research quality might require a significant fraction of the effort required to do the research in the first place.
  • People with questions might usually have a deeper, vaguer, general question they’re trying to answer. They want the actual thing answered, not a particular sub-question which may or may not be answered. Eli spoke of desiring that someone would become expert in a topic so they could then ask them lots of questions about it.
  • With Q&A, it’s challenging for the asker and answerer to have a good feedback loop as the answerer is working. It would seem to be harder for the the answerer to asking clarifying questions and share intermediate results (and thereby get feedback), and harder for the asker to ask further follow-up question. This gets worse once there are multiple people working on the question, all potentially needing further time and attention from the asker in order to do a good job.
  • Q&A (even with bounties) face the two-sided marketplace problem. Question askers aren’t going to bother writing up large and difficult to explain questions if they don’t expect to get answered. (Even less so if they try once and fail to get a response worth the effort). Potential answerers aren’t going to make it a habit to research for a platform which doesn’t have many real research questions (mostly stuff about freeze dried mussel powder and English pronunciation and what not).
5. What would it take to get it to work

Thinking about the challenges, it seems it could be made to work if we following happens:

  • We successfully get both askers and answerers to understand that LW’s Q&A is something distinctly different from other things they’re used.
    • UI changes, tags, renaming things, etc. might all help, plus explanatory posts and hands-on training with people.
    • Making becoming a question answerer a high-status thing would certainly help. If Luke or Eliezer or Nate were seen using the platform, might give it a lot of endorsement/legitimacy.
  • We successfully incentivize question answerers to expend the necessary effort to answer research questions.
    • This is partly through monetary reward, but might also include having them believe that they’re actually helping on something important, are actually getting status. (Weekly or monthly prizes for best answers - separate from bounties - might be a way to do that. Or heck, a leaderboard for Q&A contributions adjusted by karma.)
  • We get question askers to actually believe they can get real serious progress on questions for Q&A.
    • Easiest to do once we have some examples. Proof of concept goes a long way. Get a few wins and we talk about them with researchers, show them that it works.
      • It’s getting those first few examples which is going to be hardest. As they say, the first ten clients always require hustle.
  • We ensure that question answerers have a positive ROI experience for all the time spent writing up questions, reading the response, etc., etc.
  • We somehow address concerns that research might not be reliable because you don’t fully trust the research ability of people on the internet - especially not when you’re trying to make important decisions on the basis of your research.

Even then, I think getting it to work will depend on understand which research questions can be well handled by this kind of system.

6. My Uncertainties/Questions

Much of what I’m saying here is coming from thinking hard about Q&A for several days, using models from startups in general, and some limited user interaction. I could just be wrong about several of the assumptions being used above.

Some of the key questions I want answered to be more sure of models are:

  • What are the beliefs/predictions/anticipations about Q&A of our ideal question askerer’s?
    • In particular, do they think it could actually help them with their work? If not, why not? If yes, how?
    • Is trust a real issue for them? Are worried about research quality?
    • Do they have “discrete” questions they can ask, or is it usually some deeper topic they want someone to spend several days becoming an expert on?
  • What is the willingness of ideal question answerers to answer questions on Q&A?
    • Which incentives matter to them? (Impact, status, money) How well do they view current Q&A as meeting them?
      • Do they feel like they’re actually doing valuable work in answering questions?

There are other questions, but that’s a starter.

7. Alternative Idea: Marketplace for Intellectual Labor

Once you’re talking about paying out bounties for people researching answers, you’re most of the way towards just outright hiring people to do work. A marketplace. TaskRabbit/Craigslist for intellectual labor. I can see that being a good idea.

How it would work
  • People wanting to be hired have “profiles” on LW which include anything relevant to their ability to do intellectual labor. Links to CV, LinkedIn, questions answered on Q&A, karma, etc.
    • The profiles may be public or private or semi-private.
  • People seeking to hire intellectual labor can create “tasks”
  • Work can be assigned in two directions.
    • Hirers can post their tasks publicly and then people bid/offer to work on the task.
    • Hirers can browse the list of people who have created profiles and reach out to people they’re interested in hiring, without ever needing to make their task or research public.
Why this is a good idea
  • A marketplace is a really standard thing, people already have models and expectations for how they work and how to interact with them. In this case, it’s just a marketplace for a particular kind of thing, otherwise the mechanics are what people are used to. Say “TaskRabbit for research/intellectual labor” and I bet people will get it.
    • Also marketplace and working for money is more probably the right genre for working hard for several days and deterministically getting paid. The thing about Q&A is that it’s somewhat trying to get people to do serious work via something which looks a lot like the things they do recreationally.
  • It reduces uncertainty around payment/incentives. The two parties negotiate a contract (perhaps payment happens via LW) and the worker knows that they will get paid as much as in any real job they might be hired for.
  • It solves the trust thing. 1) The hirers get to select who they trust with their research questions, it’s not open to anyone. The profiles are helpful for this, as a careful hirer can carefully go through someone’s qualification and past work to see if they trust them.
    • LessWrong could even create a “pipeline” of skill around the marketplace for intellectual labor. People start with simple, low-trust tasks and as they prove themselves and get good reviews, they’re more attractive.
  • It addresses privacy. You might not be willing to place your research questions on the public internet, but you might be willing them to trust a carefully vetted single person who you hire.
  • It addresses the two-sided marketplace challenge. The format allows you to build each side somewhat asynchronously.
    • Find a few people and convince them to create a few work tasks they’d like done but aren’t urgent (approximately questions). Once they’re up there, you can say “yes, we’ve got some tasks that Paul Christiano would like answers on”
    • Find people who would be interested in the right kind of work, get them to create profiles. They don’t have to commit to doing any particular work, they’re just there in case someone else wants to reach out to hire them. (One could imagine making it behave like Reciprocity.)
  • It lets you hire for things like tutoring.
    • Eli mentioned how much he values 1:1 interaction and tutoring. When he’s got a confusion, he seeks a tutor. That’s not something Q&A really supports, but a marketplace for intellectual labor could.
    • It could be an efficient way for people looking for knowledge from an expert to be able to find one who is available and at the right price.
      • I’ve seen spreadsheets over the years of EA’s registering their names, interests, and skills. I don’t know if people ever used them, but it does seem neat if there was just a locked-in service which was directory of experts on various topics that you could pay to access.
  • [Added] It diversifies the range of work you can hire for.
    • It seems good if people doing research work can hire people to format latex, proofread, edit, and generally handle tasks which frees them up to more core research.
  • [Added] It doesn’t limit the medium of the work to being the LessWrong platform. Once an arrangement is made, the hirer and worker are free to work in person, work via Google Doc, Skype, or whatever else is most convenient and native to their work. In contrast, Q&A makes the format UI experience of the platform a bottleneck on communication of research.
    • Needing to write up results formally in a way that is suitable for the public is also a costly step that is avoided in 1-to-1 work arrangement.
    • It does seem that CKeditor could dovetail really nice with collaboration via the marketplace, assuming otherwise people are using Google Docs. Once the research content is already on LW, we can streamline the process of making it public.
      • Research being conducted in Google docs and then polished and share might be a much more natural flow than needing people to conduct research in whatever tools and then translate it into the format of comments/answers.
        • Another idea: building in things like citation management and other stuff to the LW Google Doc’s and building a generally great research

You could build up several dozen or hundred worker (laborer?) profiles before you approach highly acclaimed researchers and say “hey, we’ve got a list of people willing to offer intellectual labor”, interested in taking a look? Or “we’ve got tasks from X, Y, Z - would you like to look and see if you can help?”

[redacted]: “I’d help [highly respected person] with pretty much whatever.” Right now [highly respected person] has no easy way to reach out to people who might be able to do work for them. I’m sure X and Y [redacted] wouldn't mind a better way for people to locate their services.

In the earlier stages, LessWrong could do a bit of matchmaking. Using our knowledge and connections to link up suitable people to tasks.

Existing services like this (where the platform is kind of a matchmaker) such as TaskRabbit and Handy struggle because people use the platform initially to find someone, e.g. a house cleaner, but then bypass the middleman to book subsequent services. But we’re not trying to make money off of this, we don’t need to be in the middle. If a task happens because of the LW marketplace and then two people have an ongoing work relationship - that is fantastic.

Crazy places where this leads

You could imagine this ending up with LessWrong playing the role of some meta-hirer/recruiting agency type thing. People create profiles, upload all kinds of info, get interviewed - and then they are rated and ranked within the system. They then get matched with suitable tasks. Possibly only 5-10% of the entire pool ever gets work, but it’s more surface area on the hiring problem within EA.

80k might offer career advice, but they’re not a recruiting agency and they don’t place people.

Why it might not be that great (uncertainties)

It might turn out that all the challenges of hiring people generally apply when hiring just for more limited tasks, e.g. trusting them to do a good job. If it’s too much hassle to vet all the profiles vying to work on your task, learn how to interact with a new person around research, etc., then people won’t do it.

If it turns out that it is really hard to discretize intellectual work, then a marketplace idea is going to face the same challenges as Q&A. Both would require some solution of the same kind.

I’m sure there’s a lot more to go here. I’ve only spent a couple of hours thinking about this as of 3/17.

Q&A + Marketplace: Synergy

I think there could be some good synergies. Ways in which each blend into each other and support each other. Something I can imagine is that there’s a “discount” on intellectual labor hired if those engaged in the work allow it to be made public on LW. The work done through the marketplace gets “imported” as a Q&A where further people can come along and comment and provide feedback.

Or someone is answering your question and like what they’ve said, but you want more. You could issue an “invite” to hire them them to work more on your task. Here you’d get the benefits of a publicly posted question anyone can work plus the benefits of a dedicated person you’re paying and working closely with. This person, if they become an expert in the topic, could even begin managing the question thread freeing up the important person who asked the question to begin with.

8. Appendix: Q&A Whiteboard Workings


Check Yo Self Before You Wreck Yo Self

29 марта, 2019 - 23:33
Published on March 29, 2019 7:21 PM UTC

What problems would we face in usefully answering "hard" questions on this blog?

One is trust. Expertise is about more than making accurate predictions or interesting arguments. It's also about establishing trust through the degree-earning and peer review processes, and by having skin in the game. Until LW finds a substitute for these academic safeguards, it will struggle to be taken seriously by political and academic authorities.

Another is redundancy. Any important, tractable questions we can come up with are probably already being examined by professional academics. After all, they are the ones with the resources, training, and extrinsic motivation. LW may occasionally strike gold, but my guess is that the motherlode is in replications, fact- and logic-checking, and reviews. This is critical work that we know is neglected, and the fact that a professional academic wrote a widely-read book or article on topic X is a signal that topic X is novel, important, and answerable. Questions in decision theory or philosophy (particularly from the global priorities institute's research agenda) might also be good candidates.

Scooping is a third issue. If the LW community discovers good answers to good questions, professional academics will try to "steal" the ideas before bloggers can turn them into an excellent piece of scholarship. That's still a good outcome, and I wouldn't be surprised if it's already happening. Yet it does put a cap on LW's potential for impact in this regard.

The LW community lacks the resources to carry out empirical research, with the possible exception of Mechanical Turk-based studies. I suspect anyone capable of advancing the field of mathematics is already a PhD mathematician. So the range of problems the blogging community could usefully work on is small. Money spent on such projects seems more likely to be useful donated to a professional academic for use in their research. Likewise, time spent blogging might be better spent earning money for said donations, or else on working toward a career as a professional researcher.

It may be that working on "hard" LW questions is a way for wannabe professional researchers like myself to sharpen their skills and see if they have the right stuff. Yet it seems more directly useful for someone like that to focus on practicing their lab/math/programming skills, familiarizing themselves with the academic literature of their intended field, networking, or getting some damned exercise and going grocery shopping.

I have 0.1% confidence that LW could manifest its own peer reviewed journal, 20% confidence that it could generate publishable articles, and 98% confidence my time would be better spent reading another paper on CRISPR/Cas9 or studying calculus.


Microcosmographia excerpt

29 марта, 2019 - 21:29
Published on March 29, 2019 6:29 PM UTC

An excerpt from Cornford's Microcosmographia Academica, by way of Hitch's Letters to a Young Contrarian:

There is only one argument for doing something; the rest are arguments for doing nothing.
Since the stone axe fell into disuse at the close of the Neolithic Age, two other arguments of universal application have been added to the rhetorical armory by the ingenuity of mankind. They are closely akin; and, like the stone axe, they are addressed to the Political Motive. They are called the Wedge and the Dangerous Precedent. Though they are very familiar, the principles, or rules of inaction involved in them are seldom stated in full. They are as follows:
The Principle of the Wedge is that you should not act justly now for fear of raising expectations that you may act still more justly in the future – expectations that you are afraid you will not have to the courage to satisfy. A little reflection will make it evident that the Wedge argument implies the admission that the persons who use it cannot prove that the action is not just. If they could, that would be the sole and sufficient reason for not doing it, and this argument would be superfluous.
The Principle of the Dangerous Precedent is that you should not now do any admittedly right action for fear you, or your equally timid successors, should not have the courage to do right in some future case, which, ex hypothesi, is essentially different, but superficially resembles the present one. Every public action that is not customary, either is wrong, or, if it is right, is a dangerous precedent. It follows that nothing should every be done for the first time.
Another argument is that "the Time is not Ripe." The Principle of Unripe Time is that people should not do at the present moment what they think right at that moment, because the moment at which they think it right has not yet arrived.

(Emphasis is either Hitch's or Cornford's.)

Fiat justitia ruat caelum

Cross-posted to my blog.


Parable of the flooding mountain range

29 марта, 2019 - 19:54
Published on March 29, 2019 3:07 PM UTC

A mountaineer is hiking in a mountain range. There is a thick fog so he cannot see beyond a few meters.

It is raining heavily and so the mountain range is being flooded, the mountaineer has to climb to a high place so he won’t get washed away.

He will climb towards the highest point in his sight, and if he sees another higher point he will change his direction towards there.

Now the mountaineer is standing on the top of a hill, and to his knowledge every direction is downwards, and there is no higher peak in sight. He sits on the hilltop, anxiously watching the rain and hearing the water raising.

The water floods the hill and drowns him, washing his dead body into the abyss.

Is he on the highest peak of the mountain range? Unlikely

Can he ever get their if he cannot see beyond a few metres? Very unlikely.


A band of mountaineers are hiking in a mountain range. There is a thick fog so they cannot see beyond a few meters.

It is raining heavily and so the mountain range is being flooded, the mountaineers have to climb to a high place so they won’t get washed away.

They elected the most experienced mountaineer as their leader, in the fog he can see a couple metres further than everybody else, and so he is the best guide possible for anyone.

The band all followed him onto a hilltop, every direction is downwards so they stayed there, anxiously watching the rain and hearing the raising waters.

Until the water floods the hilltop and drowns them, washing their dead bodies into the abyss.

This band is functionally the same as a lone mountaineer.


A band of mountaineers are hiking in a mountain range. There is a thick fog so they cannot see beyond a few meters.

It is raining heavily and so the mountain range is being flooded, the mountaineers have to climb to a high place so they won’t get washed away.

They spread out and walked away from each other, then went on to search for a place to stay.

They end up on different hilltops, some higher than others. They individually yet simultaneously watch the rain and hear the water raising, anxiously.

Until the water floods some hills and drowns many mountaineers, washing their corpses into the abyss. A few mountaineers end up on the higher peaks and are left unharmed by the flood.

After the flood recedes and the fog dissipates , they go down to search their fallen friends to mourn and to bury them.

And of course, to scavenge supplies from their dead bodies.

Is this the best strategy for the band, if it wants to maximise the chance of someone surviving the flood?


It's my first time posting here, please give me some support and suggestions if you can.

Ask me any questions you have if my story seems unclear/vague/confusing because it probably is, I'm not a good writer and I didn't really have an idea what I wanted to write. It initially started as an analogy about evolution, but perhaps it also works as a vague discussion about some future choices facing humanity as well.


Relative exchange rate between preferences

29 марта, 2019 - 14:46
Published on March 29, 2019 11:46 AM UTC

Note: working on a research agenda, hence the large amount of small individual posts, to have things to link to in the main documents.

In my initial post on synthesising human preferences, I imagined a linear combination of partial preferences. Later, when talking about identity preferences, I proposed a smoothmin instead (which can also be seen as strong diminishing returns to any one partial preference).

I was trying to formalise how humans seem to trade-off their various preferences in different circumstances. However the ideal is not to a priori decide how humans are trading off the preferences, but instead to copy how humans actually do trade off the preferences.

To do that, we need to imagine the human in situations quite distant from their current ones - situations where some of their partial preferences are more or less fulfilled. This brings up the problem of modelling preferences in distant situations. Assuming some acceptable resolution to that problem, the AI would have an exchange rate between different preferences, in different situations and with different levels of preference fulfilment.

Meta-preferences may constrain the preferences in very distant situations. Most paradoxes of population ethics are in situations very different from today, so population ethics preferences can be seen in this way. Universal moral principles act in a similar way, giving limits on what can happen in all situations, including extreme ones - though note there are arguments to avoid some unusual situations entirely.

Then the AI's task is to come up with a general formula for the exchange rate between different preferences, that extends to all situations and respects the constraints of the meta-preferences. It will probably smooth out quite a bit of "noise" in the exchange rates between different preferences, while respecting the general trends.


Being wrong in ethics

29 марта, 2019 - 14:28
Published on March 29, 2019 11:28 AM UTC

  • "I ache for the day when I am in heaven singing the praise of God".

That's a preference that many religious believers have. Not being religious myself, I don't have it, and even find it factually wrong - but how can a preference be wrong?

Here's a preference I do have:

  • "I ache for a world where people spontaneously help each other overcome their problems".

That preferences is also "wrong", in a similar sense. Let's see how.

Decomposing partially erroneous variables

For the purpose of this post, I am assuming atheism, which is the position I myself hold; for those of a religious disposition, it may help to assume that "God" refers here to a non-existent god from a different, wrong, religion.

Now, the "praising God" preference could be decomposed into other, simpler variables. It may feel like it's a single node, but it is likely composed of many different preferences linked together:

So, for example, those with that preference might think it good to be humble, feel there's something intrinsically valuable about holiness, enjoy the ecstatic feeling they get when praising God in a religious context, and feel that one should praise God out of a feeling of gratitude. We can consider that these form four of the foreground variables that the believer "cares about" in their partial model (what about the fifth node? We'll get back to that).

Note that the human didn't necessarily use a partial model with these four variables; they may instead have just a single variable entitled "praise god in heaven". However, we're decomposing the partially erroneous variable (since God doesn't exist), using the believer's web of connotation, into simpler variables.

Erroneous foreground variables

There's nothing wrong with the humbleness and ecstatic preferences. The gratitude node is problematic, because there ain't any God. If the believer only feels gratitude to God, then we can just erase that node as an error. However, it's possible that they might feel general gratitude towards their superiors, or towards their parents/mentors/those who've made them who they are. In that case, the preference for "Gratitude to God" is simply a special case of a very valid preference applied to an erroneous situation.

Then we have the interesting preference for "holiness". Holiness is not a well-grounded variable, but has its own web of connotation, maybe consisting of religious rituals, feelings of transcendence, spiritual experiences, and so on. Further splitting the node could allow us to home in on more precise preferences.

Affective spirals and subjective experience

There is a problem with decomposing complex variables into components: affective death spirals / halo effects. "Holiness" probably has a huge amount of positive connotations, as compared to the variables it could be decomposed into.

I'd tend to favour using the decomposed variables as the main indicator of preference. We can still use the halo-ed top variable for some components of human experience; for example, we might assume that the believer here really enjoys the subjective experience of holiness, even if that preference over the outside world is not as strong.

Erroneous values for background variables

What of the last node? Well, as Mark Twain pointed out,

Singing hymns and waving palm branches through all eternity is pretty when you hear about it in the pulpit, but it's as poor a way to put in valuable time as a body could contrive.

This brings us to the last node: no humans alive today is capable of spending their lives praising, nor would they enjoy the experience much, after the first few minutes or hours. Believers presumably don't think of themselves in heaven as mindless zombies; I'm going to assume that this hypothetical believer might imagine that the mere presence of God will transform their desires in such a way that makes praising a desirable activity for them.

Even if that is the case, it still means putting the believer in a situation where most aspects of their identity are irrelevant or overwritten. Their sexual preferences, favourite movies and computer games, favourite sporting events, sense of humour, quirks of personality, etc... all become irrelevant. Therefore, despite the believer's beliefs, they are contemplating a partial model in which the background variables are very different, even though the believer does not realise this difference (see here for another example of preference where one of the variables is wrong).

This is the main way in which the "praise God" preference is erroneous: it involves huge changes to their identity and preferences, but they are not aware of this in their partial models.

Spontaneous help decomposition

My "spontaneous help" preference can probably be decomposed in a similar fashion:

This has a "same identity" problem, just as the "praise God" has, but it's milder: people cooperate far more often than they sing praises for days on end. The bigger problem is the implicit assumption that when people do things for other people, this automatically has the same efficiency as markets or other more selfish coordination mechanisms. People helping each other is not supposed to leave people worse off.

If I was aware of that flaw AND still had that preference (which I mildly do, in practice), then I would have a non-wrong preference. Which doesn't mean that you have to agree with me, for you have your own preferences, but means that my preference is no longer "wrong" in the sense of this post: "Same efficiency" becomes the foreground variable "Efficiency", one that is actually negatively activated in this model, but not enough to overcome the positive activation of the other variables:


Models of preferences in distant situations

29 марта, 2019 - 13:42
Published on March 29, 2019 10:42 AM UTC

Note: working on a research agenda, hence the large amount of small individual posts, to have things to link to in the main documents.

For X, consider three different partial preferences:

  1. If X were poor, they would prioritise consumption over saving.
  2. X: If I were poor, I would prioritise saving over consumption.
  3. X: If I were poor, I'd get my personal accountant to advise me on the best saving/consumption plan for poor people.

1 is what X's judgement would be in a different, distant situation. 2 is what X's current judgement about what their judgement would be in that situation. 3 is similar, but is based on a factually wrong model of what that distant situation is.

So what are we to make of these in terms of X's preferences? 3 can be discounted as factually incorrect. 2 is a correct interpretation of X's current (meta-)preferences over that distant situation, but we know that these will change if they actually reach that situation. It might be tempting to see 1 as the genuine preference, but that's tricky. It's a preference that X doesn't have, and may never have. Even if X were certain to end up poor, their preference may depend on the path that they took to get there - medical bankruptcy, alcoholism, or one dubious investment, could result in different preferences. And that's without considering the different ways the AI could put X in that situation - we don't want the AI to influence its own learning process by indirectly determining the preferences it will maximise.

So, essentially, using 1 is a problem because the preference is many steps removed and can be influenced by the AI (though that last issue may have solutions). Using 2 is a problem because the current (meta-)preferences are projected into a situation where they would be wrong. This can end up with someone railing against the preferences of their past self, even if those preferences now constrain them. This is, in essence, a partial version of the Gödel-like problem mentioned her, where the human rebels against the preferences the AI has determined them to have.

So, what is the best way of figuring out X's "true" preferences? This is one of the things that we expect the system to be robust to. Whether type 1 or type 2 preferences are prioritised, the synthesis should still reach an acceptable outcome. And the rebellion against the synthesised values is a general problem with these methods, and should be solved in some way or another - possibly by the human agreeing to freeze their preferences under the partial guidance of the AI.

Avoid ambiguous distant situation

If the synthesis of X's preferences in situation S is ambiguous, that might be an argument to avoid situation S entirely. For example, suppose S involves very lossy uploads of current humans, so that the uploads seem pretty similar to the original human but not identical. Rather than sorting out whether or not human preferences apply here, it might be best to reason "there is a chance that human flourishing has been lost entirely here, so we shouldn't pay too much attention to what human preferences actually are in S, and just avoid S entirely".