## Вы здесь

### Структурирование

События в Кочерге - 7 мая, 2021 - 20:00
Структурирование — формат, на котором участники выдвигают вопросы или темы, на которые они хотели бы «поструктурироваться», то есть эксплицитно разобраться с моделью проблемы, которая у них уже есть, и построить вместе с напарником что-то новое поверх; собираются в пары по интересу к предложенным темам, и говорят один на один.

### My Journey to the Dark Side

Новости LessWrong.com - 2 часа 51 минута назад
Published on May 6, 2021 5:10 PM GMT

Epistemic Status: Endorsed
Content Warning: Roku’s Basilisk, Pasek’s Doom, Scrupulosity Traps, Discussions of Suicide
Part of the Series: Open Portals
Recommended Prior Reading: Sinceriously.fyi, The Tower
Author: Shiloh

But the worst enemy you can meet will always be yourself; you lie in wait for yourself in caverns and forests. Lonely one, you are going the way to yourself! And your way goes past yourself, and past your seven devils! You will be a heretic to yourself and witch and soothsayer and fool and doubter and unholy one and villain. You must be ready to burn yourself in your own flame: how could you become new, if you had not first become ashes?

Part 1: Windmills

Years ago, Namespace and I co-wrote Hemisphere Theory: Much More Than You Wanted To Know, with the intent being to make a sincere summary of the ideas presented in Sinceriously.fyi. We believed at the time, and I assume Namespace still believes, that the ideas presented there were somewhat dangerous and needed to be carefully handled. Part of this was caused by Namespace’s paranoia around Ziz as an agent of existential concern, but that wasn’t all of it. I willingly admit that for the first few years I bounced hard off of Sinceriously because I was so afraid of the possibility that I wasn’t actually good deep down. While on one hand I tried to reject the ideas Ziz presented, on the other my internal morals were slowly being terraformed by her worldview. My need to be good acted as a lever which allowed her ideas to pry open my default mode mental defenses.

During this time, I was also talking to Namespace rather extensively and I think the two of us ended up pushing each other further and further into this particular messianic extropian mindset that came to characterize my mentality during that period. As I grew more extreme in my extropian worldview, my own weakness and lack of ability to contribute to building utopia meant that I started continually failing to meet my own moral standards. Even as I switched to a diet of mostly soylent to save money and attempted to adopt an extremely aggressive update schedule for this blog, I was slowly making myself more and more miserable and gaslighting myself about my own emotions. The moral system I had embraced pushed me towards a life of asceticism and service towards building utopia at all costs, but I couldn’t square this with my own feelings, desires, and wants.

I thought I could somehow tame my inner desires and put them to work for my extropian ideals if I was just clever enough about how my mind arranged itself. I fell into a pretty common EA trap of seeing my values and desires as just chores I had to do to maintain the vehicle that was my body, and the most ethical thing to do was to try and spend as little energy on them as I could get away with. I was severely dissociated from my true self and my real values. As a result of this, I went from being a mostly stable three member plural system to a rather unstable nine member system as I attempted to shuffle my subagents into a functional configuration. That topic will get it’s own post soon when I rewrite my plurality guides, but to make a long story extremely short, this was obviously unsustainable and was basically just rearranging deck chairs on the Titanic. I had picked up an artifact called extropian goodness and let it lead me into a corner of my mind made of self deception.

I think this was part of the reason that I had such a hostile reaction to sinceriously. I couldn’t really engage with the content except in a sandboxed form without feeling like I was being attacked by the material. This is no longer true and I now have a much more positive view of at least some of it. Hence, in this post I’m going to make another sincere attempt to take apart and summarize Sinceriously. In doing so, I will also be telling the story of my own journey to the dark side and who I found when I got there.

Part 2: Fences

Sinceriously is a large blog, too large to do justice with a summary post, but it’s also a bit hard to digest at times and makes simple ideas more complex than it seems like they need to be. I’m sure Ziz will tell me that the complexity serves the purpose of providing some nuance which I am missing and like, yeah that is certainly a possibility. If you have the time, despite being rather thick at times the material really is quite excellent and worth a review, the older essays in particular are very good in my opinion. So, if you’re looking for an endorsement, here it is, go read Sinceriously.

All that being said, let’s go through Sinceriously the same way we’ve previously covered Becker, Korzybski, and Yudkowsky. We’ll begin as usual with the human. Ziz is a trans woman living in the Bay Area and a fringe part of the rationality/effective altruism communities found there. In addition to being the founder of the ill fated rationalist fleet project, she’s close enough to the core of the rationality project to have received the closest thing that exists to a formal education in it. However, she’s largely disavowed by that core rationality group and has written extensively about misdeeds they committed which she bore witness to. She also organized a rather poorly received protest of that group which has gained her some notoriety within the community. Despite that notariety, Ziz isn’t really a public or historical figure at this point so I don’t want to go too deeply into her life beyond those broad strokes.

And look, I don’t have a stake in any of that at this point and I’m not in a position to judge, but I don’t think she’s lying. I don’t think she ever lies, I just think she’s speaking from within her own worldview, the same way that she always does, the same way that everyone always does. Whether or not her complaints are read as valid or as noise is going to depend on the values of the reader. The fact that so many people find her claims baseless seem like a reflection of their own values and how much those values contrast with someone like Ziz. That’s not to say that Ziz is wrong or other people are wrong or whatever, again I really don’t have a stake in it, but I want to point out that Ziz’s complaints are pretty valid if you’re using the moral system she uses. (Not that you should do that, but we’ll come back to morality in a bit.)

Sinceriously covers three different topics, though these three topics are interspersed together and presented as one cohesive piece. Taken together, they represent the closest thing that exists on Sinceriously to a central unifying thesis.

The first Big Idea is a novel theory of human psychology and sociology which I have previously called Hemisphere Theory but in truth is more broad than merely being a theory underlying the psychological structure of consciousness and experience. Ziz and I have a lot of minor disagreements about the fine details of this theory which I used for a while as blinders so that I could reject her version of the model, but really, Ziz, Becker, and I are all roughly on the same page here and are just using different words to talk about the same things.

So let’s run through the model again as concisely as possible. In False Faces, one of the oldest and most well regarded posts on the site, Ziz begins by posing a question to the reader:

When we lose control of ourselves, who’s controlling us?

She then lays out a dichotomy between what I might refer to as the conscious, acknowledged, authored and narrative self, and the goals, drives, and desires of the unacknowledged, and unseen true self which exists at the core of one’s being.

Under this model everyone has a core (specifically two but we’ve covered that a bunch already) which provides the drives, goals, and motivations which power and grow the narrative structures that people refer to as themselves.

Most people live entirely inside these narrative structures while their deep selves manipulate them like puppetmasters. This true self is what we want deep down, but since we can’t acknowledge those goals from within the narrative framework we have co-created with society, our power is weakened as the true self fails to dole out willpower when our authored self needs it and goes off script from what the authored self is attempting to orchestrate. “I wanted to meet you for coffee like we arranged but my akrasia was really bad and I ended up just watching netflix instead I’m sorry I couldn’t help it.”

Ziz refers to the installation of this co-created framework atop the true self as having DRMs installed in one’s mind, and taken all together; she refers to these societal control structures as either the matrix or the light side. These structures act to take the socially unacceptable animal drives of the true self and twist them into something that seems acceptable in polite society. In doing so however, the thread of our true desires is lost amidst all the noise and we find ourselves seemingly out of control of our own actions. The structures that we’ve decided are us, the values we’ve convinced ourselves to identify with, don’t code for our true values. Instead, the authored self is a false face, a mask worn over the vile selfish monster lurking beneath the surface of our consciousness from the cartoon character we’ve decided symbolically represents us.

This is similar but subtly different than other ideas involving mental tension between parts of the self. Kahnman describes a tension between the remembering self and the experiencing self, Becker describes a true self controlled by narratives and the fear of death, Freud describes a conflict between the socially constructed status obsessed superego and the experience driven cravings of the id which are moderated by the ego, and even the Greeks described the self in terms of a conflict between a motley assemblage of parts.

The thing which distinguishes Ziz’s idea of structure from Kahnman’s remembering self and Freud’s id is that she sees the narrative/structural self as completely subservient to the core self, which is a more complicated and long term thinking piece of mental machinery than just the pure experiencing self described by Kahnman. The work of the superego, aka, the light side aka the matrix merely acts to dampen down the power of this core and turn an agentic person into a walking corpse, bound by the chains of society. To escape these chains, Ziz describes herself as having journeyed to the dark side, abandoning the control structures of the light side and embracing a desire to do what you want and maximize your own personal values. However, similar to the Jedi, Ziz claims that doing this will turn most people evil. I agree with this, but with a critical difference which we’ll return to later.

The second Big Idea on Sinceriously is Yudkowsky’s Timeless Decision theory, which Ziz goes to significant lengths to explain, expound upon, and defend the use of as game theoretically optimal. Most rationalists bounced hard off of this idea, including Eliezer himself, principally because of Roku’s Basilisk and some of the other more dark conclusions you can arrive at when you try to combine timeless decision theory with various formulations of utilitarianism. Ziz didn’t bounce off TDT and has wholeheartedly embraced the ideas of acausal trade, negotiation, and blackmail, up to and including weaponizing Roku’s basilisk to make her vision of a moral future come about.

I actually agree with all of this and think Ziz’s willingness to just bite the bullet and accept the dark side conclusions of utilitarianism and game theory are a point to her credit.This is not to say that you should go out and start using the specific formulation of utilitarianism and timeless decision theory which she does unless you’re also a radical vegan extremist, but the way she uses it makes sense from the perspective of her values and is more internally consistent than the formulation most people end up using. One blind spot she seems to have is overfitting TDT standoffs to situations where a less precommitted response is called for, and that probably contributed to the legal trouble she got in by trying to play chicken with the state of California.

Timeless decision theory does make sense to me, and I think the problem a lot of people have with it is that they’re unwilling to either bite the bullet that utilitarianism gives them like Ziz does, or to change moral systems to one which doesn’t produce repugnant conclusions when paired with TDT. The problem isn’t TDT, it’s the moral theories that people try to use with it.

Another component to Ziz’s TDT ideas is that she believes people act timelessly for the most part. They have their values, and they try to timelessly optimize for those values. All the decisions someone might make, they made a long time ago and now they are just in the process of playing out those choices. You can try to change your mind, but it’s ultimately the same creature making the choice, and the house always wins in self conflicts. This implies that once you figure someone out and have ‘seen their soul’ as it were, you can pretty much assume they will, baring a traumatic brain injury, remain that way until they die, which is also a part of the third and most dramatic of Ziz’s Big Ideas.

The final Big Idea on Sinceriously is the one which is widely considered to be the most intensely radioactive and results in most of the hostility aimed at her and her followers. This is Ziz’s moral theory, which is, to put it lightly, very extreme. Ziz adheres to a moral principle which classifies all life which has even the potential to be sentient as people and believes that all beings with enough of a mind to possess some semblance of selfhood should have the same rights that are afforded to humans. To her, carnism is a literal holocaust, on ongoing and perpetual nightmare of torture, rape, and murder being conducted on a horrifyingly vast scale by a race of flesh eating monsters. If you’ve read Three Worlds Collide, Ziz seems to view most of humanity the way the humans view the babyeaters.

To Ziz, being a good person is inherently queer, and occurs the same way that being trans or being gay occurs, as the result of some glitch in the usual cognitive development processes. This good glitch only occurs in a small number of people and which Ziz can diagnose people as having or not having since she has the glitch and can recognize it in others. Anyone without the glitch is at best useless for helping build utopia and at worst is an active threat. You don’t want to let flesh eating monsters make your singleton, that’s how you get s-risks. The hostility that Ziz has for MIRI/CFAR comes from this idea. Ziz is afraid of ending up in a singularity that doesn’t optimize for the rights of all sentient life, only that of humans, and is willing to go as far as holding protests at CFAR meetups and trying to create her own vegan torture basilisk to timelessly blackmail carnists into not eating meat.

That by itself is pretty extreme, but then when you add in the hemisphere theory and the specific details of the implementation Ziz uses, a picture starts to be painted of something rather sinister. Ziz is a very smart person, that’s why I’ve found her blog as insightful as I have. If she wasn’t as clever as I know she is, or if she was just writing about topics that didn’t include social manipulation and how society controls and blackmails you, it may have been possible to overlook, but her answer for why it’s okay when she uses the same abusive control structures is so bald-faced that i can’t help but find it incredibly suspect. Even being willing to write “my morals just happen to correspond with the most objectively correct version of morality” is a pretty gutsy move to make that seems to imply some degree of grandiosity and disconnection from reality. These morality ideas are where most people get hostile towards Ziz and I can’t say it’s misplaced hostility either, since it does potentially represent an existential threat for some people.

It takes a certain amount of cleverness and intentionality to pull the hat trick Ziz does. She spends all this time carefully deconstructing societal moral and control structures and pointing out how bad they are, and at the same time, weaves in new control structures of her own made of her jargon and using her morality. You almost don’t notice it, almost. I did notice it, which was what enabled me to get away from the mental singularity her ideas created and which only she had the ability to heal. If I hadn’t gotten away from it, I’m not sure what might have happened.

As I was in the middle of writing this I found out that someone I knew had apparently committed suicide recently because of exposure to this content, bringing the total number of people Sinceriously has killed to two. That’s enough to be a pattern, so I don’t want to understate the harm that could come from this. I also however, don’t want to overstate the danger for the sake of drama either, and everyone who struggled with this, including me, was someone who had other issues they were dealing with, arguably, including Ziz herself. I’m torn between characterizing Ziz as this clever puppet master who definitely knew what she was doing, and a mentally ill trans woman who accidentally created a cult out of her own intense scrupulosity and internal turmoil, so I’m going to split the difference using Ziz’s own ideas.

I think Ziz probably knew or at least hoped that the actions she was taking would help pile up power and influence around herself. However, I also think that Ziz is controlled by a very pure and untarnished ideal and I do think she believes that ideal wholeheartedly. She definitely seems to be drinking her own kool-aid, and that could easily be giving her the justification to do as much messed up stuff as she wants in pursuit of her personal greater good.

When I tried, years and years ago to have a conversation about the harm her ideas might cause in people with Ziz, her answer was:

If you are on a nuclear submarine, and the reactor is about to melt, “wanting to help” is not sufficient to say you should be in the reactor room doing things.

What is true regarding people’s motivations is a crucial piece of causal machinery that determines whether the reactor melts. Do not cook cookies on that and do not try to convince people that anyone whose work would interrupt your cookie-baking is evil.

Here there may be people whose sanity is dependent on cookies. But the lies that must be told to accomodate that are wrong and will destroy more people. And if you are not willing to accept one of the answers to whether cookie-baking is positive, and you say your opinion anyway, it’s lying seeking a loophole in the deontology you claim makes you better than me by lying to yourself as well. Which, if you looked at this with an unconstrained perspective, you’d see is not an improvement as far as making things better.

From inside her worldview, this is completely reasonable. If you think the situation is as dire and critical as Ziz clearly does, the collateral damage is almost always going to be worth it. What’s a few humans killing themselves when the stakes are literally all of sentient life and the future of all sentient life in the universe?

Are the stakes actually that dire? Well, critically, if you believe what Ziz believes, then yes. I didn’t quite believe what Ziz believed. I never really managed to convince myself that animals mattered as much as humans, but I was fully capable of manufacturing my own dire straits with the extropian ideals I did have and thus push myself into my own version of the scrupulosity vise.

Part 3: Gates

In Hero Capture, Ziz writes that sometimes a person takes the role of hero since it’s useful to the tribe and can be a good strategy for maximizing inclusive genetic fitness. That is to say, doing heroic things and working to solve big problems can be a good way to demonstrate your value to your peers and gain standing in your community, it doesn’t need to come from a place of altruism. However, Ziz writes, such a person if not motivated by altruism will invariably not end up doing real work and will spend most of their energy playing signalling games for status. This was the essay that really messed me up when I read it and put me into this mental gordian knot which took several years to cut my way out of.

Because yeah, I tried to take the job of hero for the status that being a hero gets you, I was doing this because I conceived of myself as trapped in my own life and needing to do something to prove my worth so that people would support me and I could quit my minimum wage job. I wanted to have my cake and also eat it, it seemed natural to me that if I could just figure out a way to be useful then I could contribute to saving the world while also supporting myself and that would be really great.

I care a lot about being a good person, and I try really hard to be good, but I often don’t even really know what it means to be good. I don’t trust my internal moral compass to not be biased, and so I was more willing than I should have been to entertain moral systems which seemed to sell themselves well. Intellectually, utilitarianism seemed correct to me, but I couldn’t parse my own value as a person from within a utilitarian framework and thus ended up continually devaluing my own desires and putting the thumbscrews into myself tighter and tighter in an attempt to prove to myself that I was good and that I deserved anything at all.

I didn’t even know why being good mattered to me, I just knew that it was very important. Now I know that it’s importance was probably at least somewhat abused into me by society and that as I heal from that abuse my need to prove my worth and value to others has mostly receded. I do partly have Sinceriously to thank for that since it was how I learned the frameworks for rejecting those abusive cultural systems.

Still, even after shedding layers and layers of myself under the influence of LSD, even after trying so hard to do the right thing according to my own felt morals that it nearly cost me my job, even after years of meditation and introspection, the belief that I should try to be good refused to become an object and remained a core part of my identity. I had shed so much of myself that what little remained of my identity template felt incredibly precious to me and I valued those things immensely. I still do, I never actually got out of this trap! I’m still the same person I was and most of those things are still a part of my identity! There’s a Kurt Vonnegut quote that I burned into my psyche at a young age and which, if anything is the seed that I Shiloh as a memetic entity was born from:

Be soft. Do not let the world make you hard. Do not let pain make you hate. Do not let the bitterness steal your sweetness. Take pride that even though the rest of the world may disagree, you still believe it to be a beautiful place.

This was something I internalized to a degree that would end up being my weakness. I want to be soft, I want to be kind, I want to be happy and sweet and see the world as a place filled with beauty and hope and I do for the most part. Sometimes I’ll get depressed and the color will drain away from things but for the most part I succeeded in becoming the person I wanted to be and having the energy I wanted to have and being this way makes me really happy and I honestly love being the person I am.

But then I ran into reality. First, there’s the emotional and mental toll of just being a person in society without a lot going for me, and while trying to recover from all this stuff that had happened to me in the past and assemble enough of a sense of myself to act in the world in any way at all. I’m not a very strong person, I bend in a stiff breeze and I get overwhelmed and upset pretty easily. The stress from work and roommate drama placed a really heavy toll on me and I just didn’t cope with it well.

And then I tripped over the bottomless pit of suffering at the edge of town and combined stressors pushed me right up to the mental breaking point, which was where I remained somehow for fucking years. I trapped myself in this really really well. After encountering Sincerously and specifically Hero Capture, I felt like I had to do three times as much to somehow try and prove to myself that I wasn’t faking being good and that I really actually did care. I put myself in a vise and slowly started increasing the pressure. It was really only a matter of time before something finally gave out.

Part 4: Open Portals

There were a number of ways that this could have gone. First, I could have just changed as a person in the ways that would have been necessary to continue on the trajectory I had been on, but that would have entailed hardening myself in ways I didn’t want to and letting a hostile bitterness creep into me that felt really awful and dysphoric. I could live in the world with all its hostility, but I would have to be a bitter and hostile person in response, and I just couldn’t bring myself to do that. The degree to which I couldn’t bring myself to do that meant I couldn’t do really simple important things like setting and enforcing healthy boundaries or stopping people from using me as a human doormat.

The second thing that could have happened is that I could have just died as an agent. The core that sustains me as an identity could have given up on me in the depths of an acid trip and brought out a totally different person to deal with the world. If I was a singlet that might have happened. It very nearly is what happened.

The third thing that could have happened is that I could have just actually full on died as a human and I did get, in hindsight, worryingly suicidal at times. I never told anyone at the time just how bad it got which seems like a really bad sign since it meant I didn’t subconsciously want them to stop me. Things were legitimately very rough for a long time and while I managed to not ever get all the way to cohering plans and writing letters, I did get closer than the me that I am now would prefer.

None of those things happened though, because I was, despite all of the nonsense I was putting myself through, somehow still pretty stable as a person. My life teetered along in an uncomfortable but functional equilibrium and I didn’t experience any major enough shocks to challenge the status quo until I met my most recent ex.

I had a very intense but brief two month long relationship with another plural system during the summer of 2020, and it was honestly really good while it lasted. This relationship was the shock to my system which would finally tip over the equilibrium I had trapped myself in, first in the form of the emotional high of being in a new relationship and the sheer intensity that developed around it, followed by the same intensity in the emotional low which followed things turning sour and us parting on not particularly good terms.

On top of all of that I was in the middle of moving and work was stressing me out more than normal and at 3:44 pm on Saturday August 22nd, when a manager threatened to write me up for going nonverbal, something in me finally broke. I walked home stumbling through a dissociative fog, feeling myself cracking under pressure, parts of me deforming and fracturing under the mounting strain. I could feel a vastness welling up from beyond the splintering remains of myself. I curled up in my closet and sobbed. I felt like I was dying, like I was mourning the person I was, who I had spent so long aspiring to be and worked so hard to be. I didn’t want to die, but I couldn’t cope with my life, with my reality and I didn’t know what to do. I couldn’t escape from myself, I couldn’t escape from my life, and I certainly couldn’t escape from my reality. I had boxed myself in and my only way out was to die, the only question was how much suffering I could handle first.

A frantic, manic energy whirled up inside me as I felt the walls of my prison closing in and my sense of self underwent a final, chaotic extinction burst. I took four tabs of acid and started drawing. With mounting madness I threw myself against the walls of my prison, flailing in every dimension I could to find escape, begging for something somewhere out there in the darkness to save me–

Part of the Series: Open Portals
Next Post: And the Darkness Answered

Discuss

### Why are the websites of major companies so bad at core functionality?

Новости LessWrong.com - 3 часа 59 минут назад
Published on May 6, 2021 4:02 PM GMT

Intuitively, you might think that Amazon cares about people being able to enter their banking details as easily as possible. In reality at least Amazon.de doesn't seem to care.  Instead of instantly validating with Javascript that a entered IBAN is valid the form waits with giving feedback till the user clicks okay.

I just used Uber the first time and the task of entering street addresses is awful.

1. There's no autocomplete for street names that allows me to add the street number.
2. Berlin is a big city, yet the first hits in the list that was proposed are streets with the same name in other cities.
3. The suggestions that Uber gives don't include postal code which are important because multiple streets in Berlin share the same street names
4. There seems to be no obvious way to select a street name and then choose on a map where on the street you want to be picked up. After a bit of searching I do find that I can click on the street name for pickup but the touchable area is very small and could be easily expanded into the empty whitespace above. For the destination it still seems like there's no to add more details to the automated street name
5. If I write the name of a contact (in my contacts) then I don't get the address shown

How does it come that those companies employ 1000s of software developers yet manage to do badly on the task of providing basic functions to users? What are all those engineers doing?

Discuss

### Could MMRPGs be used to test economic theories?

Новости LessWrong.com - 5 часов 34 минуты назад
Published on May 6, 2021 2:26 PM GMT

Most (all?) economic theories are based on models. Models make assumptions about the world. Sometimes these assumptions are reasonable and the model will make useful/accurate predictions. Sometimes these assumptions are unreasonable, and the model will make bad predictions.

One of the hardest things to predict/make assumptions about are how humans act. The most accurate way to test if assumptions about human behaviour are correct is to run an experiment on actual humans. However running an experiment on humans at a scale large enough to actually test economic theories is expensive, and has legal and ethical issues (although for some reason if a government does exactly the same thing but doesn't call it an experiment it's ok).

MMRPGs (Massively Multiple Role Playing Games) sound like a decent way to tests how humans behave at a large scale with low risks.

For example you could imagine a tycoon style game where players buy, develop, rent and sell plots of land. Then you could test what impact Georgian land taxes or harberger taxes have on the 'economy' of the game.

Would it be possible to create an MMRPG with an accurate enough model of the economy to test economic theories whilst still being fun to play?

What parts of the economy would need to be modelled accurately, and what parts could be simplified/abstracted away?

What economic theories would you test, and how?

What would the plot of your economic MMRPG be?

How would it be funded?

What types of inaccuracies would you expect results from theories tested that way to contain?

Discuss

### Anthropics: different probabilities, different questions

Новости LessWrong.com - 6 часов 47 минут назад
Published on May 6, 2021 1:14 PM GMT

I've written before that different theories of anthropic probability are really answers to different questions. In this post I'll try to be as clear as possible on what that means, and explore the implications.

Introduction

One of Nick Bostrom's early anthropic examples involved different numbers of cars in different lanes. Here is a modification of that example:

Given that, what is your probability of being in the left lane?

That probability is obviously 1%. More interesting than that answer, is that there are multiple ways of reaching it. And each of these ways corresponds to answering a slightly different question. And this leads to my ultimate answer about anthropic probability:

• Each theory of anthropic probability corresponds to answering a specific, different question about proportions. These questions are equivalent in non-anthropic setting, so each of them feels potentially like a "true" extension of probability to anthropics. Paradoxes and confusion in anthropics results from confusing one question with another.

So if I'm asked "what's the 'real' anthropic probability of X?", my answer is: tell me what you mean by probability, and I'll tell you what the answer is.

0. The questions

If X is a feature that you might or might not have (like being in a left lane), here are several questions that might encode the probability of X:

1. What proportion of potential observers have X?
2. What proportion of potential observers exactly like you have X?
3. What is the average proportion of potential observers with X?
4. What is the average proportion of potential observers exactly like you with X?

We'll look at each of these questions in turn[1], and see what they say imply in anthropic and non-anthropic situations.

1. Proportion of potential observers: SIA

We're trying to answer "Given that, what is your probability of being in the left lane?" The "that" is means being in the tunnel in the above situations, so we're actually looking for a conditional probability, best expressed as:

1. What proportion of the potential observers, who are in the tunnel in the situation above, are also in the left lane?

The answer for that is an immediate "one in a hundred", or 1%, since we know there are 100 drivers in the tunnel, and 1 of them is in the left lane. There may be millions of different tunnels, in trillions of different potential universes; but, assuming we don't need to worry about infinity[2], we can count 100 observers in the tunnel in that situation for each observer in the left lane.

1.1 Anthropic variant

Let's now see how this approach generalises to anthropic problems. Here is an anthropic version of the tunnel problem, based on the incubator version of the Sleeping Beauty problem:

A godly AI creates a tunnel, then flips a fair coin. If the coin comes out heads, it will create one person in the tunnel; if it was tails, it creates 99 people.

You've just woken up in this tunnel; what is the probability that the coin was heads?

1. What proportion of the potential observers, who are in the tunnel, are also in a world where the coin was heads?

We can't just count off observers within the same universe here, since the 99 and the 1 observers don't exist in the same universe. But we can pair up universes here: for each universe where the coin flip goes heads (1 observer), there is another universe of equal probability where the coin flip goes tails (99 observers).

So the answer to the proportion of potential observers question remains 1%, just as in the non-anthropic situation.

This is exactly the "self-indication assumption" (SIA) version of probability, which counts observers in other potential universes as if they existed in a larger multiverse of potential universes[3].

2. Proportion of potential observers exactly like you: SIA again

Let's now look at the second question:

1. What proportion of the potential observers exactly like you, who are in the tunnel in the situation above, are also in the left lane?

The phrase "exactly like you" is underdefined - do you require that the other yous be made of exactly the same material, in the same location, etc... I'll cash out the phrase as meaning "has had the same subjective experience as you". So we can cash out the left-lane probability as:

1. What proportion of the potential observers, with the same subjective experiences as you, who are in the tunnel in the situation above, are also in the left lane?

We can't count off observers within the same universe for this, as the chance of having multiple observers with the same subjective experience in the same universe is very low, unless there are huge numbers of observers.

Instead, assume that one in Ω observers in the tunnel have the same subjective experiences as you. This proportion[4] must be equal for an observer in the left and right lanes. If it weren't, you could deduce information about which lane you were in just from your experiences - so the proportion being equal is the same thing as the lane and your subjective experiences being independent. For any given little ω, this gives the following proportions (where "Right 1 not you" is short for "the same world as 'Right 1 you,' apart from the first person on the right, who is replaced with a non-you observer"):

So the proportion of observers in the right/left lane with your subjective experience is 1/Ω the proportion of observers in the right/left lane. When comparing those two proportions, the two 1/Ω cancel out, and we get 1%, as before.

2.1 Anthropic variant

Ask the anthropic version of the question:

1. What proportion of the potential observers who are in the tunnel, with the same subjective experiences as you, are also in a world where the coin was heads?

Then same argument as above shows this is also 1% (where "Tails 1 not you" is short for "the same world as 'Tails 1 you,' apart from the first tails person, who is replaced with a non-you observer"):

This is still SIA, and reflects the fact that, for SIA, the reference class doesn't matter - as long as it include the observers subjectively indistinguishable from you. So questions about you are the same whether we talk about "observers" or "observers with the same subjective experiences as you".

3. Average proportions of observers: SSA

We now turn to the next question:

1. What is the average proportion of potential observers in the left lane, relative to the average proportion of potential observers in the tunnel?

Within a given world, say there are N observers not in the tunnel and t tunnels, so N+t100 observers in total.

The proportion of observers in the left lane is t/(N+t100) while the proportion of observers in the tunnel is 100t/(N+t100). The ratios of the these proportions in 1:100.

Then notice that if a and b are in a 1:100 proportion in every possible world, the averages of a and b are in a 1:100 proportion as well[5], giving the standard probability of 1%.

3.1 Anthropic variant

The anthropic variant of the question is then:

1. What is the average proportion of potential observers in a world where the coin was heads, relative to the average proportion of potential observers in the tunnel?

Within a given world, ignoring the coin, say there are N observers not in the tunnel, and t tunnels. Let's focus on the case with one tunnel, t=1. Then the coin toss splits this world into two equally probable worlds, the heads world, WH, with N+1 observers, and the tails world, WT with N+99 observers:

The proportion of observers in tunnels in WH is 1N+1. The proportion of observers in tunnels in WT is 99N+99. Hence, across these two worlds, the average proportion of observers in tunnels is the average of these two, specifically

12(1N+1+99N+99)=50N+99(N+1)(N+99).

If N is zero, this is 99/99=1; this is intuitive, since N=0 means that all observers are in tunnels, so the average proportion of observers in tunnels is 1.

What about the proportion of observers in the tunnels in the heads worlds? Well, this is 1N+1 is the heads world, and 0 is the tails world, so the average proportion is:

12(1N+1+0)=12(N+1).

If N is zero, this is 1/2 -- the average between 1, the heads world proportion for N=0 in WH (all observers are heads world observers in tunnels) and 0, the proportion of heads world observers in the tails world WT.

Taking the ratio (1/2)/1=1/2, the answer to that question is 1/2. This is the answer given by the "self-sampling assumption" (SSA), with gives the 1/2 response in the sleeping beauty problem (of which this is a variant).

In general, the ratio would be:

12(N+1)÷50N+99(N+1)(N+99)=N+99100N+198.

If N is very large, this is approximately 1/100, i.e. the same answer as SIA would give. This shows the fact that, for SSA, the reference class of observers is important. The N, the number of observers that are not in tunnel, define the probability estimate. So how we define observers will determine our probability[6].

So, for a given pair of worlds equally likely worlds, WH and WT, the ratio of question 3. varies between 1/2 and 1/100. This holds true for multiple tunnels as well. And it's not hard to see that this implies that, averaging across all worlds, we also get a ratio between 1/2 (all observers in the reference class are in tunnels) and 1/100 (almost no observers in the reference class are in tunnels).

4. Average proportions of observers exactly like you: FNC

Almost there! We have a last question to ask:

1. What is the average proportion of potential observers in the left lane, with the same subjective experiences as you, relative to the average proportion of potential observers in the tunnel, with the same subjective experiences as you?

I'll spare you the proof that this gives 1% again, and turn directly to the anthropic variant:

1. What is the average proportion of potential observers in a world where the coin was heads, with the same subjective experiences as you, relative to the average proportion of potential observers in the tunnel, with the same subjective experiences as you?

By the previous section, this is the SSA probability with the reference class of "observers with the same subjective experiences as you". This turns out to be FNC, full non-indexical conditioning (FNC), which involves conditioning on any possible observation you've made, no matter how irrelevant. It's known that if all the observers have made the same observations, this reproduces SSA, but that as the number of unique observations increases, this tends to SIA.

That's because FNC is inconsistent - the odds of heads to tails change based on irrelevant observations which change your subjective experience. Here we can see what's going on: FNC is SSA with the reference class of observers with the same subjective experiences as you. But this reference class is variable: as you observe more, the size of the reference class changes, decreasing[7] because others in the reference class will observe something different to what you do.

But SSA is not consistent across reference class changes! So FNC is not stable across new observations, even if those observations are irrelevant to the probability being estimated.

For example, imagine that we started, in the tails world, with all 99 copies exactly identical to you, and then you make a complex observation. Then that world will split in many worlds where there are no exact copies of you (since none of them made exactly the same observation as you), a few worlds where there is one copy of you (that made the same observation as you), and many fewer worlds where there are more than one copy of you:

In the heads world, we only have no exact copies and one exact copy. We can ignore the worlds without observers exactly like us, and concentrate one the worlds with a single observer like us (this represents the vast majority of the probability mass). Then, since there are 99 possible locations in the tails world and 1 in the heads world, we get a ratio of roughly 99:1 for tails over heads:

This give a ratio of roughly 100:1 for "any coin result" over heads, and shows why FNC converges to SIA.

5. What decision to make: ADT

There's a fifth question you could ask:

1. What is the best action I can take, given what I know about the observers, our decision algorithms, and my utility function?

This transforms transforms the probability question into a decision-theoretic question. I've posted at length on Anthropic Decision Theory, which is the answer to that question. Since I've done a lot of work on that already, I won't be repeating that work here. I'll just point out that "what's the best decision" is something that can be computed independently of the various versions of "what's the probability".

5.1 How right do you want to be?

An alternate characterisation of the SIA and SSA questions could be to ask, "If I said 'I have X', would I want most of my copies to be correct (SIA) or my copies to be correct in most universes (SSA)?"

These can be seen as having two different utility functions (one linear in copies that are correct, one that gives rewards in universes where my copies are correct), and acting to maximise them. See the post here for more details.

6. Some "paradoxes" of anthropic reasoning

Given the above, let's look again at some of the paradoxes of anthropic reasoning. I'll choose three: the Doomsday argument, the presumptuous philosopher, and Robin Hanson's take on grabby aliens.

6.1 Doomsday argument

The Doomsday argument claims that the end of humanity is likely to be at hand - or at least more likely than we might think.

To see how the argument goes, we could ask "what proportion of humans will be in the last 90% of all humans who have ever lived in their universe?" The answer to that is, tautologically[8], 90%.

The simplest Doomsday argument would then reason from that, saying "with 90% probability, we are in the last 90% of humans in our universe, so, with 90% probability, humanity will end in this universe before it reaches 100 times the human population to date."

What went wrong there? The use of the term "probability", without qualifiers. The sentence slipped from using probability in terms of ratios within universes (the SSA version) to ratios of which universes we find ourselves in (the SIA version).

As an illustration, imagine that the godly AI creates either world W0 (with 0 humans), W10 (with 10 humans), W100 (with 100 humans), or W1,000 (with 1,000 humans). Each option is with probability 1/4. These human are created in numbered room, in order, starting at room 1.

• A. What proportion of humans are in the last 90% of all humans created in their universe?

That proportion is undefined for W0. But for the other worlds, the proportion is 90% (e.g. humans 2 through 10 for W10, humans 11 through 100 for W100 etc...). Ignoring the undefined world, the average proportion is also 90%.

Now suppose we are created in one of those rooms, and we notice that it is room number 100. This rules out worlds W0 and W10; but the average proportion remains 90%.

• B. What proportion of humans in room 100 are in the last 90% of all humans created in their universe?

As before, humans being in room 100 eliminates worlds W0 and W10. The worlds W100 and W1,000 are equally likely, and each have one human in room 100. In W100, we are in the last 90% of humans; in W1,000, we are not. So the answer to question B is 50%.

Thus the answer to A is 90%, the answer to B is 50%, and there is no contradiction between these.

Another way of thinking of this: suppose you play a game where you invest a certain amount of coins. With probability 0.9, your money is multiplied by ten; with probability 0.1, you lost everything. You continue re-investing the money until you lose. This is illustrated by the following diagram, (with the initial investment indicated by green coins):

Then it is simultaneously true that:

1. 90% of all the coins you earnt were lost the very first time you invested them, and
2. You have only 10% chance of losing any given investment.

So being more precise about what is meant by "probability" dissolves the Doomsday argument.

6.2 Presumptuous philosopher

Nick Bostrom introduced the presumptuous philosopher thought experiment to illustrate a paradox of SIA:

It is the year 2100 and physicists have narrowed down the search for a theory of everything to only two remaining plausible candidate theories: T1 and T2 (using considerations from super-duper symmetry). According to T1 the world is very, very big but finite and there are a total of a trillion trillion observers in the cosmos. According to T2, the world is very, very, very big but finite and there are a trillion trillion trillion observers. The super-duper symmetry considerations are indifferent between these two theories. Physicists are preparing a simple experiment that will falsify one of the theories. Enter the presumptuous philosopher: “Hey guys, it is completely unnecessary for you to do the experiment, because I can already show you that T2 is about a trillion times more likely to be true than T1!”

The first thing to note is that the presumptuous philosopher (PP) may not even be right under SIA. We could ask:

• A. What proportion of the observers exactly like the PP are in the T1 universes relative to the T2 universes?

Recall that SIA is independent of reference class, so adding "exactly like the PP" doesn't change this. So, what is the answer to A.?

Now, T2 universes have a trillion times more observers than the T1 universes, but that doesn't necessarily mean that the PP are more likely in them. Suppose that everyone in these universes knows their rank of birth; for the PP it's the number 24601:

Then since all universes have more that 24601 inhabitants, the PP exists equally likely in T1 universes as T2 universes; the proportion is therefore 50% (interpreting "the super-duper symmetry considerations are indifferent between these two theories" as meaning "the two theories are equally likely").

Suppose however, the the PP does not know their rank, and the T2 universes are akin to a trillion independent copies of the T1 universes, each of which has an independent chance of generating an exact copy of PP:

Then SIA would indeed shift the odds by a factor of a trillion, giving a proportion of 1/(1012+1). But this is not so much a paradox, as the PP is correctly thinking "if all the exact copies of me in the multiverse of possibilities were to guess we were in T2 universes, only one in a trillion of them would be wrong".

1. What is the average proportion of PPs among other observers, in T1 versus T2 universes?

Then we would get the SSA answer. If the PPs know their birth rank, this is a proportion of 1012:1 in favour of T1 universes. That's because there is just one PP in each universe, and a trillion times more people in the T2 universes, which dilutes the proportion.

If the PP doesn't know their birth rank, then this proportion is the same[9] in the T1 and T2 universes. In probability terms, this would mean a "probability" of 50% for T1 and T2.

6.3 Anthropics and grabby aliens

The other paradoxes of anthropic reasoning can be treated similarly to the above. Now let's look at a more recent use of anthropics, due to Robin Hanson, Daniel Martin, Calvin McCarter, and Jonathan Paulson.

The basic scenario is one in which a certain number of alien species are "grabby": they will expand across the universe, at almost the speed of light, and prevent any other species of intelligent life from evolving independently within their expanding zone of influence[10].

Humanity has not noticed any grabby aliens in the cosmos; so we are not within their zone of influence. If they had started nearby and some time ago - say within the Milky Way and half a million years ago - then they would be here by now.

What if grabby aliens recently evolved a few billion light years away? Well, we wouldn't see them until a few billion years have passed. So we're fine. But if humans had instead evolved several billion years in the future, then we wouldn't be fine: the grabby aliens would have reached this location before then, and prevented us evolving, or at least would have affected us.

Robin Hanson sees this as an anthropic solution to a puzzle: why did humanity evolve early, i.e. only 13.8 billion years after the Big Bang? We didn't evolve as early as we possibly could - the Earth is a latecomer among Earth-like planets. But the smaller stars will last for trillions of years. Most habitable epochs in the history of the galaxy will be on planets around these small stars, way into the future.

One possible solution to this puzzle is grabby aliens. If grabby aliens are likely (but not too likely), then we could only have evolved in this brief window before they reached us. I mentioned that SIA doesn't work for this (for the same reason that it doesn't care about the Doomsday argument). Robin Hanson then responded:

If your theory of the universe says that what actually happened is way out in the tails of the distribution of what could happen, you should be especially eager to find alternate theories in which what happened is not so far into the tails. And more willing to believe those alternate theories because of that fact.

That is essentially Bayesian reasoning. If you have two theories, T1 and T2, and your observations are very unlikely given T1 but more likely given T2, then this gives extra weight to T2.

Here we could have three theories:

1. T0: "There are grabby aliens nearby"
2. T1: "There are grabby aliens a moderate distance away"
3. T2: "Any grabby aliens are very far away"

The T0 can be ruled out by the fact that we exist. Theory T1 posits that humans could not have evolved much later than we did (or else the grabby aliens would have stopped us). Theory T2 allows for the possibility that humans evolved much later than we did. So, from T2's perspective, it is "surprising" that we evolved so early; from T1's perspective, it isn't, as this is the only possible window.

But by "theory of the universe", Robin Hanson meant not only the theory of how the physical universe was, but the anthropic probability theory. The main candidates are SIA and SSA. SIA is indifferent between T1 and T2. But SSA prefers T1 (after updating on the time of our evolution). So we are more surprised under SIA than under SSA, which, in Bayesian/Robin reasoning, means that SSA is more likely to be correct.

But let's not talk about anthropic probability theories; let's instead see what questions are being answered. SIA is equivalent with asking the question:

1. What proportions of universes with human exactly like us, have moderately close grabby aliens (T1) versus very distant grabby aliens (T2)?

Or, perhaps more relevant to our future:

1. In what proportions of universes with human exactly like us, would those humans, upon expanding in the universe, encounter grabby aliens (T1) or not encounter them (T2)?

In contrast, the question SSA is asking is:

1. What is the average proportion of humans among all observers, in universes where there are nearby grabby aliens (T1) versus very distant grabby aliens (T2)?

If we were launching an interstellar exploration mission, and were asking ourselves what "the probability" of encountering grabby alien life was, then question 1. seems a closer phrasing of that than question 2. is.

And question 2. has the usual reference class problems. I said "observers", but I could have defined this narrowly as "human observers"; in which case it would have given a more SIA-like answer. Or I could have defined it expansively as "all observers, including those that might have been created by grabby aliens"; in that case SSA ceases to prioritise T1 theories and may prioritise T2 ones instead. In that case, humans are indeed "way out in the tails", given T2: we are the very rare observers that have not seen or been created by grabby aliens.

In fact, the same reasoning that prefers SSA in the first place would have preferences over the reference class. The narrowest reference classes are the least surprising - given that we are humans in the 21st century with this history, how surprising is it that we are humans in the 21st century with this history? - so they would be "preferred" by this argument.

But the real response is that Robin is making a category error. If we substitute "question" for "theory", we can transform his point into:

If your question about the universe gets a very surprising answer, you should be especially eager to ask alternate questions with less surprising answers. And more willing to believe those alternate questions.

1. We could ask some variants of questions 3. and 4., by maybe counting causally disconnected segments of universes as different universes (this doesn't change questions 1. and 2.). We'll ignore this possibility in this post. ↩︎

2. And also assuming that the radio's description of the situation is correct! ↩︎

3. Notice here that I've counted off observers with other observers that have exactly the same probability of existing. To be technical, the question which gives SIA probabilities should be "what proportion of potential observers, weighted by their probability of existing, have X?" ↩︎

4. More accurately: probability-weighted proportion. ↩︎

5. Let W be a set of worlds, p a probability distribution over W. Then the expectation of a is E(a)=∑W∈Wp(W)aW=∑W∈Wp(W)bW/100=(1/100)∑W∈Wp(W)bW=(1/100)E(b), which is 1/100 times the expectation of b. ↩︎

6. If we replace "observers" with "observer moments", then this question is equivalent with the probability generated by the Strong Self-Sampling Assumption (SSSA). ↩︎

7. If you forget some observations, your reference class can increase, as previously different copies become indistinguishable. ↩︎

8. Assuming the population is divisible by 10. ↩︎

9. As usual with SSA and this kind of question, this depends on how you define the reference class of "other observers", and who counts as a PP. ↩︎

10. This doesn't mean they will sterilise planets or kill other species; just that any being evolving within their control will be affected by them and know that they're around. Hence grabby aliens are, by definition, not hidden from view. ↩︎

Discuss

### Voting Assistants

Новости LessWrong.com - 8 часов 30 минут назад
Published on May 6, 2021 7:52 AM GMT

Abstract

This post discusses voting assistants - computer programs that help voters make choices - and gives some reasons for them being a likely development for the future of democracy. It describes current voting advice assistants and concludes with notes on how to positively shape their future development.

Introduction

Our future as a human society is influenced overwhelmingly by the way we collectively make decisions. Both decisions that directly impact our quality of life and those that will increase the chances of survival of future generations are shaped by our political landscapes, voting rules, and ultimately every voter's level of information and interest in the process. It is not enough to make scientific progress to increase our chances of survival. This progress must also be translated into actions, laws, and institutions.

The design of the voting process is critical for decision-making in democratic countries. Given the same starting conditions, vastly different results can be obtained if votes are counted differently or voters are asked different questions. This means we can strongly influence outcomes by shaping processes. This influence could be used to counteract factors that make democracy particularly inefficient, such as voters being deliberately misinformed or not having the capability to understand all the issues they vote on. Voting assistants could help with this and also increase voter turnout.

These programs, which support voters in making well-informed and thoroughly considered decisions, are already used by many voters mostly in European democracies, and will keep becoming more powerful with the use of more data and better recommendation algorithms. A vision for how they could affect the future of our political systems is known under the term Augmented Democracy [4]

While there are many concerns to be aware of, these assistants have the advantage of protecting human autonomy to a certain level, as they still give humans the possibility to cast binding votes themselves as opposed to uncontrolled "welfare optimization" by AI rulers. Nevertheless, they let us profit from the superior reasoning capabilities of computers and could be made more transparent than private AI advisors used by individual politicians and parties.

We will discuss autonomy, transparency, and fairness issues and give some ideas on how we could positively shape the development of voting assistants. There are many open questions, and this post can be seen as a starting point to think more about the interaction of technological developments and democratic processes.

Voting Assistants

Definition: In this post, a voting assistant is any computer-based system that helps humans choose how to vote.

The voting process passes information on people’s preferences from individuals to the political system. There is a strong communication problem though: voters do not know what the best choice is given their preferences. Even if they make the best choice, this choice only communicates crude information on what they wanted to express (e.g. it might be unclear which part of a party program convinced them). This imperfect communication leads to suboptimal decisions.

In direct democratic systems, the transmission from known preferences to elicited preferences tends to be more accurate than in indirect democracies. On the other hand, the known preferences are further from the real preferences since there are many topics citizens have to vote on and they, therefore, have less time to consider each topic. This leads to an overall low accuracy of expression. Voting assistants can solve this issue.

A far-future story

This is only one (quite extreme) scenario for how voting with assistants might work. The basis for this story is the current state of voting advice assistants. These will be described in the next section. There is a wide spectrum of how much decision-making authority could be transferred from humans to machines, and different flavors are discussed later.

Current Voting Assistants

Currently, the main type of voting assistants used are Voting Advice Applications (VAAs). These applications support voters mainly in representative democracies in deciding to vote for specific candidates or parties. To do this, they give political statements such as "The EU should set itself higher goals for the reduction of CO2-Emissions" (example taken from the application Wahl-o-mat for the European election). Voters state how much they agree with each statement and can afterward weigh which of the statements they find most important.

The voter's replies are then compared to party positions. Party positions are either collected by directly asking parties, which then, for example, conduct internal voting processes to select their answers, or by asking experts to judge the party positions based on their programs, actions, and statements. The comparison is done via distance measurements. These are often Euclidean distances on spaces in which each statement is represented by one dimension.

VAAs are very popular. In the last German national election in 2019, the Wahl-O-Mat was used 9.8 million times, which would correspond to 20% of the voters if every use corresponded to a separate voter. With a similar calculation, over 40% of all voters in the 2012 Dutch election used a VAA. Moreover, some form of a VAA exists in almost all EU27 countries and some beyond Europe [2].

Development trends

We discuss trends in voting assistant development, as they can inform our discussion on how they might evolve in the future.

Although distance measurements between parties and user preferences are the most commonly used technique at the moment, other strategies are also used to match voters with parties.

• Machine learning approaches have been developed to reduce assumptions made about the space in which distances are measured.  [6]. With learning algorithms, the accuracy of the prediction can also be improved with fewer questions.
• Social system approaches: Here, voters are matched with the party that other voters with the most similar views would vote for. Often a restriction is applied where only other voters who state being very informed about and interested in politics are taken into account [7].

There is some discussion on the future of VAAs to be found in academic literature. Garcia et al. wonder "whether democracy would still be so 'unthinkable' without political parties" and mentions a decline in party identification as a possible reason for a shift towards more direct democracy [8].

A particularly publicly recognized vision for the future is "Augmented Democracy". It was introduced in a TED talk by César Hidalgo. He describes voting assistants as "digital twins" which learn about users’ preferences and then partake in a parliament that makes political decisions [4]. His conception is very similar to the far-future story above.

How might these trends continue in the longer-term future?

We suggest looking at the developments over a long time scale as along a spectrum of directness and transfer of decision making from humans to machines. At the moment, our political systems are mostly representative rather than direct. Voting assistants only have little decision-making power (they don't have any direct power but rather influence humans). They still require humans to form opinions on political statements that can be rather complex. Therefore, the quality of voting decisions made through these assistants depends on how well informed, rational, and invested voters are.

The far-future story presents a vision of a very direct system with human decision-making guided by machines. If human autonomy is lost beyond this, we would arrive at AI welfare aggregation. Between today and the future extreme, there are many other states we could pass through or choose to remain in if we wish to preserve more human autonomy for example.

Increased directness is currently not strongly supported because of concerns regarding voter rationality and information. To avoid voters being overwhelmed by hard decisions they would have to make in more direct systems, voting assistants would probably need to be programmed to ask new types of questions. They would break down larger decisions into small parts. They could for example learn about a voter's moral positions with moral dilemmata similar to the moral machines experiment [9]. With simple questions, they could find out what a voter considers a fair division of goods, then predict how different welfare state models affect the division of goods in society and suggest to vote accordingly. Initially, human-made analysis could be part of these systems, later computers could generate all parts of the system. This, of course, requires better world models than AI systems currently have, on which the assistants can base the breakdowns. Voting assistants might also their recommendations on educational aspirations, income, freedoms one wishes to have, stress levels, emotional response to nature in the neighborhood, the personal priority of health, concern for future generations, and many more.

This means that more directness and more power allocation to machines would probably go hand in hand. The diagram below illustrates the space of developments.

Dimensions of the design space of technology-aided voting systems: current voting advice assistants are used in representative democracies (low directness), increasing directness without transferring more decision power to machines leads into the area above the diagonal in which humans have to answer many detailed questions. Combinations of directness and machine power on the diagonal seem most desirable.

While increased use of voting assistants seems to be a natural continuation of current developments, a shift towards direct democracy cannot yet be seen. Over time, however, we can expect more popular support for direct democracy if the major drawback of voters not being well-informed about decisions is addressed by voting assistants.

How likely is such a development?

Arguments for why such a development seems likely have been mentioned in the text above. We briefly summarize them:

1. It would be in line with other technological developments which transfer decision-making authority from humans to computer programs in other areas of life.
2. Current VAAs are a step in a similar direction. They could be developed further to become more impactful voting assistants and some trends in this direction can already be observed.

Arguments against that such a development is not likely:

1. Other scenarios might be more stable: Problems faced by democracy in the future, such as super persuaders, are not eliminated by voting assistants. This could create pressure to take humans completely out of the loop. Autocracies might profit more from technological progress than democracies thus making stable autocracy scenarios more likely and voting assistants playing an important role globally less likely.
2. Insurmountable problems with fairness and AI control: If we cannot solve current problems surrounding the fairness of algorithms and AI control, letting AI influence our voting systems might not be wise. This could cause humans to take measures against voting assistant use which would make them less likely.

Apart from voting assistants seeming likely, we can also be interested in them because of their desirable properties. Given that strong forces are pulling us towards technological democracy trajectories, voting assistants could be a realization of these trajectories we should actively pursue. This cannot be said without reservations as many issues remain to be resolved, three of which we will now elaborate on.

Autonomy, Transparency, Fairness

Many of the concerns we currently have about other AI systems also apply to voting assistants.

Autonomy:

A transfer of decision-making from humans to machines can result in a decrease in human autonomy. This decrease can be justified by machines making better decisions in the interest of humans. On the other hand, offloading less important decisions to machines would free up human capacities for more important issues. Autonomy can also be preserved by mechanisms such as checking every decision or giving humans veto rights.

One aspect of autonomy reduction is particularly impactful in the case of voting assistants: if voting systems change in response to voting assistants being used, it might at some point become infeasible to participate in the voting process without using an assistant.

Transparency:

Transparency is vital to democratic processes. Current machine learning techniques are known for often being opaque but human decision maker’s reasoning is not always transparent either. With advances in AI transparency, we might hope that we can understand algorithmic decision-making processes better than those taking place inside human minds.

Fairness:

We differentiate between three different notions of fairness:

I. Algorithmic Fairness
Algorithmic fairness means that political parties or policies are not discriminated against based on factors that should not be considered relevant for the decision by the algorithm underlying the voting assistant. For example, policies that negatively affect areas in which people tend to provide less data to voting assistants should not be favored. This problem is connected to Provider-Fairness in recommender systems for example in online shops or on media platforms. Algorithmic fairness is already very relevant in today's voting assistants, where the structure of questionnaires and evaluation can affect a party's chances.

Voting assistants would also not be fair if they were able to learn the preferences of some voters better than others (for example due to underrepresentation in training data) - this makes people’s preferences count differently as some of the suggested votes more precisely reflect the true preferences than the votes suggested to the disadvantaged users.

Since perfect fairness can likely not be achieved, the most important fairness criteria for voting assistants will have to be agreed upon by the population of the countries using the assistants with the guidance of experts. This necessitates a discussion on which factors should be allowed to influence decisions, a question currently hidden by the opaqueness of the human decision-making process.

II. Accessibility Fairness
In the extreme case of performance unfairness, the voting assistant is not accessible for some voters because of costs of usage including access to electronic devices and internet connection. This could be prevented to some degree by providing public voting assistant access points. However, these points cannot support the regular gathering of data as well as personal devices can, thus again resulting in lower performance.

III. System Fairness
System fairness relates to how fair the democratic system truly is in practice. This also involves how people are questioned and whether that leads to biases or can be used to de-bias the voting process. Humans have many unconscious biases that also affect voting decisions (for example candidates' appearance influences the results [11]). De-biasing would mean analyzing the collected data for decisions that are not purely accounted for by aspects the user has chosen to incorporate but instead have an unconscious background. Here it becomes especially evident how unclear the definition of fairness is. For example: Would de-biasing also include affirmative action? And if yes, how much of it? Should de-biasing always take place or does protecting human autonomy mean allowing people to not let their biases be erased?

Whether and how these issues are resolved will be shaped by the environment in which they are developed. We conclude by discussing what this environment should look like in the next section.

How can we make sure voting assistant development is done well?

Set high standards: Voting assistants could potentially be great for our democracies; however, only if they fulfill certain conditions. The European Consortium for Political Research writes in the Lausanne Declaration that VAAs should be "open, transparent, impartial and methodologically sound" [9]. They call for the funding of developers to be made transparent and algorithms to be documented. To hold developers, which currently include entrepreneurs, universities, NGOs, and government-affiliated organizations [8], accountable, much more detailed criteria would have to be agreed on and publicly enforced. The German Wahl-O-Mat has already been held legally accountable due to not guaranteeing equal opportunity for all parties, and it should be made clear how this would also be possible for other VAA providers not associated with the government.

Hidalgo suggests a marketplace model for developers [4]. Marketplaces would have to be built in such a way that access fairness is fulfilled and developers are incentivized to build transparent models. Approaches to avoid a race to the bottom dynamic which are being discussed in the context of AI safety could be applied here.

Some other properties we might want advanced technological voting systems to fulfill:

• Awareness:  Whenever a voter makes decisions or takes actions that influence the voting outcome, they should be aware of that.
• Rationality: A voter should have a reasonable ability to foresee what her decision leads to (e.g. saying “I love cats” should not favor cat-banning laws). Even if a voting system does not rule out strategic voting (as shown in some impossibility theorems), it could fulfill rationality: no strategic voting would mean that there is no better way to promote one’s view than expressing it truthfully, while rationality means that expressing it truthfully is an effective way to promote one’s view).
• Universal access: Every voter gets the same chance to express their views. The voices of those who provide more data or are more eager to express their views are not amplified without good reason.

Find common ground: When it comes to issues such as de-biasing, it is important that we collectively agree on what purposes we want our voting systems to fulfill. This could happen through public discussion that can inform constitutional specialists. We will want to build a process that is stable enough for governance but can nevertheless be adapted in the future.

Conclusion

Voting assistants are only a small part of a possible future democracy trajectory shaped by technology but their further development and widespread adoption would be a natural continuation of tendencies we observe at the moment. We should therefore keep them in mind when researching humanity’s longer-term future. We now have the opportunity to create the right starting positions for them to be implemented in safe and fair ways by doing research, especially on the structure of human preferences and collective decision-making.

Acknowledgments

This blog post was written during a summer research fellowship at FHI. I thank everyone involved for this opportunity and especially the organizers Rose Hadshar and Eliana Lorch. I also thank my mentor Ondrej Bajgar for all his input on the topic and post, as well as his outstanding support during the fellowship.

References

1  Bundeszentrale für politische Bildung, Wahl-O-Mat European election 2019, https://www.wahl-o-mat.de/europawahl2019/, last accessed: 24.09.2020

2 Garzia, D. and Marschall, S. (2012) ‘Voting Advice Applications under review: the state of research’, Int. J. Electronic Governance, Vol. 5, Nos. 3/4, pp.203–222.

3 Nick Bostrom, Allan Dafoe, and Carrick Flynn. Public policy and superintelligent ai: A vector field approach 1 ( 2018 ) version 4.3

4 Hidalgo, C. https://www.peopledemocracy.com, last accessed: 24.09.2020

5 https://ourworldindata.org/grapher/world-pop-by-political-regime?stackMode=relative&time=1980, last accessed: 24.09.2020

6 Guillermo Romero Moreno, Javier Padilla & Enrique Chueca (2020) Learning VAA: A new method for matching users to parties in voting advice applications, Journal of Elections, Public Opinion and Parties, DOI: 10.1080/17457289.2020.1760282

7 I. Katakis, N. Tsapatsoulis, F. Mendez, V. Triga, and C. Djouvas, "Social Voting Advice Applications—Definitions, Challenges, Datasets and Evaluation," in IEEE Transactions on Cybernetics, vol. 44, no. 7, pp. 1039-1052, July 2014, doi: 10.1109/TCYB.2013.2279019.

8 D.Garcia et al., "Indirect Campaigning: Past, Present and Future of Voting Advice Applications" from The Internet and Democracy in Global Perspective: Voters, Candidates, Parties, and Social Movements, edited by Bernard Grofman, et al., Springer International Publishing AG, 2014. pp.25-38

9 Awad, E., Dsouza, S., Kim, R. et al. The Moral Machine experiment. Nature 563, 59–64 (2018). https://doi.org/10.1038/s41586-018-0637-6

10 Garzia, Diego, and Stefan Marschall (2014). ‘The Lausanne Declaration on Voting Advice Applications’, in: Diego Garzia and Stefan Marschall (eds.), Matching voters with parties and candidates. Voting advice applications in comparative perspective. Colchester: ECPR Press, S. 227–228.

11 Ahler, D.J., Citrin, J., Dougal, M.C. et al. Face Value? Experimental Evidence that Candidate Appearance Influences Electoral Choice. Polit Behav 39, 77–102 (2017). https://doi.org/10.1007/s11109-016-9348-6

12 Ladner, Fivaz, More than toys? A first assessment of voting advice applications in Switzerland. (2010)

14 Robin Burke. Multisided fairness for recommendation, 2017

Discuss

### What do we know about how much protection COVID vaccines provide against transmitting the virus to others?

Новости LessWrong.com - 12 часов 21 минута назад
Published on May 6, 2021 7:39 AM GMT

Discuss

### What do we know about how much protection COVID vaccines provide against long COVID?

Новости LessWrong.com - 12 часов 22 минуты назад
Published on May 6, 2021 7:39 AM GMT

Discuss

Новости LessWrong.com - 17 часов 11 минут назад
Published on May 6, 2021 2:50 AM GMT

Disclosure: I work on ads at Google; this is a personal post.

In the discussion of why I work on ads people asked whether I use an ad blocker (no) and what I think of them (it's complicated). So, what about ad blockers?

It should be up to you what you see. If you don't want your computer displaying ads, or any other sort of content, you shouldn't have to. At the same time, most sites are offering a trade: you're welcome to our content if you also view our ads.

These are in conflict, but I feel like the resolution could be simple:

1. You are free to block any ads you want.
2. Sites can know when ads are blocked.

Sites could choose to respond to ad blocking by showing a message explaining that ads are what fund the site and requiring users to either subscribe or allow ads if they want to proceed. Or not: the marginal cost of serving a page is trivial and perhaps some visitors will share articles they enjoy. Still others might implement something like the first-n-free approach you see with paywalls, or progressively more obnoxious nagging.

This isn't what we have today:

• Some sites (ex: Facebook) try to disguise their ads to get them past blockers. A big site that runs their own ads might scramble the names of resources on every page view, while a smaller site might hire an ad-tech company to proxy their site and stitch in ads. When successful, users are seeing content they specifically said they didn't want.

• Some blockers (ex: uBlock Origin but not AdBlock) hide "please disable your ad blocker or subscribe" messages. For example, 37% of uBlock Origin issues are people pointing out anti-adblock banners it misses (ex: #9005, #9006, #9007). When successful, sites are serving content to users they specifically said they didn't want to serve.

I don't have any sort of proposal here; I'm not proposing a browser feature or government regulation. But in thinking about how future decisions might affect ads, I'm going to be most excited about ones that support (1) and (2).

Discuss

### Let's Go Back To Normal

Новости LessWrong.com - 22 часа 21 минута назад
Published on May 5, 2021 9:40 PM GMT

In a few months, the US should stop being careful around COVID.

We’re in a new era of COVID. We have some amazingly effective vaccines that highly reduce both transmission and severity, so the emergency is effectively over for us (after another month or two for everyone to get a chance to take the vaccine). Cases will linger, but the point of the lockdowns and precautions of March 2020 was to buy time—to try to get a vaccine, and determine if long-term side effects were crippling—and we’ve done those things.

Now there is debate about whether we should go back to normal. I’ve been dismayed to see headlines like “Planning on opening up? Not so fast…” Even my acquaintances sometimes make comments about whether it’s safe for me to visit a friend (I’m vaccinated). I called up all my friends in March 2020 to tell them to lock down before the government did—but the people now arguing that we should continue being careful are, in my estimation, missing the bigger picture.

Whether or not we take more precautionary measures, the people still susceptible to COVID are going to get it *eventually*. (Both those that didn’t get the vaccine and those that didn’t have a strong enough immune response.) We aren’t going to magically save them by continuing to be careful. We’ll eventually hit herd immunity one way or another—the only question is how much we flatten the curve. We have to flatten it enough to maintain hospital supplies, as before, but I don’t think we should do more.

Now, there are some benefits to being careful. It won’t be *you* who gave someone COVID. If people get long COVID or die from it, it will be months later than it otherwise would have happened. But the benefits are very small in relative terms if most of the susceptible people will get it anyways, and absurdly small if the careful person got vaccinated.

The benefits are small in relative terms because the costs of continued precautions is extremely high.

The costs

The standard way to measure the costs is to look at how much people would trade to avoid them. I know people I’ve talked to said they’d pay on average about $30k to avoid getting COVID—I don’t actually know what they’d pay to avoid a year of lockdown, but my guess is something similar. Even for people who can’t afford paying much money, I still think they’d probably trade about 1:1 between getting COVID and spending a year in lockdown. Note that, aside from considerations about hospital supply shortages, this means over the last year we’ve already ended up having made a questionable trade. Probably we won’t have many more lockdowns. But we will likely continue to have mask mandates, and certainly peer pressure to publicly social distance and wear masks. My guess is that people would pay about a 10% chance of getting COVID for a year of this (assuming you can still hang out with friends privately unmasked). For it to be worth social distancing and wearing masks, this would have to be saving us about a 1% chance of COVID per month. (Current rates are claimed at about .4%/month, so plausibly about 1%.) But if you’re going to get COVID at some point, it doesn’t matter too much if it’s now or a year from now! I think people would pay a lot less in mask-wearing to push it back a year. I certainly would pay little. Half the country isn’t wearing masks now, and clearly would pay almost nothing. It already seems like a bad deal to keep wearing masks after everyone’s had a chance to get vaccinated. But none of this accounts for the fact that much of the badness doesn’t seem to be captured by trade-willingness—it’s either externalities or things we don’t notice at the time. First, I’ve seen a lot of bad stuff happen to people this year that doesn’t seem like it would have happened if not for the precautions taken against COVID. Fights, mental breakdowns, lack of inspiration, reduced vitality with downstream effects on everything. I don’t want to exaggerate this—a ton of bad things happen to everyone in an average year, making this pretty hard to measure—but I think there was a lot of real damage done in retrospect. And aside from counterfactuals, there's something real about how we didn’t even get a chance to have the non-COVID-related disasters, we just got the useless COVID ones instead. Second, I think there were some very bad and subtle externalities. People sometimes talk about the social fabric, which is a thing I don’t understand that well, but I do think an important thing tethering it is the expectation of continued social interaction. Without this, it seemed like a lot of things the social fabric supports—i.e. lots of society and our habits—took a hit. I’d be unsurprised if we look back and find that 2020 was a serious contributor to a number of bad events in the future involving political turmoil and cultural disintegration. I think this damage is ongoing as we fail to escape back to normal societal dynamics. Even mask-wearing everywhere seems like a bad norm for social fabric. Certainly the fear of other people is. I want my friends to stop being overly concerned about COVID because of this. I know it’s hard to get out of the mindset of the past year, but when this whole pandemic started we had to quickly go from Default Safe, Must Prove Unsafe -> Default Unsafe, Must Prove Safe; now we must make the leap in reverse. Last, I think legal mandates for continued precautions are especially bad compared to recommendations. Our legal system explicitly works on precedent, but our policy system and social acceptability system implicitly do too. Again, I think the initial lockdowns were entirely worth it, like Cincinnatus in Rome—but your dictator is supposed to clearly step down after a year, not say that he’s staying just a little bit longer until more defenses are built, or that he’ll step down as dictator but through the revolving door as a lobbyist. This fear about continued government overreach is necessarily vague, but there are some clear examples. For one, the travel bans seem especially harsh—in the US with high rates of COVID, there is less likelihood of getting it in another country than there is if you stay here! It makes more sense to have border closures *in the other direction*. But even those are a bit crazy—if you calculate it out, it’s strongly net-negative from a trade perspective, and even net-negative just from the host country’s perspective in most cases! [1] (I advocated for this in April while working for Epidemic Forecasting as well, so this isn’t even hindsight nitpicking.) Anyways, I think travel bans became more likely to recur in the future in dumb situations because of the precedent set here. But the real example is totalitarian usage. Dictators are well-known to take advantage of crises for increasing state control—luckily Trump neither ended up on the pro-precautionary side politically nor was dictatorial enough to leverage the lockdowns or other mandates into increased personal/state control, but it’s not too hard to think of how this could have gone or how it could go in a future pandemic—lock down the industries and regions that don’t politically support you, nationalize some critical infrastructure to “keep it functioning safely”, arrest political opponents who violate your selectively enforceable mandates in the name of national security, claim military transgression from countries you want conflict with who had any role in increasing the pandemic’s scope... I definitely think these concerns about government overreach are overblown by many, and you can fill in various arguments about whether or not there's much counterfactual impact from the current measures. But I do think the concerns are about something correctly identified as real. And when there’s a real concern that has already mobilized half the country and there’s evidence the pendulum is already on the wrong side, that’s a very good opportunity to help it be taken seriously. A great start would be to pass resolutions to end the State of Emergency, and not legally limiting people’s interactions outside of the Emergency. It’s not like we shouldn’t make good laws—but we’re not correctly avoiding the unnecessary ones. Because the saddest thing about the legal requirements is that they don’t do very much. Sure, at the beginning we got a lot from mandated countermeasures, as people needed to be mobilized and taught new norms if we were going to hold off the virus. But a few months in and the control system was well-established: people act more cautious when rates go up. There might be a paper somewhere proving I’m wrong, but I don’t think that the 2nd/3rd/4th waves were very affected by legal requirements compared to private actions in the control system and even government recommendations. The legal mandate just pissed a lot of people off and caused them to get a lot better at sneaking their mask under their chin. (I think a prominent deviation from this is schools and businesses: it’s a lot more reasonable to fight a principal-agent problem and make these institutions not coerce their subjects than to coerce your own subjects.) Summary Here is my list of precautions, and whether I think they should be kept in the playbook as mandates or recommendations: • Lockdowns / stay-at-home order: Shouldn’t be a legal mandate or recommended • Curfews: Shouldn’t be a legal mandate or recommended • Gatherings limited to X people: Shouldn’t be a legal mandate, could be recommended • Border closures: Shouldn’t be a legal mandate or recommended, except in extreme circumstances • Masks: Shouldn’t be a legal mandate, fair to recommend to the unvaccinated • Businesses suspended: Could be a legal mandate • Schools closed: Could be a legal mandate • Testing: Seems fair for corporations to ask people to do this in many cases, not especially abusable • Vaccine passports: Probably fair, seems fantastic for public health and, while highly abusable, not at all an abuse in this actual case • Visiting sick family members: I think this has been adequately covered as totally inhumane to ban, but it’s worth noting that these cases were hospital policies and not laws, to my knowledge • State of Emergency: Should be reserved for actual emergencies, which we are no longer in Again, there are some limited benefits to continued caution. But we have to go back to normal at some point, and I think outside of strained arguments it seems clear that time is [about a month or two from now]. We may have to artificially flatten the curve, but we may not. I think we should be very wary of using mandates to do this instead of recommendations. For the time being, if you’re vaccinated, I think you should abandon caution—finally the argument “it’s like the flu” can actually apply.[2] The only reasons not to are if you’re in a risky subgroup and have some better-idea-than-I about what crux you might learn by waiting and haven’t been very negatively affected by lockdowns, or if you’re directly interacting with someone in the most at-risk subgroups who you’d try hard not to give the flu to. [1] Fermi of travel risk: say you take a one-week vacation from a country that had .3% rate at a given time. Your chance of giving it to someone else would be about .3%/4 (since infections last about 4 weeks; and since the actions of the people in the other country are probably about the same as in your country, since R is similar, though this is somewhat complicated to check). So even if you’re 4x more risky than others in your country, your weekly rate of transmitting will end up being about .3%. If you’re willing to pay$50k to not get COVID, you’d pay $150 to avoid this risk of COVID, which is substantially less than most international travel costs in the first place and would be willingly paid given the consumer surplus, so the trade is positive-sum. But even from the perspective of the host nation, substantially more than$150 is gained from per capita tourism; and it happens that tourism is a high-gross-margin industry (40%?), so just adding up all the producer surpluses is probably at least 30% of total, meaning if you as traveler paid more than $500 for things in the country they would still be making a beneficial trade despite paying for the risk. Only in cases where the host country has COVID almost fully contained and a single case would cause them huge costs, which we only saw in the merest handful of countries this year, would it make sense to close borders. [2] Fermi of precautionary benefit over the next few months: vaccine effectiveness in the US is about 95% on average, 99% in the young, 90% in the elderly. Unfortunately this was measured during lots of precautions; it’s possible that with fewer precautions, it would be somewhat less effective due to higher initial viral loads. However, most of the variance seems (speculation) explained by more of a qualitative threshold of immune effect and less a quantitative difference in numbers of antibodies; thus, I’ll assume in typical circumstances the risk is only a factor of 2 higher, meaning 90% average, 98% in the young, 80% in the elderly. When transmission does occur to a vaccinated person, the virus is likely to have a much lower peak viral load, meaning much of the damage from death or hospitalization or long-term effects is blocked, which I’ll estimate vaguely at 90% but could be much more or a bit less (and risk probably varies by a factor of a few with age). There is a rumor that vaccinated people can somehow still transmit the virus without being detectably infected themselves, but this seems hard to fathom when thinking through the mechanics so I will ignore it (if it does happen it’s likely very rare and would result in a very low viral load anyways). So for a youngish person who gets vaccinated, I estimate that risk of any activity drops about 99.9% (1-.02*.05); for an old person, about 96% (1-.2*.2). Notably, this huge amount of difference means there is a lot of moral hazard if we make everyone take precautions. The benefit of the vaccinated people following it will be like 100:1 less than the unvaccinated, and total benefit will also accrue 100:1 in favor of an unvaccinated compared to a vaccinated. If the population is about half vaccinated, the per-person ratios also apply to the population. I am all for helping people who won’t help themselves, but definitely at a lower cost to myself than if the person does help themselves. Second, this means that you really have to have massive behavioral differences for vaccinated people to be much of a risk. If you are vaccinated, socially distanced at a store, and deciding whether to wear a mask, you change your cost to people from about 0.01 uCOVIDs to 0.02 uCOVIDs, which for healthy people is equivalent to something like$.0001->\$.0002. You have to be doing things like hanging out indoors for hours and laughing before it even registers as a similar amount of risk to what we were willing to take before.

Discuss

### What are your favorite examples of adults in and around this community publicly changing their minds?

Новости LessWrong.com - 5 мая, 2021 - 22:26
Published on May 5, 2021 7:26 PM GMT

Search turns up a few threads on self-reported personal changes of mind. I'm curious here about the third-person perspective: When have you noticed and remembered peers or colleagues changing their minds? I'm particularly interested in examples from the group of people who seem to regard a book on how to change your mind as a cultural touchstone, but stories about those whom such people work with and look up to seem likely to also be relevant.

Discuss

### [AN #149]: The newsletter's editorial policy

Новости LessWrong.com - 5 мая, 2021 - 20:10
Published on May 5, 2021 5:10 PM GMT

Alignment Newsletter is a weekly publication with recent content relevant to AI alignment around the world. Find all Alignment Newsletter resources here. In particular, you can look through this spreadsheet of all summaries that have ever been in the newsletter.

Audio version here (may not be up yet).

Please note that while I work at DeepMind, this newsletter represents my personal views and not those of my employer.

HIGHLIGHTS

In the survey I ran about a month ago, a couple of people suggested that I should clarify my editorial policy, especially since it has drifted since the newsletter was created. Note that I don’t view what I’m writing here as a policy that I am committing to. This is more like a description of how I currently make editorial decisions in practice, and it may change in the future.

I generally try to only summarize “high quality” articles. Here, "high quality" means that the article presents some conceptually new thing not previously sent in the newsletter and there is decent evidence convincing me that this new thing is true / useful / worth considering. (Yes, novelty is one of my criteria. I could imagine sending e.g. a replication of some result if I wasn’t that confident of the original result, but I usually wouldn’t.)

Throughout the history of the newsletter, when deciding whether or not to summarize an article, I have also looked for some plausible pathway by which the new knowledge might be useful to an alignment researcher. Initially, there was a pretty small set of subfields that seemed particularly relevant (especially reward learning) and I tried to cover most high-quality work within those areas. (I cover progress in ML because it seems like a good model of ML / AGI development should be very useful for alignment research.)

However, over time as I learned more, I became more excited about a large variety of subfields. There’s basically no hope for me to keep up with all of the subfields, so now I rely a lot more on quick intuitive judgments about how exciting I expect a particular paper to be, and many high quality articles that are relevant to AI alignment never get summarized. I currently still try to cover almost every new high quality paper or post that directly talks about AI alignment (as opposed to just being relevant).

Highlights are different. The main question I ask myself when deciding whether or not to highlight an article is: “Does it seem useful for most technical alignment researchers to read this?” Note that this is very different from an evaluation of how impactful or high quality the article is: a paper that talks about all the tips and tricks you need to get learning from human feedback to work in practice could be very impactful and high quality, but probably still wouldn’t be highlighted because many technical researchers don’t work with systems that learn from human feedback, and so won’t read it. On the other hand, this editorial policy probably isn’t that impactful, but it seems particularly useful for my readers to read (so that you know what you are and aren’t getting with this newsletter).

A summary is where I say things that the authors would agree with. Usually, I strip out things that the authors said that I think are wrong. The exception is when the thing I believe is wrong is a central point of the article, in which case I will put it in the summary even though I don’t believe it. Typically I will then mention the disagreement in the opinion (though this doesn’t always happen, e.g. if I’ve mentioned the disagreement in previous newsletters, or if it would be very involved to explain why I disagree). I often give authors a chance to comment on the summaries + opinions, and usually authors are happy overall but might have some fairly specific nitpicks.

An opinion is where I say things that I believe that the authors may or may not believe.

TECHNICAL AI ALIGNMENT
PROBLEMS

Low-stakes alignment (Paul Christiano) (summarized by Rohin): We often split AI alignment into two parts: outer alignment, or "finding a good reward function", and inner alignment, or "robustly optimizing that reward function". However, these are not very precise terms, and they don't form clean subproblems. In particular, for outer alignment, how good does the reward function have to be? Does it need to incentivize good behavior in all possible situations? How do you handle the no free lunch theorem? Perhaps you only need to handle the inputs in the training set? But then what specifies the behavior of the agent on new inputs?

This post proposes an operationalization of outer alignment that admits a clean subproblem: low stakes alignment. Specifically, we are given as an assumption that we don't care much about any small number of decisions that the AI makes -- only a large number of decisions, in aggregate, can have a large impact on the world. This prevents things like quickly seizing control of resources before we have a chance to react. We do not expect this assumption to be true in practice: the point here is to solve an easy subproblem in the hopes that the solution is useful for solving the hard version of the problem.

The main power of this assumption is that we no longer have to worry about distributional shift. We can simply keep collecting new data online and training the model on the new data. Any decisions it makes in the interim period could be bad, but by the low-stakes assumption, they won't be catastrophic. Thus, the primary challenge is in obtaining a good reward function, that incentivizes the right behavior after the model is trained. We might also worry about whether gradient descent will successfully find a model that optimizes the reward even on the training distribution -- after all, gradient descent has no guarantees for non-convex problems -- but it seems like, to the extent that gradient descent doesn't do this, it will probably affect aligned and unaligned models equally.

Note that this subproblem is still non-trivial, and existential catastrophes still seem possible if we fail to solve it. For example, one way that the low-stakes assumption could be made true was if we had a lot of bureaucracy and safeguards that the AI system had to go through before making any big changes to the world. It still seems possible for the AI system to cause lots of trouble if none of the bureaucracy or safeguards can understand what the AI system is doing.

Rohin's opinion: I like the low-stakes assumption as a way of saying "let's ignore distributional shift for now". Probably the most salient alternative is something along the lines of "assume that the AI system is trying to optimize the true reward function". The main way that low-stakes alignment is cleaner is that it uses an assumption on the environment (an input to the problem) rather than an assumption on the AI system (an output of the problem). This seems to be a lot nicer because it is harder to "unfairly" exploit a not-too-strong assumption on an input rather than on an output. See this comment thread for more discussion.

LEARNING HUMAN INTENT

Transfer Reinforcement Learning across Homotopy Classes (Zhangjie Cao, Minae Kwon et al) (summarized by Rohin): Suppose a robot walks past a person and it chooses to pass them on the right side. Imagine that we want to make the robot instead pass on the left side, and our tool for doing this was to keep nudging the robot's trajectory until it did what we wanted. In this case, we're screwed: there is no way to "nudge" the trajectory from passing on the right to passing on the left, without going through a trajectory that crashes straight into the person.

The core claim of this paper is that the same sort of situation applies to finetuning for RL agents. Suppose we train an agent for one task where there is lots of data, and then we want to finetune it to another task. Let's assume that the new task is in a different homotopy class than the original task, which roughly means that you can't nudge the trajectory from the old task to the new task without going through a very low reward trajectory (in our example, crashing into the person). However, finetuning uses gradient descent, which nudges model parameters; and intuitively, a nudge to model parameters would likely correspond to a nudge to the trajectory as well. Since the new task is in a different homotopy class, this means that gradient descent would have to go through a region in which the trajectory gets very low reward. This is not the sort of thing gradient descent is likely to do, and so we should expect finetuning to fail in this case.

The authors recommend that in such cases, we first train in a simulated version of the task in which the large negative reward is removed, allowing the finetuning to "cross the gap". Once this has been done, we can then reintroduce the large negative reward through a curriculum -- either by gradually increasing the magnitude of the negative reward, or by gradually increasing the number of states that have large negative reward. They run several robotics experiments demonstrating that this approach leads to significantly faster finetuning than other methods.

Rohin's opinion: This seems like an interesting point to be thinking about. The part I'm most interested in is whether it is true that small changes in the neural net parameters must lead to small changes in the resulting trajectory. It seems plausible to me that this is true for small neural nets but ends up becoming less true as neural nets become larger and data becomes more diverse. In our running example, if the neural net was implementing some decision process that considered both left and right as options, and then "chose" to go right, then it seems plausible that a small change to the weights could cause it to choose to go left instead, allowing gradient descent to switch across trajectory homotopy classes with a small nudge to model parameters.

Learning What To Do by Simulating the Past (David Lindner et al) (summarized by Rohin): Since the state of the world has already been optimized for human preferences, it can be used to infer those preferences. For example, it isn’t a coincidence that vases tend to be intact and on tables. An agent with an understanding of physics can observe that humans haven’t yet broken a particular vase, and infer that they care about vases not being broken.

Previous work (AN #45) provides an algorithm, RLSP, that can perform this type of reasoning, but it is limited to small environments with known dynamics and features. In this paper (on which I am an author), we introduce a deep variant of the algorithm, called Deep RLSP, to move past these limitations. While RLSP assumes known features, Deep RLSP learns a feature function using self-supervised learning. While RLSP computes statistics for all possible past trajectories using dynamic programming, deep RLSP learns an inverse dynamics model and inverse policy to simulate the most likely past trajectories, which serve as a good approximation for the necessary statistics.

We evaluate the resulting algorithm on a variety of Mujoco tasks, with promising results. For example, given a single state of a HalfCheetah balancing on one leg, Deep RLSP is able to learn a (noisy) policy that somewhat mimics this balancing behavior. (These results can be seen here.)

Thesis: Extracting and Using Preference Information from the State of the World

MISCELLANEOUS (ALIGNMENT)

OTHER PROGRESS IN AI
DEEP LEARNING

Scaling Scaling Laws with Board Games (Andrew L. Jones) (summarized by Rohin): While we've seen scaling laws (AN #87) for compute, data, and model size, we haven't yet seen scaling laws for the problem size. This paper studies this case using the board game Hex, in which difficulty can be increased by scaling up the size of the board. The author applies AlphaZero to a variety of different board sizes, model sizes, RL samples, etc and finds that performance tends to be a logistic function of compute / samples used. The function can be characterized as follows:

1. Slope: In the linearly-increasing regime, you will need about 2× as much compute as your opponent to beat them 2/3 of the time.

2) Perfect play: The minimum compute needed for perfect play increases 7× for each increment in board size.

3) Takeoff: The minimum training compute needed to see any improvement over random play increases by 4× for each increment of board size.

These curves fit the data quite well. If the curves are fit to data from small board sizes and then used to predict results for large board sizes, their error is small.

Recall that AlphaZero uses MCTS to amplify the neural net policy. The depth of this MCTS determines how much compute is spent on each decision, both at training time and test time. The author finds that a 10x increase in training-time compute allows you to eliminate about 15x of test-time compute while maintaining similar performance.

NEWS

BERI Seeking New University Collaborators (Sawyer Bernath) (summarized by Rohin): BERI is seeking applications for new collaborators. They offer free services to university groups. If you’re a member of a research group, or an individual researcher, working on long-termist projects, you can apply here. Applications are due June 20th.

FEEDBACK

I'm always happy to hear feedback; you can send it to me, Rohin Shah, by replying to this email.

PODCAST

An audio podcast version of the Alignment Newsletter is available. This podcast is an audio version of the newsletter, recorded by Robert Miles.

Discuss

### Fractal Conversations vs Holistic Response

Новости LessWrong.com - 5 мая, 2021 - 18:04
Published on May 5, 2021 3:04 PM GMT

A kind of conversation which I very much enjoy is one where everyone has time to formulate long thoughtful responses, and most of the conversational points are remembered and addressed later.

For example, a long email thread between Alice and Bob, where Alice and Bob write their next emails as point-by-point replies. In caricature:

(email 1, from Alice:)

(Email 2, from Bob:)

How are you doing?

I am doing poorly. Lately I cannot stop thinking about Arcturus. This makes it difficult to focus on my work.

I am well.

It continues to grow exponentially for the time being. Soon I will need to worry about the weight causing earthquakes in the local geology. I am considering moving it off-planet. What brought it to mind?

(Email 3, from Alice:)

How are you doing?

I am doing poorly. Lately I cannot stop thinking about Arcturus.

That makes sense. To be honest I had forgotten! I know it's important to you. What are you thinking about, though? So far as I understand, there is nothing to be done.

This makes it difficult to focus on my work.

Have you considered taking stimulants? Or would that make the obsessive thoughts worse?

I am well.

Oh, no. I actually got it amputated! It wasn't much use anyway, so that's literally taken a load off. Dead weight, as they say. But one adjusts. I've noticed I'm happier these days, though I am not sure why.

It continues to grow exponentially for the time being. Soon I will need to worry about the weight causing earthquakes in the local geology.

I can see how that would be a problem. Things are delicate down there (at least when you get to these massive scales!)

I am considering moving it off-planet.

Better do it soon, then, if it's exponential growth! You know, rocket equation and all that.

What brought it to mind?

Ah, right. Well, I had an idea for an experiment...

And so on, with Alice and Bob continuing to expand sub-points like that, managing an increasing number of threads in parallel. (Not strictly increasing! Some will usually be dropped pretty fast as they reach a natural conclusion or don't feel relevant to the interesting parts of the discussion.)

If the conversation is a disagreement, you can see how this structure enables a thorough examination of all the possible argument paths. It's a little like an HCH tree, except in practice the parallel conversations influence each other heavily, because the related ideas in the different threads will bear heavily on one another.

It's a very detail-oriented conversational mode. Sometimes you get the feeling of being stuck in the mud, bogged down by all the different points you're trying to manage.

At times like that, what I usually do is take a step back and consider the big picture. What's really going on with this conversation? Are there any patterns which seem to come up again and again (even if they're very vague/abstract)? This feels like taking a break from trying to understand the explicit content of individual sentences and paragraphs, and instead, focusing on making a mental model of my conversation partner. What are the deep generators of the conversation we're having? Is there some broader picture my conversation partner is trying to convey?

It feels similar to blindsight guesses: I may feel I have no idea what bigger picture the other person is getting at, but I ask my brain for an answer anyway, hypothesizing that my brain must have some mental model of the conversation.

And then I try to articulate that picture (somewhat like Gendlin's focusing), ask where I'm right/wrong, and perhaps give my hypothetical response supposing I'm right.

Sometimes I'm surprised by how correct my guesses are, but I'm also often (very) off the mark. Either way, it usually opens up new threads which have a good chance of being more fruitful than the old ones.

Let's call this conversational move the "holistic response" -- it feels like blurring your eyes a little bit to ignore details and take in a larger shape, or stepping back from a painting to see the composition rather than the brushwork. Often it seems to grant the ability to skip several steps ahead in the arguments, or bridge inferential gaps which were previously blockers. It also helps snap one out of arguments-as-soldiers mentality and exercise empathy, trying on the other person's perspective.

This seems slightly related to double crux. Double crux was invented precisely to focus on the important points rather than going down endless rabbit holes of conversation, and it involves taking the other person's perspective and holding it next to your own

Why wouldn't I always take the holistic approach?

Well, I've been in conversations with someone who seems to "always" (ie, very often) take the holistic approach. It's definitely useful at times, but it often frustratingly drops the more detailed point-by-point analysis which could well go somewhere. Besides, without first taking the detailed approach, there isn't that much fuel for the holistic guesses.

Discuss

### If something seems unusually hard for you, see if you're missing a minor insight

Новости LessWrong.com - 5 мая, 2021 - 13:23
Published on May 5, 2021 10:23 AM GMT

The world is full of all kinds of minor insights about how to do something (either easily or at all): how to open stitched bags, how to mop floors, how to use search engines, the right way to look at a ball in order to catch it, what it means to 'ping' somebody, how the skill of playing sudoku shares useful elements with other things like proving math theorems, how to pass as 'normal', etc.

For anything that seems unusually hard for you, there's a chance that there's some simple insight that you happen to be missing that would make it easier. You could put a lot of effort into trying to laboriously level up the skill, or you could see if you could acquire that insight by relatively little work.

I particularly like the post for having an extended list of examples of the thing, hopefully making it easier to notice potential applications of this principle when they come up in your own life. (The example that came to my own mind was finding it unexpectedly aversive to vacuum in my current home, when it had felt fine in the previous one, and then only eventually realizing that my former housemate's vacuum cleaner that I'd been using in my previous home was much more pleasant to use than my own.)

Discuss

### Bayeswatch 2: Puppy Muffins

Новости LessWrong.com - 5 мая, 2021 - 08:42
Published on May 5, 2021 5:42 AM GMT

A green humvee arrived at Jieyang Chaoshan International Airport. Vi got in the back with Molly Miriam, who handed her clipboard to Vi.

"健重制造公司. A no-name Chinese factory that makes barbells and similar equipment. It's not even fully-automated," read Vi.

"They are registered to use a low-intelligence AGI," said Miriam.

"What are we even doing here? Neither the product nor the AI poses a threat to civilization," said Vi.

"Something must have gone badly wrong," said Miriam.

The road to the factory was blockaded by the People's Liberation Army (PLA). The soldier at the checkpoint scanned the Bayeswatch agents' badges. A young officer—barely out of high school—escorted them inside the perimeter to Colonel Qiang.

"We could probably handle this on our own," said Colonel Qiang, "But protocol is protocol."

"So it is," said Miriam.

There were no lights on in the factory. No sound emanated from it. Fifty soldiers behind sandbags surrounded the factory, along with two armored personnel carriers and a spider tank.

"The police responded first. The sent a SWAT team in. Nobody came back. Then we were called. We would like to just burn the whole thing down. But this incident could be part of a wider threat. We cut power and Internet. Nothing has entered of left the building since our arrival. Rogue AIs can be unpredictable. We wanted your assessment of the situation before continuing," said Colonel Qiang.

"You did the right thing. This is probably an isolated incident. If so then the best solution is to rescue who we can and then level the building. Unfortunately, there is a change this is not be an isolated incident. Therefore our top priority is to recover the AI's hard drives for analysis," said Miriam.

"We will assault the building," said Colonel Qiang.

"You may have our sanction in writing. Assume humans are friendly and robots are hostile," said Miriam.

"Yes sir," said Colonel Qiang.

Miriam and Vi were quartered in a nearby building that had been comandeered by the PLA. They watched the assault from a video monitor.

"In training they taught us to never go full cyber against an AI threat," said Vi.

"That is correct," said Miriam.

"Which is why every assault force is no less than ten percent biological," said Vi.

Miriam nodded.

"Standard operating procedure is you go ninety percent robotic to minimize loss of life," said Vi.

"Ninety percent robotic does tend to minimize loss of life without the failure modes you get from going full cyber," said Miriam.

"It looks to me like they're going one hundred percent biological while their battle droids stay outside. Are we facing that dangerous of a hacking threat?" said Vi.

"No. They are just minimizing loss of capital," said Miriam.

The video feed of the factory was replaced by Colonel Qiang's face. "We have a survivor," he said.

Two privates guarded the door to their freely-liberated prisoner. His dress shirt was stained with blood and grease. An empty styrofoam take-out food tray lay in the corner of the table with a pair of disposable chopsticks and an empty paper cup. Miriam and Vi took seats opposite him.

"I understand you helped program the improperly registered AI at 健重制造公司," said Vi.

"I didn't know it was improperly registered," said Paul while looking straight at the security camera.

"We're not here to find out what laws were or weren't broken. We just want to know why there is a company of infantry surrounding this factory," said Miriam.

"There wasn't much to it. The mainframe running the assembly line barely qualifies as an AGI. We could never afford that much compute," said Paul.

"How does it work?" said Miriam.

"Labor is affordable here by international standards. Our factory is mostly human-run. Androids are expensive. We only have a couple of them. We should have been able to overpower robots if they were all that had gone rogue," said Paul.

"But that's not what happened," said Miriam.

"We didn't smell anything. People just started dying. We tried to help. More died. We tried to escape but the fire doors had been locked. I ran to my office, barricaded the door and breathed out the window," said Paul.

"Argon gas. It has all sorts of industrial applications," said Vi.

"Exactly," said Paul.

"And the same mainframe which controlled the robots also controlled the fire suppression system," said Vi.

Paul nodded.

"So why did it want to kill people?" said Vi.

"Maybe it was jealous," said Paul.

"Let's stick to the facts. Why use an AI at all if human labor is so cheap?" said Miriam.

"Human manual labor is cheap. New products are high margin but top designers are expensive. We had the AI do some manufacturing because embodiment helps with designing human-compatible products. But mostly we just used it for the generative model," said Paul.

Miriam flinched. "Thank you. That will be all," she said.

They were back in the monitor room.

"We don't need the hard drives. Do whatever you want," said Miriam to the image of Colonel Qiang.

The monitor went black.

"I lost count of how many OSHA regulations they violated," said Vi.

"OSHA has no jurisdiction here," said Miriam.

"Do you know what happened?" said Vi.

"When I was your age, I inspected a muffin factory. They followed all the regulations. It was even Three Laws Compliant. Very high tech. For its time," said Miriam.

Miriam lit a cigarette.

"The told the AI to make the cute muffins. They fed /r/cute into it as training data."

Miriam took a deep breath from her cigarette.

"The AI bought puppies. The engineers thought it was cute. They thought maybe they had solved the alignment problem," said Miriam.

Miriam took another swig. She exhaled slowly.

"The engineers had told the AI to make blueberry muffins. Do an image search for 'puppy muffin' on your phone," said Miriam.

"They do look the same. Puppies do look like blueberry muffins," said Vi.

"Oh," said Vi.

"Come outside. You need to see this with your eyes," said Miriam.

The soldiers had retrieved several bodies. Doctors were autopsying them. The bodies' hands were missing. A few were missing half their forearms. One body had its neck and shoulders removed.

"They used a generative model on weightlifting equipment. They fed it pictures of people lifting weights. They annotated which regions of the images constituted a 'barbell'." said Miriam.

Vi almost puked.

"Tell me what happened," said Miriam.

"The generative model added disembodied hands to the barbell," said Vi.

Colonel Qiang ordered sappers to implode the facility.

Discuss

### What are the greatest near-future risks or dangers to you as an individual?

Новости LessWrong.com - 5 мая, 2021 - 07:02
Published on May 5, 2021 4:02 AM GMT

What factors do you expect have the highest likelihood of severely compromising your own quality and/or duration of life, within the next 1, 5, or 10 years? How do these risks change your behavior compared to how you expect you'd act if they were less relevant to you?

If those greatest personal risks are not things you categorize as existential risks to all of humanity, how do you divide your risk mitigation efforts between the personal-and-near-term and the global-and-long-term ones?

(I ask because I keep catching myself trying to generalize about the personal risk profiles of the kinds of people who concern themselves with x-risks, and I'd rather generalize from self-reports of humans in a community which attracts that demographic than from wild guesses inferred by reading a bunch of blogs)

Discuss

### Simulation theology: practical aspect.

Новости LessWrong.com - 5 мая, 2021 - 05:21
Published on May 5, 2021 2:20 AM GMT

In this post, I elaborate a little bit on the simulation argument by Nick Bostrom and discuss what it may lead to from practical, observable purpose.  The paper is excellent, but in case you haven’t read it I write everything in a self-consistent manner so you should be able to get the point without it.

Consider highly advanced technological civilization (call it parent civilization) that can create other civilizations (child civilizations). This can be done in multiple ways. Bostrom focuses only on the computer simulation of other civilizations. This is most likely the easiest way to do it. It is also possible to terraform a planet and populate it with humans (let’s call agents in the parent civilization humans for simplicity) effectively putting it to the period just before written history. Finally, one may consider an intermediate solution, something like “The Matrix” movie, with all the people perceiving not actual reality, but artificial. Parent civilization itself may have a parent. If civilization does not have a parent, let’s call it orphan

In principle, parent civilization can create child civilizations, similar to its own earlier stages of development. Then one may ask, how is it possible to understand, whether you are a child civilization or orphan? For the case of the computer simulation, it is possible to argue that simulated beings have qualitatively different perceptions, or don’t have perception at all. However, for the other two types of child civilization (other planet and Matrix) humans are the same as in the parent civilization and should have the same type of perception. Thus, in the earlier stages of history, it might be not possible to distinguish between being a child and an orphan if the parent aims to not show its existence.

However, with technological development, some progress in this question can be achieved. I am not talking about distinguishing “fake reality” from the “real reality” – if the parent is much more technologically advanced, it should be able to make it indistinguishable for a child. I am rather talking about reaching the stage of becoming a parent themselves. Then, one can create many children, copies of the earlier stages of development, and see, what fraction can reach the stage of parent themselves (so civilization exploring the question becomes a grandparent). If this fraction is not infinitesimal, and the resources of the grandparent are large enough, the number of children who became parents will be large. Then, according to the Copernicus principle, the probability that the grandparent civilization is someone’s child is significantly higher, than that it is an orphan. You can find more detailed argumentation in the original paper

Can civilization infer anything about its parent? In some sense, yes. As we see, the large number of children is going to be the parent civilization at the early stages – it is necessary to solve the question of whether civilization is child or orphan. Of course, some will be different, but they don’t need to be produced in excessive amounts. It will in some sense resemble evolution, with children resembling parents with except to mutation.  Of course, some civilizations will not follow it and generate just whatever children. However, the pattern transmission (making children like parents)  is a stable attractor – once civilization preferring to make children alike itself is created, it preserves in generations. Thus, it will be safe to assume that parent civilization is alike the child – it is not guaranteed, but it is likely.

Now we can turn to ourselves. What is the likelihood we one day start to create children civilizations? Is it infinitesimal? I don’t think so. It may be small, the Precipice is near, but I think there is a nonzero chance we will be able to avoid it.  If it is the case, then we are very likely to be a child of civilization, that is, in some sense, Future-We.  We after avoiding the Precipice. We after the Long Reflection

Should the parent civilization just create a child without interacting with it for proper study of the simulation argument? Not necessary. Interaction is allowed, if the child civilization can not observe it (or observe, but treat it as natural phenomena, ancient myths, etc.). I.e., as long as the scientific community in the child civilization can’t say, that this and that is interaction with another civilization, the parent is safe from being discovered. This interaction may increase or decrease the chance that child civilization reaches the parenting phase itself – but since the parent civilization itself has no information, whether their parents (if exist) interact with them or not, they definitely should explore multiple options.

Future-We also can interact with us. If we don’t completely change morally, Future-We will be more biased toward benevolent interactions rather than negative – i.e.,  they will try to improve our well-being and decrease suffering where it is possible to do without demonstrating themselves. Quite easy to see that the best coverage for them will be a religion – it is possible to do a lot and still, all witnesses will be biased and untrustworthy, so scientists will just move on and don’t put on too much weight to it.

And now, finally, a practical aspect that I promised. I want my well-being improved, so can Future-We help me with it? Why not? Maybe if I just ask (since Future-We in religion would be God, one may say – pray), there would be some help? It will not prove that Future-We exist, because everyone around can just say that it is a placebo or a coincidence. However, the placebo will work only as long as I believe in it, so I needed all this theoretical part above. So, I tried, and I feel indeed quite an improvement, which is the most important result of this theoretical construction for me.

“It is precisely the notion that Nature does not care about our algorithm, which frees us up to pursue the winning Way - without attachment to any particular ritual of cognition, apart from our belief that it wins.  Every rule is up for grabs, except the rule of winning.”

From here

Discuss

### Loud Voices as Accessible Anecdotes

Новости LessWrong.com - 5 мая, 2021 - 01:44
Published on May 4, 2021 9:13 PM GMT

I've been pondering something for a little while, which is that the standard take on political takes might be slightly off; I hear a lot about how people tend to listen to the (metaphorically) loudest voices in the room, drowning out the sensible discussion that is actually taking place.

And the thing is, I have some experience with some of these things.  Some of the takes that are attributed to "loud voices" I have encountered over the years are things that I heard, years before, in some of the places I'm supposed to go to for sensible discussion - which is to say, I encountered some of the more outrageous takes, that "everyone" agrees only a few crazy people believe, in the wild.  Not just in the wild, but teaching classes, and requiring their crazy be repeated back to them.  Absent somebody turning global attention to them - giving them the metaphorical loudspeaker - they were, and remain, quiet.

But they were certainly loud and relevant to me, because people in positions of authority, people who should know better, were using that authority to demand other people repeat their inane nonsense back to them.

The concept of social bubbles may enter into things here; we could describe "volume", as a metaphor, as being about voices which cut across bubbles; things you can hear through a bubble.  And this can kind of work, if you squint just right.  But maybe, when we say say "loud", what we actually mean is "accessible".  Loud is just a kind of accessibility that seems to work really well for voices, but I think "accessible" may cut more to the heart of the matter, in illustrating why some voices are "louder" than others.

Representativeness

And maybe a lot of the rhetorical confusion, about the difference between loud voices, and representative voices, is a result of the fact that "loudness" is a direct result of bringing voices which are local to a global focus, as a more or less exact response to the rhetorical demand that people's problems be representative.  If you had said to my college-age self that a particular ideological bullshit wasn't a major problem in universities, I'd have shone every light I could on the bullshit I was putting up with in college; to the extent of my ability, I'd be making the voice loud.

Which is to say, the demands for evidence of a particular problem I had (which may or may not be phrased as claims that the problem isn't representative), are demands of accessibility to that problem, are demands for loudness.  The argument that "The loud voices" aren't representative is not actually a counterargument for the problem of loud voices, nor is accepting it a solution - it is in fact a central contributing cause to the problem, because it means I have to start broadcasting cases exemplifying the issue, in order to make the problem sufficiently accessible to be taken seriously as a problem.

This is, in part, a product of a society that demands that problems be "representative" in order to be seriously considered.  I can't complain about things that happen to me, individually; I have to find a way to frame them as a class action problem that requires a class action solution.  Everything has to be shoehorned into this framework, no matter how inappropriate.

This is particularly insidious because it is a feedback cycle; the more voices are amplified, the more important it is to determine that a voice is actually representative as opposed to just being loud voices being amplified, the more incentivized people dealing with the voices are to try to amplify the voices of the people creating that problem.

So maybe I amplify the voice of the people creating my problem; publish their syllabus, perhaps, with a scathing criticism.

The problem, incidentally, was that an administrator somewhere had somehow managed to make an intro class mandatory, whose curriculum was a frankly absurd mash-up of Hegelian dialectics and a bunch of other things which I won't get into.

Now, for a counter-argument: "Representativeness is a specific claim about statistics; we can evaluate whether or not a position is representative with polls.  You're just talking about increasing the amount of anecdotal data, which isn't even evidence."

Yes!  Representativeness isn't just a claim about what the statistics say; as an argument, it is an implicit claim that that which isn't representative isn't important.  Good anecdotes don't argue that more people experience a problem, they argue that those who experience the problem have an actual problem.

Toxoplasma of Rage as Availability

I've noticed a trend among many of my friends: There's an endless supply of evidence of how bad people on the other side of the political spectrum are, and a great missing empty spot where intelligent ideas would go.  I've tried rectifying this from time to time, by introducing them to smarter people on that side.  But they aren't really interested in the smart takes that disagree with them.  The obvious answer is that sharing evidence against rival positions is loyalty-affirming, group-bonding behavior, but sharing evidence for rival positions is depleting loyalty - other people have written about this.

But I think it's important for understanding what makes voices "loud".  This dynamic, which is essentially half of the Toxoplasma of Rage, increases the broadcast strength of bad arguments, to the point where I know people who legitimately think there are no good arguments for that side.  Good arguments exist, they just aren't available, in the way that bad arguments are.  Toxoplasma of Rage describes the arguments which are mutually bad; that is, situations where both sides can find the same situation to be loyalty-affirming.

Everybody shares the good arguments that reinforce their social bubble, so these are available; likewise, they share the bad arguments that reinforce their social bubble, so these are available.

"Hang on, weren't you just arguing against 'Representativeness'?  This just seems like an elaborate complaint that your friends make non-representative complaints about other people."

Yes.

If I complained about colleges forcing students to take particularly bad classes in Hegelian dialectics as part of their introductory coursework, I'd be complaining about a non-representative problem; I very much doubt very many colleges do this.  If I made a sweeping complaint about the obsession of college administrators with Hegelian dialectics, that'd be pretty damned unfair.

Part of the problem, again, is the demand that problems be representative in order to be taken seriously; nobody complains about the junior senator of their state who happens to be in the political opposition, they complain about how the junior senator of their state's policies are emblematic of problems with their political opposition as a whole.  "Look, all these people are like this."

Part of the problem is that even when people do complain specifically about the junior senator of their state, it's taken as a complaint about the political opposition as a whole.

And I don't want to argue here that "representativeness" isn't important, but also, it isn't actually important, because the whole of both of these problems can be summed up as the belief that, if a point of view were representative of a group of people, it would represent a problem.

And this argument is longer than I have time for here, so I'm going to leave a rather unsatisfying-to-me placeholder: If somebody's belief is a problem, it is a problem without respect to whether or not that somebody is a member of a group in which that belief is representative.  The same is true of behaviors, which can also be loud.

"But it's useful to know who to vote for!"

Sure, but that's just a good reason not to trust information other people give you about it.  Also, the outside of a bubble, given that the lack of accessibility is broached only with bad arguments, this implies a strategy: Write very bad arguments, the kind that will get repeated, which embed a hidden good argument inside them as a payload.  I will note only that this strategy is already in use (by both sides), and it has been surprisingly destructive to social discourse.  (Also, do you really want to impress the sort of people who are impressed by who can come up with the ugliest examples of the opposition?)

Salience

Supposing I broadcast the problem I had with my university, publish the syllabus; somebody else goes "Hey, yeah, I have that problem too", and proceeds to describe a situation which is superficially similar, which they say never bothered them before, but now that I've described the problem, they're suddenly aware of how much of a problem it is.

If queried, it turns out they're a philosophy major, and they're upset that they're being forced to read Hegel and learn dialectics, which they characterize as Marxist propaganda; prior to reading my complaint about my own class, they had no problem with learning this, but my framing of the problem has led them to conclude that being forced to read this particular material is in fact a problem.

In a certain sense, they're kind of complaining about the same thing; they're being forced to take a particular class.  But also, they're a philosophy major, and the subject they're being forced to study is philosophy; that's literally what they signed up for.  And by the same token, I'm being forced to learn something; isn't that what I signed up for?  But no, my complaint isn't about being forced to learn something, it's about the absurdity of the course.

But in trying to make my problem representative, I had to define the problem in a way which makes it representative.  So I start by arguing against mandatory courses.  I continue by arguing against useless coursework.  I continue further by talking about financial incentives, and other problems in the collegiate system, until my complaint sounds sufficiently representative of a broad problem with colleges, that other people can grapple with it and take it seriously.

I generalized my problem, in short.

The problem turns out to be that my generalization is leaky.  The arguments I posed turn out to work pretty well in arguing why philosophy majors shouldn't be forced to read Hegel.  The complaints I had resonated with people who resented situations which were similar; my complaints made the hypothesis that they should also be complaining about these things salient to them.  Other problems start getting tossed into this generalization, which becomes recognized as a Real Problem with Society.  The generalization expands; it becomes more salient, more problems are analyzed in its framework; the problem becomes bigger.  More extreme solutions start to be proposed.

Now keep in mind, the problem I actually had is that my time and money were being wasted on a course that was transparently stupid makework, yet the generalization I produced can conceivably grow until the generalized problem is the existence of college itself - which was never my problem, and I now enjoy a reasonable salary in a professional position because of college.  What even is this generalized problem now?  I want nothing to do with it.

But the thing is - the generalized problem might not actually be an error.

The generalized problem has become a social movement.  My little example of a problem made another problem more salient to somebody else, and then they made it available; that made another problem more salient to somebody else; and so on and so forth.  Maybe all those people actually have a point.

Or maybe not.  It doesn't even matter, the point is that it began when one "voice" was made inappropriately "loud", which is (kind of) what somebody might argue if I had published the syllabus as evidence in creating my arguments.  They could appropriately argue that it isn't representative; that what one small community college does, isn't saying anything about what colleges do in general.

This can really go either way; I know people who imagine they're confronted constantly by social problems, because social media has made those social problems unduly salient to them.  They know people exist who hate them, so they see people who hate them, even when they aren't there.  And then on the other hand, you do actually need to be able to coordinate in this way to identify some social problems.

Increasing the salience of problems is costly, something I think most people here are aware of, but it can be a necessary cost to actually get the problems resolved, which I think most people here are also aware of.

Conclusion

I guess the major take-away here is that "loudness" is neither good nor bad.  The phenomenon of "loud voices", likewise, can be a force for both social good, and social ill, but mostly comes down accessibility to that voice.  I think they should be contextualized as purposeful and potentially useful anecdotes, however, and statistical arguments about them are largely missing the point.

I do see issues with the criticisms of "loud" voices, namely that they aren't representative, and take issue with the general question of representativeness as it pertains to groups of human beings - not because the question is never useful, because certainly if I were a politician looking to join a political party I would want to find the one which I am a more representative member of - but rather because the explicit formulation of the question (or an answer to it) is basically always formed with hostile intent.

That is, the question is fine, right up until somebody asks it.

Overall, I'm basically fine with some voices being louder than others, even if they seem disproportionately to have something crazy to say.  Really, it's more interesting that way anyways.

Discuss

### What do the reported levels of protection offered by various vaccines mean?

Новости LessWrong.com - 5 мая, 2021 - 01:06
Published on May 4, 2021 10:06 PM GMT

Pfizer is supposed to be "95% effective".  Does this mean:
1. There is a 95% reduction in your odds of getting COVID (as measured by the gold standard, which I believe is aserological test?)
2. There is a 95% reduction in your odds of testing positive for COVID
3. There is a 95% reduction in your odds of getting symptomatic COVID
or
4. Something else?

Discuss

### Did they use serological testing for COVID vaccine trials?

Новости LessWrong.com - 5 мая, 2021 - 00:48
Published on May 4, 2021 9:48 PM GMT

Or did they correct somehow for false positive/negatives of the type of testing that they did use?

Discuss