Вы здесь

Новости LessWrong.com

Подписка на Лента Новости LessWrong.com Новости LessWrong.com
A community blog devoted to refining the art of rationality
Обновлено: 12 минут 51 секунда назад

Is the Reversal Test overrated?

25 января, 2020 - 22:13
Published on January 25, 2020 7:01 PM GMT

I see it treated as a reliable way to discover status quo bias. The classic example being: "If you don't want to artificially decrease your lifespan by 50 years why don't you want to increase it by 50 years?"

Except it's basically always rational to prefer the status quo (change is bad)

If someone ask me: "Would you like to increase your body temperature by 50 degrees?" I would say I wouldn't, but saying that I also wouldn't like to decrease my body temperature by 50 degrees wouldn't make me irrational.

Now this a very obvious example but I worry that when the community uses it on someone not trained in the Less Wrong jargon/way of thinking and they can't immediately explain why they prefer the status quo their argument gets labeled as irrational.

Just keep in mind that it's only a small tool to explore someone's reasoning behind an opinion, not a way to disprove something.


SSC Zürich February Meetup

25 января, 2020 - 20:21
Published on January 25, 2020 5:21 PM GMT


Ten Causes of Mazedom

25 января, 2020 - 16:40
Published on January 25, 2020 1:40 PM GMT

We continue answering the questions we asked earlier. It was claimed last time that maze levels and the danger of mazes was lower in the past than it is now, and that overall maze levels have been rising, as measured both by maze levels within organizations across the board, and maze levels within the overall society.

The world is better for people than it was back then. There are many things that have improved. This is not one of them.

I am confident this is the case and have pointed to ways to see it. I recognize that I have in no way proven this is the case. I don’t have a way to do that. Rather I am relying on your own observations and analysis. If you disagree, I hope the discussion that follows will still prove useful as a comparison to what you see as an alternate possible scenario where these dynamics are less toxic.

Now we ask what may have been different in the past, and whether we can duplicate those causes.

Why was it different? Can we duplicate those causes?

The sketched model suggests several causes.

One can model this as a two-level problem. Something happens (e.g. technological change) resulting in a change in circumstances (e.g. more real need for large organizations, cause one), which then causes higher overall maze levels. 

Since this question asks for potential action on a large scale, political proposals will need to be among the proposals discussed. Let us do our best to debate models and how the gears in them work, and stay away from actual politics. To that end, I am not claiming any of these policies are good ideas and should be implemented, or bad and should not be implemented. I am only claiming that they would have particular physical results in the world. If they are obviously good or bad ideas, I shouldn’t need to say so.

Cause 1: More Real Need For Large Organizations

Modern life has larger organizations with more levels of hierarchy. Corporations are bigger. Governments are bigger. Political organizations are bigger. Universities are bigger. They wrap that bigness in more levels of hierarchy than ever before, with increasing amounts of bureaucracy and administration.

Jobs are done that previously did not get done. Where the same job persists, that same job takes more people. 

There is little doubt that some of the problem is caused by the increased complexity of real needs. What is difficult is figuring out how much is necessary versus how much is unnecessary.

Do we see these bigger organizations mostly because technological and economic development requires it? Or do we mostly see bigger organizations because maze-supporting behaviors and rules have made it so?

How much of what goes on in a large organization is bullshit, and could be eliminated, shrinking its size?

How much of needing a large organization is due to regulation and subsidy? How much of this additional size and regulation is due to additional demands for safety, either real safety or the illusion of safety? 

How much is our unwillingness to let ‘too big to fail’ organizations fail, and the perception that size is necessary, good and/or prestigious?

To the extent that this is the problem, we are mostly stuck. We can still make different trade-offs, since we can give up some of the additional complexity and its benefits to reduce the large-organization costs paid, but there are no easy wins to be had here.

My low confidence belief is that this accounts for a substantial portion of the problem, but less than half. It would be foolish precision to give a more exact number.

Cause 2: Laws and Regulations Favor Large Organizations

There is genuine need for larger organizations with more levels of hierarchy. But once you create those organizations, they have a logic of their own that generates further need for them. Those within these organizations steer conditions in their direction. A lot of the bigger organizations are there because they are structurally advantaged because they enjoy regulatory capture, or because of other barriers to smaller competition, rather than because being larger make any economic sense.

While we were making life harder for the small, we made life easier for the large. We removed technological and legal costs and barriers to larger organizations of all kinds. The friction caused by those costs previously ensured that big organizations only existed when there was a big payoff. Now we can also get them because of mission creep, because it gives people making the decisions more power, or often because someone’s ego gets stroked by the idea.

We were so busy figuring out how we could go big, we didn’t stop to ask whether we should.

As opposed to doing new things, where we are now so busy asking whether we should we no longer stop to ask whether we could.

Cause 3: Less Disruption of Existing Organizations

Wars, natural disasters, economic collapses and other large negative shocks are the standard historical method of preventing large organizations from hanging around for too long.  

Fortunately, we’ve managed to mostly avoid major destructive wars for some time. Natural disasters are not the threat they once were. While we had an economic collapse of sorts in 2008, it was nothing like the Great Depression. The title of one of the definitive books on that crisis, and the one I read to better understand it, was literally Too Big to Fail.

Existing large organizations were protected by government intervention. When they were so damaged that they could not be defended, instead of letting them fail, they were forcibly merged into even larger organizations

With our much larger productivity and surplus, our greater ability to do finance and run large deficits for long periods, and greater ability to absorb negative shocks, and our growing distaste for risk and loss of any kind, along with a long peace, and our judgment of leaders on extreme short term perception of outcomes, we have biased things greatly against disruption.

That then combines with the effects of regulatory capture and crony capitalism. The rise of mazes and maze behaviors is a vicious cycle. 

We’ve also seen a dearth of positive shocks. Technological breakthroughs, economic developments and other big positive events allow new entrants to gain advantage over existing entrenched interests whose cached actions are increasingly inefficient and no longer make sense.

It is unknown to what extent this dearth of positive shocks is due to inherent difficulty in pushing forward at the current technological frontier, to what extent it is a cost of choices we have made, and to what extent it was the intent of the choices that were made.

In many cases, here and now, progress and disruption that would challenge existing systems is being forcibly suppressed. Some of the mechanisms of this will be among this list of causes, again with a mix between inherent, unintentional and intentional. Incumbents of all kinds by default are fragile and would prefer the status quo not be disrupted. Being mazes only enhances these incentives. In other cases, we’ve run into some combination of increasing difficulty of underlying problems, and our increasing lack of ability to do things even when not inhibited. 

Cause 4: Increased Demand for Illusion of Safety and Security

Safety and security are important. They are not infinitely important. They are extremely poor choices for sacred values.  

Illusion of safety and illusion of security are increasingly seen as important sacred values. Occasionally so are actual safety and security. 

In important ways, we are less secure than in the past.

Due to cultural changes, atomization and government interventions, we cannot count on our family, our friends or our community to support us. Even when we can count on them to support us, what we are counting on is a different order of magnitude of support from far fewer people. Government safety nets, even at their best, are a poor replacement.

We are taught that fulfilling all the requirements to responsibly have children and a family requires many stars to simultaneously align. We are put on very tight deadlines. There is huge risk that these things will not happen for us at all. They are the things that many people value the most today, and which most people valued most in the past. One is not permitted to ‘skip ahead’ as one is often kept in educational limbo and then a post-education period where you are not expected to be able to seriously attempt a family, until the clock has almost run out, then the scramble begins.

We also in important ways have longer memories and less ability to recover from mistakes or poor trajectories. The very idea of things going on one’s ‘permanent record’ puts one in continuous danger even as a young child. Path dependence goes up. We have this concept that one’s ‘life can be over’ as the result of quite minor misfortunes. Potentially this can be as small as an ill-advised tweet, or a poorly navigated social interaction.

Then we are expected to end this journey with millions of dollars in ‘life savings’ as a substitute for the old support systems. That money can still be expropriated at any time by the medical or elder care systems. 

Rates of depression, suicide, traumatic stress and other mental health issues have skyrocketed, likely largely as a result of all of these other dangers and pressures, also largely because we are taught to find lesser and lesser issues unacceptable and crippling. Increasing numbers of people generally feel unprepared for problems that would not have much fazed them in the past, or in some cases would not have even qualified as problems. We can’t even.

In important ways, we are more physically safe than in the past.

Our trauma care and treatment of infections has dramatically improved. The rates of physical trauma, street crime and infection have gone down and the consequences tend to be less severe. Murder rates are down. Cars are vastly safer than in the past. Planes are much less risky than the car ride to the airport. Whenever I hear about accidents from travel these days, it’s almost universally in fiction, or what would have in the past been a minor accident that now makes the news. Our food supply is vastly safer. Our toys and playgrounds are vastly safer.

Our physical products and physical spaces and lives are universally are vastly safer.

Life really shouldn’t feel physically scary the same way that it used to.  The rising physical risks are suicide and drug abuse, which are a different kind of physical risk, and the consequences of too much safety such as allergies.  

In important ways, we are also more emotionally safe than in the past, as well. In other important ways, we are far more emotionally at risk.

We are safer in the sense that things that many unpleasant things that were considered normal parts of life are now considered unacceptable, and thus we guard against them. We still must deal with violence, with racism, with sexism, with abuse, with religious intolerance, with homophobia and transphobia, and hosts of other similar problems. Insert all additional concerns here. But without in any way saying that any of our current concerns are illegitimate, oh my are all those problems in a vastly better place than they used to be. As Jim Norton commented in his most recent stand-up special, many of the stories our grandparents tell of how they met would, by modern ears, horrify us.

Such questions are treated as much more of a priority than they used to be, especially in educational settings. We also make aggressive use of psychiatric medication.

We are less safe from such things in two ways. 

We are less safe in the sense that we no longer are taught that such things are acceptable, or that we should be able to ‘walk it off’ when they happen or take it on the chin. Nor are we forced to practice dealing with them. Thus have a much harder time dealing with them when they happen. We have the problems that come with a revolution of rising expectations.

We are also less safe in the sense that mazes turn everything into a signal and tool in the maze’s political games. Even when one could shrug off an event and its direct consequences, that event also comprises a move within the game. If people observe that offense can be done to you without retaliation, that is a license to everyone to ramp up the level of offense. This transforms your social identity into one to whom offense can safely be done, thus destroying your status and any hopes for success, including assistance with high placement into another maze.

A lot of this safety comes at the direct expense of fun, learning and efficiency. And a lot of it makes us so safe that people are failing to encounter the adversity they need to become resilient.

This all seems quite bad to me on its own merits, but one could disagree. It largely reflects the very good news that life is safer and thus we are free to demand it be safer still. This being a cause of increased maze activity doesn’t depend on it being otherwise good or bad.  

This rise in risk of failure tracks the physical and social risks of ‘falling off the track’ described above. Emotionally falling off the track has also become far more dangerous. 

The risk of things going wrong has gone down at many steps. So has our tolerance for those things going wrong, and our tolerance for risk of things going wrong. We have gone from resilient to fragile. Meanwhile, we’ve lost traditional supports, and many circumstances and life transitions have become far more perilous than they have ever been.

This has bled into even tiny decisions, such as what to have for lunch in a strange town.

One of the highest value practical lessons of rationality is to recognize what is actually safe and unsafe and to what degree, and when it makes sense to take an unconventional path.

How do these phenomena raise maze levels?

Mazes are, among other things, engines for producing the illusion of safety and the illusion of security. In some ways and cases, they do this by producing the real thing. In other cases, it is pure illusion. Either way, they have a huge comparative advantage. 

Working for large organizations is seen as safe and secure. So is working as part of a profession whose rents are protected by regulation and other barriers to entry. Working elsewhere is seen as risky and insecure. 

A product made by a maze is seen as reliable and consistent, even if low quality or a bad value.

Existing risks are ‘grandfathered in.’ Once an activity, product, drug or what have you is approved and becomes standard, that becomes subject to a different frame, where taking that risk isn’t considered risk, and it doesn’t open you up to being blamed if something goes wrong. If no one gets fired for buying IBM, as the saying goes, it’s going to take a damn good reason to buy something else.

Inside a maze, that is a huge support for doing business with and helping other mazes over other non-mazes. Outside a maze, people are still largely seeking to avoid being blamed and/or blameworthy.

When your options are bad, it is even more important to show that you have made the conventional, standard and therefore ‘safe’ choice, so that when it blows up in your face, it is not your fault. For a current real example in my own life, I will be using a health insurance broker to navigate the dystopian wasteland that is the individual health insurance market in New York, and placing a high priority on a policy that provides illusion of security.

Even if all is well and safe and good, and that is not in dispute, when something is new, or isn’t the generic standard thing, the default consumer does not know this. Why go to the trouble of finding out, when anything that goes wrong is then your fault, even if it had nothing to do with your choice, or your choice actively made it less likely

New risks, and any downside of a new proposal to basically anyone, are taken vastly out of proportion. Mazes are much better equipped to handle this than non-mazes, as this forces one to play the system and the games of propaganda and public relations, hiding what needs to be hidden, and spending huge amounts of time, money and expertise on the right to do anything. Many of the costs involved are fixed. Small scale efforts often don’t survive this. 

Safety regulations and other requirements for doing things are almost always conceptualized based on what currently exists. Lobbyists made sure that existing systems could handle them, and protested loudly if that would be expensive. If you’re thinking of a new way of doing things, even if it makes the existing system look vastly unsafe in some way, it is likely to violate one of the existing technical requirements, or make some aspect slightly less safe in some fashion. 

Real shame, that. 

I leave as an open question how often the purpose of the safety regulations is the prevention of innovation and competition.

Cause 5: Rent Seeking is More Widespread and Seen as Legitimate

This is a result as well as a cause.

Mazes have comparative advantage at seeking, securing and extracting rents. Rents are what keep the mazes running once they are no longer capable of efficient creation of value.

Thus, part of their reversal of morality is to turn the destructive act of rent seeking into the legitimate and respectable thing that one does to earn a living, and make positive-sum actions seem increasingly illegitimate.  

That major forces in our society are pushing this, and that it has been working, have been addressed earlier in the series. To the extent that they haven’t been, it’s another post or even sequence, and I won’t argue further for it here. 

Cause 6: Big Data, Machine Learning and Internet Economics

Big data and machine learning swing the balance in favor of mazes. 

Mazes are full of people who could not communicate with each other (extreme case of the SNAFU principle), who are many steps removed from the object-level considerations they neither understand nor have the data or skill to process, and make decisions based on metrics and political considerations largely around those metrics. And not caring about incidental effects.  

Machine learning gives a powerful tool to anyone with a metric to optimize and lots of data that can be gathered. The explicit price of this is giving up on a gears-level understanding of what is going on or how things work, not caring too much about incidental effects, and trusting the algorithm. Thus, machine learning helps mazes be much better at what they wanted to do anyway, while charging them prices they’ve already paid.

Others are then judged by those mazes and metrics and pushed towards using more machine learning style strategies. If they give in to this, outsiders lose their advantages of actually understanding how things work and what is important and seeing incidental effects, and are at a severe data disadvantage because the only data that counts is no longer the type of data they have. 

Mazes also benefit because big data and machine learning love huge amounts of data. Huge amounts of data demand scale. Instead of having humans in charge of decisions, now algorithms are in charge of those decisions. The blindness of those inside the maze becomes a blindness from lack of data for those who don’t have sufficient scale.

Internet economics is then shaped by this machine learning and by optimizing for metrics, especially when bidding for advertising and trying to reach the point where one can use advertising to scale. A lot of costs are fixed, which favors large over small. Other aspects do favor small over large, and favor better over worse. So it’s not a complete loss. And there’s a lot of pushback.

I am not claiming this is one directional or simple. 

In addition to the issue of what algorithms do to mazes that use them, there is the issue of what algorithms do to the people the algorithms are reacting to, when they want to change and/or are shaped by how the algorithms react. When algorithms are optimizing how to react to us, we face similar underappreciated pressures even if we are not using them, which also contributes to all this. Our entire mode of thinking about the world becomes warped. That’s a much bigger topic beyond scope.

Cause 7: Ignorance

Mazes are an example of a thing that, unless they have already sufficiently corrupted values in general, could be destroyed by the truth. 

Mazes depend upon the public, and in particular on potential workers and associates, not understanding how mazes work or what working in or with a maze will do to their lives and souls. People do understand that there is something soul-crushing about ‘working for the man’ and that is true as well, but mostly misses the point.

It takes a minimum of years inside a maze to start to understand how mazes work from the inside, or how bad it gets in middle management, if no one explicitly tells you. By that time, many are too invested to stop, or have already self-modified in ways that leave them no longer able to perceive the problem. Even if you have the data needed to put it together, my experience is that to fully appreciate what you are dealing with, you have to quit first. Which would mean you need to quit before you realize how badly you need to quit. Whoops.

If someone does explicitly tell you beforehand, you’re inclined not to believe it, because such warnings usually lack the gears and sound absurd, and come from people who appear biased against business or are experiencing sour grapes. Because things would sound so absurd, those doing the warning will round down their warnings to sound reasonable. Then the listener does the same. Even when the source is credible and provides gears, as I hope this source does when combined with the original book, the brain constantly instinctively rounds it all down to avoid seeing the thing. Those who recoil in horror at the thought of working for a maze, who are not doing so based on hard won experience, usually don’t know the half of it.

This is enough to ensure that workers at large corporations make more money than those at smaller ones. Tyler Cowen and other defenders of large corporations use this as evidence that large corporations make people more productive. I think this represents a terribly incomplete model of how labor markets work. Workers are not primarily paid in some relation to their marginal product. The accounting identity neither holds nor does it enlighten much.

My model is more along the lines of: Workers are paid in relation to what it costs to attract talent to a position, and then the business either is or is not viable. One of the inputs to that is indeed the worker’s perception of their marginal or average production at that job. There is some expectation that great success will be shared with the workers to some extent. Not to do so would make them upset. This is mostly a threshold effect and acts largely on marginal compensation via reciprocity and fairness. This isn’t centrally about supply and demand curves meeting. In good times one gets a raise. 

The exception is when there is a strong union. A strong union changes the calculation from what workers will accept to work here instead of elsewhere, to a zero-sum battle by the union to extract maximal resources from the corporation without too quickly driving it into bankruptcy, versus the corporation trying to retain profits and not let them do that. This conflict plays out in a legal structure where many otherwise legal things become illegal, other otherwise illegal things become legal or mandatory, and the outcomes are weird. Union jobs really are different. 

Another input is the status that goes along with a job, and what it feels like it ‘should’ pay on that basis. In most cases this runs in the perverse direction – a ‘better’ job requires more compensation to keep people happy about their pay. Workers have the instinctual sense that higher status jobs should strictly dominate low status ones. They also have the (usually correct) instinctual sense that if the job does not pay, the promised status is fake slash won’t be accepted. And that if lower status jobs are not paid less, or equal status people who produce better are paid more, then their status is under threat, so they will revolt against such practices. Thus, everyone wants the higher status jobs. You cannot effectively balance this with pay.

It really is a terribly bad idea to let the workers find out what the other workers are paid.

A lot of this, one hopes, is based on how much someone wants to do that job, and have the life that goes along with having that job.

When we see that jobs in big corporations pay more, rather than conclude they must be more productive, I primarily take this as evidence of some combination of two things.

One, that working at a big corporation is something people dislike, so you have to pay them more to do it. This despite them not knowing how much they should dislike it, and therefore demand in compensation, and despite a lot of career lock-in. We also see above-standard pay in large government bureaucracies. 

That can be combine with two, that working at a big corporation is seen as higher status, and therefore demands higher pay. 

We often see poor pay in academia, because people enter thinking they’ll get to think about things all day that they find interesting, and have that position securely for life, they like that lifestyle, and that makes them want the job more. They can then accept lower pay because they (incorrectly) see what they are doing as not being part of a profitable operation, and (correctly) see others think this, so they do not need higher pay to hold onto the job’s high status. 

Prospective academics are almost always, as far as I can tell, at least in willful denial about their job prospects, and about how their jobs will work if they find them. They certainly don’t process the news about the incentives and pressures they will face, and the petty battles and tasks that lie ahead.

Big business still has to write those larger checks every two weeks while staying in business. Isn’t that evidence of higher productivity? It’s some evidence of otherwise higher profitability of the corporation, that this does not drive them out of business. Profitability is not productivity. If we think those higher profits are largely due to extractive practices, rents and regulatory advantages rather than production, that does not count. If we think that it is due to things inherent to the corporation, such as intellectual property or monopoly or returns to scale, and this in turn causes employees to demand more compensation, that also does not count if it’s not the employees (or not all but a handful of employees) doing the production. 

If people knew the truth about working for mazes and especially being in middle management, and/or were facing less outside pressure, they would demand much higher pay. This would make mazes bear more of the real costs they impose on people, while making it harder to attract strong talent. It would be much easier to justify declining such jobs, and harder to push someone into accepting one. Doing other things would get relatively higher status. 

This would all make mazes less competitive.

So would general knowledge of how mazes derive their advantages and rents from regulation and subsidy. Regulatory capture is rife. Much of what opponents of business and other long entrenched mazes do ends up being used by those entrenched mazes to cement their advantages. A common pattern is outrage at the actions of corrupt actors, creating demand for new rules and regulations, which turn out primarily to entrench and enrich those same corrupt actors.

If people understood these dynamics, and knew what mazes were, they would use different tools. They would demand different interventions. Then there are the things that business convinces the government to do directly. If those were better understood, and people could coordinate to stop them, that would be an even bigger boon.  

One could object that this isn’t a difference with past conditions. Did people in the past know these things? 

Ignorance often comes form not knowing the alternatives to the maze way of doing things, or that there exist or ever have existed things other than mazes, or of the ways in which mazes take over non-mazes and how to prevent that. People object that things are terrible, and they respond that since the incentives naturally trend towards these terrible things, how could things ever not be terrible? Despite things sometimes not being terrible, and things not being terrible in general. Those in charge of non-maze spaces invite the maze-runners in willingly, without knowing they have sealed the space’s fate.

This was not true in the sufficiently distant past. It is not even true for your sufficiently distant past. It is learned behavior. We do not begin life this way. Children begin only knowing the non-maze way of doing things. Then we send them to school. Then we give them jobs.

The hypothesis here is that things were different in the past because people were regularly exposed to non-mazes as central actors in their worlds. This taught them that small maze levels were the normal way of being, instead of being mostly exposed to mazes as central actors, building their lives around them, and being taught that this was the normal way of being. People used to realize that the things involved in this were weird and now they aren’t weird anymore. This leaves us blind to what mazes result in. People used to expect better.

Ignorance often also takes the form of not knowing how to do actual business. Most people today have no idea how to go about working on their own, feeling mostly that There Be Dragons. Or generally how they or their worlds might survive without giant engines taking care of things for them on all levels. That did not used to be the case.

Cause 8: Atomization and the Delegitimization of Human Social Needs

A large cost of working in mazes is that they do not permit full independent outside social lives. People are told to isolate themselves and neglect basic human needs. Our society’s willingness to treat such a condition as normal, acceptable and appropriate means that they can pull this off. 

In the past, in most times and places, I believe that people would have realized how terrible all of this increasingly has been and prioritized fixing it. This would have made people realize that joining an all-consuming maze was a huge life cost not to be taken lightly, rather than a source of palliatives to help one suffer through an impoverished social life.

For those in middle management or above, they are being evaluated on everything they do. This includes who they choose to associate with, including who they marry and how they raise their kids, how they spend their time outside of work, what church if any they go to, and so forth. They are also under increasing pressure to spend every possible waking moment on job-related things. Socializing with coworkers mostly counts.

Even for those not in middle management, jobs play havoc with their schedules and flexibility. Almost total reliability is demanded at most jobs in large organizations during standard hours, and most require also working outside of those hours when asked to do so. Any exceptions are quite costly. Vacation time is both slim and usually planned well in advance. This is hard on someone trying to live life, especially with a family. You end up tired all the time. Your life becomes a grind. Any free time largely needs to be spent recovering. 

The majority of your waking hours are spent interacting with your coworkers, who you mostly did not choose. They then effectively become your friends, and often even your family, the same way they do in most workplace television shows. They are the people who have things in common with you, who you can reasonably keep up with and make plans with. Often they are the people you date, despite all attempts at policy to the contrary, and whether or not there are bad dynamics involved.

On a broader scale, you do not only keep the hours they tell you to keep. You live where they tell you to live, so you can work where they tell you to work. In the ongoing battle to create a rationalist community in New York, there is the constant assumption by everyone that finding a good job somewhere determines where you live. One of my three closest remaining friends, a former Magic: The Gathering professional, recently uprooted his family, including three kids, and sold his apartment, because job offers in another city were somewhat better than those here. His entire family knows zero people in the new city. That is now simply what one does. People who have skills or ambition are expected to walk away from everyone they have ever known for a job. 

This happens because basic human social needs have been delegitimized. Work provides one with one’s ‘social life’ and ‘friends’ and even ‘family,’ and we are expected to count that as a benefit. I believe that in most times and places, this would rightfully be seen as a deeply impoverished bastardization of these concepts. People think their Facebook ‘friends’ are friends, or even close friends. They’re not.

Atomization has many other causes as well. I have wide uncertainty how much higher maze levels caused atomization, as opposed to atomization causing higher maze levels. The deligitimization of basic human needs seems more tied to maze issues. 

Given all that, how does one invest in real and deep long term friendships and groups, when they are increasingly unlikely to last? 

If you have a better answer than ‘persevere and hope for the best’ please share. I give great thanks for the few old friends that are still around.

Cause 9: Educational System

We indoctrinate our citizens to the horrors of mazes early and often.

Our first step is to imprison our children, starting from around the age of five, for the bulk of their waking hours. During that time, all of their behaviors are tightly controlled, and they are taught that their success in life depends on satisfying the arbitrary demands of the person put in charge of them, who will mostly use negative selection to determine outcomes. When asked to justify doing this, the reply is typically that if you do not do this to your child now, your child will be ill-prepared to be subjected to increasingly intense versions of it as they progress through the educational system.

You had better suffer and kill your soul now, or else you won’t be prepared to suffer worse things later.

Once taught that life is about obeying arbitrary dictates and doing work with no object level application whatsoever, giving most of your life over to arbitrary schedules and demands, and gathering together credentials and approvals necessary to get the labels that get others to give you status and compensation, where everyone is on the lookout for reasons to put black marks on your record, you are ready to work in a maze without recoiling in horror.

This process also saddles its students with massive debts that force them to then take jobs they hate, and gives them credentials that give them an advantage in narrow rent extraction in one area, preventing choice or exit.

I previously wrote at somewhat more length on this in these posts: The Case Against EducationThe Case Against Education: FoundationsThe Case Against Education: Splitting the Education Premium Pie and Considering IQSomething Was Wrong

Cause 10: Vicious Cycle

Mazes are self-reinforcing.

Many of these causes make this clear. Mazes cause a problem and shift in priorities and values, which in turn makes mazes more attractive and devalues non-mazes. It makes people rely more and more on mazes for their needs. More mazes gives maze activities more interactions with other maze activities and those caught in mazes, legitimizes maze activities and delegitimizes non-maze activities, causing a vicious cycle.

Once people are sufficiently dependent on mazes, or mazes gain sufficient influence and power, there is a critical phase shift. Morality’s sign flips. The forces of social approval, which previously were pushing against maze behaviors, instead start pushing for them. People punish others for not orienting towards mazes and away from the object-level, where before they did the opposite and some people chose mazes in spite of this.

That phase shift seems to have happened in the bulk of general society, including our school system and our political system, and increasingly in social life lived via social media, causing an accelerating problem.

Reputational systems in general have largely become maze-oriented reputation systems. School is about ‘what looks good on an application.’ Work is about ‘what looks good on a resume’ when it isn’t about the boss. Romantic life is even based largely on ‘what looks good on a dating profile’ or in the first moments of interaction, favoring the legible over the illegible more than previous systems.

The more one must likely deal with mazes in the future, and the more they determine our futures, the more ‘guard one’s reputation’ means ‘make sure the searchable record does not contain things mazes will dislike, and let one tell to mazes the story one wants to tell.’

These new concerns crowd out what one would otherwise be concerned about. One key thing they crowd out is wanting to be known in good ways for who you actually are and what you do, by people who observe you, who will then tell other people. Mazes can and often do start to destroy your connection to object level reality long before you ever set foot in one. 

Even when the reputation system is based on explicitly being judged by those who interact with you, such as customer reviews, it’s about an average quantification. This largely becomes another metric game. Such systems do excellent work detecting when things are wrong, but mostly operate via negative selection due to clustering of ratings near the top of the range allowed. Thus they likely will become increasingly distortionary with time towards prioritizing avoiding the things that legibly cause bad ratings, effectively punishing anything non-standard or any attempt to be exceptional, because when you earn that sixth star, you’re not allowed to collect it.

A costly way to go above the non-costly maximum would potentially help, but has its own problems, and further discussion is fascinating to me but beyond the scope here. It is a hard problem.

More Mazes, More Problems

Next post I will present the best potential solutions I could come up with. This is not a problem that can be fully solved, but certainly there are changes on the margin that would decrease the rate at which things get worse, or even roll the damage back somewhat.

Implementing them, of course, is a whole additional problem.




On the ontological development of consciousness

25 января, 2020 - 08:56
Published on January 25, 2020 5:56 AM GMT

This post is about what consciousness is, ontologically, and how ontologies that include consciousness develop.

The topic of consciousness is quite popular, and confusing, in philosophy. While I do not seek to fully resolve the philosophy of consciousness, I hope to offer an angle on the question I have not seen before. This angle is that of developmental ontology: how are "later" ontologies developed from "earlier" ontologies? I wrote previously on developmental ontology in a previous post, and this post can be thought of as an elaboration, which can be read on its own, and specifically tackles the problem of consciousness.

Much of the discussion of stabilization is heavily inspired by On the Origin of Objects, an excellent book on reference and ontology, to which I owe much of my ontological development. To the extent that I have made any philosophical innovation, it is in combining this book's concepts with the minimum-description-length principle, and analytic philosophy of mind.

World-perception ontology

I'm going to write a sequence of statements, which each make sense in terms of an intuitive world-perception ontology.

  • There's a real world outside of my head.
  • I exist and am intimately connected with, if not identical with, some body in this world.
  • I only see some of the world. What I can see is like what a camera placed at the point my eyes are can see.
  • The world contains objects. These objects have properties like shape, color, etc.
  • When I walk, it is me who moves, not everything around me. Most objects are not moving most of the time, even if they look like they're moving in my visual field.
  • Objects, including my body, change and develop over time. Changes proceed, for the most part, in a continuous way, so e.g. object shapes and sizes rarely change, and teleportation doesn't happen.

These all seem common-sensical; it would be strange to doubt them. However, achieving the ontology by which such statements are common-sensical is nontrivial. There are many moving parts here, which must be working in their places before the world seems as sensible as it is.

Let's look at the "it is me who moves, not everything around me" point, because it's critical. If you try shaking your head right now, you will notice that your visual field changes rapidly. An object (such as a computer screen) in your vision is going to move side-to-side (or top-to-bottom), from one side of your visual field to another.

However, despite this, there is an intuitive sense of the object *not *moving. So, there is a stabilization process involved. Image stabilization (example here) is an excellent analogy for this process (indeed, the brain could be said to engage in image stabilization in a literal sense).

The world/perception ontology is, much of the time, geocentric, rather than egocentric or heliocentric. If you walk, it usually seems like the ground is still and you are moving, rather than the ground moving while you're still (egocentrism), or both you and the ground moving very quickly (helocentrism). There are other cases such as vehicle interiors where what is stabilized is not the Earth, but the vehicle itself; and, "tearing" between this reference frame and the geocentric reference frame can cause motion sickness.

Notably, world-perception ontology must contain both (a) a material world and (b) "my perceptions of it". Hence, the intuitive ontological split between material and consciousness. To take such a split to be metaphysically basic is to be a Descartes-like dualist. And the split is ontologically compelling enough that such a metaphysics can be tempting.

Pattern-only ontology

William James famously described the baby's sense of the world as a "blooming, buzzing confusion". The image presented is one of dynamism and instability, very different from world-perception ontology.

The baby's ontology is closer to raw percepts than an adult's is; it's less developed, fewer things are stabilized, and so on. Babies generally haven't learned object permanence; this is a stabilization that is only developed later.

The most basic ontology consists of raw percepts (which cannot even be considered "percepts" from within this ontology), not even including shapes; these percepts may be analogous to pixel-maps in the case of vision, or spectrograms in the case of hearing, but I am unsure of these low-level details, and the rest of this post would still apply if the basic percepts were e.g. lines in vision. Shapes (which are higher-level percepts) must be recognized in the sea of percepts, in a kind of unsupervised learning.

The process of stabilization is intimately related to a process of pattern-detection. If you can detect patterns of shapes across time, you may reify such patterns as an object. (For example, a blue circle that is present in the visual field, and retains the same shape even as it moves around the field, or exits and re-enters, may be reified as a circular object). Such pattern-reification is analogous to folding a symmetric image in half: it allows the full image to be described using less information than was contained in the original image.

In general, the minimum description length principle says it is epistemically correct to posit fewer objects to explain many. And, positing a small number of shapes to explain many basic percepts, or a small number of objects to explain a large number of shapes, are examples of this.

From having read some texts on meditation (especially Mastering the Core Teachings of the Buddha), and having meditated myself, I believe that meditation can result in getting more in-touch with pattern-only ontology, and that this is an intended result, as the pattern-only ontology necessarily contains two of the three characteristics (specifically, impermanence and no-self).

To summarize: babies start from a confusing point, where there are low-level percepts, and patterns progressively recognized in them, which develops ontology including shapes and objects.

World-percept ontology results from stabilization

The thesis of this post may now be stated: world-percept ontology results from stabilizing a previous ontology that is itself closer to pattern-only ontology.

One of the most famous examples of stabilization in science is the movement from geocentrism to heliocentrism. Such stabilization explains many epicycles in terms of few cycles, by changing where the center is.

The move from egocentrism to geocentrism is quite analogous. An egocentric reference frame will contain many "epicycles", which can be explained using fewer "cycles" in geocentrism.

These cycles are literal in the case of a person spinning around in a circle. In a pattern-only ontology (which is, necessarily, egocentric, for the same reason it doesn't have a concept of self), that person will see around them shapes moving rapidly in the same direction. There are many motions to explain here. In a world-percept ontology, most objects around are not moving rapidly; rather, it is believed that the self is spinning.

So, the egocentric-to-geocentric shift is compelling for the same reason the geocentric-to-heliocentric shift is. It allows one to posit that there are few motions, instead of many motions. This makes percepts easier to explain.

Consciousness in world-percept ontology

The upshot of what has been said so far is: the world-percept ontology results from Occamian symmetry-detection and stabilization of a pattern-only ontology (or, some intermediate ontology).

And, the world-percept ontology has conscious experience as a component. For, how else can what were originally perceptual patterns be explained, except by positing that there is a camera-like entity in the world (attached to some physical body) that generates such percepts?

The idea that consciousness doesn't exist (which is asserted by some forms of eliminative materialism) doesn't sit well with this picture. The ontological development that produced the idea of the material world, also produced the idea of consciousness, as a dual. And both parts are necessary to make sense of percepts. So, consciousness-eliminativism will continue to be unintuitive (and for good epistemic reasons!) until it can replace world-percept ontology with one that achieves percept-explanation that is at least as effective. And that looks to be difficult or impossible.

To conclude: the ontology that allows one to conceptualize the material world as existing and not shifting constantly, includes as part of it conscious perception, and could not function without including it. Without such a component, there would be no way to refactor rapidly shifting perceptual patterns into a stable outer world and a moving point-of-view contained in it.


Have epistemic conditions always been this bad?

25 января, 2020 - 07:42
Published on January 25, 2020 4:42 AM GMT

In the last few months, I've gotten increasingly alarmed by leftist politics in the US, and the epistemic conditions that it operates under and is imposing wherever it gains power. (Quite possibly the conditions are just as dire on the right, but they are not as visible or salient to me, because most of the places I can easily see, either directly or through news stories, i.e., local politics in my area, academia, journalism, large corporations, seem to have been taken over by the left.)

I'm worried that my alarmism is itself based on confirmation bias, tribalism, catastrophizing, or any number of other biases. (It confuses me that I seem to be the first person to talk much about this on either LW or EA Forum, given that there must be people who have been exposed to the current political environment earlier or to a greater extent than me. On the other hand, all my posts/comments on the subject have generally been upvoted on both forums, and nobody has specifically said that I'm being too alarmist. One possible explanation for nobody else raising an alarm about this is that they're afraid of the current political climate and they're not as "cancel-proof" as I am, or don't feel that they have as much leeway to talk about politics-adjacent issues here as I do.)

So I want to ask, have things always been like this, or have they actually gotten significantly worse in recent (say the last 5 or 10) years? If they've always been like this, then perhaps there is less cause for alarm, because (1) if things have always been this bad, and we muddled through them in the past, we can probably continue to muddle through in the future (modulo new x-risks like AGI), and (2) if there is no recent trend towards worsening conditions then we don't need to worry so much about conditions getting worse in the near future. (Obviously if we go back far enough, say to the Middle Ages, then things were almost certainly as bad or worse, but I'm worried about more recent trends.)

If there are other reasons to not be very alarmed aside from the past being just as bad, please feel free to talk about those as well. But in case one of those reasons is "why be alarmed when there's little that can be done about it", my answer is that being alarmed motivates one to try to understand what is going on, which can help with (1) deciding personal behavior now in expectation of future changes (for example if there's going to be a literal Cultural Revolution in the future, then I need to be really really careful what I say today), (2) planning x-risk strategy, and (3) defending LW/EA from either outside attack or similar internal dynamics.

Here's some of what I've observed so far, which has led me to my current epistemic state:

In local politics, "asking for evidence of oppression is a form of oppression" or even more directly "questioning the experiences of a marginalized group that you don't belong to is not allowed and will result in a ban" has apparently been an implicit norm, and is being made increasingly explicit. (E.g., I saw a FB group explicitly codifying this in their rules.) As a result, anyone can say "Policy X or Program Y oppresses Group Z and must be changed" and nobody can argue against that, except by making the same kind of claim based on a different identity group, and then it comes down which group is recognized as being more privileged or oppressed by the current orthodoxy. (Even if someone does belong to Group Z and wants to argue against the claim on that basis, they'll be dismissed based on "being tokenized" or "internalized oppression".)

In academia, even leftist professors are being silenced or kicked out on a regular basis for speaking out against an ever-shifting "party line". ("Party line" is in quotes because it is apparently not determined in a top-down fashion by people in charge of a political party, but instead seems to arise from the bottom up, which is even scarier as no one can decide to turn this off, like the Chinese Communist Party did to end the Cultural Revolution after Mao died.) See here for a previous comment on this with links. I don't recall reading this kind of stories before about 5 years ago.

The thing that most directly prompted me to write this post was this (the most "recommended") comment on a recent New York Times story about "cancel culture":

Having just graduated from the University of Minnesota last year, a very liberal college, I believe these examples don't adequately show how far cancel culture has gone and what it truly is. The examples used of disassociating from obvious homophobes, or more classic bullying that teenage girls have always done to each other since the dawn of time is not new and not really cancel culture. The cancel culture that is truly new to my generation is the full blocking or shutting out of someone who simply has a different opinion than you. My experience in college was it morphed into a culture of fear for most. The fear of cancellation or punishment for voicing an opinion that the "group" disagreed with created a culture where most of us sat silent. My campus was not one of fruitful debate, but silent adherence to whatever the most "woke" person in the classroom decided was the correct thing to believe or think. This is not how things worked in the past, people used to be able to disagree, debate and sometimes feel offended because we are all looking to get closer to the truth on whatever topic it may be. Our problem with cancel culture is it snuffs out any debate, there is no longer room for dissent or nuance, the group can decide that your opinion isn't worth hearing and - poof you've been canceled into oblivion. Whatever it's worth I'd like to note I'm a liberal, voted for Obama and Hillary, those who participate in cancel culture aren't liberals to me, they've hijacked the name.

I went to the University of Washington (currently also quite liberal, see one of the above linked "professor" stories which took place at UW) in the late 90s, and I don't remember things being like this back then, but some of replies to this comment say that things were this bad before:

@Cal thoughtful comment, however, i grew up in the late 60s-70s and what you described was going on then also. the technology of course is different today, and the issues different. we never had a name ("cancel") for it, but it existed.

This sounds a lot like my college experience in the late 80s and early 90s. I think when people get out into the “real world” and have to work with people of varying ages and from varying backgrounds, they realize they need to be more tolerant to get by in the workplace. I remember being afraid to voice any opinion in liberal arts classes, for fear it would be the wrong one and inadvertently offend someone.

So LW, what to make of all this?


Cambridge Prediction Game

25 января, 2020 - 06:57
Published on January 25, 2020 3:57 AM GMT

In order to improve our prediction and calibration skills, the Cambridge, MA rationalist community has been community has been running a prediction game (and keeping score) at a succession of rationalist group houses since 2013.

Below are the rules:


The game consists of a series of prediction markets. Each market consists of a question that will (within a reasonable timeframe) have a well-defined binary or multiple-choice answer, an initial probability estimate (called the "house bet"), and a sequence of players' probability estimates. Each prediction is scored on how much more or less accurate it is than the preceding prediction (we do it this way because the previous player's prediction is evidence, and one of the skills this game is meant to develop is updating properly based on what other people think).

Creating Markets

Any player can create a market. To create a market, a player writes the question to be predicted on a whiteboard or on a giant sticky note on the wall with a house bet. The house bet should be something generally reasonable, but does not need to be super well-informed (this is abusable in theory but has not been abused in practice).

Making Predictions

To make a prediction, a player writes their name and their probability estimate under the most recent prediction (or the house bet if there are no predictions so far). The restrictions on predictions are:

  • The player who set the house bet cannot make the first prediction (otherwise they could essentially award themself points by setting a bad house bet).
  • No predicted probability can be < 0.01.
  • A player without a positive score cannot lower any predicted probability by more than a factor of 2 (in order to avoid creating too many easy points from going immediately after an inexperienced player).

When a market is settled (i.e. the correct answer becomes known), each prediction is given points equal to:

100 * log2( probability given to the correct answer / previous probability given to the correct answer)

In a binary market where the correct answer is no, each prediction's implied probability of "no" is used (e.g. if a player predicted 0.25, that is treated as p(no)=0.75).

This is a strictly proper scoring rule, meaning that the optimal strategy (strategy with the highest expected points) is to bet one's true beliefs about the question.

The points from each market are tracked in a spreadsheet, along with the date each market settled. The points from each market decay by a factor of e every 180 days.

The score of each player with a positive score is written on one of our whiteboards and is updated semi-regularly.

Example Markets

Example binary outcome market:

Does the nearest 7-11 sell coconut water? Points House 0.5 Alice 0.4 -32 Bob 0.2 -100 Alice 0.3 +58 Carol 0.6 +100 Outcome Yes

Example multiple-choice market:

Faithless electors in 2016 0 1-5 6-36 37+ Points House 0.4 0.4 0.1 0.1 Alice 0.2 0.4 0.1 0.3 0 Bob 0.2 0.5 0.2 0.1 +100 Carol 0.25 0.55 0.15 0.05 -42 Bob 0.1 0.3 0.58 0.02 +195 Outcome Yes


SSC Halifax Meetup -- January 25

25 января, 2020 - 04:15
Published on January 25, 2020 1:15 AM GMT

Come to the Humani-T cafe(on South street, mind you!), drink some coffee/tea, and discuss interesting topics.


Litany Against Anger

25 января, 2020 - 03:56
Published on January 25, 2020 12:56 AM GMT

The map is not the territory,
the word is not the thing.

My anger is not the trigger,
triggers do not become more or less important
as I feel anger wax and wane.

Emotions are motivation,
they're supposed to get me to do the right thing.

Defuse from my frustration,
breathe and wait for clarity calm will bring.


AI alignment concepts: philosophical breakers, stoppers, and distorters

24 января, 2020 - 22:23
Published on January 24, 2020 7:23 PM GMT

Meta: This is one of my abstract existential risk strategy concept posts that are designed to be about different perspectives or foundations upon which to build further.


When thinking about philosophy one may encounter philosophical breakers, philosophical stoppers, and philosophical distorters; thoughts or ideas that cause an agent (such as an AI) to break, get stuck, or take a random action. They are philosophical crises for that agent (and can in theory sometimes be information hazards). For some less severe human examples, see this recent post on reality masking puzzles. In AI, example breakers, stoppers, and distorters are logical contradictions (in some symbolic AIs), inability to generalize from examples, and mesa optimizers, respectively.

Philosophical breakers, stoppers, and distorters all both pose possible problems and opportunities for building safe and aligned AGI and preventing unaligned AGI from becoming dangerous. The may be encountered or solved by either explicit philosophy, implicitly as part of developing another field (like mathematics or AI), by accident, or by trial and error. An awareness of the idea of philosophical breakers, stoppers, and distorters provides another complementary perspective for solving AGI safety and may prompt the generation of new safety strategies and AGI designs (see also, this complementary strategy post on safety regulators).

Concept definitions

Philosophical breakers:

  • Philosophical thoughts and questions that cause an agent to break or otherwise take a lot of damage that are hard to anticipate beforehand for that agent.

Philosophical stoppers:

  • Philosophical thoughts and questions that cause an agent to get stuck in an important way that are hard to anticipate beforehand for that agent.

Philosophical distorters:

  • Philosophical thoughts and questions that cause an agent to choose a random or changed philosophical answer than the one it was using (possibly implicitly) earlier. An example in the field of AGI alignment would be something that causes an aligned AGI to in some sense randomly choose it’s utility function to be paperclip maximizing because of an ontological crisis.
Concepts providing context, generalization, and contrast

Thought breakers, stoppers, and distorters:

  • Generalizations of their philosophical versions that covers thoughts and questions in general, like a thought that would cause an agent to halt, implementing algorithms in buggy ways, deep meditative realizations, self-reprogramming that causes unexpected failures, getting stuck in thought loop... that are hard to anticipate beforehand for that agent.

System breakers, stoppers, and distorters:

  • A further generalization that also includes system environment and architecture problems. For instance, system environments could be full of hackers, noisy, or adversarial examples and the architecture could involve genetic algorithms.

Threats vs breakers, stoppers, and distorters:

  • Generalizations of breakers, stoppers, and distorters to include those things that are easy to anticipate beforehand for that agent.

Viewpoints: The agent’s viewpoint and an external viewpoint.

Application domains

The natural places to use these concepts are philosophical inquiry, the philosophical parts of mathematics or physics, and AGI alignment.

Concept consequences

If there is a philosophical breaker or stopper for an AGI when undergoing self-improvement into a superintelligence, and it isn’t a problem for humans or it’s one that we’ve already passed through, then by not disarming it for that AGI we are leaving a barrier in place for its development (a trivial example of this is general intelligence isn’t a problem for humans). This can be thought of as a safety method. Such problems can be either naturally found as consequences of an AGI design or an AGI may be designed to encounter them if it undergoes autonomous self-improvement.

If there is a philosophical distorter in front of a safe and aligned AGI, we’ll need to disarm it either by changing the AGI’s code/architecture or making the AGI aware of it in a way such that it can avoid it. We could, for instance, hard code an answer or we could point out some philosophical investigations as things to avoid until it is more sophisticated.

How capable an agent may become and how fast it reaches that capability will partially depend on the philosophical breakers and stoppers it encounters. If the agent has a better ability to search for and disarm them then it can go further without breaking or stopping.

How safe and aligned an agent is will partially be a function of the philosophical distorters it encounters (which in turn partially depends on its ability to search for them and disarm them).

Many philosophical breakers and stoppers are also philosophical distorters. For instance if a system gets stuck in generalizing beyond a point, it may rely on evolution instead. In this case we must think more carefully about disarming philosophical breakers and stoppers. If a safe and aligned AGI encounters a philosophical distorter, it is probably not safe and aligned anymore. But if an unaligned AGI encounters a philosophical stopper or breaker, it may be prevented from going further. In some sense, an AGI cannot ever be fully safe and aligned, if it will, upon autonomous self-improvement, encounter a philosophical distorter.

A proposed general AGI safety strategy with respect to philosophical breakers, stoppers, and distorters:
  1. First, design and implement a safe and aligned AGI (safe up to residual philosophical distorters). If the AGI isn’t safe and aligned, then proceed no further until you have one that is.
  2. Then, remove philosophical distorters that are not philosophical breakers or stoppers
  3. Then, remove philosophical distorters that are philosophical breakers or stoppers
  4. And finally, remove philosophical breakers and stoppers


The two-layer model of human values, and problems with synthesizing preferences

24 января, 2020 - 18:17
Published on January 24, 2020 3:17 PM GMT

I have been thinking about Stuart Armstrong's preference synthesis research agenda, and have long had the feeling that there's something off about the way that it is currently framed. In the post I try to describe why. I start by describing my current model of human values, how I interpret Stuart's implicit assumptions to conflict with it, and then talk about my confusion with regard to reconciling the two views.

The two-layer/ULM model of human values

In Player vs. Character: A Two-Level Model of Ethics, Sarah Constantin describes a model where the mind is divided, in game terms, into a "player" and a "character". The character is everything that we consciously experience, but our conscious experiences are not our true reasons for acting. As Sarah puts it:

In many games, such as Magic: The Gathering, Hearthstone, or Dungeons and Dragons, there’s a two-phase process. First, the player constructs a deck or character from a very large sample space of possibilities.  This is a particular combination of strengths and weaknesses and capabilities for action, which the player thinks can be successful against other decks/characters or at winning in the game universe.  The choice of deck or character often determines the strategies that deck or character can use in the second phase, which is actual gameplay.  In gameplay, the character (or deck) can only use the affordances that it’s been previously set up with.  This means that there are two separate places where a player needs to get things right: first, in designing a strong character/deck, and second, in executing the optimal strategies for that character/deck during gameplay. [...]The idea is that human behavior works very much like a two-level game. [...] The player determines what we find rewarding or unrewarding.  The player determines what we notice and what we overlook; things come to our attention if it suits the player’s strategy, and not otherwise.  The player gives us emotions when it’s strategic to do so.  The player sets up our subconscious evaluations of what is good for us and bad for us, which we experience as “liking” or “disliking.”The character is what executing the player’s strategies feels like from the inside.  If the player has decided that a task is unimportant, the character will experience “forgetting” to do it.  If the player has decided that alliance with someone will be in our interests, the character will experience “liking” that person.  Sometimes the player will notice and seize opportunities in a very strategic way that feels to the character like “being lucky” or “being in the right place at the right time.”This is where confusion often sets in. People will often protest “but I did care about that thing, I just forgot” or “but I’m not that Machiavellian, I’m just doing what comes naturally.”  This is true, because when we talk about ourselves and our experiences, we’re speaking “in character”, as our character.  The strategy is not going on at a conscious level. In fact, I don’t believe we (characters) have direct access to the player; we can only infer what it’s doing, based on what patterns of behavior (or thought or emotion or perception) we observe in ourselves and others.

I think that this model is basically correct, and that our emotional responses, preferences, etc. are all the result of a deeper-level optimization process. This optimization process, then, is something like that described in The Brain as a Universal Learning Machine:

The universal learning hypothesis proposes that all significant mental algorithms are learned; nothing is innate except for the learning and reward machinery itself (which is somewhat complicated, involving a number of systems and mechanisms), the initial rough architecture (equivalent to a prior over mindspace), and a small library of simple innate circuits (analogous to the operating system layer in a computer).  In this view the mind (software) is distinct from the brain (hardware).  The mind is a complex software system built out of a general learning mechanism. [...]An initial untrained seed ULM can be defined by 1.) a prior over the space of models (or equivalently, programs), 2.) an initial utility function, and 3.) the universal learning machinery/algorithm.  The machine is a real-time system that processes an input sensory/observation stream and produces an output motor/action stream to control the external world using a learned internal program that is the result of continuous self-optimization. [...]The key defining characteristic of a ULM is that it uses its universal learning algorithm for continuous recursive self-improvement with regards to the utility function (reward system).  We can view this as second (and higher) order optimization: the ULM optimizes the external world (first order), and also optimizes its own internal optimization process (second order), and so on.  Without loss of generality, any system capable of computing a large number of decision variables can also compute internal self-modification decisions.Conceptually the learning machinery computes a probability distribution over program-space that is proportional to the expected utility distribution.  At each timestep it receives a new sensory observation and expends some amount of computational energy to infer an updated (approximate) posterior distribution over its internal program-space: an approximate 'Bayesian' self-improvement.

Rephrasing these posts in terms of each other, in a person's brain "the player" is the underlying learning machinery, which is searching the space of programs (brains) in order to find a suitable configuration; the "character" is whatever set of emotional responses, aesthetics, identities, and so forth the learning program has currently hit upon.

Many of the things about the character that seem fixed, can in fact be modified by the learning machinery. One's sense of aesthetics can be updated by propagating new facts into it, and strongly-held identities (such as "I am a technical person") can change in response to new kinds of strategies becoming viable. Unlocking the Emotional Brain describes a number of such updates, such as - in these terms - the ULM eliminating subprograms blocking confidence after receiving an update saying that the consequences of expressing confidence will not be as bad as previously predicted.

Another example of this kind of a thing was the framework that I sketched in Building up to an Internal Family Systems model: if a system has certain kinds of bad experiences, it makes sense for it to spawn subsystems dedicated to ensuring that those experiences do not repeat. Moral psychology's social intuitionist model claims that people often have an existing conviction that certain actions or outcomes are bad, and that they then level seemingly rational arguments for the sake of preventing those outcomes. Even if you rebut the arguments, the conviction remains. This kind of a model is compatible with an IFS/ULM style model, where the learning machinery sets the goal of preventing particular outcomes, and then applies the "reasoning model" for that purpose.

Qiaochu Yuan notes that once you see people being upset at their coworker for criticizing them and you do therapy approaches with them, and this gets to the point where they are crying about how their father never told them that they were proud of them... then it gets really hard to take people's reactions to things at face value. Many of our consciously experienced motivations, actually have nothing to do with our real motivations. (See also: Nobody does the thing that they are supposedly doing, The Elephant in the Brain, The Intelligent Social Web.)

Preference synthesis as a character-level model

While I like a lot of the work that Stuart Armstrong has done on synthesizing human preferences, I have a serious concern about it which is best described as: everything in it is based on the character level, rather than the player/ULM level.

For example, in "Our values are underdefined, changeable, and manipulable", Stuart - in my view, correctly - argues for the claim stated in the title... except that, it is not clear to me to what extent the things we intuitively consider our "values", are actually our values. Stuart opens with this example:

When asked whether "communist" journalists could report freely from the USA, only 36% of 1950 Americans agreed. A follow up question about Amerian journalists reporting freely from the USSR got 66% agreement. When the order of the questions was reversed, 90% were in favour of American journalists - and an astounding 73% in favour of the communist ones.

From this, Stuart suggests that people's values on these questions should be thought of as underdetermined. I think that this has a grain of truth to it, but that calling these opinions "values" in the first place is misleading.

My preferred framing would rather be that people's values - in the sense of some deeper set of rewards which the underlying machinery is optimizing for - are in fact underdetermined, but that is not what's going on in this particular example. The order of the questions does not change those values, which remain stable under this kind of a consideration. Rather, consciously-held political opinions are strategies for carrying out the underlying values. Receiving the questions in a different order caused the system to consider different kinds of information when it was choosing its initial strategy, causing different strategic choices.

Stuart's research agenda does talk about incorporating meta-preferences, but as far as I can tell, all the meta-preferences are about the character level too. Stuart mentions "I want to be more generous" and "I want to have consistent preferences" as examples of meta-preferences; in actuality, these meta-preferences might exist because of something like "the learning system has identified generosity as a socially admirable strategy and predicts that to lead to better social outcomes" and "the learning system has formulated consistency as a generally valuable heuristic and one which affirms the 'logical thinker' identity, which in turn is being optimized because of its predicted social outcomes".

My confusion about a better theory of values

If a "purely character-level" model of human values is wrong, how do we incorporate the player level?

I'm not sure and am mostly confused about it, so I will just babble & boggle at my confusion for a while, in the hopes that it would help.

The optimistic take would be that there exists some set of universal human values which the learning machinery is optimizing for. There exist various therapy frameworks which claim to have found something like this.

For example, the NEDERA model claims that there exist nine negative core feelings whose avoidance humans are optimizing for: people may feel Alone, Bad, Helpless, Hopeless, Inadequate, Insignificant, Lost/Disoriented, Lost/Empty, and Worthless. And pjeby mentions that in his empirical work, he has found three clusters of underlying fears which seem similar to these nine:

For example, working with people on self-image problems, I've found that there appear to be only three critical "flavors" of self-judgment that create life-long low self-esteem in some area, and associated compulsive or avoidant behaviors:Belief that one is bad, defective, or malicious (i.e. lacking in care/altruism for friends or family)Belief that one is foolish, incapable, incompetent, unworthy, etc. (i.e. lacking in ability to learn/improve/perform)Belief that one is selfish, irresponsible, careless, etc. (i.e. not respecting what the family or community values or believes important)(Notice that these are things that, if you were bad enough at them in the ancestral environment, or if people only thought you were, you would lose reproductive opportunities and/or your life due to ostracism. So it's reasonable to assume that we have wiring biased to treat these as high-priority long-term drivers of compensatory signaling behavior.)Anyway, when somebody gets taught that some behavior (e.g. showing off, not working hard, forgetting things) equates to one of these morality-like judgments as a persistent quality of themselves, they often develop a compulsive need to prove otherwise, which makes them choose their goals, not based on the goal's actual utility to themself or others, but rather based on the goal's perceived value as a means of virtue-signalling. (Which then leads to a pattern of continually trying to achieve similar goals and either failing, or feeling as though the goal was unsatisfactory despite succeeding at it.)

So - assuming for the sake of argument that these findings are correct - one might think something like "okay, here are the things the brain is trying to avoid, we can take those as the basic human values".

But not so fast. After all, emotions are all computed in the brain, so "avoidance of these emotions" can't be the only goal any more than "optimizing happiness" can. It would only lead to wireheading.

Furthermore, it seems like one of the things that the underlying machinery also learns, is situations in which it should trigger these feelings. E.g. feelings of irresponsibility can be used as an internal carrot and stick scheme, in which the system comes to predict that if it will feel persistently bad, this will cause parts of it to pursue specific goals in an attempt to make those negative feelings go away.

Also, we are not only trying to avoid negative feelings. Empirically, it doesn't look like happy people end up doing less than unhappy people, and guilt-free people may in fact do more than guilt-driven people. The relationship is nowhere linear, but it seems like there are plenty of happy, energetic people who are happy in part because they are doing all kinds of fulfilling things.

So maybe we could look at the inverse of negative feelings: positive feelings. The current mainstream model of human motivation and basic needs is self-determination theory, which explicitly holds that there exist three separate basic needs:

Autonomy: people have a need to feel that they are the masters of their own destiny and that they have at least some control over their lives; most importantly, people have a need to feel that they are in control of their own behavior.Competence: another need concerns our achievements, knowledge, and skills; people have a need to build their competence and develop mastery over tasks that are important to them.Relatedness (also called Connection): people need to have a sense of belonging and connectedness with others; each of us needs other people to some degree

So one model could be that the basic learning machinery is, first, optimizing for avoiding bad feelings; and then, optimizing for things that have been associated with good feelings (even when doing those things is locally unrewarding, e.g. taking care of your children even when it's unpleasant). But this too risks running into the wireheading issue.

A problem here is that while it might make intuitive sense to say "okay, if the character's values aren't the real values, let's use the player's values instead", the split isn't actually anywhere that clean. In a sense the player's values are the real ones - but there's also a sense in which the player doesn't have anything that we could call values. It's just a learning system which observes a stream of rewards and optimizes it according to some set of mechanisms, and even the reward and optimization mechanisms themselves may end up getting at least partially rewritten. The underlying machinery has no idea about things like "existential risk" or "avoiding wireheading" or necessarily even "personal survival" - thinking about those is a character-level strategy, even if it is chosen by the player using criteria that it does not actually understand.

For a moment it felt like looking at the player level would help with the underdefinability and mutability of values, but the player's values seem like they could be even less defined and even more mutable. It's not clear to me that we can call them values in the first place, either - any more than it makes meaningful sense to say that a neuron in the brain "values" firing and releasing neurotransmitters. The player is just a set of code, or going one abstraction level down, just a bunch of cells.

To the extent that there exists something that intuitively resembles what we call "human values", it feels like it exists in some hybrid level which incorporates parts of the player and parts of the character. That is, assuming that the two can even be very clearly distinguished from each other in the first place.

Or something. I'm confused.


How much do we know about how brains learn?

24 января, 2020 - 17:46
Published on January 24, 2020 2:46 PM GMT

In particular, how much do we know about how human brains learn?

I'm also particularly interested in details at a level similar to this post:

What I remember about this is vague, but my understanding is that we don't (or didn't until very recently) have particularly strong evidence about how (human) brains do the equivalent of reinforcement learning in { decision theory / statistics / AI / ML }.

Do we know significantly more now versus five (5) years ago? What do we know now (and how do we know it)?


Epistea Summer Experiment (ESE)

24 января, 2020 - 14:00
Published on January 24, 2020 10:49 AM GMT

Remark: This post was written collectively by the organizing team of the Epistea Summer Experiment.

Cross-posted to the EA Forum here.

Epistea Summer Experiment (ESE, /&#x2C8;i&#x2D0;zi/) was an experimental summer workshop in Prague combining elements of applied rationality and experiential education. The main goals were to:

  • Try new ideas about rationality education, such as multi-agent models of minds, and ideas about group epistemics and coordination
  • Try to import insights and formats from experiential education
  • Connect people interested in rationality education

We consider the event successful and plan to use the insights gained for creating more content along similar lines.

The remainder of the post will outline our motivation for focusing on these goals, our takeaways, and future plans.

MotivationsGroup Rationality

Most of today’s rationality curriculum and research is focused on individual rationality, or ‘how to think clearly (alone)’. The field of group rationality - or ‘how to think well and act as groups of humans’ - is less developed, and open problems are the norm.

[...] I feel like group rationality remains a largely open and empty field. There’s a lot of literature on what we do wrong, but not a lot of ready-made “techniques” just sitting there to help us get it right — only a scattering of disconnected traditions in things like management or family therapy or politics. My sense is that group rationality is in a place similar to where individual rationality was ~15 years ago [...]. (cited from ‘Open Problems In Group Rationality’, by Duncan Sabien)

The central problem is that people use beliefs for many purposes - including tracking what is true. But another, practically important purpose is coordination. We think it’s likely that if an aspiring rationalist decides to “stop bullshitting”, they lose some of the social technology often used for successfully coordinating with other people. How exactly does this dynamic affect coordination? Can we do anything about it?

There are more reasons why we believe progress in the domain of group rationality is needed. There are for example the classical problems in coordination (e.g. Moloch), communication (e.g. common knowledge and miasma), social dynamics (e.g. groupthink, pluralistic ignorance, etc.) and others. In particular, we have little know-how about how to teach the skills which are important for effectively solving any one group rationality problems (this is also called pedagogical-content knowledge or PCK).

Group rationality and group epistemics are important for all kinds of EA, long-termist and x-risk efforts, possibly more so than in most other domains. It has been observed in the past that topics related to EA are prone to coordination failures. Given the sheer size of these problems, it is unlikely any one of us will solve them alone.

Overall, we are confident that the field of group rationality is both important and neglected.

Experiential Education

Experiential education is an educational tradition with a roughly 40-year history in Czechia. The central idea is that the process of learning often happens through directly experiencing the content - in contrast to other, more passive ‘classroom-style’ approaches to learning. Clearly, different types of content lend themselves more to one or the other pedagogical approach. For example, advanced mathematics is hard to experience directly. Understanding team dynamics or the role of trust in cooperation, on the other hand, can effectively be taught via relevant experiences.

Several members of our team have organized events of this type, some of them targeted to talented high schoolers. Judging by our personal experience, the methodology succeeds in giving people more agency and helps them coordinate in a group context. This is why we were excited to transfer the pedagogical know-how from the experiential education school of thought to teaching group rationality in particular.

Multiagent Models of Mind

We developed a new technique for increasing internal alignment called ‘Internal Communication Framework’ (ICF). [1] On the theoretical side, it is based on multi-agent mind models, predictive processing and game theory (for more theory behind our effort, see Jan's post and Kaj's sequence). On the practical side, it draws from Internal Family Systems (IFS) and Internal Double Crux (IDC).

Our goal for the event was to gather more data and we have anecdotal evidence that the technique is useful for increasing internal alignment in individuals.


Since the event was highly experimental, we did not extend our outreach efforts to a broad audience and instead targeted the event to people who already had prior experience with applied rationality. Our participants were CFAR alumni, members of the Moscow Kocherga hub and other people with a similar background. The motivation behind this was to minimize the risks in cases where our content wouldn’t work. Specifically, we wanted to avoid several failure modes, including idea inoculation, PR risks and destabilizing participants prone to mental health issues. (We also warned everyone that the content was experimental.)

Curriculum design

In creating the activities, we drew a lot of inspiration from the Czech Experiential Learning Centre. Over the years, they have accumulated an impressive amount of knowledge and a library of activities ranging from outdoor games to deep introspection methods. We adjusted some of these for our goals and created a couple of new ones. We also incorporated a few classes which were taught traditionally (i.e. classroom-style) to introduce relevant concepts. Part of the content was created originally by Epistea, based mainly on theoretical reasoning.

After generating a set of individual activities, we arranged them into a 6-day long program while taking into account program coherence, the need to balance mental and physical intensity for participants, the overarching plotline, and other factors.

Examples of activities or techniques used are:

  • Role-playing game focused on the mechanisms of depleting shared resources (tragedy of the commons)
  • Game in which participants work in teams on a range of problem-solving tasks while one or two of them are secret “saboteurs”
  • Trust-o-meter: similarly to CFAR's toolification of accessing implicit representations of surprise (i.e. surprise-o-meter), we can access and calibrate over time implicit representations of various social variables
  • "Tea tasting": accessing implicit models (similar to some forms of Focusing)
  • Class on Internal Communication Framework (ICF)
  • Campfire ritual prepared by participants
  • Outdoor games

Overall we consider the workshop successful:

  • Results of the survey filled in by the participants at the end of the event indicate that experiential education may be a good medium to convey group rationality concepts. [2]
    • For example, two experiential games focused on coordination were by half of the participants evaluated as "extremely useful to improve external coordination".
  • We received positive feedback on the Internal Communication Framework.
    • “ICF is a more general and powerful tool than the stuff I was using before (IDC, NVC, CBT) and I think it might help me to make more robust inner peace.”
    • “ICF has succeeded where several similar systems were less successful.”
    • “[ICF] appears to have unblocked me with respect to introspection and system 1.”
  • We deepened connections among the currently somewhat scattered European rationality space.
    • All of the participants reported they would be comfortable to reach out for help with a rationality related task to more people than before the event. (The imprecise estimate of the average delta is ~8 new people.)
  • The team gained valuable insights into group rationality and how to teach it.
  • Some of the activities from ESE were successfully re-used at ESPR (European Summer Program on Rationality).

We also made mistakes:

  • We created a logistical nightmare for ourselves by running the event back-to-back with the Human-aligned AI Summer School. (This was motivated by the wish to make it easier for people to participate in both events, but was likely not worth it.)
  • We were miscalibrated about people's physical abilities and overestimated e.g. their stamina.
  • Some of the activities were prepared last minute at the expense of the energy of organizers who could otherwise use it to make the event run more smoothly or get more rest.

Though the participants’ feedback was positive, the sample size collected last summer is small. We will continue to experiment with rationality education along similar lines and gather more data. Our future plans include events focused on group rationality, epistemics, ICF and also exchange of ideas among rationality teachers. Further information on these will follow soon. If you want to make sure not to miss information on future events we organize, you can fill in this form.

[1] As the technique and the right method of teaching it are still in development, there isn’t any public write up available.

[2] There is going to be a follow-up survey to check if the participants retained the knowledge and how they evaluate the experience after a longer period of time.


Strategic Frames for LW in 2020

24 января, 2020 - 06:19
Published on January 24, 2020 3:19 AM GMT

When not crunching numbers or reading old books, I try to think about the LessWrong Big Picture. In this post, I share a few of the major lines of thought I've pursuing.

These are Ruby's frames and don't necessarily represent the views of the rest of the LW team.

LW's Enduring, Compounding Intellectual Tradition

Many places on the Internet offer interesting content people can browse for their enjoyment: FB, Reddit, Twitter, and a thousand blogs and new sites. Most of the content on those sites is ephemeral entertainment; viewed once and then people move on.

LessWrong aspires to be in a different reference class. LessWrong is trying to build a lasting body of knowledge akin to an academic field or university library. Each post is a brick, each comment some mortar. Hopefully, people are building knowledge on top of their predecessors in ongoing intellectual conversation.

I say enduring because the vision is that content persists and is archived for decades to come.

I say compounding because the vision is that people build upon each other's knowledge.

On the shoulders of giants.

Projects that fit within this frame are pingbacks, tagging, improved search, a wiki, the Library page, the 2018 Review, and the recommendations section. These all make it easier to locate older content of interest and shift focus away from the Latest Posts list.

LW as  a Community of Practice / A Place of Effort
  1. Communities of Practice have members with a particular work role or expertise. These communities are focused on developing expertise, skills, and proficiency in the specialty. The motivation is to master the discipline, learn about the specialty, and solve problems together. An example of a role-based community is project management, and an example of an expertise-based community is Microsoft SharePoint.
  2. Communities of Interest are groups of people who want to learn about a particular topic, or who are passionate about one. They make no commitment to deliver something together. The motivation is to stay current on the topic and to be able to ask and answer questions about it. An example is all people who have an interest in photography. - Stan Garfield

The Community of Interest version of LessWrong is as place where we all share common interests in self-improvement, AI, technology, philosophy, etc., and we get together to read and discuss these ideas.

The Community of Practice version is where we're intentionally trying to make progress. Trying to grow as individuals and improve our rationality, trying to grow as a group and collectively generate valuable knowledge.

LessWrong has also been a mix of the two (which it should be!), but the idea is to more firmly establish LessWrong as a place where you can invest effort to get worthwhile outcomes. LessWrong as a gym, dojo, and research institute.

A place to train

Projects that fit within this frame include creating exercises for content (already allowed somewhat), rewards for reading the Sequences and other core content, and providing assistance to meetups where people train.

The Open Questions platform was an attempt to create a place for more high-effort, novel intellectual contributions. It became apparent that on no major existing questions platform do people perform much new research on questions posed. At best they'll spend several hours writing up their existing knowledge. If we want to achieve that grant vision for Open Questions, we'll probably have to address the overall incentives of doing that kind of work.

Making the Incentives Work

I'm motivated by the goal of creating a LessWrong where people i) level up in rationality, ii) contribute to intellectual progress that matters, or iii) learn or grow in ways that help them do good things.

Naturally, people will engage in productive activities on LessWrong to extent that the balance of benefits, costs, and risks works out in their mind. People come to LessWrong for multiple reasons not among my three goals for them. People care about the hedonics, entertainment, status, have an audience, social interaction, community belonging, etc.

Within this frame, I try to think about where the current balance of incentives lies and what we might do to shift so that's the correct thing for people to invest in LessWrong more. [I'm not interested in trying to trick people into LW counter to their own interests; I'm interested in making LessWrong a compelling top choice for their time and attention.]

Under this frame, I wonder whether people would be more motivated if we had more "major authors" on the site. Should we push harder on that? Or maybe more major authors would come if there was clearly a larger audience (chicken and egg then). Or maybe we need to provide rewards to contributors with prizes and including them in things like books. I worry about the experience of new users and whether they clash with established users who communicate differently.

I have a model that people like to get involved in things that they think are going well. This motivates me to share data on LessWrong's growth, though I try to commit to sharing the data even when the conclusions are inconvenient too.


Related to this frame, I'm very interested in conducting user interviews where I can get an inkling of what motivates (or un-motivates) people from being on LessWrong.

LW as a Recruitment Tool

One of the greatest things that LessWrong has accomplished historically is attracting a bunch of great people into the same community. We could focus on doing this more intentionally.

There's a host a analytics work I intend to shortly on understanding the funnel: where do people come from to LW, what's their initial experience like, which of them stick around, read the core content, and start making valuable contributions? Can we model this funnel and then improve it? I think so.

Joining the team!

If we successfully figured out who ultimately could become a major contributor to LessWrong, maybe we could actively seek them out. I dunno, posting to relevant Reddit forums or buying adds on SlateStarCodex.

Those are some frames

The above are a few ways I've been thinking about LessWrong recently. Each suggest that certain projects might be a good idea. I hope this posts communicates both how I'm thinking about things and also how those thoughts lead to some of the project we work on.

Thoughts, advice, feedback, recommendation, and objections are all welcome.


2018 Review: Voting Results!

24 января, 2020 - 05:00
Published on January 24, 2020 2:00 AM GMT

The votes are in!

59 of the 430 eligible voters participated, evaluating 75 posts. Meanwhile, 39 users submitted a total of 120 reviews, with most posts getting at least one review. 

Thanks a ton to everyone who put in time to think about the posts - nominators, reviewers and voters alike. Several reviews substantially changed my mind about many topics and ideas, and I was quite grateful for the authors participating in the process. I'll mention Zack_M_Davis, Vanessa Kosoy, and Daniel Filan, as great people who wrote the most upvoted reviews.

In the coming months, the LessWrong team will write further analyses of the vote data, and use the information to form a sequence and a book of the best writing on LessWrong from 2018.

Below are the results of the vote, followed by a discussion of how reliable the result is and plans for the future.

Top 15 posts
  1. Embedded Agents
  2. The Rocket Alignment Problem
  3. Local Validity as a Key to Sanity and Civilization
  4. Arguments about fast takeoff
  5. The Costly Coordination Mechanism of Common Knowledge
  6. Toward a New Technical Explanation of Technical Explanation
  7. Anti-social Punishment
  8. The Tails Coming Apart As Metaphor For Life
  9. Babble
  10. The Loudest Alarm Is Probably False
  11. The Intelligent Social Web
  12. Prediction Markets: When Do They Work?
  13. Coherence arguments do not imply goal-directed behavior
  14. Is Science Slowing Down?
  15. Robustness to Scale
Top 15 posts not about AI
  1. Local Validity as a Key to Sanity and Civilization
  2. The Costly Coordination Mechanism of Common Knowledge
  3. Anti-social Punishment
  4. The Tails Coming Apart As Metaphor For Life
  5. Babble
  6. The Loudest Alarm Is Probably False
  7. The Intelligent Social Web
  8. Prediction Markets: When Do They Work?
  9. Is Science Slowing Down?
  10. A voting theory primer for rationalists
  11. Toolbox-thinking and Law-thinking
  12. A Sketch of Good Communication
  13. A LessWrong Crypto Autopsy
  14. Unrolling social metacognition: Three levels of meta are not enough.
  15. Varieties Of Argumentative Experience
Top 10 posts about AI

(The vote included 20 posts about AI.)

  1. Embedded Agents
  2. The Rocket Alignment Problem
  3. Arguments about fast takeoff
  4. Toward a New Technical Explanation of Technical Explanation
  5. Coherence arguments do not imply goal-directed behavior
  6. Robustness to Scale
  7. Paul's research agenda FAQ
  8. An Untrollable Mathematician Illustrated
  9. Specification gaming examples in AI
  10. 2018 AI Alignment Literature Review and Charity Comparison
The Complete Results

Click Here If You Would Like A More Comprehensive Vote Data Spreadsheet

To help users see the spread of the vote data, we've included swarmplot visualizations.

  • For space reasons, only votes with weights between -10 and 16 are plotted. This covers 99.4% of votes.
  • Gridlines are spaced 2 points apart.
  • Concrete illustration: The plot immediately below has 18 votes ranging in strength from -3 to 12.
#Post Title              Total                                                                                                 Vote Spread                                                                           1Embedded Agents209(One outlier vote of +17 is not shown)2The Rocket Alignment Problem1833Local Validity as a Key to Sanity and Civilization1334Arguments about fast takeoff985The Costly Coordination Mechanism of Common Knowledge956Toward a New Technical Explanation of Technical Explanation917Anti-social Punishment90(One outlier vote of +20 is not shown)8The Tails Coming Apart As Metaphor For Life899Babble8510The Loudest Alarm Is 
Probably False8411The Intelligent Social Web7912Prediction Markets: 
When Do They Work?7713Coherence arguments do not imply goal-directed behavior7614Is Science Slowing Down?7515Robustness to Scale7415A voting theory primer for rationalists7417Toolbox-thinking and Law-thinking7318A Sketch of Good Communication7219A LessWrong Crypto Autopsy7120Paul's research agenda FAQ7021Unrolling social metacognition: Three levels of meta are not enough.6922An Untrollable Mathematician Illustrated6523Specification gaming 
examples in AI6423Will AI See Sudden Progress?6423Varieties Of Argumentative Experience6426Meta-Honesty: Firming Up Honesty Around Its Edge-Cases6227My attempt to explain 
Looking, insight meditation, 
and enlightenment in 
non-mysterious terms6027Naming the Nameless6027Inadequate Equilibria vs. Governance of the Commons60302018 AI Alignment Literature Review and Charity Comparison5731Noticing the Taste of Lotus5531On Doing the Improbable5531The Pavlov Strategy5531Being a Robust, Coherent Agent (V2)5535Spaghetti Towers5436Beyond Astronomical Waste5136Research: Rescuers during the Holocaust5138Open question: are minimal circuits daemon-free?4838Decoupling vs Contextualising Norms48(One outlier vote of +23)40On the Loss and Preservation of Knowledge4741Is Clickbait Destroying Our General Intelligence?4642What makes people intellectually active?4343Why everything might have taken so long4044Challenges to Christiano’s capability amplification proposal3945Public Positions and Private Guts3846Clarifying "AI Alignment"3646Expressive Vocabulary3648Bottle Caps Aren't Optimisers3449Argue Politics* With Your Best Friends3250Player vs. Character: A Two-Level Model of Ethics3051Conversational Cultures: Combat vs Nurture (V2)2951Act of Charity2953Optimization Amplifies2753Circling27(One outlier vote of -17)55Realism about rationality25(Two outliers of -30 and +18)55Caring less2557Lessons from the Cold War on Information Hazards: Why Internal Communication is Critical2457The Bat and Ball Problem Revisited2459Argument, intuition, and recursion2159Unknown Knowns2161Competitive Markets as Distributed Backprop1862Towards a New Impact Measure1462Explicit and Implicit Communication1462On the Chatham House Rule1462Historical mathematicians exhibit a birth order effect too1466Everything I ever needed to know, I learned from World of Warcraft: Goodhart’s law1367The funnel of human experience1168Understanding is translation969Preliminary thoughts on 
moral weight770Metaphilosophical competence can't be disentangled from alignment371Two types of mathematician272How did academia ensure papers were correct in the early 20th Century?-273Birth order effect found in Nobel Laureates in Physics-574Give praise-1075Affordance Widths-142(One outlier of -29)How reliable is the output of this vote?

For most posts, between 10-20 people voted on them (median of 17). A change by 10-15 in a post's score is enough to move a post up or down around 10 positions within the rankings. This is equal to a few moderate strength votes from two or three people, or an exceedingly strong vote from a single strongly-feeling voter. This means that the system is somewhat noisy, though it seems to me very unlikely that posts at the very top could end up placed much differently. 

The vote was also affected by two technical mistakes the team made:

  1. The post-order was not randomized. For the first half of the voting period, the posts on the voting page appeared in order of number of nominations (least to most) instead of appearing randomly, thereby giving more visual attention to the first ~15 or so posts (these were posts with 2 nominations). Ruby looked into it and says that 15-30% more people cast votes on these earlier-appearing posts compared to those appearing elsewhere in the list. Credit to gjm for identifying this issue.
  2. Users were given some free negative votes. When calculating the cost of users' votes, we used a simple equation, but missed that it produced an off-by-one error for negative numbers. Essentially, users got a free 1-negative-vote-weight on all the posts to which they had voted on negatively. To correct for this, for those who had exceeded their budget - 18 users in total - we reduced the strength of their negative votes by a single unit, and for those who had not spent all their points their votes were unaffected. This didn't affect the rank-ordering very much, a few posts changed by 1 position, and a smaller number changed by 2-3 positions.

The effect size of these errors is not certain since it's hard to know how people would have voted counterfactually. My sense is that the effect is pretty small, and that the majority of noise in the system comes from elsewhere.

Finally, we discarded the ballot of one user, who spent 10,000 points on voting instead of the allotted 500.

Overall, I think the vote is a good indicator to about 10 places within the rankings, but, for example, I wouldn't agonise over whether a post is at position #42 vs #43.

Future Years

This has been the first LessWrong Annual Review. This project was started with the vision of creating a piece of infrastructure that would

  1. Create common knowledge about how the LessWrong community feels about various posts and topics and the progress we've made.
  2. Improve our longterm incentives, feedback, and rewards for authors.
  3. Help create a highly curated "Best of 2018" Sequence and Book.

The vote reveals much disagreement between LessWrongers. Every post has at least five positive votes and every post had at least one negative vote - except for An Untrollable Mathematician Illustrated by Abram Demski, which was evidently just too likeable – and many people had strongly different feelings about many posts. Many of these seem more interesting to me than the specific ranking of the given post.

In total, users wrote 207 nominations and 120 reviews, and many authors updated their posts with new thinking, or clearer explanations, showing that both readers and authors reflected a lot (and I think changed their mind a lot) during the review period. I think all of this is great, and like the idea of us having a Schelling time in the year for this sort of thinking.

Speaking for myself, this has been a fascinating and successful experiment - I've learned a lot. My thanks to Ray for pushing me and the rest of the team to actually do it this year, in a move-fast-and-break-things kind of way. The team will be conducting a Review of the Review where we take stock of what happened, discuss the value and costs of the Review process, and think about how to make the review process more effective and efficient in future years.

In the coming months, the LessWrong team will write further analyses of the vote data, award prizes to authors and reviewers, and use the vote to help design a sequence and a book of the best writing on LW from 2018. 

I think it's awesome that we can do things like this, and I was honestly surprised by the level of community participation. Thanks to everyone who helped out in the LessWrong 2018 Review - everyone who nominated, reviewed, voted and wrote the posts.


Improving Group Decision Making

24 января, 2020 - 04:29
Published on January 24, 2020 1:29 AM GMT

This was a talk at Effective Altruism Global (London 2019) by Mahendra Prasad. It's a non-technical introduction to the ideas in his working paper I previously posted. It also covers some additional ground. For instance, from Republican polling data, we can see the difference voting methods can make with Trump in first place in plurality voting vs last place with approval voting. Towards the end, there's also a sketch of a research program applying algorithmic game theory to better understand and improve the background conditions of multiagent systems like deliberative democracy.


Emergency Prescription Medication

24 января, 2020 - 04:20
Published on January 24, 2020 1:20 AM GMT

In the comments on yesterday's post on planning for disasters people brought up the situation of medications. As with many things in how the US handles healthcare and drugs, this is a mess.

The official recommendation is to prepare emergency supply kits for your home and work that contain:

At least a week-long supply of prescription medicines, along with a list of all medications, dosage, and any allergies

Running out of some medications can kill you: running out of blood pressure medication (ex: clonidine, propranolol) risks strokes or heart attacks, running out of anxiety medication (specifically, benzodiazepines) risks seizures, running out of insulin risks a diabetic coma. For medications like these, a week's worth seems low to me, since the harm of not having them is very high and maintaining extra that you rotate through should be low cost.

Should be low cost, but is it? If I decide I want to stock an extra month's worth of non-perishable food and rotate through it this is just bringing an expense forward a month, and is relatively cheap. But that's not how it works with medication.

Let's say I go to my doctor and ask for an extra month's worth of my medication to keep on hand for emergencies, and they are willing to write a prescription. My insurance company isn't required to cover backup medication, so they don't, which means I'd need to pay the sticker price.

Now, the US health insurance system is a mess, and part of how it's a mess is that it's mostly not insurance. In the case of prescription drugs it is more of a buyers club. While an individual is in a poor position to negotiate with a drug company, an insurance company can often use its large membership to get lower rates. Many drugs are far more expensive when bought individually than when bought with insurance, so it's likely that my extra month's worth of medication would cost me much more than it would cost my insurance company. And that's in addition to my insurance not helping me pay for it!

This is also assuming that my doctor is willing to write the prescription. If I'm on benzodiazepines, which are a controlled substance, my doctor would probably get in trouble for writing that prescription. Legally, there's nothing I can do except get every refill on the first day it's available.

If you read patient discussions you see some strategies that are probably not legal:

  • Refilling slightly early each month and building up a surplus (doesn't work with controlled medications).

  • Getting the doctor prescribe a higher dose, but continuing taking the lower dose and saving the difference.

  • Skipping some days or dividing a dose to build up some slack.

  • Buying the medication from sketchy foreign websites.

The medical community is aware of this problem (ex) but policy here, especially for controlled substances, is pretty limiting.

It's hard for me to think of solutions here when the drug policy I'd advocate is substantially less restrictive than the status quo, but for people who have a decent chance of dying if their medicine is interrupted this is clearly not a reasonable approach.

What do other countries do?

Comment via: facebook


LW/SSC Warsaw February Meetup

23 января, 2020 - 22:49
Published on January 23, 2020 7:49 PM UTC

A meetup for people interested in improving their reasoning and decision-making skills, technology, the long-term future, and philosophy. You don't have to know what LessWrong or Slate Star Codex are, but it definitely helps.

Here's a list of discussion prompts for you to use or ignore:
- Identifying our mental blind spots
- Noticing systematic biases in our behaviour and ways to fix them
- How to evaluate contrarian ideas/when is it better to trust consensus and tradition
- Recent progress in AI, ML, or science and technology in general
- Future of work given AI progress
- AI safety and ethics
- Effective learning techniques
- Designing better institutions
- Your favourite philosopher/thinker and how they relate to any of those issues
- Interesting things you have learned recently and want to share

Discussions will be held in English, unless everyone present is comfortable with Polish.

This time we are meeting in "Pi&#x119;tro Ni&#x17C;ej", a craft beer bar with Polish cuisine. It is located in the basement of "PAST" building on Zielna 39, just next to the entrance to &#x15A;wi&#x119;tokrzyska metro station.

There are 4 tables reserved for us this time (so switching subgroups and topics will be easier), with 16 spots in total - more people could likely be accomodated, but I can't guarantee more free tables/chairs. Due to the group reservation be prepared for 10% menu price hike. More about the venue: http://notabene.webd.pl/ntbn2/


Theory of Causal Models with Dynamic Structure?

23 января, 2020 - 22:47
Published on January 23, 2020 7:47 PM UTC

I'm looking for any work done involving causal models with dynamic structure, i.e. some variables in the model determine the structure of other parts of the model.

I know some probabilistic programming languages support dynamic structure (e.g. the pyro docs mention it directly at one point). And of course one can always just embed the dynamic structure in a static structure (i.e. a model of any general-purpose computer), although that's messy enough that I'd expect it to create other problems.

I haven't found much by quick googling (too many irrelevant things use similar terms) so I'd appreciate any pointers at all. At this point I've found basically-zero directly relevant work other than PPLs, and I don't even know of any standard notation.