Вы здесь

Новости LessWrong.com

Подписка на Лента Новости LessWrong.com Новости LessWrong.com
A community blog devoted to refining the art of rationality
Обновлено: 38 минут 16 секунд назад

Simple Rules of Law

13 часов 8 минут назад
Published on May 19, 2019 12:10 AM UTC

Response To: Who Likes Simple Rules?

Epistemic Status: Working through examples with varying degrees of confidence, to help us be concrete and eventually generalize.

Robin Hanson has, in his words, “some puzzles” that I will be analyzing. I’ve added letters for reference.

  • A] People are often okay with having either policy A or policy B adopted as the standard policy for all cases. But then they object greatly to a policy of randomly picking A or B in particular cases in order to find out which one works better, and then adopt it for everyone.
  • B] People don’t like speed and red-light cameras; they prefer human cops who will use discretion. On average people don’t think that speeding enforcement discretion will be used to benefit society, but 3 out of 4 expect that it will benefit them personally. More generally people seem to like a crime law system where at least a dozen different people are authorized to in effect pardon any given person accused of any given crime; most people expect to benefit personally from such discretion.
  • C] In many European nations citizens send their tax info into the government who then tells them how much tax they owe. But in the US and many other nations, too many people oppose this policy. The most vocal opponents think they benefit personally from being able to pay less than what the government would say they owe.
  • D] The British National Health Service gets a lot of criticism from choosing treatments by estimating their cost per quality-adjusted-life-year. US folks wouldn’t tolerate such a policy. Critics lobbying to get exceptional treatment say things like “one cannot assume that someone who is wheel-chair bound cannot live as or more happily. … [set] both false limits on healthcare and reducing freedom of choice. … reflects an overly utilitarian approach”
  • E] There’s long been opposition to using an official value of life parameter in deciding government policies. Juries have also severely punished firms for using such parameters to make firm decisions.
  • F] In academic departments like mine, we tell new professors that to get tenure they need to publish enough papers in good journals. But we refuse to say how many is enough or which journals count as how good. We’d keep the flexibility to make whatever decision we want at the last minute.
  • G] People who hire lawyers rarely know their track record and winning vs. losing court cases. The info is public, but so few are interested that it is rarely collected or consulted. People who hire do know the prestige of their schools and employers, and decide based on that.
  • H] When government leases its land to private parties, sometimes it uses centralized, formal mechanisms, like auctions, and sometimes it uses decentralized and informal mechanisms. People seem to intuitively prefer the latter sort of mechanism, even though the former seems to works better. In one study “auctioned leases generate 67% larger up-front payments … [and were] 44% more productive”.
  • I] People consistently invest in managed investment funds, which after the management fee consistently return less than index funds, which follow a simple clear rule. Investors seem to enjoy bragging about personal connections to people running prestigious investment funds.
  • J] When firms go public via an IPO, they typically pay a bank 7% of their value to manage the process, which is supposedly spent on lobbying others to buy. Google famously used an auction to cut that fee, but banks have succeed in squashing that rebellion. When firms try to sell themselves to other firms to acquire, they typically pay 10% if they are priced at less than $1M, 6-8% if priced $10-30M, and 2-4% if priced over $100M.
  • K] Most elite colleges decide who to admit via opaque and frequently changing criteria, criteria which allow much discretion by admissions personnel, and criteria about which some communities learn much more than others. Many elites learn to game such systems to give their kids big advantages. While some complain, the system seems stable.
  • L] In a Twitter poll, the main complaints about my fire-the-CEO decisions markets proposal are that they don’t want a simple clear mechanical process to fire CEOs, and they don’t want to explicitly say that the firm makes such choices in order to maximize profits. They instead want some people to have discretion on CEO firing, and they want firm goals to be implicit and ambiguous.

Some of these examples don’t require explanation beyond why they are bad ideas for rules. They wouldn’t work. Others would, or are even obviously correct, and require more explanation. I do think there is a enough of a pattern to be worth trying to explain. People reliably dislike the rule of law, and prefer to substitute the rule of man.

Why do people oppose the rule of law?

You can read his analysis in the original post. His main diagnosis is that people promote discretion for two reasons:

  1. They believe they are unusually smart, attractive, charismatic and well-liked, just the sort of people who tend to be favored by informal discretion.
  2. They want to signal to others that they have confidence in elites who have discretion, and that they expect to benefit from that discretion.

While I do think these are related to some of the key reasons, I do not think these point at the central things going on. Below I tackle all of these cases and their specifics. Overall I think the following are the five key stories:

  1. Goodhart’s Law. We see a lot of Regressional and Adversarial Goodhart, and also some Extremal and Causal as well. B, F, G, K and L all have large issues here.
  2. Copenhagen Interpretation of Ethics. If you put a consideration explicitly into your rule, that makes you blameworthy for interacting with it. And you’re providing all the information others need to scapegoat you not only for what you do, but what you would do in other scenarios, or why you are doing all of this. This is a factor in A, B, D, E, F, H and K all have to worry about this.
  3. Forbidden Considerations and Lawsuit Protection. We can take the danger of transparency a step further. There are things that one wants to take into consideration that are illegal to consider, or which you are considered blameworthy for considering. The number of ways one can be sued or boycotted for their decision process goes up every year.  You need a complex system in order to disguise hidden factors, and you certainly can’t put them into explicit rules. And in general, the less information others have, the less those that don’t like your decision have to justify a lawsuit or making other trouble. B, D, E, F, K and L have this in play.
  4. Power. Discretion and complexity favor those with power. If you have power, others worry that anything they do can influence decisions, granting you further power over everything in a vicious cycle. If you use a simple rule, you give up your power, including over unrelated elements. Even if you don’t have power directly over a decision, anyone with power anywhere can threaten to use that power whenever any discretion is available, so everyone with power by default favors complex discretionary rules everywhere. This shows up pretty much everywhere other than A, but is especially important in B, F, H, K and L.
  5. Theft. Complex rules are often nothing more than a method of expropriation. This is especially central in B, C, D, H, I and J.

I’ll now work through the examples. I’ll start with C, since it seems out of place, then go in Robin’s order.

Long post is long. If a case doesn’t interest you, please do skip it, and it’s fine to stop here.

C] Tax Returns

The odd policy out here is C, the failure to have the government tell citizens what it already knows about citizens’ tax returns. This results in many lost hours searching down records, and much money lost paying for tax preparation. As a commentator points out, the argument that ‘this allows one to pay less than they owe’ doesn’t actually make sense as an explanation. The government still knows what it knows, and still cross-checks that against what you say and pay. In other countries one can still choose to adjust the numbers shared with you by the government.

In theory, one could make a case similar to those I’ll make in other places, that telling people what information the government knows and doesn’t know allows people to hide anything that the government doesn’t know about. But that seems quite minor.

What’s going on here is simple regulatory capture, corruption, rent seeking and criminal theft. Robin’s link explains this explicitly. Tax preparation corporations like H&R Block are the primary drivers here, because the rules generate more business for them. There is also a secondary problem that fanatical anti-tax conservatives like that taxes annoy people.

But I’ve never heard of a regular person who thinks this policy is a good idea, and I never expect to find one. We’re not this crazy. We have a dysfunctional government.

A] Random Experiments

Robin’s explanations don’t fit case A. If you’re choosing randomly, no one can benefit from discretion. If you choose the same thing for everyone, again no one can benefit from discretion. If anything, the random system allows participants to potentially cheat or find a way around a selection they dislike, whereas a universal system makes this harder. Other things must be at work here.

This is the opposite of C, a case where people do oppose the change, but the change would be obviously good.

I call this the “too important to know” problem.

To me this is a clear case of The Copenhagen Interpretation of Ethics and Asymmetric Justice interacting with sacred values.

An experiment interacts with the problem and in particular interacts with every subject of the experiment, and with every potential intervention, in a way sufficient to render you blameworthy for not doing more, or not doing the optimal thing.

The contrast between the two cases is clear.

Without an experiment, we’re forced to make a choice between options A and B. People mostly accept that one must do their best, and potentially sacred values are up against other potentially sacred values, and one must guess and try their best.

In the cases in the study, it’s even more extreme. We’re choosing to implement A or implement B, in a place where normally one would do nothing. So we’re comparing doing something about the situation to doing nothing. It’s no surprise that ‘try to reduce infections’ comes out looking good.

With an experiment, the choice is between experimentation and non-experimentation. You are choosing to prioritize information over all the sacred values the non-objectionable choices are trading off. Even if the two choices are fully non-objectionable, choosing between them still means placing the need to gather information over the needs of the people in the experiment. 

The needs of specific people are, everywhere and always, a sacred value. Someone, a particular real person, is on the line. When put up against “information” what chance does this amorphous information have?

Copenhagen explains why it is pointless to say that the experiment is better for these patients than not running one. Asymmetric Justice explains why the benefits to future patients doesn’t make up for it.

There are other reasons, too.

People don’t like their fates coming down to coin flips. They don’t like uncertainty.

People don’t like asymmetry or inequality – if I get A and you get B, someone got the better deal, and that’s not fair.

If you choose a particular action, that provides evidence that there was a reason to choose it. So people instinctively adjust some for the fact that it was chosen. Whereas in an experiment, it’s clear you don’t know which choice is better (unless you do know and are simply out to prove it, in which case you are a monster). That doesn’t inspire confidence.

A final note is that if you look at the study in question, it suggests another important element. If you choose A, you’re blameworthy for A and for ~B, but you’re certainly not blameworthy for ~A or for B! Whereas if you choose (50% A, 50% B) then you are blameworthy for A, ~A, B and ~B, plus experimentation in general. That’s a lot of blame.

Remember Asymmetric Justice. If any element of what you do is objectionable, everything you do, together, is also objectionable. A single ‘problematic’ element ruins all.

So if we look at Figure 1 in the study, we see in case C that the objection score for the A/B test is actually below what we’d expect if we thought the chances of objecting to A and B were independent, and people were objecting to the experiment whenever they disliked either A or B (or both). In cases B and D, we see only a small additional rate of objection. It’s only in case A that we see substantial additional objection. Across the data given, it looks like this phenomenon explains about half of the increased rate of objection to the experiments.

It also looks like a lot of people explicitly cited things like ‘playing with people’s lives’ via experiment, and object to experimentation as such at least when the stakes are high.

B] Police Discretion

I do not think Robin’s story of expectation of personal benefit is the central story here, either. The correlation isn’t even that high in his poll.

If police have discretion IN GENERAL regarding who they arrest, do you think they will on average use that discretion to arrest those who actually do more net social harm? Do you think that you will tend to be favored by this discretion?49% Yes to both13% No to both10% Yes re net harm, No re me28% No re net harm, Yes re me — Robin Hanson (@robinhanson) May 6, 2019

If you think net harm is reduced (59%), you’re (49/59) 84% to think you’ll benefit. If you think net harm is not reduced, you are (28/41) 68% to think you’ll benefit. Given that you’d expect models to give correlated returns to the two questions – if discretion is used wisely, it should tend to benefit both most people and a typical civilian looking to avoid doing social harm, and these are Robin’s Twitter followers – I don’t think personal motivation is explaining much variance here.

The question also asking about only part of the picture. Yes, we would hope that police (and prosecutors and others in the system) would use discretion, at least in part, to arrest those who do more net social harm over those who do less net social harm.

But that’s far from the only goal of criminal justice, or of punishment. We would also hope that authorities would use discretion to accomplish other goals, as well.

Some of these goals are good. Others, not so much.

What are some other goals we might have? What are other reasons to use discretion?

I think there are five broad reasons, beyond using it to judge social harm.

  1. Discretion gives law enforcement authorities power. This power allows them to maintain and exert authority. It keeps the system running, ensures cooperation and allows them to solve cases and enforce the law.
  2. Discretion gives law enforcement authorities power and status. This power and status is a large part of the compensation we give to law enforcement.
  3. Discretion allows those with status and/or power to gain additional benefits from that status and/or power.
  4. Discretion allows us to enforce rules without having to state those rules explicitly, be able to well-specify those rules, or specify them in advance.
  5. Discretion guards against Goodhart’s Law and the gaming of the system.

To this we would add Robin’s explanations, that one might want to benefit from this directly, and/or one might want to signal support for such authorities. And the more discretion they have, the more one would want to signal one’s support – see reason 1.

A case worth grappling with can be made for or against each of these five justifications being net good. So one could argue in favor like this with arguments like (I am not endorsing these, nor do they necessarily argue for today’s American level of discretion):

  1. No one would cooperate with authorities. Every potential witness will stonewall, every defendant will fight to the end. Intimidation of witnesses would be impossible to stop. Good luck getting anyone even associated with someone doing anything shady to ever talk to the police. Cases wouldn’t get solved, enforcement would break down. Criminals would use Blackmail to take control of people and put them outside the law. Authorities’ hands would be tied and we’d be living in Batman’s Gotham.
  2. In many ways, being a cop is a pretty terrible job. Who wants to constantly have hostile and potentially dangerous interactions with criminals? Others having to ‘respect my authoritah‘ improves those interactions, and goes a long way in making up for things on many fronts.
  3. Increasing returns for status and power increase the power of incentives. We want people to do things we approve of, and strive to gain society’s approval, so we need levers to reward those who do, and punish those who don’t. We want to reward those who walk the straight and narrow track, and make good, and also those who strive and achieve. This has to balance the lure of anti-social or criminal activity, especially in poorer communities. And it has to balance the lure of inactivity, as we provide increasingly livable baskets of goods to those who opt out of status and prestige fights to play video games.
  4. If we state our rules explicitly, we are now blameworthy for every possible corner case in which those rules result in something perverse, and for every motivation or factor we state. Given that we can be condemned for each of these via Asymmetric Justice, this isn’t an option. We also need to be able to implement systems without being able to specify them and their implications in an unambiguous way. If we did go by the letter of the law, then we’re at the mercy of the mistakes of the legislature, and the law would get even more complex than it is as we attempt to codify every case, beyond the reach of any person to know how it works. Humans are much better at processing other types of complex systems. And we need ways for local areas to enforce norms and do locally appropriate things, without having their own legislatures. If the system couldn’t correct obvious errors, both Type I and Type II, then it would lose the public trust. And so on.
  5. If people know exactly what is and isn’t illegal in every case, and what the punishments are, then it will be open season for anything you didn’t make explicitly illegal. People will find technically permitted ways to do all sorts of bad things. They’ll optimize for all the wrong things. Even when they don’t, they’ll do the maximum amount of every bad thing that your system doesn’t punish enough to stop them from doing, and justify it by saying ‘it was legal’. Your laws won’t accomplish what they set out to accomplish. Any complex system dealing with the messiness of the real world needs some way to enforce the ‘spirit of the law.’ One would be shocked if extensive gaming of the system and epic Goodhart failures didn’t result from a lack of discretion.

Or, to take the opposite stances:

  1. Power corrupts. It is bad and should be minimized. The system will use its power to protect and gather more power for itself, and rule for the benefit of its members and those pulling their strings, not you. If we didn’t have discretion, we could minimize the number of laws and everyone wouldn’t need to go around afraid of authority. And people would see that authority was legitimate, and would be inclined to cooperate.
  2. Power not only corrupts, it also attracts the corrupt. If being a cop lets you help people and stand for what’s right, and gives you a good and steady paycheck, you’ll get good people. If being a cop is an otherwise crappy job where you get to force people to ‘respect my authoritah‘ and make bank with corrupt deals and spurious overtime, you’ll get exactly the people you don’t want, with exactly the authority you don’t want them to have.
  3. The last thing we want is to take the powerful (and hence corrupt) and give them even more discretion and power, and less accountability. The inevitable result is increasing amounts of scapegoating and expropriation. This is how the little guy, and the one minding their own business trying to produce, get crushed by corporations and petty tyrants of all sorts.
  4. If you can’t state what your rules are, how am I supposed to follow them? How can it be fair to punish me for violating rules you won’t tell me? This isn’t rules at all, or rule of law. It’s rule of man, rule of power begetting power. The system is increasingly rigged, and you’re destroying the records to prevent anyone from knowing that you’re rigging it.
  5. The things you’re trying to protect from being gamed are power and the powerful, which are what are going to determine what the ‘spirit of the rules’ is going to be. So the spirit becomes whatever is good for them. The point of rule of law, of particular well-specified rules, is to avoid this. Those rules are not supposed to be the only thing one cares about. Rather, the law is supposed to guard against specific bad things, and other mechanisms allowed to take care of the rest. If the law is resulting in Goodhart issues, it’s a bad law and you should change it.

The best arguments I know about against discretion have nothing to do with the social harm caused by punished actions. They are arguments for rule of law, and to guard against what those with discretion will do with that power. These effects are rather important and problematic even when the system is working as designed.

The best arguments I know about in favor of discretion also have nothing to do with the social harm caused by punished actions. They have to do with the system depending on discretion in order to be able to function, and in order to ensure cooperation. A system without discretion by default makes the spread of any local information everyone’s enemy, and provides no leverage to overcome that. If we didn’t have discretion, we would have to radically re-examine all of our laws and our entire system of enforcement, lest everything fall apart.

My model says that we currently give authorities too much discretion, and (partly) as a result have punishments that are too harsh. And also that the authorities have so much discretion partly because punishments have been made too harsh. Since discretion and large punishments give those with power more power, it would be surprising if this were not the case.

D] National Health Service

The National Health Service gets criticized constantly because it is their job to deny people health care. There is not enough money to provide what we would think of as an acceptable level of care under all circumstances, because our concept of acceptable level of care is all of the health care. In such a circumstance, there isn’t much they could do.

Using deterministic rules based on numbers is the obviously correct way to ration care. Using human discretion in each case will mean either always giving out care, since the choice is between care or no care – which is a lot of why health care costs are so high and going higher – or not always giving out care when able to do so, which will have people screaming unusually literal bloody murder.

Deterministic rules let individuals avoid blame, and allow health care budgets to be used at all. But that doesn’t mean people are going to like it. If anything, they’re going to be mad about both the rules and the fact that they don’t have a human they can either blame or try to leverage to get what they want. There’s also the issue of putting a value on human life at all, which is bad enough but clearly unavoidable.

More than that, once you explicitly say what you value by putting numbers on lives and improvements in quality of life, you’re doing something both completely necessary and completely unacceptable. The example of someone in a wheelchair is pretty great. If you don’t provide some discount in value of quality of life for physical disability, then you are saying that physical disabilities don’t decrease quality of life. Which has pretty terrible implications for a health care system trying to prevent physical disabilities. If you do say they decrease quality of life, you’re saying people with disabilities have less value. There are tons of places like this.

Another way to view this is that the only way for one to make health care decisions to ration care or otherwise sacrifice sacred values to stay on budget, without blame, is to have all those decisions be seen as out of your control and not your choice. The only known way to do that is to have a system in place, and point to that. That system then becomes a way to not interact with the system, avoiding blame. Whereas proposing or considering any other system involves interaction, and thus blame.

E] Value of Life

If you are caught making a trade-off between a sacred value (life) and a non-sacred value (money), it’s not going to go well. Of course a company doing an explicit calculation here is going to get punished, as is a government policy making an explicit comparison. Humans don’t care about the transitive property.

Thus, firms and governments, who obviously need to value risk to human life at a high but finite numerical cost, will need to do this without writing the number down explicitly in any way. This is one of the more silly things one cannot consider, that one obviously must consider. In a world where we are blameworthy (to the point of being sued for massive amounts) for doing explicit calculations that acknowledge trade-offs or important facts about the world, firms and governments are forced to make their decisions in increasingly opaque ways. One of those opaque preferences will be to favor those who rely on opaqueness and destroy records, and to get rid of anyone who is transparent about their thinking or otherwise, and keeps accurate records.

F] Tenure Requirements

Tenure is about evaluating what a potential professor would bring to the university. No matter what extent politics gets involved, this is someone you’ll have to work with for decades. After this, rule of law does attach. You won’t be able to fire them afterwards unless they violate one of a few well-defined rules – or at least, that’s how it’s supposed to work, to protect academic freedom, whether or not it does work that way. You’ll be counting on them to choose and do research, pursue funding, teach and advise, and help run the school, and be playing politics with them.

That’s a big commitment. There are lots more people who want it and are qualified on paper than there are slots we can fund. And there are a lot more things that matter than how much research one can do. Some of them are things that are illegal to consider, or would look bad if you were found to be considering them. Others simply are not research done. You can’t use a formula, because people bring unique strengths and weaknesses, and you’re facing other employers who consider these factors. Even if a simple system could afford to mostly ‘take its licks’ you would face massive adverse selection, as everyone with bad intangibles would knock at your door.

You need to hold power over the new employees, so they’ll do the work that tenured employees don’t want to do, and so they’ll care about all aspects of their job, rather than doing the bare technical minimum everywhere but research.

Then there’s the Goodhart factors on the papers directly. One must consider how the publications themselves would be gamed. If there was a threshold requirement for journal quality, the easiest journals that count would be the only place anything would be published. If you have a point system, they’d game that system, and spend considerable time doing it. If you don’t evaluate paper quality or value, they won’t care at all about those factors, focusing purely on being good enough to make it into a qualifying journal. Plus, being able to evaluate these questions yourself without an outside guide or authority will be part of the job you’re trying to get. We need to test that, too.

What you’re really testing for when you consider tenure, ideally, is not only skill but also virtue. You want someone who is naturally driven to scholarship and the academy, to drive forward towards important things. While also caring enough to do a passable job with other factors. Otherwise, once they can’t be fired, you won’t be able to get them to do anything. Testing for virtue isn’t something you can quantify. You want someone who will aim for the spirit rather than the letter, and who knows what the spirit is and cares about it intrinsically. If you judge by the letter, you’ll select for the opposite, and if you specify that explicitly, you’ll lose your signal that way as well.

I’d write this one up to power and exploitation of those lower on the totem pole, the need to test for factors that you can’t say out loud, the need to test for virtue, and the need to test for knowing what is valuable.

G] A Good Lawyer

People rightfully don’t think this number will tell us much, even now when it is not being gamed and vulnerable to Goodhart. Robin seems to be assuming that one should think that a previous win percentage should be predictive of a lawyer’s ability to win a particular case, rather than being primarily a selection effect, or a function of when they settle cases.

I doubt this is the case, even with a relatively low level of adversarial Goodhart effects.

Most lawyers or at least their firms have great flexibility in what cases they pursue and accept. They also have broad flexibility in how and when they settle those cases, as clients largely rely on lawyers to tell them when to settle. Some of them will mostly want cases that are easy wins, and settle cases that likely lose. Others, probably better lawyers for winning difficult cases, will take on more difficult cases and be willing to roll the dice rather than settle them.

I don’t even know what counts as a ‘win’ in a legal proceeding. In a civil case you strategically choose what to ask for, which might have little relation to realistic expectations for a verdict, so getting a lesser amount might or might not be a ‘win’ and any settlement might be a win or loss even if you know the terms, and often the terms are confidential.

Thus, if I was looking for a lawyer, I would continue to rely on personal recommendations, especially from lawyers I trust, rather than look at overall track records, even if those track records were easily available. I don’t think those track records are predictive. Asking questions like someone’s success in similar style cases, with richer detail in each case, seems better, but one has to pay careful attention.

If people started using win-loss records to choose lawyers, and lawyers started optimizing their win-loss records, what little information those records might have gets even less useful. You would mostly be measuring which lawyers prioritize win-loss records, by selecting winners and forcing them to verdict, while avoiding, settling or pawning off losers, and by getting onto winning teams, and so on. By manipulating the client and getting them to do what was necessary. It’s not like lawyers don’t mostly know which cases are winners. By choosing a lawyer with too good a win-loss record, you’d be getting someone who cares more about how they look in a statistic than doing what’s right for their clients, and also who has the flexibility to choose which cases they have to take.

The adverse selection here, it burns.

That’s what I’d actually expect now. Some lawyers do care a lot about their track records, they’ll have better track records, and they’re exactly who you want to avoid. I’d take anyone bragging about their win rate as a very negative sign, not a positive one.

So I don’t think this is about simple rules, or about people’s cognitive errors, or anything like that. I think Robin is just proposing a terrible measure that is not accurate, not well-defined and easily gamed, and asking why we aren’t making it available and using it.

Contrast this with evaluations of doctors or hospitals for success rates or death rates from particular surgeries. That strikes me as a much better place to implement such strategies, although they still have big problems with adversarial Goodhart if you started looking. But you can get a much better idea of what challenges are being tackled and about how hard they are, and a much better measure of the rate of success. I’d still worry a lot about doctors selecting easy cases and avoiding hard ones, both for manipulation and because of what it would do to patient care.

A general theme of simple rules is that when you reward and punish based on simple rules, one of the things you are rewarding is a willingness to prioritize maximizing for the simple rule over any other goal, including the thing you’re trying to measure. Just like any other rule you might use to reward and punish. The problem with simple rules is that they explicitly shut out one’s ability to notice such optimization and punish it, which is the natural way to keep such actions in check. Without it, you risk driving out anyone who cares about anything but themselves and gaming the system, and creating a culture where gaming the system and caring about yourself are the only virtues.

H] Land Allocations

If all you care about is the ‘productivity’ of the asset and/or the revenue raised, then of course you use an auction. Easy enough, and I think people recognize this. They don’t want that. They want a previously public asset to be used in ways the public prefers, and think that we should prefer some uses to other uses because of the externalities they create.

It seems reasonable to use the opportunity of selling previously public goods to advance public policy goals that would otherwise require confiscating private property. Private sellers will also often attach requirements to sales, or choose one buyer over another, sacrificing productivity and revenue for other factors they care about.

We can point out all we like how markets create more production and more revenue, but we can’t tell people that they should care mostly about the quantity of production and revenue instead of other things. When there are assets with large public policy implications and externalities to consider, like the spectrum, it makes sense to think about monopoly and oligopoly issues, about what use the assets will be put to by various buyers, and what we want the world to look like.

That doesn’t mean that these good factors are the primary justifications. If they were, you’d see conditional contracts and the like more often, rather than private deals. The real reason is usually that other mechanisms allow insiders to extract public resources for private gains. This is largely a story of brazen corruption and theft. But if we’re going to argue for simple rules because they maximize simple priorities, we need to also argue for why those priorities cover what we care about, or we’ll be seen as tone deaf at best, allowing the corrupt to win the argument and steal our money.

I] Investment Funds

Low fee index funds are growing increasingly popular each year, taking in more money and a greater share of assets. Their market share is so large that being included in a relevant index has a meaningful impact on share prices.

Managed funds are on the decline. Most of these funds are not especially prestigious and most people invested in them don’t brag about them, nor do they have much special faith in those running the funds. They’re just not enough on the ball to realize they’re being taken for a ride by professional thieves.

Nor do I think most people care about associating with high status hedge funds or anything like that. I don’t see it, at all.

Also, those simple rules? You can find them in active funds, too. A lot of them are pretty popular. Simple technical analysis, simple momentum, simple value rules, and so on. What counts as simple? That’s a matter of perspective. Index providers are often doing staggeringly complex things under the hood. And indexing off someone else’s work is a magician’s trick, free riding off the work of others in a way that gets dangerous if too many start relying on it.

Most regular investors who think about what they’re doing at all, know they should likely be in index-style funds, and increasingly that’s where they are. If there’s a mystery at all it’s at least contained at the high end, in hedge funds with large minimums.

One can split the remaining ‘mystery’ into two halves. One is, why do some people think there exist funds that have sufficient alpha to justify their fees? Two is, why do some people think they’ve found one of those funds?

The first mystery is simple. They’re right. There exist funds that have alpha, and predictably beat the market. The trick is finding them and getting your money in (or the even better trick is figuring out how to do it yourself). I don’t want to get into an argument over efficient markets here and won’t discuss it in the comments, but the world in which no one can beat the market doesn’t actually make any sense.

The second mystery is also simple. Marketing, winners curse, fooled by randomness and adverse selection, and the laws of markets. Of course a lot more people think they’ve found the winner than have actually found one.

This is a weird case in many ways, but my core take here is that the part of this that does belong on this list, is an example of complexity as justification for theft.

J] Selling the Company

Google is the auction company. They were uniquely qualified to run an auction and bypass the banks, and did it (as I understand it) largely because it was on brand and they’d have felt terrible doing otherwise. A more interesting case is Spotify, who recently simply let people start trading its stock without an IPO at all. Although they still paid the banks fees, which I find super weird and don’t understand. There never was a rebellion.

How do banks extract the money?

My model is something like this, coming mostly from reading Matt Levine. The banks claim that they provide essential services. They find and line up customers to buy the stock, they vouch for the stock, they price the stock properly to ensure a nice bump so everyone feels happy, they backstop things in case something goes wrong, they handle a ton of details.

What they really do are two things. Both are centered around the general spreading by banks of FUD: Fear, Uncertainty and Doubt.

First, they prevent firms from suddenly having to navigate a legally tricky and potentially risky, and potentially quite complex, world they know nothing about, where messing up could be a disaster. One does not simply sell the company or take it public, as much as it might look simple from the outside. And while the bank’s marginal costs are way, way lower than what they charge, trying to get that expertise in house in a confident way is hard.

Second, they are what people are comfortable with. You’re not blameworthy for paying the bank. It’s the null action. If you do it, no one says ‘hey they’ve robbed us all of a huge amount of money.’ Instead, they say ‘good on you for not being too greedy and trying to maximize the price while risking the company’s future.’

They’re doing this at the crucial moment when how you look is of crucial importance, when you’re about to get a huge windfall for years or a lifetime of work and give the same to everyone who helped you. When you’re spending all your energy negotiating lots of other stuff. A disruption threatens to unravel all of that. What’s a few percent in that situation? So what if you don’t price your IPO as high as you could have so that bankers can enjoy their bounce?

Banks are conspiring with the buyers to cheat the sellers out of the value of what they bring to the table. Buyers who object are threatened with ostracism and being someone no one is comfortable with, with the other side walking away from the table after buyers put in the work to get here.

Is this all guillotine-worthy highway robbery? Hell yes. Completely.

Banks (and the buyers who are their best customers and allies) are colluding with this pricing, and that’s the nicest way to put this. Again, this is theft. Complexity is introduced to allow rent seeking and theft, exploiting a moment of vulnerability.

K] College Admissions

Interesting that Robin says the system ‘appears stable.’ To me it does not seem stable. We just had a huge college admissions scandal that damaged faith in the system and a quite-well justified lawsuit against Harvard. We have the SAT promising to introduce ‘adversity scores.’ We have increasingly selective admissions eating more and more of childhood, and the rule that what can’t go on forever, won’t. This calls for some popcorn.

What’s causing the system to be complex? We see several of the answers in play here.

We see the ‘factors you can’t cite explicitly’ problem and the ‘we don’t want something we can be sued or blamed for’ here. Admissions officers are trying to pick kids who will be successful at school and in life, as well as satisfy other goals. A lot of the things that predict good outcomes in life are not things you would want to be caught dead using as a determinant in admissions even if they weren’t illegal to use in admissions. The only solution is to make the system complex and opaque, so no one can prove what you were thinking.

We also see complexity as a way for the rich and powerful to expropriate resources, in the sense that the rich and powerful and their children are likely to be more successful, and more likely to give money to the school. And of course, if the school has discretion, that gives the school power. It can extract resources and prestige from others who want to get their kids in. Employees, especially high-up ones, can extract things even without illegal bribes. Why pass that up?

We see the Goodhart’s Law and adverse selection problems. If you admit purely on the basis of a test, and the other schools admit on the basis of a range of factors, you don’t get the best test scorers unless you’re Harvard. You get the kids who are an epic fail at those other factors.

If you give kids an explicit target, they and their parents will structure their entire lives around it. They’ll do that even with a vague implicit target, as they do now. If it’s explicit, you get things like you see in China, where (according to an eyewitness who once came to dinner) many kids are pulled from school and do nothing but cram facts into their heads for the college admissions exam for years. And why shouldn’t they?

So you get kids whose real educations are crippled, who have no life experience and no joy of childhood. The only alternative is to allow a general sense of who the kid is and what they’ve done to matter. To be able to holistically judge kids and properly adjust.

As always, the more complex and hard to understand the game, the greater the expert’s advantage. The rich and powerful who understand the system and can make themselves look good will have a large edge from that. And the more we explicitly penalize them for those advantages, but not for their gaming of the system, the more we force them to game the system even harder. If you use an adversity score to set impossibly high standards for rich kids, they’re going to use every advantage they have to make up for that even more than they already do.

And of course, part of the test is seeing how you learn to game the test and what approach you take. Can you do it with grace? Do you do too much of it, not enough or the right amount?

This is all an anti-inductive arms race. The art of gaming the system is in large part the art of making it look like you’re not gaming the system, which is an argument for simpler rules. At this point, what portion of successful admissions is twisting the truth? How much is flat out lying? How much is presenting yourself in a misleading light? To what extent are we training kids from an early age to have high simulacrum levels and sacrifice their integrity? A lot. Integrity being explicitly on the test just makes it one more thing you need to learn to fake.

I hate the current situation, and the educational system in general, but I think the alternative of a simple, single written test, with the system otherwise unchanged, would be worse. But of course, we’d never let it be that simple. That’s all before the fights over how to adjust those scores for ‘adversity’ and ‘diversity,’ and how to quantify that, and the other things we’d want to factor in. Can you imagine what happens to high schools if grades don’t matter? What if grades did matter in a formulaic matter and students and teachers were forced to confront the incentives? The endless battles over what other life activities should score points, the death of any that don’t, and the twisting into point machines of those that do?

So here we have all the Goodhart problems, and the theft problems, and the power problems, and the blameworthy considerations and justifications problems and lawsuit problems with their incentive to destroy all information. The gang’s all here.

L] Fire that CEO

I love me a prediction market, but you have to do it right. Would enough people and money participate? If they did, would they have the right incentives? If both of those, would you want that to be how you make decisions?

I think the answer to the first question is yes, if you structure it right. If there are only two possibilities and one of them will happen, you can make it work.

The answer to the second question is, no.

We can consider two possibilities.

In scenario one, this acts as an advisory for the board, to help them decide what to do.

In scenario two, this is the sole thing looked at, and CEOs are fired if and only if they are judged to be bad for the stock price, or can otherwise only be fired for specific causes (e.g. if found shooting a man on Fifth Avenue or stealing millions in company funds and spending them on free to play games, you need to pull the plug without stopping to look at the market).

The problem with scenario one is that you’re trading on how much the company is worth given that the CEO was fired. That’s very different from what you think the company would be worth if we decided to fire the CEO. The scenarios where the CEO is fired are where the board is unhappy with them, which is usually because of bad things that would make us think the stock is likely to be less valuable, like the stock price having gone down or insiders knowing the CEO has done or will do hard to measure long term damage. That doesn’t mean it won’t also take into account other things like whether the CEO is paying off the board, but the correlation we’re worried about is still super high. Giving the board discretion, that market participants would expect the board to use, hopelessly muddles things.

You could try to solve that problem by having the market trade only very close in time to the board decision. You kind of have to do that anyway, to avoid having a lame duck CEO. But it still depends on a lot of private information, and the decision will still reveal a lot about the firm. So I think that realistically

The problem with scenario two is that you’ve taken away any ability to punish or reward the CEO for anything other than future stock prices. This effectively gives the CEO absolute power, and allows them to get away with any and all bad behavior of all kinds. Even if past behavior lowers the stock price, it only matters to the extent that it predicts future actions which would further lower the stock price. So CEOs don’t even have to care about the stock price. They only need to care about the stock price predictions in relation to each other. So the best thing the CEO can do is make getting rid of them as painful as possible. Even more than now, they want to make sure that losing them destroys the company as much as possible. Their primary jobs now are to hype themselves as much as possible to outsiders, and to spend capital manipulating these prediction markets.

Again, we’re seeing Goodhart problems, we’re seeing reinforcement of power (in this case, of the board over the CEO, so it’s a balance of power we likely welcome), and the ability to take things into consideration without needing to make them explicit or measurable, as companies both care about things they’re not legally allowed to care about and which we wouldn’t like hearing they cared about, especially explicitly, and they need to maintain confidentiality.


What are good practices for using Google Scholar to research answers to LessWrong Questions?

15 часов 33 минуты назад
Published on May 18, 2019 9:44 PM UTC

Answers could provide workflows, tips, tricks, trigger-action patterns, things-to-avoid, useful and/or unknown Google Scholar features -- whatever helps someone get up to speed as a power user.


Simulation Typology and Termination Risks

18 мая, 2019 - 15:42
Published on May 18, 2019 12:42 PM UTC

The main idea of the article is that we likely live in a “Fermi simulation”, that is, a simulation which is created by aliens to solve Fermi paradox via simulating possible global risks, or in a "Singularity simulation", – a simulation where future AI models its own origin (gaming simulations will also play with our period of history as a most interesting one). It means that our simulation will be likely turned off soon, as at least one of three conditions will be reached:

- Our simulation will model a global catastrophe, which is subjectively equal to the termination.
- Our simulation will reach unknown to us goal (but likely related to our AI's origin conditions) and will be terminated.
- Our simulation will reach a nested simulation level shortly after strong AI creation, which will drastically increase computational resources demand and thus the simulation will be terminated.

We also suggested a classification of different types of simulations, patched the Simulation Argument and suggested Universal Simulation Argument which now has to take into account all possible civilisations in the universe. This makes SA's branches where we are not in the simulations ("extinction soon" or "ethical future") much less likely, as it would require that all possible civilisations will go extinct or all possible non-human civilisations will be ethical.


Yes Requires the Possibility of No

18 мая, 2019 - 01:39
Published on May 17, 2019 10:39 PM UTC

1. A group wants to try an activity that really requires a lot of group buy in. The activity will not work as well if there is doubt that everyone really wants to do it. They establish common knowledge of the need for buy in. They then have a group conversation in which several people make comments about how great the activity is and how much they want to do it. Everyone wants to do the activity, but is aware that if they did not want to do the activity, it would be awkward to admit. They do the activity. It goes poorly.

2. Alice strongly wants to believe A. She searches for evidence of A. She implements a biased search, ignoring evidence against A. She finds justifications for her conclusion. She can then point to the justifications, and tell herself that A is true. However, there is always this nagging thought in the back of her mind that maybe A is false. She never fully believes A as strongly as she would have believed it if she just implemented an an unbiased search, and found out that A was, in fact, true.

3. Bob wants Charlie to do a task for him. Bob phrases the request in a way that makes Charlie afraid to refuse. Charlie agrees to do the task. Charlie would have been happy to do the task otherwise, but now Charlie does the task while feeling resentful towards Bob for violating his consent.

4. Derek has an accomplishment. Others often talk about how great the accomplishment is. Derek has imposter syndrome and is unable to fully believe that the accomplishment is good. Part of this is due to a desire to appear humble, but part of it stems from Derek's lack of self trust. Derek can see lots of pressures to believe that the accomplishment is good. Derek does not understand exactly how he thinks, and so is concerned that there might be a significant bias that could cause him to falsely conclude that the accomplishment is better than it is. Because of this he does not fully trust his inside view which says the accomplishment is good.

5. Eve is has an aversion to doing B. She wants to eliminate this aversion. She tries to do an internal double crux with herself. She identifies a rational part of herself who can obviously see that it is good to do B. She identifies another part of herself that is afraid of B. The rational part thinks the other part is stupid and can't imagine being convinced that B is bad. The IDC fails, and Eve continues to have an aversion to B and internal conflict.

6. Frank's job or relationship is largely dependent to his belief in C. Frank really wants to have true beliefs, and so tries to figure out what is true. He mostly concludes that C is true, but has lingering doubts. He is unsure if he would have been able to conclude C is false under all the external pressure.

7. George gets a lot of social benefits out of believing D. He believes D with probability 80%, and this is enough for the social benefits. He considers searching for evidence of D. He thinks searching for evidence will likely increase the probability to 90%, but it has a small probability of decreasing the probability to 10%. He values the social benefit quite a bit, and chooses not to search for evidence because he is afraid of the risk.

8. Harry sees lots of studies that conclude E. However, Harry also believes there is a systematic bias that makes studies that conclude E more likely to be published, accepted, and shared. Harry doubts E.

9. A bayesian wants to increase his probability of proposition F, and is afraid of decreasing the probability. Every time he tries to find a way to increase his probability, he runs into an immovable wall called the conservation of expected evidence. In order to increase his probability of F, he must risk decreasing it.


"One Man's Modus Ponens Is Another Man's Modus Tollens"

18 мая, 2019 - 01:03
Published on May 17, 2019 10:03 PM UTC


Announcing my YouTube channel

17 мая, 2019 - 20:50
Published on May 17, 2019 5:50 PM UTC

Social theory. Geopolitics. Power. New videos every week.

Recently I launched a YouTube channel. This channel provides another medium in which to share my thoughts, as well as a place to access recordings of my talks and interviews.

New content

The first several videos dive into my thoughts on institutions, history, and modern society.

  • Silicon Valley Was Wrong: The Internet Centralized Society It is commonly claimed that the Internet has been a decentralizing force for society, providing more power to individuals who can wield the new technology. This theme is ubiquitous within hacker culture and the cyberpunk literary genre, for example. However, today we find precisely the opposite: the Internet has, on the whole, been a centralizing force for society. A few large media companies have massive influence over public discourse as well as access to data about the behaviors of millions of users. While this has made individuals more transparent and more legible to large institutions at great scale, I argue that it has not made those large institutions more legible to us.
  • Why America is Not an Open Society In this video, I explore three common models given as explanations for the success of America, and argue that they don’t capture the complete picture. If these common perceptions are not true, then what more nuanced theory of history explains America’s success and prosperity?
  • I. America as an open, transparent society. Do ideas rise and fall on their own merits and strength of evidence, or is it possible to manipulate public opinion towards misinformation given enough material resources? To answer this question, I explore Edward Bernays’ 1928 book Propaganda, psychiatry in communist Yugoslavia, Lysenkoism, and the (lack of) transparency of modern media institutions such as Facebook.
  • II. The American public as rational, self-interested actors. I discuss the success of Sweden’s welfare state and examples of how individuals often make economic choices that depend on trust and that reflect care for others around them, as opposed to making all choices out of pure economic self-interest.
  • III. Decisions in American governance as the output of democratic processes. In reality, many decisions are not made by officials in elected positions, because much political steering power is instead held by entrenched bureaucracies and civil servants.
  • Will China Out-Innovate the United States? The United States prides itself on being a hub for world innovation and on attracting top talent from all over the world. However, China’s economy is now comparable to that of the United States, and its international influence is growing to match. What forces drive this rise, and will there be consequences be for American innovation? Furthermore, what can we learn by observing the books Xi Jinping keeps on his desk?
  • How to Predict the Next Global Hub What sociological factors have made modern Silicon Valley a hub for thinkers, innovators, and entrepreneurs? The most important factors may be unexpected, and the most expected factors may be unimportant; for example, London at its peak was crime-ridden. I explore Alexandria under the patronage of the Ptolemaic dynasty, economic opportunities in Paris during the 18th century, and the social landscapes of Los Angeles, San Francisco, and Shanghai.
  • How I Learn History When learning history, how can we reconstruct what has truly happened? The most useful method is to ingest information from primary sources directly; these sources are not filtered through somebody else’s interpretation. Do in-depth case studies, read the firsthand accounts of those who were there, and reconstruct how individuals and situations were affected by the institutions and bureaucracies around them.
  • What Is Your Theory of History? Whether they realize it or not, everyone has their own implicit theory of history. We use our theories of history to make predictions and to decide what is important at the largest scale for our societies. An unexamined theory of history, however, can easily be inconsistent in how it reasons about the past, present, and future — and poor predictions are the result. By applying systematic thinking, you can build a theory of history that is consistent and coherent.
Talks and interviews

In addition to new content, the YouTube channel provides a location for recordings of my talks and interviews.

  • Civilization: Institutions, Knowledge, and the Future This is a talk I gave with the Foresight Institute; I’ve written about it here. For the YouTube channel, I’ve also curated some standalone excerpts from this talk:
  • The Lycurgus Cup: What happens when a civilization’s technology becomes lost for over a thousand years? What can we learn about the economic output of the Roman Empire at its peak and before its fall? What technologies might our own civilization stand to lose? When our descendants read about our achievements, will they believe us?
  • Intellectual Dark Matter: Physicists have inferred the existence of dark matter not by direct observation per se, but by observing the force it exerts on surrounding matter. Likewise, through observing history we can infer the existence of certain knowledge that has been developed and used by historical civilizations and which, though lost to the ages, has nonetheless shaped the trajectory of future civilizations.
  • Artificial Intelligence: Existential Hope Scenarios This is a panel discussion with Mark Miller, Jessica Cussins, and De Kai in which I propose concrete actions towards guiding AI research to safe outcomes. I also discuss how to identify the highest risk areas of research, the feasibility of regulating software, and international cooperation.

I hope you find it interesting!

Samo Burja


What makes a scientific fact 'ripe for discovery'?

17 мая, 2019 - 12:01
Published on May 17, 2019 9:01 AM UTC

The existence of multiple discovery seems to suggest that there are certain factors that make scientific facts ready to be discovered. What are these, factors, and how could one measure them?


Why exactly is the song 'Baby Shark' so catchy?

17 мая, 2019 - 09:26
Published on May 17, 2019 6:26 AM UTC

(Epistemic status: low-level infohazard, class Earworm. You have been warned.)

The video, "Baby Shark Dance", has over 2,773,743,743 views as of the time of this post. It bit into me while captive to a parent soothing their child on the BART subway some time ago and has occasionally reared its ugly snout ever since.

I am curious what models of music psychology have to say about this phenomenon and also what those explanations would suggest for increasing the virality of a given piece of music, audiovisuals, or any memetic content in general if applicable.


Offer of collaboration and/or mentorship

16 мая, 2019 - 17:16
Published on May 16, 2019 2:16 PM UTC

I have two motivations for making this offer. First, there have been discussions regarding the lack of mentorship in the AI alignment community, and that beginners find it difficult to enter the field since the experienced researchers are too busy working on their research to provide guidance. Second, I have my own research programme which has a significant number of shovel ready open problems and only one person working on it (me). The way I see it, my research programme is a very promising approach that attacks the very core of the AI alignment problem.

Therefore, I am looking for people who would like to either receive mentorship in AI alignment relevant topics from me, or collaborate with me on my research programme, or both.


I am planning to allocate about 4 hours / week to mentorship, which can be done over Skype, Discord, email or any other means of remote communication. For people who happen to be located in Israel, we can do in person sessions. The mathematical topics in which I feel qualified to provide guidance include: linear algebra, calculus, functional analysis, probability theory, game theory, computation theory, computational complexity theory, statistical/computational learning theory. I am also more or less familiar with the state of the art in the various approaches other people pursue to AI alignment.

Naturally, people who are interested in working on my own research programme are those who would benefit the most from my guidance. People who want to work on empirical ML approaches (which seem to be dominant in OpenAI, DeepMind and CHAI) would benefit somewhat from my guidance, since many theoretical insights from computational learning theory in general and my own research in particular, are to some extent applicable even to deep learning algorithms whose theoretical understanding is far from complete. People who want to work on MIRI's core research agenda would also benefit somewhat from my guidance but I am less knowledgeable or interested in formal logic and approaches based on formal logic.


People who want to collaborate on problems within the learning-theoretic research programme might receive a significantly larger fraction of my time, depending on details. The communication would still be mostly remote (unless the collaborator is in Israel), but physical meetings involving flights are also an option.

The original essay about the learning-theoretic programme does mention a number of more or less concrete research directions, but since then more shovel ready problems joined the list (and also, there are a couple of new results). Interested people are advised to contact me to hear about those problems and discuss the details.


Anyone who wants to contact me regarding the above should email me at vanessa.kosoy@intelligence.org, and give me a brief intro about emself, including knowledge in math / theoretical compsci and previous research if relevant. Conversely, you are welcome to browse my writing on this forum to form an impression of my abilities. If we find each other mutually compatible, we will discuss further details.


Which discovery was most ahead of its time?

16 мая, 2019 - 15:58
Published on May 16, 2019 12:58 PM UTC

Looking into the history of science, I've been struck by how continuous scientific progress seems. Although there are many examples of great intellectual breakthroughs, most of them build heavily on existing ideas which were floating around immediately beforehand - and quite a few were discovered independently at roughly the same time (see https://en.m.wikipedia.org/wiki/List_of_multiple_discoveries).

So the question is: which scientific advances were most ahead of their time, in the sense that if they hadn't been made by their particular discoverer, they wouldn't have been found for a long time afterwards?


European Community Weekend 2019

16 мая, 2019 - 14:44
Published on May 16, 2019 11:44 AM UTC

From Friday August 30th to Monday 2nd September aspiring rationalists from across Europe and further afield will gather for 4 days of socializing, fun and intellectual exploration. There will be scheduled talks, but the majority of the content will be unconference style and participant driven.

On Friday afternoon we put up four wall-sized daily planners and by Saturday morning the attendees fill them up with +50 workshops, talks and activities of their own devising, such as icebreaker games, rationality techniques, EA community building discussions, comfort zone expansion workshop, polyamory and relationships workshops, morning meditation sessions in the winter garden and many more.

This is our 6th year and we feel that the atmosphere and sense of community at these weekends is something that’s really special. If that sounds like something you would enjoy and you have some exciting ideas and skills to contribute do come along and get involved. This year is the biggest one yet and it’s an entire day longer than previous years!

We have a very limited amount of spots. If you would like to be among the first selected in the beginning of May, sign up today: www.tiny.cc/lwcw2019_signup and make sure to let us know what experience and ideas you may contribute to this event: www.tiny.cc/lwcw2019_contribution.


When? 30th August – 2nd September 2019

Where? jh-wannsee.de

Tickets? Regular Ticket (200€), Supporter Ticket (300/400€): www.tiny.cc/lwcw2019_signup

If you have any question, please email us at lwcw.europe@gmail.com.

Looking forward to a great weekend, The Community Weekend organizers and LessWrong Deutschland e.V.


Kevin Simler's "Going Critical"

16 мая, 2019 - 07:36

What is this new (?) Less Wrong feature? (“hidden related question”)

16 мая, 2019 - 02:51
Published on May 15, 2019 11:51 PM UTC

What is this? I just noticed it (I see it when editing one of my posts):


Feature Request: Self-imposed Time Restrictions

16 мая, 2019 - 01:35
Published on May 15, 2019 10:35 PM UTC

Hacker News has a feature called "noprocrast". Here's how they explain it in the FAQ:

In my profile, what is noprocrast?It's a way to help you prevent yourself from spending too much time on HN. If you turn it on you'll only be allowed to visit the site for maxvisit minutes at a time, with gaps of minaway minutes in between. The defaults are 20 and 180, which would let you view the site for 20 minutes at a time, and then not allow you back in for 3 hours.

If you try to use HN when you precommitted to not using it, you'll get the following message from them:

Get back to work!Sorry, you can't see this page. Based on the anti-procrastination parameters you set in your profile, you'll be able to use the site again in 43 minutes.

I was thinking that something like this would be awesome for LessWrong. Personally, I have a rather large problem browsing the web - which includes browsing LessWrong - when I should be doing other things. After reading Digital Minimalism, I get the impression that such struggles are moreso the norm than the exception.


Why I've started using NoScript

16 мая, 2019 - 00:32
Published on May 15, 2019 9:32 PM UTC

NoScript is a browser extension[1] that prevents your browser from loading and running JavaScript without your permission. I recently started using it, and I highly recommend it.

I had first tried using NoScript around a decade ago. At the time it seemed like too much of a hassle. I ended up wanting to enable almost all the scripts that were included, and this was somewhat annoying to do. Things have changed a lot since then.

For one, NoScript's user interface has become much better: Now, if a page isn't working right, you simply click the NoScript icon and whitelist any domains you trust, or temporarily whitelist any domains you trust less. You can set it to automatically whitelist domains you directly visit (thereby only blocking third-party scripts).

A more pressing change is that I'm now much less comfortable letting arbitrary third parties run code on my computer. I used to believe that my browser was fundamentally capable of keeping me safe from the scripts that it ran. Sure, tracking cookies and other tricks allowed web sites to correlate data about me, but I thought that my browser could, at least in principle, prevent scripts from reading arbitrary data on my computer. With the advent of CPU-architecture-based side channel attacks (Meltdown and Spectre are the most publicized, but it seems like new ones come out every month or so), this belief now seems quite naïve.

Finally, in that decade, third-party scripts for tracking and ads have become almost literally ubiquitous on the web. Just about every web site I visit, I've discovered, has at least a couple of third-party dependencies, whose provenance I don't trust, and which I'd rather not spend (even a minuscule proportion of) my energy bill on. Even disregarding the new hardware vulnerabilities, I don't think arbitrary third party trackers ought to be trusted to run in your browser[2]; if even one of the hundreds of tracking scripts is compromised, this could easily leak your passwords or other data to attackers.

An added benefit has been that NoScript works better than my ad blocker. Around the time I started using NoScript, I was watching a show on a streaming site I don't normally visit, that shall remain nameless. This site is extremely annoying. It plays more ads per minute than content, somehow evading uBlock Origin, and often the ads seem to break the actual video player so that the show stops partway through. After installing NoScript, I spent about 3 minutes wading through the ~50 script sources, enabling not-ads until eventually the video played. I was thrilled to see that the video played perfectly, with no interruptions.

In summary, just go try it. You might not like it, but at least then you'll know.

  1. There is a version for Firefox and one for Chrome; it also has a Wikipedia page. ↩︎

  2. Some decent background on the problem can be read here ↩︎


Emotional valence as cognition mutator (not a bug, but a feature)

15 мая, 2019 - 15:49
Published on May 15, 2019 12:49 PM UTC

Maybe because of my neurotype I view it important that I have good general way of handling things. I noticed a a pattern where people would process seemingly similar requests very differently and got curious whether there is an actual difference or are people being needlessly inconsistent.

  • Can you pass the salt, please?
  • Step aside
  • Can I have your wallet, please?

2 of these are supposed to have a "well, he did ask" kind of reaction and the reminaing a "no, why would I?". The difference in suggestibility is great althought they all are supposed to be straight forward requests for X or a "can i have X? yes/no" kind of questions.

Now if I give a complicated answer why you should give me your wallet you are likely to remain pessimistic about complying. This also seems to have the weird property of increasing the sophistication of the argument is unlikely to increase compliance probability. It would seem like the person is not willing to entertain the proposition but is prejudiced to reject it from the get go. This would seem to go contrary to "intellectual openness/readiness".

However the behaviour is not really mysterious and the reasons for it are pretty well founded. Your wallet contains a lot of your money which is important for a lot of things you do. Should something bad happen to it a lot of things will get a lot messier. It's also an attractive prey target. It's plausible that someone might lie to you just to get a hold on your wallet just to have its possession.

When I thought about a straight up thief trying to get the wallet by asking I realised that increasing the sophistication of the reason why to give up the wallet is not a good strategy. However asking for any "innocent" target is likely to encounter a lot more suggestibility. Some of these suggestions might put you in the position to better grab the wallet. That is a "Hey what's over there?" and pointing away and then physically grabbing the wallet is likely to be more effective than a sophisticated argument why to give the wallet. The strange thing here seems to be that if the target percieves your motivation to be the wallet it's effectively game over to you as the thief. A lot of the targets psychological defences know to activate on that cue.

The surprising result that I ended up on upon thinking is that the phenomenon is a legit psychological defence and it's presence is actually constructive. But abstracting it into other spheres it means that in situations where we are handling requests to system crtical phenomena we have a increased weight to understand the request to a higher degree. It's not that the agent goes emotional and throws reason out of the window. It's precisely opposite in that the agent correctly identifies that this needs to be understood and processed correctly. And it can't be blanket rejected because there is a minority of the situations where we actually want to comply with the request.

There is also a principle of a kind of burden of proof here. If I don't understand it I am going to reject it. Even if you use logic that is "more intellectual" than me. This burden of proof doesn't apply normally. Normally "you know what you are doing" can be a reason to give you the benefit of doubt. I don't need to know where you are about to walk to step out of your way. But in these high-stakes situations I do need to know the details.

The ultimate such high-stakes situation would be the AI-boxing situation. In basic communication when people hear about the problem they liken it more to the "Can you pass the salt, please" kind of problem or pattern recognise "friendliness" as a kind of academic curiosity. A method of communication that would liken it to the asking of wallet would make poeple employ their latent psyhological skills to the problem. Here is one mini attempt at it. Situation 1: You have hitler in a cell and he asks you to let him go out of the cell. You reply "No, you are f Hitler". Situation 2: you have Hitler and a innocent person in a cell. For some strange reason you don't know how Hitler looks like. A man beyond the bars asks "Let me, go I am innocent", you reply "No you are Hitler trying to lie you are the innocent one to get out jail", "What can I do to prove to you I am not Hitler?". In this kind of situation it's clear that letting an innocent man walk is desirable but clearly not worth having to deal with Hitler again, even if we have no reason to think that Hitler is immidiately about to commit something bad.

As the AI-boxing problem was presented sometimes it felt it was presented as a unique problem perhaps requiring unique answers. But the variable suggestibility scales kind of highlights that natural intelligences box each other all the time already. We are capable of trusting each other in some situations but also capable of forgoing trust when it's neccesary.


Data Analysis of LW: Activity Levels + Age Distribution of User Accounts

15 мая, 2019 - 02:53
Published on May 14, 2019 11:53 PM UTC

Epistemic: I rarely trust other people’s data analysis, I only half trust my own. Right now, analytics is only getting a slice of my attention and this work is not as thorough as I’d like, but I think the broad strokes picture is correct. I have probably failed to include enough clarifications and disclaimers on where we should expect the data to be inaccurate. Feedback on my approach welcome.

I’ve been doing some analytics work for the LessWrong 2.0 team since September last year (since March I’ve been doing other work too, but that’s not relevant here). This post will hopefully be the first a series which will eliminate the backlog of analytics results I’ve been wanting to share.

This post is probably not the ideal starting point - that would be probably be a big picture general overview of LessWrong usage since the beginning - but it is some of my most recent work and therefore is easiest to share. Still, it does show things about the bigger picture.

Warning: The graphs are repetitive even though they’re showing different things. I’ve included them all for completeness, but you can just read my summary/interpretations while looking at only some of them.

Distribution of User Account Age

Question: LW2 seems to be doing well, but is that just because we’re retaining/re-engaging a devoted base of older users despite not signing up new users?

Answer: Activity on LW2 is coming from both new and old users across all activity types (posts, comments, votes, and views). The project is succeeding at getting new people to create accounts and engage.

In fact, there have consistently been more new users voting and viewing each month since LW2 launched than throughout LW’s past. The number of new users posting each month is roughly the same as historical levels. The number of new users who are commenting has declined (though the percentage new users is roughly the same), however this is consistent with the trend that comment volume on LW2 has not recovered from The Great Decline of 2015-2017 the way other metrics have.

Meaning of the Graphs

I plotted graphs for each activity type (posts, comments, votes, and views) and the corresponding population of users which engages in those activities. For each user engaging in each activity type, I calculated the “age” of that account since it first engaged in that activity type, i.e. in the graph for users who post, the age of the user account in a given month is the number of days elapsed since that account first posted. In the graph for commenters, is the days elapsed since the account first commented. This avoids certain complications which inconsistencies in how the data was recorded for different activities over LessWrong’s histories.

I segmented the user accounts into four “buckets” based on their “age” [since first engaging in activity of type X].

  • 0 - 90 days
  • 90 - 360 days (~3 months to 12 months old)
  • 360 - 720 days (~1 to 2 years)
  • 720 - 10,000 days (~2+ years )

When I’ve said new user accounts, I have been meaning 0 - 90 days; when I’ve said old users, I’ve meant the 720+ days bucket.

Caveat: we believe that many old users created new accounts when LW2 launched and this is somewhat confounding the data, though not necessarily a lot.

Reading the Graphs

  • X-Axis is time
  • Values plotted are the total values for each month
  • Y-Axis is about the number of individuals engaging in a behavior in a given month, e.g there ~600 people who viewed posts, 30% of which have had less 90 days elapsed since they were first recorded viewing a post as a logged-in user***.
  • In each set of graphs by activity:
    • The first set (2x2) shows a time series line for each age bucket segment alone.
    • The second long graph shows an area plot time series with age buckets segments stacked. This lets you see overall size of the population engaging an activity type over time.
    • The third graph is a 100% area plot which shows the composition of the overall population by “age” of the user accounts over time.
  • A moving-average filter of three time months has been applied for smoothing.

***All data here is from logged-in users!!! Including views. View counts of non-logged in users are over an order of magnitude higher.

Poster Distribution of Age

Questions and posts with 2 or less upvotes have been filtered out. Event/meetup posts have not.

In addition to our primary focus on the age distribution of accounts, we can note the inflection point occurring in September/October 2017. This corresponds to the launch of the LessWrong 2.0 Open Beta 9-20 and publishing of Eliezer’s Inadequate Equilibria on LW* on 10-28. I have marked 2017-10-01 on the graphs with a dotted black line.

*The LW2 team requested Inadequate Equilibria be published on LW2 as an initial draw.

We see from these graphs that the number of users making posts each month is almost as high (~75%) as historical levels, especially those after 2013.

Unsurprisingly, over time more profiles fall into in the “2+ year since their first post” bucket since the longer LW has existed, the more profiles which are at 2+ years can exist. Percentage of users posting with accounts less than 90 days since first post (this includes their first post) has remained almost constant over time with the exception of during the decline period 2015-2017.

A small aside: it’s interesting to note that the nature of posts has shifted somewhat. The median post length on LW2 (~1000 words) is double that from old LW (~500 words). Main posts were on average much longer than Discussion posts (median ~1000 words vs ~300 words). The distribution of post length on LW2 almost exactly matches that LW’s Main section despite having far more posts. The net result is that at least many total words of posts are being written on LW2 compared to legacy LW.

Inserting some analysis from a few months ago. I haven’t re-checked this before including though, so slightly higher chance that I messed something up.

Post Length Distributions for LW1 Discussion, LW1 Main, and LW2

Word count is naively calculated as character count divided by 6, hence the fractional values.

I vaguely suspect that the shift in post length signifies a change in how LessWrong is used and that this is related to the large reduction in comment volume (see next section). A hypothesis is that old LW used to be used for some of the same uses as Facebook and other social media currently fulfils for people, and that new LW2 is now primarily serving some other need.

Commenter Distribution of Age

The graphs for commenters reveal a significant reality for LW2: while post, vote, and view have resurged since The Great Decline of 2015-2017, commenting levels have not returned to anything near historical levels. Since the LW2.0 launch, the percentage of commenters who are new commenters are at its highest levels since 2013 while commenters who began commenting 2+ years ago has been steady at 50% of commenters. The topmost left graph (blue line) shows that there were no new users commenting in the period before LW2 but that this changed with the launch of LW2 and Inadequate Equlibria.

Voter Distribution of Age

The graphs for population of voters tell an interesting tale. There has been a dramatic increase in the number of new users voting while the number and proportion of accounts who first voted 2+ years ago has stayed almost steady/declined a little. The net result is that voters who first voted within the last two years are making up 60% of the voting population. (The effects on overall karma distributed can’t be straightforwardly inferred from this alone since it will depend on how many votes each user makes and their karma scores.)

Logged-In Viewer Distribution of Age

The distribution of user account age for logged-in viewers is similar to that for votes. Large uptick for new accounts, yet no growth among older user accounts. The data here however is “compromised” since in March 2018 all users were logged-out. Users who failed to login again (which is unnecessary if you are not posting, commenting, or voting), would no longer be detected. The drop in logged-in viewer population can be seen in early 2018, particularly in the 2+ year plus time series (red line). After that point, it is mostly flat similar to the case for voters.

Concluding Thoughts

It’s heartening to see that LW2 has made a difference to the trajectory of LW. A site which was nearly put into read-only archive mode is definitely alive and kicking. Counter to my fears, LW2 is drawing in new users rather than purely being sustained by a committed core of older users from LW1. This is despite not yet focusing on recruiting new users, e.g. via promotion of content on social media.

However, the rate of new users which come on is matched by the number of users failing to return (be they older users or new users who aren’t sticking around). Overall, most of the straightforward analytics metrics for LW have not grown significantly since its launch. I suspect that if we understand what is going on with retention, we might be able to hold onto more of the new users and actually cause upwards growth. . . . assuming we want that. I and others on the team don’t blindly believe that growth for the sake of growth is good. We’ll continue to think carefully about any actions we might take that even if they caused “growth”, might cause LW2 not to be the place we want it to be.

I’ve only had a very cursory look at retention. I found that new users were returning in the first month after signing up at historical levels (~30%), but that retention three months after joining is less than half of historical levels (~20%->%5). This is odd. However, these numbers are only from a cursory glance and I haven’t been very thorough yet either in coding mistakes or even thinking about it right.. This paragraph is low confidence.

Another point is that users of LW2 don’t use the site, on average, as much as they did during LW’s peak. Up to 2014, the median user was present on LW for 4-5 days each month (i.e. 4/31 days); in the last couple of years that has been 2-3 days. This might correspond to fewer people commenting since being engaged in ongoing comment threads might keep people coming back. The team is curious how a revamped email notification system (currently under development) will affect frequency of visiting LW.

Lastly, and I think it’s okay for me to say this, is that many of the most significant contributors to LW in the past are still present on the site - lurking - even if they post and comment far less. (I will soon write a post on how we use user data for analytics and decision-making; rest assured we have an extremely hard policy against ever sharing individual user data which is not public.) I think it’s a good sign that LW2 is generating enough content and discussion that these users still want to keep up date with LW2. It makes me hopeful that LW2 might even become a (the?) central place of discussion.

For those interested in working with LessWrong data

To protect user privacy, we’re not able to grant full-access to our database to the public, however there might be more limited datasets which we can release. If there’s enough interest, I’ll discuss this with the team.


Boo votes, Yay NPS

14 мая, 2019 - 22:07
Published on May 14, 2019 7:07 PM UTC


Many votes on LW are "boos" and "yays", and consequently they aren't very useful for determining what is worth reading. A modified version of a Net Promoter Score (NPS) on each post may provide a better metric for determining read worthiness.


It's come up a couple time in my recent comments that I've expressed a theory that votes on LW, AF, and EAF are "boos" and "yays". I have an idea about how we could do better assuming the purpose of votes is not to jeer and cheer but to provide information about the post, specifically how much the post is worth reading, so I'm finally writing it up so others can, yes, boo or applaud my effort, but more importantly so we might discuss ways to improve the system. If you don't like my proposal and agree we could do better than votes, I encourage you to write up your ideas and share them.

So, there are many things votes could be for, but I view votes as a solution to a problem, so what's the problem votes are trying to solve? The number one question I want answered about every post is some version of "should I read this?". There's subtly different ways to phrase this question: "is this worth engaging with?", "should I read this carefully or just skim it?", "is this worth my time and energy?", etc.

I want a solution to this problem because when I come to LW/AF/EAF every day I want a reliable signal about what it's worth me spending my energy engaging with (I generally don't want to just read, but also comment, discuss, understand, grow). Right now votes don't provide this to me, as I'll explain below, but they do provide other things. So keep in mind that my goal in this proposal is primarily to solve the particular problem of "should I read this?" and not the many other problems votes might be solutions to like "how to deliver simple positive/negative feedback?", "how can I express my pleasure or displeasure with a post?", "how do we determine status within the forum?", or "how do we increase platform engagement?". I don't ignore these other purposes, but I take them as secondary—and maybe there's other purposes I forgot to list and so forgot to take into account! The point being I want it to be clear I'm making a proposal that's trying to solve a particular problem, and if you complain "but wait, it doesn't solve this other problem" my response will be "yep, sure doesn't", so any discussion of this sort should be sure to explain why we should care about this other thing.

Okay, all that out of the way, let's talk about votes, and then NPS.

Boo Votes

Up/down voting is very simple and has a long history on LW, thanks to its presence on Reddit (from which, if I recall correctly, the original forum's codebase was forked). It has a number of nice features, and LW has made them nicer:

  • everyone knows how it works
  • it lets you express yourself in two ways (unlike on Twitter where the only option is to vote up something, and a "downvote" requires writing your own tweet expressing dislike)
  • the aggregate votes on a post can be used to generate a user score (karma)
  • the user score can be used to meter access to various site features
  • votes are proportional the status of users, as measured by karma

And of course lots of popular forums of all sorts use votes: Facebook, Twitter, Reddit, Tumblr. Even when votes aren't present something like voting is in the form of "reacts" where a person can choose from a list of named images/sounds/etc. to express something and that something generally can include a simple vote (usually using a universally recognized vote react, like thumbs up/down); cf. Slack, Discord, most massively multiplayer games, Twitch. So it would seem that people like votes a lot and they are used to some effect in lots of places.

Unfortunately for our purposes of trying to figure out "should I read this?", most of what votes are doing is only indirectly engaged with this question. Votes, especially if we think of them as a degenerate case of reacts, are more used to express an opinion on the content than to determine whether or not the content is worth reading, and when there are two voting options they tend to be rounded off to down = boo and up = yay. If you have any doubts about this, just spend more time on social media and let me know if you still disagree in general, i.e. you disagree that most people do this, not that you don't do this or your small group of friends don't do this.

On that point of using votes for something else, it's tempting to think "hey, this is LW; we're rational AF; we know better than to use votes as boos and yays". To which I say "please, tell us more about how you've managed to create a community of perfectly rational agents".

Joking aside, my point is that I've been on the receiving end of all kinds of voting patterns, so I've gotten a chance to see how people use votes on LW. Further, I've talked to people about my posts (either in comments or elsewhere) and in some cases explicitly learned how they voted on my posts and why, and it's lead me to a few conclusions about how people use votes here.

  • Sometimes votes are attempts to increase or decrease visibility of something, regardless of how someone feels about what's in a post or comment.
  • Sometimes votes are a genuine expression of "you should/shouldn't read this".
  • Most often votes say "yay, I like this" or "boo, I don't like this" in response to one of several thing:
    • like/dislike the author
    • like/dislike the subject matter
    • like/dislike the content
    • like/dislike the presentation

The result is what I consider a lot of voting anomalies from the perspective of trying to answer the question "should I read this?". Some claims of things I've seen (I won't link specific posts because I don't want to risk applying shame to anyone for what happened to their post in the votes, and also it's a lot of work to dig up all the examples that caused me to form these beliefs):

  • Low content/quality posts voted highly because people like the author
  • High content/quality posts voted lowly because people dislike the author
  • Posts voted down for heresy, regardless of quality
  • Posts voted up for applause lights, regardless of quality

My personal experience is mainly with writing heretical posts of good quality such that I get more up votes than down but also a lot of down votes (maybe 1/3 down and 2/3 up), and it caused me to pay more attention to voting patterns, engage more with low score posts, and try to figure out just what was going on when posts got low scores that I gave upvotes. What I learned lead me to surmise what I've presented above.

So votes seem to be largely used to signal approval and disapproval of posts, which I suggest is only weakly correlated with telling me whether or not I should read a post. As a result I basically ignore votes and have to skim everything to figure out where the good stuff is. But what if we could do something better...?


Net Promoter Score (NPS) is a simple metic many companies use to evaluate questions of customer satisfaction. To calculate it people are asked "how likely are you to recommend our product or service to a friend or colleague?" and asked for a number from 0 to 10, 0 meaning "not likely at all" and 10 meaning "already have". I really like NPS because it asks people to imagine recommending something and then asking them for something like a probability of how likely they are to do it, although I've never seen a version that did this explicitly.

Responses are then converted into a score by first segmenting respondents into detractors, passives, and promoters, and then taking percent promoters minus percent detractors. I find this metric to be of limited value, and more prefer to engage directly with the full distribution of responses, but if you really needed a single scalar this is one way to get it.

What I imagine doing is asking people to score posts like this:

How likely are you to recommend a friend or colleague read this post?


So they are asked the question and given a slider to mark their likelihood, which includes 100% because they may have already shared it (but there's probably some UI work here to make it clear that 100% and 99% are drastically different responses).

Does this answer our question "should I read this?"? I think it may do a better job than votes, to be sure. Rather than an ambiguous vote, people are now at least being asked to respond directly to a question and give their response to it. Also, we could better use the distribution of responses to make reading decisions. For example, heretical posts might get bimodal distributions of scores, with clusters of strong detractors and strong promoters, and maybe you choose to read a post when it has at least n promoters, regardless of detractors. Maybe you choose to filter out posts with more than n detractors because you don't like controversy or low quality content. Maybe you filter on NPS or mean or median or something else, or sort based on it. And every post, rather than showing a simple number for its score like we do now you show a box-plot or some other suitable visualization showing the distribution of responses.

Now unfortunately NPS is more complicated than votes, so it may work against other problems people are trying to solve with votes. How does NPS help us deal with the problems addressed by karma? How do we prevent NPS from devolving into a binary where people always vote 100% to upvote and everything else is a downvote (the eBay/Uber/Lyft voting problem, where anything less than 5-stars is considered a downvote)? And do we measure comment quality with NPS, or keep votes there, or do something else?

I also don't really expect the LW team to drop everything and implement NPS. Heck, if I were working on LW I probably wouldn't jump all over this. My goal in writing this, maybe more than anything, is to get us thinking about how to better answer the question "should I read this?" and I wanted to provide at least one solution I've thought of and think could be better in some ways. I mostly think we could do more to give better signals of quality on LW and make them less distorted by and engaged with other signals people try to send with votes.

So, what do you think of the current state of votes? What problems do you want to solve on LW that votes or something else may be solutions to? And how would you improve votes or something else to solve those problems?


Coordination Problems in Evolution: The Rise of Eukaryotes

14 мая, 2019 - 20:40
Published on May 14, 2019 5:40 PM UTC


This is a series of posts about coordination problems, as they appear in the course of biological evolution. It is based on the book "The Major Transitions in Evolution" by John Maynard Smith and Eörs Szathmáry. Previous part, discussing Eigen's paradox as well as the origin of chromosomes, can be found here.

In this part we are going to look at the origin of eukaryotic cell, specifically at its acquisition of endosymbiotic organelles, and at the origin of multicellularity.

Prokaryota vs. eukryota

While all single-celled organisms may look like similar wiggly little creatures to us, there is a huge difference between prokaryota (like bacteria) and eukaryota (like protozoa or, for that matter, our own cells). The cell wall is different. The interior of the cell is different. One has rigid cell wall, in the other it's the cytoskeleton that holds the cell together. One has a single-origin DNA strand attached to the cell wall, the other has nucleus containing chromosomes. One has mitochondria and chloroplasts, the other does not. Even the mechanism of cell division is different. If we haven't known that we share part of the genome, it would be easy to make a mistake and believe that the life on Earth had originated at two separate occasions.

The transition from prokayotes to eukaryotes is likely the most complex transition in the entire course of evolution. It took two billion years to happen. More than the emergence of life itself.

All that being said, we are going to look only at a single part of the evolution of eukaryotes, namely at their acquisition of mitochondria. Mitochondria were free-living cells once. But then they've became an inseparable part of eukaryotic cell. Hence, back to the coordination problems!

How did it come to be that some cells started living within other cells? Well, assuming that flexible cell wall and phagocytosis evolved before the domestication of mitochondria, getting them inside wouldn't be a big problem. It happens each time one cell eats another.

What's more interesting is how did the guest cell survive and how did the cooperative behavior between the host and the guest evolve.


Let's make a digression and think about symbiosis for a second. If we assume that there are only two strategies for the host (cultivate the symbiont or try to kill it) and two strategies for the symbiont (cooperate with the host or parasitize) then the problem gets reduced to variants of the prisoner's dilemma game.

Consider this kind of arrangement of payoffs. The numbers specify the fitness of the host (left) and the symbiont (right):

Host / Symbiont cooperate parasitize cultivate 20 / 20 5 / 30 kill 15 / 0 10 / 5

It can be easily seen that there is only one equilibrium: Whatever the host does it's better for the symbiont to parasitize. And if the symbiont is a parasite it's always better for the host to kill it.

Under what conditions do we see this kind of game? The authors point out that this happens when each individual host acquires some genetically different symbionts from the environment. The reason is that it doesn't pay for the symbiont to invest in the cooperation with the host if the host is going to be killed by a different symbiont anyway.

How about a different scenario?

Host / Symbiont cooperate parasitize cultivate 20 / 20 5 / 10 kill 15 / 0 10 / 5

The ideal strategy for the symbiont is not clear in this case. If the host is cultivator it may pay to the symbiont to cooperate. If, on the other hand, the host tries to kill it, the best thing for the symbiont to do would be to multiply as fast as possible, regardless to any damage to the host.

This kind of setup is expected if each host acquires only a single symbiont from the environment:

However, with hosts infected by a single symbiont, cooperative mutualism is likely to be stable once it evolves. The evolution from parasitism to mutualism will be favored if the hosts killing response is ineffective, and if the further spread of the symbiont is greater if the host does survive. It will not occur if the host can rapidly rid itself of the parasite, or if the parasite spreads only by killing the host.

Finally, let's have a look at the the following scenario:

Host / Symbiont cooperate parasitize cultivate 20 / 20 5 / 10 kill 15 / 15 10 / 0

Again, there's only one equilibrium. It's always better for the symbiont to cooperate and once it's cooperating, it's better for the host to cultivate it.

This happens when the host acquires one or a few symbionts from one of its parents.

It makes sense: If the only place you can disperse to are your host's children you really don't want to kill it.

So there's a rule of thumb emerging here: If the transmission of the symbiont happens between unrelated individuals (horizontal transmission) the symbiosis will evolve towards parasitism. If the symbiont is passed only from one parent to its children, then the relationship will evolve towards mutualism.

In fact, both experiments and observations in the wild show that vertical transmission of the symbiont leads to mutualism and horizontal transmission leads to parasitsm. There are some exceptions though. For example, the transmission of luminous bacteria in deep-see fish is horizontal, yet the symbionts are essential to the survival of their hosts.

Parasites or livestock?

Now, let's get back to the origin of eukaryotic cell. What was the relationship between the early host cells and early mitochondria?

It may have been that the mitochondria were parasites. Maybe they sometimes escaped the host cell and infected different cells. However, the authors hint at an interesting alternative: The host cells may have farmed the mitochondria for the later consumption, just like we do with the cattle.

One important point to understand here is that, however we feel about slaughtering cows, from the population generics point of view it's a mutually beneficial arrangement. Homo Sapiens gets steaks. Bos Taurus becomes one of the most common terrestrial vertebrates around.

So, the host cells may have first consumed mitochondria, but then learned to keep them around (or rather inside) so that they can be consumed later.

And we do see some evidence that the host cell adopts active measures to keep the relationship mutualistic. In sexually reproducing species the transmission of mitochondria happens from one parent only. When human egg merges with human sperm, all the mitochondria from the sperm are discarded and only those from the egg make their way into the embryo. That, according to the model described above, prevents competition between the different strains of mitochondria at the expense of the host cell.

Later on in the evolution, straightforward consumption of the symbionts must have been replaced by protein "taps" that we see installed into the cell wall of mitochondria today. The taps allow the nutrients produced by the mitochondria to flow into the host cell. Think of Maasai puncturing the flesh of a cow and drinking the blood without killing the animal. The fact that the tap protein is always encoded in the DNA of the host cell rather than in mitochondrial DNA is a hint that the idea of the host cells "farming" mitochondria may not be that implausible.

Gene transfer to the nucleus

Once the mitochondria were living inside the host cell a curious process began. The genes from the mitochondria started "jumping" into the host cell's nucleus.

By losing their genes, mitochondria lost the chance to break free from the eukaryotic cell for good. So why did it happen? And who benefited?

I really like this process because it shows how complex the interplay between different levels of selection can be. In particular, we have to do with three distinct levels of selection here: Selection on the level of the host cell, selection on the level of mitochondria and selection on the level of a single mitochondrial gene.

First, we can imagine a mitochondrial gene getting attached to a nuclear chromosome. It would be clearly advantageous for the gene: One more copy! Hooray! What's not to like?

But why didn't the gene got discarded from the nuclear DNA given that it performed no useful function? Well, it turns out that nuclear DNA can contain humungous amounts of dead code and yet the code doesn't get discarded by natural selection. Contrast that with prokaryotes which tend to keep their genetic code short, sweet and streamlined.

But wait. The gene would still be translated into protein. That would be a useless expenditure of energy and thus it would be selected against. To make it advantageous for the cell there must have been a mechanism to transport the protein back into mitochondria. Luckily, all that is needed for that is to add to the protein a "transit peptide", a handle which would be recognized by a receptor in the mitochondrial membrane and used to carry the protein inside. Creating such a handle is easy. Baker & Schatz pasted randomly chosen pieces of E. coli and mouse DNA in front of protein genes, and found that 2.7 per cent of the bacterial inserts and 5 per cent of the mammalian ones were successful transit peptides.

Another hint that the transition may be easy is that there is no distinct pattern to the transit peptides. In other words, the transit peptides probably evolved 700 times independently — once for each mitochondrial gene that was transferred into the nucleus.

When the gene is in nucleus and the peptide handle in its place, the mitochondria can gain advantage by discarding the genes that they don't need anymore (they are importing those proteins from the outside anyway). And, as already mentioned, prokaryotes are very good at stripping their genome of any unnecessary baggage. Nick Lane does some back-of-the-envolope calculations and concludes that the energy savings are truly huge.

All of that being the case, the question is rather why all the genes haven't been transferred to the nucleus.

Why the gene transfer stopped

For mitochondria, the process was stopped by the change in mitochondrial genetic code (see only kind-of-related but fun-to-read column by Douglas Hofstadter). As soon as one of the mitochondrial codons began coding for a different amino acid, the genes could no longer jump to the nucleus. When they did they were turned into defective proteins by the old, unmodified nuclear translation machinery.

But that can't be the entire story. The chloroplast genes are encoded in the plain old generic code. Yet the chloroplasts still keep some of the genes for themselves. This may (or may not) indicate that there's still some level of separate identity to the organelles, that they may have goals of their own not fully aligned with the goals of the enclosing eukaryotic cell. An example of that would be, for example, mitochondria trying to distort the sex ratio of the host species.

(As a side note, there are organelles called peroxisomes that were once thought to be endosymbionts, very much like mitochondria. Except that they had no genes at all. It has been suggested that they may be endosymbionts that have transferred all of their genes to the nucleus. However, that idea has been recently challenged.)

Multicullular life and Orgel's second rule

Multicellular life sure looks like it has a coordination problem. All those billions and trillions of cells have to cooperate somehow. Most of them have to give up individual reproduction and rather work for the benefit of all. Hell, there's even programmed cell death where the cell is expected to willingly die when there's no use for it any more.

But when you take a step back the argument doesn't make sense. All those cells are genetically identical. There aren't multiple entities engaging in a coordination problem. There's just one entity: The multi-cellular organism itself.

Or is there?

It may be instructive to pause here for a while and do an exercise in evolutionary thinking.

Consider what happens if a somatic cell mutates.

The mutation may cause the cell to divide in unregulated manner.

But there's even more intriguing possibility: Imagine that a the cell mutates in such a way that it's more likely to give rise to a germ cell. For example, a plant cell that would otherwise produce a leaf would give rise to a flower instead. By doing so it would lower the overall fitness of the organism: The plant would now have less leaves than what's optimal. However, at the same time the renegade cell would increase its own fitness because any pollen or seed produced by that flower would carry the mutated gene instead of the original one.

So what do you think? Does the above make sense or does it not? Is it really an intra-organism conflict? Think about that. I'll wait.

Smith and Szathmáry approach the problem by splitting it into two parts.

First, they discuss whether mutation that increases a chance of giving rise to a gamete creates an internal conflict and, consequently, selection pressure for other cells to evolve a mechanism to prevent such mutations. They conclude that it doesn't. After all, this is not much different from when the mutation occurs in a germ cell. The child will be slightly genetically different from the parent, but that's just how evolution works. If the child happens to be more fit than the parent, it will eventually prevail in the competition on the organism level. If not so, the new strain will be eliminated by the natural selection.

The second part of the question is what happens if the mutant is malignant, i.e. if it causes unregulated cell proliferation. We call that cancer. In that case, the authors conclude, there will be an actual selective force to prevent, or delay, the malignancy.

Have you got that right? If not so, don't be disappointed and think about Orgel's second rule. The rule says: "Evolution is cleverer than you are."

If you want to remember just one thing about evolution, Orgel's second rule may be a good choice.

In fact, Smith and Szathmáry, as good evolutionary biologists, have the rule ingrained and conclude the section by hedging their bets:

There is, therefore, no reason to think that [specific mechanisms discussed in the book] evolved to suppress cell-cell competition. But the question is important, and we do not regard our arguments as decisive.

To be continued

In the following parts I will cover the origin of sex and of social species. In the end I am going to speculate about possible parallels between coordination problems in evolution and coordination problems in human society.

October 14th, 2018

by martin_sustrik


Unikernels: No Longer an Academic Exercise

14 мая, 2019 - 20:40
Published on May 14, 2019 5:40 PM UTC


I've been following the unikernel area for years and I really liked the idea, but I was unconvinced about the possibility of the wide-scale adoption of the technology.

The cost was just too high. It required you to forget everything you knew, to drop all the existing code on the floor, to rewrite all your applcations and tools and start anew. (I am exaggerating, but not by much.) If microkernels never made it, the unikernels are not going to either.

Whatever the benefits, the cost was prohibitive.

Unikernels as processes

Recent "Unikernels as Processes" paper by Koller, Lucina, Prakash and Williams (free download) turns the situation on its head. It proposes to run unikernels as good old boring OS processes. The idea is that most of the stack that's currently in the kernel will be in libraries linked directly to the application. Only few calls would cross the user/kernel boundary.

One-click security

Assuming that libraries implementing POSIX APIs are available (and they are; see e.g. rumpkernel) it should be possible to take your existing application and just recompile it as a unikernel. The application would work as it did before, but it would use only a few system calls.

So, on one hand, any vulnerability in the kernel outside of those few functions won't affect your application.

On the other hand, any vulnerability in, say, TCP or filesystem implementation would compromise your application — but the problem would be at the same time kept at bay by separation between OS processes (different address spaces etc.) It wouldn't result in compromising other applications running on the same machine.

Now think about that from economic point of view.

Vendors are finally forced to take security seriously. But all the options they face are rather unpalatable.

They can keep the status quo and pray that nobody bothers to hack them.

Or they have to fix security flaws which, likely, means security audit of the entire stack and then rewriting most of the legacy code. That's prohibitively expensive. Only few, if any, companies are able to afford that.

Unikernels-as-processes model is no panacea, it won't fix a SQL-injection vulnerability for you, but it addresses a broad class of highly dangerous security flaws (for estimates, see, for example, this paper).

More importantly though, it's a once-click solution! It does improve security at close to zero cost.

That, in turn, gives business people an easy way out from this uncomfortable dead-end situation.

Based on the above, my guess is that the technology, once it is truly provides a one-click experience, will face a quick and widespread adoption.


Compared to the one-click security, portability is just a minor selling point, but still:

As the moment, it's not trivial to port applications even between POSIX-compliant OSes, but porting to Windows is a pain in the ass, plain and simple.

Unikernels-as-processes model has an interface between the OS and the application consisting of a single digit number of functions. Once those few functions are available and work the same on all mainstream OSes an application written on one OS would just run on a different OS. (And I am not even mentioning the possibility of running it directly under the hypervisor or on bare metal.)

Keep your tools

One of the things that hindered the adoption of unikernels was that they broke the existing tools and processes.

Typical question: If I am going to run my application as unikernel, how the hell am I going to debug it?

And while the debugger is the tool that comes to mind first the same problem applies for any tool that assumes that the application is a standard executable and that it runs as an standard OS process. That will likely break most build and deployment toolchains. It can break control interfaces, monitoring and who knows what.

With unikernels-as-processes it is no longer a problem. The application IS a standard executable. And it DOES run as a standard OS process. Only minor changes to tools and processes are needed.

In fact, the switch can even improve the tools. Consider debugging an application. Currently, you stop at the kernel boundary. There's no way to step into the implementation of, say, TCP protocol. If, on the other hand, TCP implementation is just a library linked into your application, then sure, step into it, place breakpoints inside it, do whatever you want.

Next productivity boom

When open source infrastructure became widely adopted we have experienced a productivity boom. Instead of waiting for years until the commercial vendor implemented the feature you wanted, you could suddenly choose a free solution that aligned with your needs. And you could fix any broken or missing stuff yourself.

However, we've seen only a dampened-down version of this revolution in the operating systems space.

Sure, you can fork Linux kernel and implement the feature you need. But then you are faced with the dilemma of either maintaining the fork yourself or upstreaming the change to the mainline kernel.

In the former case, you are doomed to maintain the fork forever. That's annoying and costly. Moreover, you have to ask the users of your application to run your fork of the operating system. That's not going to make them happy. They'll imagine you going bankrupt next year and them having to maintain a custom fork of the kernel for the next decade. They will politely back down.

If, on the other hand, you choose to upstream the change, you'll have to fight Linus Torvalds to get your patch in. If he doesn't like it, you are out of luck and back to the previous option. Even if you manage to get the patch in, it'll take years till everyone updates their OS to include your feature. Do you have a customer running RHEL 5 with kernel from 2007? Oops! Not going to work! You will end up with time-to-market of 10 years and that's guaranteed to kill almost any business plan.

With unikernels-as-processes model, the problem disappears.

You want to tweak IP protocol implementation? Yeah, sure. Find an existing IP library on GitHub, patch it as needed and link it to your application. Done. Anyone can run it on their off-the-shelf OS.

October 23th, 2018

by martin_sustrik