# Новости LessWrong.com

A community blog devoted to refining the art of rationality
Обновлено: 50 минут 31 секунда назад

### Peekskill Lyme Incidence

9 мая, 2021 - 22:50
Published on May 9, 2021 7:50 PM GMT

Let's say you're considering moving your nonprofit out of the Bay Area, and are considering somewhere in the Hudson Valley, perhaps near Peekskill NY. Looking at CDC maps it seems like Lyme is something to consider:

The dots are close enough together in the Northeast, however, that this map implies more uniformity than there actually is. Several years ago I made a map, coloring each county by Lyme incidence 1992-2011, and the Hudson Valley is really pretty bad:

What do things look like with more recent data? I went back to the CDC site and downloaded their public use dataset. It has annual cases by county, 2000-2018. Peekskill is in the far North-West corner of Westchester County, surrounded by Putnam, Orange, and Rockland. Even though it is in Westchester, so much of the Westchester population is farther South that my guess is Putnam is probably a better proxy?

Taking population data from the census, here's the annual rate for each county:

Putting these percentages in perspective, a 0.1% annual rate represents 1 in 1,000 people getting it each year, and if you stayed for 20 years you'd roughly have a 1 in 50 (2%) chance.

Mostly this makes me sad that no one has decided to manufacture the FDA-approved Lyme vaccine LYMErix, whose patent expired in 2014.

(I would like to update my map with this new data, but there are artifacts I need to look into first.)

Discuss

9 мая, 2021 - 20:29
Published on May 9, 2021 5:29 PM GMT

In The Gervais Principle, Venkatesh Rao argues that the show The Office "is not a random series of cynical gags aimed at momentarily alleviating the existential despair of low-level grunts. It is a fully realized theory of management that falsifies 83.8% of the business section of the bookstore."

In this post, I argue that viewing academia through this lens can be equally revealing. But first, we need to discuss the lens itself.

Rao develops this theory of management around the comic Company Hierarchy by Hugh MacLeod:

Hugh MacLeod’s Company Hierarchy

The theory begins by dividing people in an organization into three categories: Sociopaths at the top, Clueless middle managers, and the average workers as Losers. For brevity we will sometimes call this system the SCL hierarchy.

Because these category names were chosen for a gag comic, they aren't great matches for the groups they describe, and can even be a little confusing. Losers aren't losers in the normal sense, they are losers economically — they have struck a bad bargain where they labor for a paycheck. They don't have equity. This bargain can be perfectly rational; it's low reward but it's also low risk. Someone without great natural talents or large amounts of capital may be smart not to take these risks, and in these cases being an economic Loser is often the right call. Most of the characters in The Office are Losers, essentially everyone who isn't in management.

Sociopaths may or may not be literal sociopaths — they are like clinical sociopaths in that they are willing to take risks in the service of rewards, and that they are willing to bend social rules to do so. This too is often rational, for people who are willing to take risks and have the ability or capital to do so. It may or may not be admirable. Bending ethical rules is often bad, but bending rules like "don't question authority" or "don't have original ideas" is often good. In The Office, executives like David Wallace are Sociopaths.

Clueless might be the most appropriate of the three terms. These are people who are clueless enough that they can be easily manipulated to serve the purposes of the organization. In The Office, Michael Scott is the flagship Clueless.

The three categories can be understood as a developmental trajectory, but curiously the development is not the same as the company hierarchy.

Clueless are underdeveloped, and act like children or adolescents. They are motivated primarily by approval from authority figures, and that makes them suckers. Since they are motivated by approval, and because they are otherwise not very smart/not very self-aware, they don't realize that being a wage slave is a bad deal. This is their defining characteristic: they will work very hard for a company that doesn't value them. This is also why they end up in middle-management. They will live or die (metaphorically, we hope) for the company, which makes them very useful to organization Sociopaths, who can use them as fall guys.

In comparison, Losers recognize that being a wage slave IS a bad deal. As a result, they do the minimum necessary to not get fired and keep collecting their paycheck. Again, this is a reasonable thing to do in many cases. For example, you may be a Loser in your day job so you can pursue your real interests nights and weekends. And for many people unable or uninterested to take the risks inherent to being a Sociopath, this is an acceptable bargain. Taking such risks is not only a gamble, it often involves bending or breaking social rules. This is likely to estrange your peers, and so people who are well-adjusted will usually prefer to stay economic losers rather than become isolated.

Sociopaths are the most developed, but maybe over-developed. They understand social dynamics well enough that they begin to have a hard time taking them seriously ("...they are looking for the truth about social realities because they think they can handle it."). As a result they stop finding social or status rewards motivating, and as a result tend to value material rewards instead. Unfortunately for them, this tends to make them unhappy in the long run.

There can be state transitions. A Clueless who stops valuing approval from authority figures, or realizes that the company does not actually care about them, will stop over-performing and become a Loser. And a Loser who gains the skills, leverage, or disillusionment needed to take risks will become a proto-Sociopath, do even less work than usual, and look for a chance to get promoted. Once they have some actual bargaining chips, they become a real Sociopath. Losers do not generally become Clueless, because it's rare for people to regress that much.

As a result, the categories aren't exactly personality traits, and they're not exactly descriptions of where you exist in the company hierarchy, they're somewhere in between. The same person might be a Loser in one company, then leave the company to found a startup (where, as a founder, they are a Sociopath). If the company goes under, they might get another Loser job. On the other hand, someone who is Clueless will be an economic sucker wherever they go, so they will probably end up in Clueless positions in every company they become a part of.

In addition, Rao describes four languages (and a fifth semi-language) that the three groups use to communicate. More on this later.

A critical point, and one that is easy to miss, is that the Clueless serve two main purposes in an organization. First, since they have misplaced loyalty to the organization, Sociopaths can use them as cat's-paws and fall guys. They can get them to take risks by proxy, and have them take the fall for these or other risks as necessary. Second, they serve to insulate the Sociopaths from the Losers, or as Rao puts it, "to provide a buffer in what would otherwise be a painfully raw master-slave dynamic in a pure Sociopath-Loser organization."

In this post I use this theory of management to analyze the dynamics of academia. There are a couple of good reasons to do this. Applying the theory to a new area is a good way to explore it and test its power as a framework. It can help explain some parts of academia that might otherwise seem confusing. And finally, I think it can explain some aspects of the current culture war (so caveat lector for those of you who are wary of such things).

A university is an organization just like any business. Dunder-Mifflin has many branches; the Scranton branch is just one branch of many. A university has many departments; each department is just one department of many. Or we could say, academia is divided up into many different universities; each university is just one university of many. So we might analyze this system at the department level, at the university level, or at the all-academia level, but it doesn't make much of a difference.

To begin to analyze a system from the SCL perspective, we first need to figure out which of the three groups people belong to. In case you wonder where my loyalties lie, know that as a PhD student, under this system of analysis I am decidedly a Loser. But more on this later.

Rao names a number of signs by which we can identify the three groups:

Clueless

• Over-perform for their organization, marking themselves as suckers
• Identify with the organization, to the point of having loyalty (which is not requited)
• "...cannot process anything that is not finite, countable and external. They can only process the legible."

Losers

• Do the bare minimum to stay in the organization (and the more they under-perform, the more likely they are to become Sociopaths)
• Conflate material (e.g. money) and emotional (e.g. status) rewards; "cannot process the material aspect of anything that involves strong emotions"
• Jockey over social status, rather than material power (Sociopaths) or approval from authority (Clueless)

Sociopaths

• Play for real (material) stakes
• "about recognizing that there are no social realities"
• Perform "game design" for the organization, arranging for social competition (Losers) and medals and ranking schemes (Clueless), while collecting material rewards for themselves

Academia has many different sub-groups. This is not unlike business — the Scranton branch has warehouse staff, support, and sales, as well as a manager. In academics, however, the structure is less immediately hierarchical, and so it is worth examining this system at every level. We will skip a few levels to simplify. In particular we will ignore MA students, but that's ok, they're used to it.

Undergrads as undergrads do not fall into the SCL hierarchy. After all, they're not part of the organization — to a university, undergrads are consumers, not employees. But undergrads who aspire to become academics ARE semi-employees, usually through serving as research assistants (RAs).

Most RAs are Losers. They are engaged in a bad economic deal — exchanging their labor for a recommendation letter, a long shot chance at graduate school. They're not even paid, so in most cases this is an even worse deal than working for a paper company. Undergrads don't tend to be experienced enough to understand this the same way most workers do, but they usually have an intuitive sense for it, which is why few undergrad RAs put in long hours or show much devotion to their lab.

A small number of RAs, however, do devote themselves to their lab and work heroic hours on thankless research projects. If you've spent any time in academia, you recognize this character. RAs who act this way are Clueless — remember, the defining characteristic for this group is over-working themselves for an organization that couldn't care less about them and doesn't reward them. In this way they send a strong signal that they are suckers who can easily be exploited. Unsurprisingly, these RAs are destined to go far.

RAs cannot really be Sociopaths because undergrads, as Rao would put it, are playing with monopoly money. Having already paid tuition, they have almost nothing academia could want from them. Any RAs with Sociopath tendencies express instead as low-performing Losers. They are in the system only to look for opportunities. A hypothetical true Sociopath RA would need to have either their own funding or truly blockbuster ideas, and would turn them into first-author (or even single-author) publications in good journals, or better yet, use their ideas to do something like get a book deal or found a startup.

Especially cynical RAs will choose projects that appear to be very difficult but are secretly very easy — for example, data coding tasks that can be automated with a simple script — in order to appear Clueless on grad applications.

1.2 PhD Students

Two kinds of students are selected for PhD programs. The first are those who have proven themselves, as RAs, to be utterly Clueless. This looks like accomplishing many impressive projects as an undergraduate (for neither pay nor credit) and having very impressive recommendation letters (and nothing else) to show for it. For related reasons, these students also tend to have very good grades. As discussed, this singles them out to faculty as suckers who can be easily exploited for lots of labor. Faculty are probably not self-aware enough to see it this way, but in their own jargon, they recognize the students will be "productive".

This is part of why burnout is such a big problem in graduate school. Not because there is a problem with the system (though there is), and not because faculty push students to overwork themselves (though some do), but because there is an enormous selection pressure to promote Clueless RAs into PhD programs. In many ways this is like the promotion of the Clueless that Rao describes in the business world.

On another level, the Clueless do very well as PhD students. As Rao says, to the Clueless "everything worth learning is teachable, and medals, certificates and formal membership in meritocratic institutions is evidence of success." So while they find PhD programs stressful, it's at least stressful in a way they understand.

However, there are only so many Clueless RAs in a given year. In addition, everyone can tell that the work done by Clueless undergrads is not very creative; it looks less like independent work, and more like pulling multiple all-nighters on someone else's project. As a result, faculty can tell that this student "may not be able to do original work".

So the second kind of students selected for PhD programs are the Losers who have some Sociopathic tendencies. As mentioned, most RAs are Losers. The ones who float to the top tend to be those who trend Sociopathic, because this tendency will inspire them to create something they have at least partial ownership over, and this ends up looking like the ability to come up with original lines of work. Faculty value this, since it leads to a different kind of productivity, and so these students are often admitted as well.

In addition, faculty who lean Sociopathic will be tempted to admit students of the same stripe, because they value having someone around who sees things the same way they do. This is true even if neither of them are true Sociopaths.

So in admission to PhD programs you tend to have an even split of Clueless and Losers who trend Sociopathic. Beginning to get some real power, and growing steadily more disillusioned, some of these Losers will become true Sociopaths. This is pretty rare, however, since PhD students rarely have the power or will to play at that level. Those that do often get summer internships for major companies, leave early to do something like found a startup, or spend all their time secretly working on a side project instead of attending to their graduate research.

Losers without Sociopathic tendencies don't often make it to grad school, and don't often stay when they do, because true Losers put community and their emotional life first. This forms a feedback loop. There is not much of an emotional life in grad school, so the people who value it leave, so there is not much of an emotional life in grad school...

1.3 Faculty

There are even fewer faculty positions than there are spots in PhD admission, so at this level we see another round of strong selection pressure.

This is where we run into the first major surprise from analyzing academia from a SCL perspective. Because while you might expect me to say that faculty are mostly Sociopaths, in fact they are almost entirely Clueless. I think this is true of both tenured and untenured faculty, so I will treat them together from here on.

The defining characteristic of the Clueless is over-performing relative to their level of reward. It's hard to imagine a better way to describe university faculty. Everyone knows that the unappreciated workload of faculty is massive, and the the unpaid workload even larger. They teach classes for humorously low wages, edit journals for "prestige", and perform peer review for nothing at all.

The Clueless "cannot process anything that is not finite, countable and external. They can only process the legible." Certainly this describes the behavior of faculty, literally counting lines on their CV, grubbing for citations, breathlessly calculating their h-index.

This sounds more than a little abusive, and it is, but in many ways, these people are attracted to academia for exactly these reasons. "The Clueless can process the legible," says Rao, "so a legible world is presented to them." In this way they find it very comforting.

Rao even has a whole section on the humor used by each group, and while it is a little hard to explain, faculty definitely have Clueless humor. Losers make jokes for the group, and often use forms of humor that encourage the group to join in. Sociopaths make jokes for themselves, that other people don't get, and often don't even notice. But the Clueless make jokes that are antisocial and yet also not for their own benefit. I can testify that sitting in on faculty meetings is a lot like sitting in on the lunch table at the local high school. Different faculty members will all try to make the same joke, one after another. They will make jokes that you can't build on, to which there is no possible response. They will say a joke, and then when no one laughs, they will say the same joke again, only louder. Rao's other note on the Clueless is that they make you cringe with their actions, and faculty humor is nothing if not cringeworthy.

Despite ostensible appearances to the contrary, faculty are not at the top of the pyramid in academia. Instead, they are academia's middle-managers.

Some fields are probably more this way than others. A field where there is more room for material rewards, where labs can land huge grants, may be more likely to attract Sociopaths. But on the other hand, "me win most grants" is also very legible.

Of course, there are some Loser faculty and some Sociopath faculty in every field. The Losers are distinguished by being very aware that being a professor is an economically raw deal. They tend to be people who have a passion for research or teaching and accepted this bad deal because it let them fulfill themselves in their other calling. I know one professor who never applies for grants and never takes grad students. As a result his department has pushed him into a tiny office, but he doesn't care. He just wants to do his independent work without being bothered, and his tenured position has landed him a situation where he can focus on that.

Sociopath faculty are distinguished by using their faculty position as part of a wider portfolio, or as a stepping-stone to other things. Any faculty member who is involved with a startup or has several popular trade books might be a Sociopath. Steve Pinker, who clearly aspires to be more of a public intellectual than "merely" a Harvard Professor, is almost certainly a Sociopath under this system (and maybe in general).

So why are university faculty almost universally Clueless? I think there are two main reasons. First of all, doing hard work for little reward marks you as exploitable, and two levels of filtering for that trait leads to an inevitable conclusion.

Second, the Clueless serve a special role in a large organization, that of separating the Losers from the Sociopaths at the top. "Without it," says Rao, "the company would explode like a nuclear bomb, rather than generate power steadily like a reactor."

1.4 The Top???

Universities are organizations. But as we've just seen, the faculty are not ruling the roost — they are all Clueless. Who is working these machines from the top?

My first instinct is to say that these are the people directly above the faculty, maybe the deans. This makes some sense — I have almost no experience interacting with any dean. But the stereotype of the university dean seems a lot more Clueless than Sociopath. In large organizations, there may be many layers of Clueless middle-managers, and universities are very large.

Maybe you have to go higher. The board of trustees? The president of the university? Maybe, but my limited experience of these people is that they seem pretty Clueless as well. The president gets paid a lot, but maybe seems like a potential fall guy, which would make him Clueless. But who would he be taking a fall for? I don't know; but given that I am a Loser in these organizations, and I don't even know who my local Sociopaths are, the system seems to be working as intended.

It's also possible that universities have evolved to be a truly headless entity, but I find it hard to believe that someone isn't profiting off of this system. At their heart, many major universities are real estate companies. Between them, Harvard and MIT own most of Cambridge. NYU is slowing buying up as much of Manhattan as they can. What makes these companies special is just how much Clueless flash they have been able to put between themselves and the public eye.

(In fact, the fact that many universities are secretly real estate companies makes me wonder if there might be a Georgist interpretation along similar lines — Universities are landlords, faculty are capital/bosses, and grad students/undergrads are labor. This matches the three categories of SCL surprisingly well.)

This reinforces the value of having most of the faculty be Clueless. They work long hours to be very distracting. They have brand loyalty to the university, even when that university is a monster. They will take risks for the university in exchange for nothing more than "medals, certificates and formal membership." And when the university needs someone to take the fall for a risk that went wrong, the faculty are always there to take that fall.

In Part II of his essay, Rao describes the four (plus one) languages that the different groups speak with one another.

Powertalk is the language that Sociopaths speak among themselves. It is the most interesting language on its own merits, but as academics are almost never Sociopaths (in the SCL sense of the term), we won't discuss it in depth here. Read Rao's essay for the fascinating details.

Posturetalk is the language that the Clueless speak to everyone; indeed, "they don’t have an in-group language since they don’t realize they constitute a group." In academia, the typical Clueless is a faculty member, so this is the jargon you hear from faculty — differing slightly by field, but vague and stuffy in the ways you expect.

Babytalk is the languages the other two groups use to address the Clueless. Rao emphasizes that Babytalk "seems like Posturetalk to the Clueless." In the case of academia, this is something that sounds like jargon to faculty, but is actually dismissive of them. I'll further emphasize that the purpose of Babytalk is to allow the other two groups to manipulate the Clueless, which in this case means manipulating faculty and senior PhD students. More on this in a minute.

Finally, Gametalk is the language spoken among Losers as part of their pecking-order games. "Gametalk leaves power relations unchanged because its entire purpose is to help Losers put themselves and each other into safe pigeonholes that validate do-nothing life scripts." If you are cynical enough, maybe this will sound like the language of undergraduates to you too.

That black line on the diagram goes officially unnamed per Rao, because one of the functions of the Clueless is to provide a buffer between Sociopaths and Losers, so they never have to / get to communicate. But he says this is "an unadorned language you could call Straight Talk if it were worth naming." In academia the Sociopaths are so far removed from the Losers that this is not worth considering.

Clearly the most interesting of these languages, in the context of academia, is Babytalk, the shared language spoken by both Losers and Sociopaths when they want to placate, manupulate, or distract the Clueless. Notably, to the Clueless it sounds largely like their own language, Posturetalk, so to an academic, this will sound something like academic jargon.

I submit that in the modern political climate, "woke" language is an important dialect of Babytalk.

It fits all the criteria. Woke language borrows the style of academic jargon and sounds a lot like academic speech to faculty. None of them are aware that they are being condescended to.

Woke language rarely changes anything substantive — in Rao's words it "leaves power relations basically unchanged" — but it is very effective in manipulating Clueless faculty and senior PhD students. The more progressive among them will readily back down out of agreement with the ideology, and the less progressive will back down out of their irrational and disproportionate fear of being "cancelled", this despite the fact that in reality professors are almost never "cancelled" for anything.

Finally, woke language can be identified as Babytalk by the fact that neither Sociopaths nor Losers use it among themselves; they only use it in communicating with the Clueless.

Certainly we don't expect the academic Sociopaths, whoever they are, to use woke language in their personal dealings. But some of you may be surprised to learn that students don't use woke language among themselves either. Now, their own ingroup language, or Gametalk, does involve similar issues of race, class, and gender, but it's distinct from the Babytalk they direct at faculty. Those of us Losers who have spent a lot of time in progressive spaces, I'm sorry to reveal, can easily distinguish between the two.

Some people are even consciously aware of using woke language to condescend to or manipulate the local middle-managers. I happened to be speaking with an undergraduate student recently. This student is not only from a notoriously progressive, even radical college, they also fit many of the personal stereotypes of "wokeness" — they are queer, asexual, neurodivergent, etc. But at one point when we were discussing a problem they were facing with the administration, they told me:

I feel like putting my case in personal terms is wrong because my race shouldn't matter, but I feel like our Dean of Students will only listen to me if I frame it as a threat to me as a woman of color.

This matches my general experience as well. Faculty and administrators prefer to ignore student concerns, but they freak out when presented with issues of race or gender. Students are in tune with this and learn that they have to frame things this way to have any hope of getting anything done.

Presumably the local Sociopaths are aware of this as well. As a result it is not surprising that both Loser and the Sociopath academics use woke language as part of their Babytalk. Nor is it surprising that many professors experience a world where everything is framed in the woke terms of race, gender, and class. This is not the dialogue spoken in the great wide world, but these professors have made it clear that this is the only kind of framing they will pay attention to, and people have adjusted their messaging accordingly. I understand that this puts them in a cold sweat, but it's hard to feel sympathetic.

This is further emphasized by the fact that when faculty (who are Clueless) try to describe "woke ideology", they fail miserably. This will be invisible to outsiders because Babytalk is designed to pass for the faculty's native Posturetalk, but believe me, professors could not remotely pass the Woke Turing Test. Young people today are not afraid of being challenged; they do not reduce themselves to their skin color or their genitalia. They are not "confused" about their gender. When college professors express concern about this sort of thing, they're just showing that they do not even understand the terms they are being condescended to with.

Because woke langauge sounds like Posturetalk to the Clueless, some of the terms have been adopted by Clueless PhD students and faculty. At this point, we shouldn't be surprised to hear the Clueless using woke terms with the other groups and even with each other. When the Clueless use it, however, it is totally ineffectual, and never sounds quite right

Since there are Clueless PhD students and even Clueless undergrads, you will sometimes hear earnestly outrageous "woke" messages coming from them. But the main use of woke language is to cajole or frighten faculty into submission. There's no real power here, it's a bluff — Rao says, "Posturetalk and Babytalk leave things unchanged because they are, to quote Shakespeare, 'full of sound and fury, signifying nothing.'" But because faculty are concerned about and/or afraid of these issues, they can often be convinced to back down by the use of this kind of language.

2.2 Actually Being Cancelled

What about those faculty members who are cancelled? Here we return to the other organizational role of the Clueless, that of being a convenient scapegoat.

It's hard to know for certain, as these decisions are deliberately obscured, but I suspect that many of these faculty were fired for reasons unrelated to wokeness. Rao describes how Sociopaths set up bureaucracies that are designed to be byzantine and become clogged with appeals. When they want to keep something from happening, they let the appeals pile up. But when they want to make something happen, the Sociopath who handles the exceptions lets the right appeal jump the queue.

So in the few (rare) cases where a professor was actually cancelled by their university, I suspect that what happened was that the university wanted to fire them for some unrelated reason first. To protect the people at the top, however, and redirect the blame to the students and the bureaucracy, they first waited until a student made a complaint about the professor in question. This complaint then jumped the queue and was promoted to the level of an Issue, and the professor was fired, ostensibly as the result of the student complaint. This is hard to prove but it makes sense when we observe a professor being fired for what appear to be very flimsy reasons.

This is not the only way faculty can be made to take a fall for something. It's also possible, for example, that if some other scandal were about to come to light, a school could fire some professor for "wokeness" reasons as a way to distract from the other issue.

3. Other Implications

Some final thoughts on implications of this analysis.

Everyone knows that graduate school is kind of hellish. People are overworked and underpaid. Most of them slowly become aware they will never get an academic job. They burn out, suffer breakdowns. A lot of work goes into making it a better place for everyone.

But viewing it through an organizational lens suggests that these problems can't be solved: they're inherent to the system, and can't be gotten rid of. Rao says that theory of management is "based on the axiom that organizations don’t suffer pathologies; they are intrinsically pathological constructs." Again to quote Rao directly, "It is designed to fail in ways that achieve unspoken Sociopath intentions, while allowing them to claim the nobler explicit intentions enshrined in the law. "

It's even possible that attempts to make things better will make things worse, as it gives the Sociopaths at the top an opportunity to fiddle with the system to better suit their needs. Rao says of bureaucracies that you should "periodically attempt to 'reform' it through means that only ensure it gets worse (adding complexity)." If you have spent any time in academia, this will sound familiar to you.

If this perspective is correct, the only thing to be done is to abandon grad school altogether. But as long as there are Clueless students who can be recruited to feed the machine, I'm afraid this cycle will continue.

The same probably goes for graduate admissions and faculty hiring. These processes are perverse not as a bug, but as a feature. At some level most universities really do want you to hire the person with the longest CV, regardless of how much crap is on it, because this allows you to pick out the Clueless applicant who is the #1 biggest sucker. Universities can find many uses for such suckers.

3.2 Science Generally

An interesting implication is that the reason science sucks so much these days is that mainstream science has been "captured" to serve as the intent-obscuring bureaucracy of a set of major organizations.

In the old days, most scientists were Sociopaths, bored of life, pursuing meaning through solving mysteries. A small number of them had day jobs as Losers, and did science on the side as their hobby. But "science" is now dominated by the group with the lowest level of development, the Clueless, which can only bode poorly.

This is in line with my more general feeling that we should expect most scientific progress to occur outside the academy, though it is something of a problem that so much science funding is captured in this way.

Discuss

### [Event] Weekly Alignment Research Coffee Time (05/10)

9 мая, 2021 - 14:05
Published on May 9, 2021 11:05 AM GMT

Just like every Monday now, researchers in AI Alignment are invited for a coffee time, to talk about their research and what they're into.

And here is the everytimezone time.

Small change for this second edition: the link to the walled garden now only works for AF members. Anyone who wants to come but isn't an AF member needs to go by me. I'll broadly apply the following criteria for admission:

• If working in a AI Alignment lab or funded for independent research, automatic admission
• If recommended by AF member, automatic admission
• Otherwise, to my discretion

I prefer to not allow people who might have been interesting but who I'm not sure will not derail the conversation, because this is supposed to be the place where AI Alignment researchers can talk about their current research without having to explain everything.

See you then!

Discuss

### If you want Trump back on Twitter, try serving more alcohol and playing more video games

9 мая, 2021 - 10:24
Published on May 9, 2021 7:24 AM GMT

I'd like to talk about social host liability. If, as you read that term, you parsed it as 'social' as in social media, 'host' as in webhost, and 'liability' as in if a social media webhost does something negligent that causes harm to come to third parties that they should be held legally liable... then you don't know what social host liability is. You do, however, have an intuitive grasp of the point I'd like to make in this article.

Social host liability means that if you are serving alcohol at a party, and you serve one of your guests to a point of obvious intoxication, and that guest leaves and gets in a car accident or otherwise hurts someone where the proximate cause is determined to be intoxication, then you're legally liable. Dram shop laws are similar but apply to commercial establishments (e.g. bars) rather than private citizens hosting parties.

I mention this because we have a clear precedent in another context where we stop people from irresponsible overuse of a cognition-impairing addictive substance. In some cases we go as far to hold businesses and private citizens legally liable if they allow a person to continue to irresponsibly use that substance.

While I've seen various criticisms of social host liability and dram shop laws, it's never along the lines of "we're violating people's First Amendment rights by prohibiting them freedom of expression through drunk driving."

Twitter in 2021 is one of several businesses that have successfully developed a scalable way to surreptitiously give hundreds of millions of people a behavioral addiction and monetize it… oh and all the while making them believe they’re virtuously practicing free speech.

A more accessible answer to this question comes from an Irish podcaster with a plastic bag on his head, Blindboy (quote starts around 1:07:00).

Twitter is a video game that people don't know they're playing. Because the thing with Twitter is you're always engaged in an act of performance... Twitter encourages people to create a characterized version of themselves and to preform this character, like a role playing game. Twitter is a giant massively multiplayer online role playing gametext basedwhere it rewards hostility... You get rewarded on Twitter for having really good complaints. If you can think of a really good complaint on Twitter, you'll get lots of points in the form of retweets and likes. But the thing is, if everyone on Twitter is complaining, because this is what's getting you likes and points, then Twitter becomes an excessively negative place, which it is.

I would only add, unlike other MMORPGs, how good (or bad) you are at playing Twitter can have substantial effects on your real world reputation.

A slightly more academic perspective on the topic comes form Shoshanna Zuboff's excellent book, The Age of Surveillance Capitalism. In the same way that capitalism optimized Doritos to be maximally addictive, it's done the same for casinos. Much of what was learned about casinos was repurposed for social media. Shoshanna is discussing Facebook here, but in terms of having a financial incentive to increase advertising revenue through engagement, Facebook and Twitter are isomorphic (hyperlinks are my own and added for clarity).

The hand-and-glove relationship of technology addiction was not invented at Facebook, but rather it was pioneered, tested, and perfected with outstanding success in the gaming industry, another setting where addiction is formally recognized as a boundless source of profit. Skinner had anticipated the relevance of his methods to the casino environment, which executives and engineers have transformed into as vivid an illustration as one can muster of the startling power of behavioral engineering and its ability to exploit individual inclinations and transform them into closed loops of obsession and compulsion.

No one has mapped the casino terrain more insightfully than MIT social anthropologist Natasha Dow Schüll in her fascinating examination of machine gambling in Las Vegas, Addiction by Design. Most interesting for us is her account of the symbiotic design principles of a new generation of slot machines calculated to manipulate the psychological orientation of players so that first they never have to look away, and eventually they become incapable of doing so. Schüll learned that addictive players seek neither entertainment nor the mythical jackpot of cash. Instead, they chase what Harvard Medical School addiction researcher Howard Shaffer calls “the capacity of the drug or gamble to shift subjective experience,” pursuing an experiential state that Schüll calls the “machine zone,” a state of self-forgetting in which one is carried along by an irresistible momentum that feels like one is “played by the machine.” The machine zone achieves a sense of complete immersion that recalls Klein’s description of Facebook’s design principles—engrossing, immersive, immediate—and is associated with a loss of self-awareness, automatic behavior, and a total rhythmic absorption carried along on a wave of compulsion. Eventually, every aspect of casino machine design was geared to echo, enhance, and intensify the hunger for that subjective shift, but always in ways that elude the player’s awareness.

Schüll describes the multi-decade learning curve as gaming executives gradually came to appreciate that a new generation of computer-based slot machines could trigger and amplify the compulsion to chase the zone, as well as extend the time that each player spends in the zone. These innovations drive up revenues with the sheer volume of extended play as each machine is transformed into a “personalized reward device.” The idea, as the casinos came to understand it, is to avoid anything that distracts, diverts, or interrupts the player’s fusion with the machine; consoles “mold to the player’s natural posture,” eliminating the distance between the player’s body and frictionless touch screens: “Every feature of a slot machine—its mathematical structure, visual graphics, sound dynamics, seating and screen ergonomics—is calibrated to increase a gambler’s ‘time on device’ and to encourage ‘play to extinction.’” The aim is a kind of crazed machine sex, an intimate closed-loop architecture of obsession, loss of self, and auto-gratification. The key, one casino executive says in words that are all too familiar, “is figuring out how to leverage technology to act on customers’ preferences [while making] it as invisible—or what I call auto-magic—as possible.”

Common carriers are not behavioral addictions

Infamously in 2012, Megaupload was shutdown because it was often used by people pirating media. Commenting on this event Steve Wozniak said:

If someone commits a crime shipping drugs on the sea, you don't drain the sea and say the sea is the problem. If they are mailing drugs through the post office you don't shut the post office down you try to get the people who are doing the wrong steps.

The implication here is that Megaupload was a common carrier, a neutral conduit, like the postal service.  The wrong-downing where it occurred was on the part of people using Megaupload illegally, like someone sending drugs through the mail. I agree here with Steve and this was an excellent point about Megaupload. Megaupload was much more like the postal service.

Is Twitter like the postal service? We can answer that with simple thought experiments.

How many people, for example, regularly stay up until 3am compulsively stuffing envelopes with angry letters to strangers they disagree with about, say, K-pop bands? And, even if some eccentric types like that did exist, is the postal service in the business of amplifying their behavior and encouraging others to respond in kind?

User content on Twitter isn't in an envelope or shipping crate. To the contrary, user content on Twitter is something like a prodrug, a substance that by itself is mostly inert, but is algorithmically metabolized by the platform to become “biologically active” and addictive. Jaron Lanier suggests this is done, to some extent, by separating users in to bins and finding what kind content engages those users then finding ways to get it in front of them

Consider how few people care about the press releases Trump has been writing since he was booted off the platform. Even though Trump's post-ban press releases are written like Tweets (down to the @ mentions), they’re relatively inert sans Twitter's "algorithmic metabolism."

Trump’s derangement syndrome

Jaron Lanier often remarks on how he met Trump at several points in his life and watched Trump become progressively more deranged as Trump‘s Twitter use increased. As Jaron put it in Ten Arguments for Deleting Your Social Media Accounts Right Now:

As a Twitter addict, Trump has changed. He displays the snowflake pattern and sometimes loses control. He is not acting like the most powerful person in the world, because his addiction is more powerful. Whatever else he might be, whatever kind of victimizer, he is also a victim.

A better question than “why are we banning Trump from Twitter?” is instead “why are we encouraging high-profile politicians to engage with potential behavioral addictions?”

That's really a very serious question. Can you imagine a situation where large groups of people would demand we allow someone who still wields tremendous real world influence and power to become more regularly drug-addled? If substances are too much of a stretch, then substitute a behavior like gambling and the analogy still holds.

The appropriate feeling towards Trump's ban from Twitter strikes me as mudita—a sympathetic joy that he's been given the possibility to dry out and sober up, as it were. I know that sounds like some hippy shit, but I'm not ashamed to say I believe it. He has a real chance for reflection and growth right now, and I hope he uses it.

A bearded Trump deprived of salon appointments due to complications from eating a carnivore diet ferries Republicans to the polarization realm... err I mean a 19th-century interpretation of Charon's crossing by Alexander Litovchenko

In all seriousness, "Trump in recovery" strikes me as something that would be powerfully good for the world. Where he once was Charon-like character ferrying people in to ever more polarizing edges of the Internet, he could instead become a something more like a Bodhisattva Boatman who "aspires to become a Buddha simultaneously with other sentient beings, sharing their difficulties and encouraging them along the way." ... okay, I'll admit I'm being slightly facetious here, but I still think it's a good idea.

It's time to update

At the risk of writing a post that receives even more negative karma, I want to say ragintumbleweed and the rationalists who want to see Trump back on Twitter have all missed the elephant in the room—they all seem to presuppose that Twitter in 2021 is a game that's safe for individuals and humanity at large.

The set of rationalists against the Trump Twitter ban would have been closer to the mark if it happened in 2012, and they are making arguments that still echo the ones Steve Wozniak made in that era—these companies are common carriers, market place of ideas, free speech, etc.

The seismic shift that’s occurred in the last 10 years is the ability of social media platforms to freebase user generated content and create serious behavioral addictions with very salient real world consequences, we ignore that at our own peril.

We‘re making a category error if we continue to discuss Twitter like it’s the same platform it was 10 years ago.

Relevant meme for Gen X LWersRelevant meme for Millennial LWers

Zoomers and Boomers... I don't know how to speak your language. I hope you can find someone to translate. ¯\_(ツ)_/¯

Discuss

### Review and Comparison: Onto-Cartography & Promise Theory

9 мая, 2021 - 09:42
Published on May 9, 2021 12:42 AM GMT

Introduction

This is my first post here, although I've lurked for five or six years. I came up with this idea when I noticed similarities in these two books and reread them - both tackle some flavor of practical ontology, but from different angles. The first is primarily a philosophy text and the other's the basis for CFEngine, a "configuration management system" for computer networks. One works with words, the other theorems - one has agents, the other machines. I think there's a lot of overlap in how these ontologies behave and what concepts we can draw from them, so here's a hybrid casual review and attempt to extract the marrow.

Onto-Cartography: An Ontology of Machines and Media

This book is part of a current trend in philosophy to take objects just as seriously as subjects (OOO, or object-oriented ontology). One of the better motivations for this trend is global warming and existential risk more broadly - the idea that some objects are "hyperobjects" too multifarious or widely-distributed for people to deal with them directly. This book tries to come at this sort of problem from another angle, by maintaining materialism while making everything an agent no matter how non-agentic it looks.

The "machines" of the title are just things that perform functions, in the mathematical sense - to be worth discussing you've got to be changing inputs into outputs. Any machine in this sense is also made of tinier machines doing things to each other. (This also includes abstract concepts, which are taken as different from concrete ones only in that they're "multiply-instantiated" in the world.) The exceptions here being "dormant objects" that don't affect things any longer, like a book that no one alive has read, and "dark objects" that exist theoretically but have no effect on anything yet, like a planet outside of our light cone. In the opposite direction some objects are "bright" or "rogue" and interact a lot with others or have massive influence, like the Sun does for Earth or clocks do for the structure of our days.

The book's proposal is that we can tinker with these machines by understanding the smaller machines they're made of, their levels of "gravity" for other machines and what sorts of relations they have with each other. When you find a situation defined heavily by one machine, you can introduce alternatives that are brighter and restructure everything, or break it down and remove an offending part that makes it less powerful.

Ontocartographers would, as their name suggests, map these machines and their relations and try to alter both for the better. Just like a machine is made of smaller machines, any group of machines that's well-coordinated is itself a machine, and most problems machines face are problems of their parts coordinating. Where a layman would first ask what a thing is, what it's made of, etc. a dashing ontocartographer would ask what it does and what's being done to it.

There's a deeper level of philosophy in this book that may have gone over my head, partly because this is also an attempt to make an OOO-friendly, naturalist, 'flat' ontology. For instance, many machines preserve their organization from moment to moment as a kind of constant "becoming", and are subject to entropy where their component machines are more likely to be in different places the more they function in a given timescale. They each have some ability to do things they aren't doing right now, and form one giant system where everything influences everything else. They can also "mediate" each other, carrying the products of one system to another, although this doesn't come up very much.

Ultimately I found this the more accessible of the two, but I feel that it lacks application on its own. It achieves the typical level of abstraction I'd expect from ontology but doesn't use many examples to raise itself back toward concreteness or to provide real programs for changing system x or y. Out of the two methods this is the one I could imagine a science-fiction protagonist using without any trouble, but that I'd have problems implementing myself.

Promise Theory: Principles and Applications

It surprised me when I first learned that "ontologist" is a job title, but this is a good attempt to make one version of it rigorous and possibly computable. There are ties to information theory, decision theory and queueing theory here, but they're not very fleshed out in the book. On the ground-level we're basically doing directed graphs with additional, optional structures for the information we want to capture and paying attention to things like symmetries where agents reach equilibrium with each other.

Promises are a kind of communication where one agent shows an intention to do something and another "assesses" the likelihood of it happening. They cut out parts of possibility space for other agents, and they're idempotent - a promise to promise to do x is the same as a promise to do x. (Cutely, there's also an empty promise, which is always considered to be made and kept.) The motivation here is to replace obligations with something more local to the object-under-discussion. The idea that a flower "agrees" with a beam of light to point towards it is of course a map-level abstraction, but no more so than saying the flower is "forced" to do it, and the latter needs room in the model for something doing the forcing.

Just like before, agents are made of agents as far down as discussion requires, and perfect coordination between agents looks and behaves like an agent. Promises can't always be kept by agents, though, and sometimes their failure leads to catastrophic changes. Even worse, two agents can give promises that are incompatible, or one agent make a deceptive promise that has a communicated intention and an incompatible, uncommunicated one. The task of a promise theorist is to prevent or deal with these kinds of failures and get agents to coordinate.

Some tools for doing that in big networks of promises include measuring how often promises are kept, finding which things have in-degree or out-degree zero (that is, they only take or give promises), appointing "queue dispatchers" that promise to process promises to and from other agents, and looking for goals that exist over many of an agent's communicated intentions. (There are also other things that just seem technical to me, like parametrized promise bundles where one promise is made to use information from some external agent to specify choices made in the rest of the promises.)

On my first read I found this book pretty imposing, but between that and my second I tried reading more mathematical texts and found a lot of this to be very approachable or even seemingly obvious. There's a point where the structure provided to bolster this relatively simple idea seems overdone or inelegant, and I wish there had been more material on connections with other fields of mathematics, but I appreciate how seriously it treats its subject and wish more nonfiction would follow its lead.

Conclusions

Promise theorists and ontocartographers are both dealing with coordination problems, but from different angles. Promise theory is designed for building up coordination, while ontocartography is wants to understand and change coordinations that exist already. In both cases there's a sense that every object or agent has its own subjectivity - both authors use the term "world" for this - that encompasses everything that interacts with it. Just like dark objects could be floating out there far away from us, promises are dark for any agent that's not in their scope. And where every machine's behavior matters because it could do something different, promises provide a way to quantify which decision a machine "selects" and why.

To finish, here are a few ideas I've had for less obvious extensions or combinations:

• There are only so many promises an average agent can make within a given timespan without decoordinating.
• It's possible to broadcast promises to agents you're not aware of at first, as a kind of scope development.
• Gravity can be expressed as a prior like trust for a group of agents to make promises to a given agent.
• Alternatively, the non-local, multiply-instantiated force of obligation could be introduced for bright objects that seem to twist the status quo at a fundamental level.
• Or thirdly, this could be described as a form of strong equilibrium for promises made with a given agent.
• Promise dispatchers and machinic mediators do the same work, bridging gaps between agents.
• Promisers have a failure mode in their communication not reaching anyone; it's possible to restrict agents by promising to prevent their communication of promises.
• Agents that exclusively give or take promises are signs of unfinished (or intentionally simplified) models.
• Every agent can be expressed as an opportunist (out-degree 0) or an altruist (in-degree 0).

Discuss

### Implementing the Second Virtue: Feel the Emotions Which Fit the Facts

9 мая, 2021 - 08:47
Published on May 9, 2021 5:47 AM GMT

Yudkowsky's Second Virtue of Rationality is "Relinquishment". You'd expect this to be about relinquishing mistaken beliefs. But most of the description of the virtue actually focuses on feelings.

Relinquish the emotion which rests upon a mistaken belief, and seek to feel fully that emotion which fits the facts. If the iron approaches your face, and you believe it is hot, and it is cool, the Way opposes your fear. If the iron approaches your face, and you believe it is cool, and it is hot, the Way opposes your calm. Evaluate your beliefs first and then arrive at your emotions. Let yourself say: “If the iron is hot, I desire to believe it is hot, and if it is cool, I desire to believe it is cool.”

What is not written, but seems easily deduced, is: "If the iron that approaches my face is hot, I desire to feel fear. If the iron is cool, I desire to feel calm." But where are the limits of this desire? Is it only applicable to face-approaching iron? Of course not.

Most people claim that receiving a million dollars would dramatically improve their lives. They would feel very happy for a very long time after receiving such a gift. The things they currently possess are worth only a small fraction of such a massive sum, and growing their net worth a hundredfold seems life-altering.

I regularly ask these people how much money they'd like to receive for one or multiple of their limbs. Would you trade both arms for a million dollars? Would you want to be quadriplegic and blind for five million? Nearly everyone answers no. A healthy body is so much more valuable than such a sum of money.

And then they go right back to saying that they would be so much happier if they received a million dollars.

Of course, such a position could be rationally defended. A million dollars in cash allows people to do a lot of things that can't be done with just a functioning leg. But I don't think the position above can be defended as rationally "feeling the emotions which fit the facts". Because that's not how our emotions work.

We suffer from hedonic adaptation. We are not made for static enjoyment. Think about driving a car, especially one with manual transmission. It's highly complicated and takes quite a lot of effort to learn. But after years of practice, many people drive miles and miles, executing all kinds of complicated techniques, without a second of conscious effort or even conscious awareness of their actions. They're thinking about dinner or that awkward conversation at work and suddenly arrive at their destination.

That is generally how we live our lives. We are barely consciously aware of the "default parameters" of our existence. We don't consider the miraculous nature of our functioning limbs, nor the softness of our beds. We don't consider the shelter offered by our roofs, walls and windows. We're always focused on the small amount of things that are changing and new. The new product we just ordered, a new task at our work, a new health problem, a new skill we've got to learn, a hot or cold iron approaching our face. That is where our awareness lies - and that is where our emotions flow from.

This isn't 'bad' or 'wrong'. It is what inspires us to continuously improve and explore, to learn new things, to send humans to the Moon and Mars, to build faster computers and better AI.

But it does cause our emotions to be highly influenced by relatively trivial events that are completely out of line with the big picture. Imagine if we could be less concerned by petty squabbles, and more stably content with things like "our generally healthy body" and "our shelter that is highly effective at keeping out the cold and the rain", and all of the possessions and accomplishments we've worked so hard at to attain.

But is that rational? We want to feel happy so we fight against our nature and deliberately focus our awareness on positive considerations. Insteading of leaving out static things, now we leave out negative things. If we're trying to rationally consider the big picture and feel the emotions which fit that, we've also got to consider "static negative things" which our awareness leaves out most of the time.

Everything which makes us us, our identity, our memories, are bound to a mortal and decaying body, and the same is true for our family, our friends and our partner. We're a deeply irrational species that generally isn't able to properly organize governments, fight pandemics, implement cryonics and prepare for the possibility of a technological singularity. Political polarization and social unrest is rising and rising, leading to riots, destruction, death and general distrust and a lack of cooperation. Running into the Great Filter, a permanent end to intelligent life in the known universe because of AI developments gone wrong, biological and/or nuclear warfare, nanofoglets gone rogue or any other unforeseen technological development is a plausible event in this century and our lifespans.

On the other hands, smart people make plausible arguments that we are under the protection of Elua, the god of kindness and flowers and free love or the Goddess of Everything Else

She showed them transcendence of everything mortal, she showed them a galaxy lit up with consciousness. Genomes rewritten, the brain and the body set loose from Darwinian bonds and restrictions. Vast billions of beings, and every one different, ruled over by omnibenevolent angels. The people all crowded in closer to hear her, and all of them listened and all of them wondered.

"The Goddess of Cancer created you; once you were hers, but no longer. Throughout the long years I was picking away at her power. Through long generations of suffering I chiseled and chiseled. Now finally nothing is left of the nature with which she imbued you. She never again will hold sway over you or your loved ones. I am the Goddess of Everything Else and my powers are devious and subtle. I won you by pieces and hence you will all be my children. You are no longer driven to multiply conquer and kill by your nature. Go forth and do everything else, till the end of all ages.”

So the people left Earth, and they spread over stars without number. They followed the ways of the Goddess of Everything Else, and they lived in contentment. And she beckoned them onward, to things still more strange and enticing.

So, what is the big picture? What are the facts from which our emotions should follow? Is our mundane existence secretly magical and can or should awareness of that significantly improve our baseline happiness? Has Moloch already won and do we inhabit frail and mortal bodies in a cold and uncaring universe, filled with multipolar traps and suboptimal equilibria? Do or can we even know the big picture? Is it even relevant, or should our emotions only flow from the tiny sliver of reality that our instincts decide to make us consciously aware of?

Personally, I think the big picture is rather positive and that awareness of it does and should bring me some relief. But it's a complex issue and I'd love to hear your thoughts!

Discuss

### The Nuclear Energy Alignment Problem

9 мая, 2021 - 06:50
Published on May 9, 2021 3:50 AM GMT

Berlin, 1938.

"Have you ever thought about the Anthropic principle? Chances are about half the people who have ever lived will die before us and half after. The human population has been growing at an exponential rate. There is probably a disaster waiting just around the corner," said Schumann.

"I bet it's nuclear energy," said Heisenberg.

"There is no way a single machine could destroy the world," said Schumann.

Heisenberg wrote E=mc2 on a sheet of paper.

m=Ec2=2×1032 joules(3×108ms)2=2×1015 kilograms

"How much is that?" said Schumann.

"All the carbon in the terrestrial biosphere," said Heisenberg, "Or half the carbon in all of the Earth's coal."

"No nation on Earth has the industrial capacity to dump that much mass into a single machine. We are safe," said Schumann.

"Not if there is a chain reaction. Suppose a nitrogen atom releases two neutrons when it fissions. It could ignite all the nitrogen in the atmosphere," said Heisenberg.

"Is there more nitrogen in the atmosphere than carbon in the biosphere?" said Schumann.

"It doesn't matter. If we destroy the atmosphere then humanity is doomed whether or not the planet itself technically survives," said Heisenberg.

"We had better invent a theory of ethics to guide the use of nuclear energy. Otherwise someone might accidentally build a bomb," said Scumann.

"What does ethics have to do with anything? Keeping the energy contained is a purely technical problem," said Heisenberg.

"We need a theory of ethics so we can coordinate with the Allied powers. It does us no good to hold the moral high ground if the United States accidentally destroys the world," said Schumann.

"Please be serious. There is no way the United States could build a working nuclear machine. Their whole society is run by Jews," said Heisenberg.

"Britain or France might pull it off," said Schumann.

"Do you remember what happened the last time we tried to coordinate with Britain and France?" said Heisenberg.

"Oh right," said Schumann.

"Fortunately, nuclear energy is sixty years away and it always will be," said Heisenberg.

"Not anymore," said Schumann. He had a proposal for the Führer.

Discuss

### MIRI location optimization (and related topics) discussion

9 мая, 2021 - 02:12
Published on May 8, 2021 11:12 PM GMT

MIRI is moving (with high probability)!

We haven’t finalized a location yet, but there’s a good chance we’ll make our decision in the next six weeks. I want to solicit:

• Feedback on our current top location candidates.
• Ideas for other places that might fit our criteria.

I’m also interested in a more general location-optimizing discussion. What are your general thoughts on where you’d like to live, and have they changed any since the hub conversations Claire Wang began in September and November? If a new rationality community hub sprang up at any of these locations, would you be tempted to join? Is there a different place you’d prefer (either personally, or for the community)?

Everything from 'statements of personal preferences' to 'models of how the rationality community might make humanity's future much more awesome' is welcome.

What kind of place we're looking for

Our priority is to find a place where we think researchers will be able to think unusually clearly and well, in line with our December update. Recently, we’ve been looking for a campus or proto-campus (one or more buildings, with space and legal ability to build more) that's:

• far enough away from urban areas to be maximally calm, quiet, and close to nature; and
• near enough to urban areas to ensure there are people to see and things to do in driving distance, and to provide access to city conveniences (food delivery, ridesharing, lots of restaurants, etc.).

These proto-campus-type properties seem to be very rare and hard to find. E.g., we heard good things about Madison, WI and spent time looking for a property in the area, but ended up finding zero candidates currently on the market (that weren't falling apart, etc.).

If you want to convince MIRI to move to your favorite city, one good route would be to find a property like this for sale and either email it to Alex Vermeer or PM me (please don’t post specific property options you’re recommending publicly). The best places we’ve found have often been at the outskirts of pleasant, walkable 50k-100k population cities or college towns.

The specific factors we've been looking at fall under three rough, overlapping categories:

1. Is this a good place to think?

This is the biggest factor, and includes...

• Physical environment: How much do we expect this place to make it easy for researchers to go on peaceful walks to think, talk, etc.? The best options will tend to be secluded, close to nature, and safe-feeling. The worst options will tend to be busy and chaotic, or distracting/jarring for hard cognitive work.
• Social environment: What's the local vibe? If we absorb the local vibes, does this make us better or worse at things? Is there strong pressure in the direction of “it's silly/bad/crazy to try to save the world” or “elites aren't dropping the ball”? If the US, the world, or Twitter suddenly got a lot more socially chaotic, would the place we're in provide a feeling of safety, resilience, and nondistraction? (See also 2 and 3 below.)
• Usability and scalability: The property has good indoor work and hangout spaces, or spaces that could be easily converted into such in a reasonable timeframe: we want to be able to start doing research here in the near future, without spending 6+ months to get an initial setup in place. The property is also large enough—probably at least 20 acres, and ideally 50+ acres. It's easy to build on the land (and otherwise modify the property), or easy to expand over the years by buying properties within walking distance as they go on the market.

2. Is this otherwise a good place for MIRI staff to live?

If we go with the proto-campus plan, we’re likely to start with most staff living in the city proper (possibly with a second office there). More folks may move to the campus as we build it out or acquire more property.

We’re likely to be happier if we have friends and colleagues, community spaces, etc. on or near the campus and in the city, though we don't have a settled view on what size or kind of local rationalist community would be ideal.

Regardless of how many MIRI staff are living on campus (and although our biggest constraint is "can we find a proto-campus for sale here at all?"), features of the city and area will inevitably matter a lot.

• (Proto-)campus: Access to city conveniences (Uber, UberEats, etc.). The campus is a nice place to live, and close enough to other places staff can live that commutes don't have to be long.
• City: The city is nice to live in, walkable, fun, and safe. Housing is affordable. The area doesn't feel prone to (present or future) political violence, street fighting, riots, instability, etc. The culture is relaxed and LGBTQ-friendly.
• Area: Taxes aren't high, and aren’t likely to sharply rise in the future. It seems highly unlikely that the government (at any level) will be very anti-technologist six months or ten years from now. The weather isn’t too extreme. Etc.

3. How good is this place socially (and how good could it become)?

This overlaps with 2, since the things that make a city nice for MIRI staff also affect whether other people will like it.

• We want to mesh well with people who already live here, so we can make new social connections, benefit from the intellectual exchange with people in the area, and recruit people to do research.
• We want our partners, friends, and colleagues to like it here and have strong job options in the area.
• We want it to be easy and attractive for friends, colleagues/collaborators on alignment research, and potential hires to visit, and for us to visit them. E.g., places that are close to the Bay and have a major airport will score well here.

Our current top choices

Bellingham and Peekskill

We’ve spent hundreds of hours searching through long lists of candidates, and have pared those down to 30 relatively promising parts of the US, including two that look quite good as campuses and three other areas we especially like (but haven’t found a property in).

The two campus options we like (that do the best job of satisfying the criteria I listed above) are near (or on the outskirts of) two cities:

Bellingham, Washington: A 90,000-population town near the Canadian border. Located in between Seattle (80–120 minutes south, depending on traffic) and Vancouver, Canada (65–165 minutes north, depending on traffic and border crossing delays), with Puget Sound on one side and forests and a lake and distant mountains on the other. It seems enjoyably walkable near downtown (adjacent neighborhoods: 1, 2, 3), it feels vibrant and youthful, it's pretty good on crime, and considering all those positives and being on the coast, housing is relatively available and affordable.

It's a bit cloudier than Seattle, which is among the cloudiest parts of the country (something like 35% of the possible sun hours are sunny, compared to 38% for Seattle and 52% for Ann Arbor; this % depends heavily on your source though). It’s especially cloudy and rainy during the winter, though usually the rain and clouds come and go from day to day, and in July and August it’s beautiful almost every day. It’s hardly ever cold and hardly ever hot.

Also: unlike a lot of places we’ve considered, very little distraction/misery from mosquitoes!

CDC image.

The proto-campus we’re looking at is located in a peaceful, wet, quiet forest, full of walking trails. Our current gestalt impression of the city itself is that it’s full of hipsters and hobbyists, plus people who moved to the town for the beautiful environs and small-town aesthetic.

Folks seem to come here for good food, breweries, nature, and maybe plays to go to, and to attend Bellingham's (unimpressive) medium-sized public university; or they come to escape things they dislike about the rest of the west coast. Hobbies are a very big thing here: sports, hiking, rock climbing, going to the mountains, kayaking on the sound, etc. There aren’t many jobs here except for the service industry—it’s a town of students, hip retirees, and people working remotely in Seattle or California.

Bellingham International Airport does not have international flights (?!) and is quite small, but does have direct flights to Oakland several times a week.

Peekskill, New York: A small 24,000-person town on the Hudson River, 60–120 minutes by car or 65–75 minutes by train from New York City. (Video tour.) Blake Borgeson shares his impressions of this option:

NYC is our big nearby city, and it seems like roughly the best big city. My picture here at the moment, which I mostly got from Zvi, is that the big way NYC is different from other big cities is it’s more like 10 big cities with 10 different cultures, all of them coexisting in the same place and something between ignoring and welcoming one another. NYC is a place where doing your own thing and bringing your own culture is totally welcome.

Zvi seems enthusiastic about NYC culture’s effects on MIRI. (IIRC, he thinks that New Hampshire and Austin are probably the two best US locations for MIRI in terms of epistemic effects on us, and I currently mostly buy this claim.)

Peekskill is highly diverse and functional-seeming and real-seeming in a way that seems connected to it being near NYC. The folks I’ve talked to who live near Peekskill feel like Vermonters, progressive rural-ish folks, and I like them so far. They treat the land the way I want it treated—like, the nature here is beautiful, we can all agree, and you want to showcase it and preserve it rather than golf course it or English manor it.

Besides diversity and coexisting, the culture of this area feels simpler, more like some decades ago. Life is more straightforward. There’s no angst about finding your calling and finding your hobbies. You grow up, you find a job, you find love and have a family, you take care of your family, you make money to provide for them, and if you make enough money for it, congratulations, you can relax in your country estate with your family and enjoy that.

Hobby-type things seem not quite to fit in around here. So compared to Bellingham, Peekskill has fewer restaurants that I love (though it has some), and fewer places to sample craft beer, and fewer ways for adults to do things like sports (35 minutes driving from Peekskill to the nearest adult soccer league I could find online, for instance).

Some MIRI staff would live in Peekskill, while others live in more rural houses outside of town. The campus we’re looking at adjoins big forested hills with trails winding throughout, little used and with space to wander in nature for hours. A couple of miles on these trails takes you to the Appalachian Trail, if you want to go on longer hikes.

The area has real seasons, with deciduous trees that lose their leaves in the fall. Winter is much colder than in Bellingham, and summer is much hotter. (Bellingham’s weather is something like Berkeley’s with everything shifted 5-10°F colder.)

Comparisons by weatherspark.com.

A big draw of the Peekskill area for us is that we’re currently a lot more interested in being near New York City than being near Seattle/Vancouver. We could imagine starting a chain of rationalist communities along the Hudson Valley train line, allowing the full range from “very rural” to “very urban” life for people with different tastes. The train itself comes hourly, has plenty of seating (during non-COVID times, it seems you might have to stand for half the trip back from Grand Central at certain times of day), and is nice to ride (more like Caltrain, not the BART).

The campus options we’re looking at in the Peekskill area are also faster and easier to set up, requiring less construction. And there’s a small airport 35–60 minutes north, typically used to fly to Philadelphia International and then connect to another flight.

On the other hand, Bellingham is closer to our connections in the Bay Area, and is a much more exciting town than Peekskill in its own right: more restaurants and grocery stores, more climbing gyms, etc. NY state taxes are also higher.

Both Bellingham and Peekskill are quiet, safe, and relatively affordable (though not stunningly affordable—and as with most places, prices are on the rise):

Data extracted from Zillow. This is not a "sale price," but a Zillow proprietary "home value" number.

Both locations are cheaper than the Bay, have fewer local amenities, have worse job markets, and have (what I'd expect most people to consider) worse weather than the Bay. Homes in Bellingham are roughly 0.4x the $/sqft in Berkeley. Some of the biggest questions we’d love answered about these areas are: • Lyme disease: A potentially serious disadvantage of the Peekskill area, especially for outdoorsy or rural-life-enjoying rationalists, is that it’s tick country. Information like the following could do a lot to influence how good the place looks to us: • How common is Lyme disease around Peekskill? (According to this article, the Hudson Valley both has the highest number of ticks in New York state and the highest number of Lyme-carrying ticks per tick. Estimating local prevalence of the disease in humans is very important but seems tricky, since many cases go undiagnosed.) • Lyme disease rates are rising fast in the US—up 10x in the last 30 years. Given that, how much should we expect incidence to increase in areas that are already hard-hit? • Are there relatively easy, effective, and politically feasible ways to reduce the number of deer ticks in the area? (Examples.) • How harmful, and common, are long-term symptoms from Lyme disease? • How easy is it to reduce risk of catching Lyme disease, or risk of long-term symptoms? How effective is it to avoid walks, hikes, picnics, shaded-woodsy-areas, etc.? How effective is it to check for ticks with such-and-such frequency, and such-and-such thoroughness or laziness? (The CDC claims, somewhat vaguely: “In most cases, the tick must be attached for 36 to 48 hours or more before the Lyme disease bacterium can be transmitted.”) • How costly is it in microLymes to have long hair, a beard, or lots of body hair? • The CDC claims that Lyme is usually transmitted by nymphs, because they’re tiny and hard to spot during a tick check. How big is this effect, and what are its implications? • Is there a weird trick that can basically eliminate the risk of tick bites while walking around? E.g., Alan Eaton proposes “When outdoors for any length of time, tuck your shirt into your pants and your pants into your socks. Wear tall rubber boots that are too smooth for most ticks to grab onto. Apply insect repellent and/or wear insect [repellent] clothing.” How effective are these measures? • How effective is it to monitor for symptoms of Lyme, and administer antibiotics early in response to certain triggers? • How soon will we have Lyme vaccines of such-and-such effectiveness? Is there a faster option, like bringing back LYMErix or just using the veterinary Lyme vaccine? How likely is it that the vaccines (or some other intervention) will drastically reduce the severity or frequency of long-term Lyme symptoms in people already suffering from them? • Uber access in non-plague years: • Regarding Bellingham, Blake reports: “At the moment there's not much Uber service—maybe 1/4 of the time during normal hours I can find an Uber there. I am pretty confident (85%?) that this is a COVID effect, with 75% of the students not being around, and that at least during school sessions there will be Ubers, but I still need to check this. I'm less confident (60%?) that Ubers will be readily available after COVID when school isn't in session.” • Regarding Peekskill: “Right now there often aren’t Ubers around. A Peekskill local said there were lots of Ubers before COVID, and especially on Fri/Sat evenings when things are busier. “ • It would be very useful to know exactly how easy it is to get ride-shares in these areas during non-COVID times. • How many rationalists might want to live here? Other candidates We also feel some pull to strongly consider the following places more, despite so far not finding properties that fit our campus vision very well there: • the parts of New Hampshire nearest Boston; • the Austin, Texas area; and • Reno, Nevada. I’d currently assign about 50% probability to “we move to the Peekskill or Bellingham area, or somewhere similar, on the strength of properties available there,” and 50% to “we largely give up on the proto-campus idea and move to someplace more like an office building or set of buildings in a city like Reno, not nature-y or lush but still a pleasant, relaxed place for going on walks to think.” We’re also still open to suggestions of cities and properties, at least for the next two weeks (and possibly longer). Here are a bunch of other places we’ve seriously considered and in some cases visited: • West Coast: Bend, OR; Eugene, OR; Issaquah, WA; North Bend, WA • Rocky Mountain: Boulder, CO; Fort Collins, CO; Bozeman, MT; Missoula, MT • Southwest: • Great Plains: • Midwest: Urbana-Champaign, IL; Bloomington, IN; West Lafayette, IN; Ann Arbor, MI; Madison, WI • South: Asheville, NC; Greensboro, NC; Blacksburg, VA • Mid-Atlantic: Rochester, NY; the Philadelphia, PA area (e.g., Lambertville, NJ) • New England: the Amherst/Springfield, MA area; the outskirts of the Boston metropolitan area (e.g., Norwood, MA); Portland, ME; Keene, NH; Portsmouth, NH; Burlington, VT Berkeley, CA and our top 30 candidates, with the top 5 marked ❤️: Bellingham, Reno, Austin, Peekskill, and southern NH. Although we’ve been focusing heavily on the US in our search, we’re also still interested in country suggestions, if you think Canada or some other part of the world scores especially well on metrics like “unlikely to see a future tech backlash or eat-the-rich revolt that makes it decreasingly attractive to live in.” From our perspective, a higher-but-stable tax rate is better than a lower present rate with a lot of future instability and unpredictability. As I said at the start of the post, I want to hear people's thoughts about the locations I mentioned (especially our five favorites), and arguments for other locations. Either 'is this a good place for MIRI?' or 'is this a good place for a rationalist hub?' / 'are there worlds where I'd be happy moving here if a hub does spring up?' I'm also looking for recommendations of specific properties to buy. We've found that criteria like the ones I listed narrow the candidate list a lot, so we may end up giving up on the campus dream. (E.g., if you search for parts of the US that are quiet and close to nature, have Uber/Lyft access, are walkable, aren't extremely conservative, aren't too low-population, are permissive about zoning, have low crime rates, don't have super-high elevation, don't have super-high heat, and have a campus-like property on the market, you end up with a very short list.) Thanks to Blake Borgeson and Alex Vermeer for reviewing this post, and for doing most of the legwork to help MIRI think through move tradeoffs and logistical details. Any remaining errors are probably theirs, since they did most of the hard cognitive work and I just wrote a post about it. Discuss ### Android Video with External Microphone 9 мая, 2021 - 01:30 Published on May 8, 2021 10:30 PM GMT My smartphone is my best camera, but its microphone is not nearly as good. When recording video where I care about the audio ( ex), I've normally recorded audio and video separately and then combined them. This works, but is a hassle. Can we record them together? An Android phone (likely iPhone too, but I don't know those) can record audio from any standard ("class compliant") USB audio interface. This may bring up images of spending hundreds of dollars on professional audio equipment, but if you only need a single channel and don't need a preamp (or you already have mixing equipment) you can use a "USB soundcard": This is a Sabrent USB-C adapter one I got for$7.30 shipped.

While it's supposed to be possible to use an external microphone with the built-in camera, I couldn't get it to work. Instead, I used Open Camera:

This is "Settings > Video settings > Audio source > External mic (if present)", and worked without issue. Here's an example:

Videography courtesy of Lily

I previously made a lot of recordings with the microphone on the camera just because it was so much more convenient than combining audio and video from two sources, but I think I'm done with that.

Discuss

### [Event] Weekly Alignment Research Coffee Time (05/10)

9 мая, 2021 - 01:03
Published on May 8, 2021 10:03 PM GMT

Just like every Monday now, researchers in AI Alignment are invited for a coffee time, to talk about their research and what they're into.

Small change for this second edition: the link to the walled garden now only works for AF members. Anyone who wants to come but isn't an AF member needs to go by me. I'll broadly apply the following criteria for admission:

• If working in a AI Alignment lab or funded for independent research, automatic admission
• If recommended by AF member, automatic admission
• Otherwise, to my discretion

I prefer to not allow people who might have been interesting but who I'm not sure will not derail the conversation, because this is supposed to be the place where AI Alignment researchers can talk about their current research without having to explain everything.

See you then!

Discuss

### Munich SSC/ACX (+LW) (online) meetup May 2021

8 мая, 2021 - 23:04
Published on May 8, 2021 8:04 PM GMT

We have started as in-person Munich SSC meetup during the «meetups everywhere» drive. Unfortunately, we had to become online-only for the time being. Fortunately, it means that coming is easier, and everyone is welcome!

We are a pretty unfocused group, so feel free to drop by if you want to discuss something that you hope might be interesting to people reading SSC/ACX or LW — or hear something like that discussed. Agreeing with LW conclusions or Scott Alexander positions on whatever is neither expected nor discouraged (either choice can lead to a detailed argument…).

End time is just an indication — the meetup lasts as long as we want to talk about something, and it is also perfectly normal to leave whenever you prefer (or have) to.

Discuss

### a visual explanation of Bayesian updating

8 мая, 2021 - 22:45
Published on May 8, 2021 7:45 PM GMT

As a teaser here is the visual version of Bayesian updating:

But in order to understand that figure we need to go through the prior and likelihood!

You find me standing in a basketball court ready to shoot some hoops. What do you believe about my performance before I take a shot?. There are no good Null hypothesis here unless you happen to have a lot of knowledge about the average human basket ball performance!, and even so, why do you care whether I am significant different from the average?, You can fall back to the new statistics which is almost as good as the Bayesian approach, it but does not answer what you should believe before I take a shot.

p(θ)∼Beta(1,1)

Where θ is my probability of scoring, the distribution looks like this:

Completely Uniform, a great prior when you are totally oblivious.

I take a shot and miss (z=0), the likelihood of a miss looks like this:

(if you are extra currious, you can brush up on the math behind all the binary distributions here)

Notice that:

• p(z=0∣θ=0)=1, the likelihood that I always miss is 1
• p(z=0∣θ=0.5)=0.5, the likelihood that I miss half the time is 0.5
• p(z=0∣θ=1)=0, the likelihood that I always hit is 0, which is obvious as I can't score all the time if I just missed.

Notice that these likelihoods and not probabilities, but how likely the data are for different values of θ, so it is twice as likely:

p(z=0∣θ=0)p(z=0∣θ=0.5)=10.5=2

That the data z=0 was generated by θ=0 compared to θ=0.5.

Bayesian Updating Math

Here is Bayes theorem for the Bernoulli distribution with a Beta prior, where the parameter z is 1 when I score and 0 otherwise:

p(θ|z)=p(z∣θ)p(θ)p(z)

For technical reason p(z), the probability of the data, is difficult to calculate, it is however 'just a normalization constant' because it does not depend on θ which is my scoring probability, thus we can simply drop it and get an unnormalized posterior:

p(θ|z)∝p(z∣θ)p(θ)

An normalized posterior is simply a density function that does not sum to 1, which means when we plot it it looks 'correct' except we have screwed up the numbers on the y axis.

Visual Bayesian Updating

So now we have a 'square' prior p(θ)∼Beta(1,1) and we have a triangle likelihood p(z=0∣θ), if we multiply them together we get the unnormalized posterior, so we do:

p(θ|z)∝p(z∣θ)p(θ)

Which intuitively can be taught of as: the square makes everything equally likely, so the likelihood will dominate the posterior, or in dodgy math:

posterior∝square×triangle∝triangle

Here is the Figure:

Try to put your finger on the figure check that θ=0.5 is 1 for the square and 0.5 for the triangle and is thus 1×0.5=0.5 in the unnormalized posterior

I shoot again and score!

Now we use the previous posterior as the new prior, but because we score we get an 'opposite triangle' which is the likelihood of p(z=1∣θ)

Again we multiply the prior triangle by the likelihood triangle and get a blob centered on 0.5 as the posterior:

Notice how the posterior is peaked at θ=0.5, this is because the two triangles at the center have an unnormalized posterior density of 0.5×0.5=0.25 where at edges such as θ=0.9 they have 0.9×0.1=0.09

I shoot again and sore!

So now again the previous blob posterior is our new prior, which we multiply by the 'I scored triangle' resulting in a blob that has a mode above 0.5, which makes sense as I made 2/3 shots:

While this may seem like a cute toy example it's a totally valid way of solving a Bayesian posterior, and is the way all most popular bayesian books (Gelman[1], Kruschke[2] and McElreath[3]) introduce the concept!

Bayesian Updating using Conjugation

In the case of the Bernoulli events we can actually solve the posterior easily because the Beta is conjugated to the Bernoulli, conjugation is simply fancy statistics speak for it having a simple mathematical form, and that form is also a Beta distribution, thus you can update the beta distribution using this simple rule:

Beta(α+z,β+1−z)

So we Started with a prior with α=β=1

Beta(1,1)

Then we got a miss, z=0

Beta(1,2)

Then we got a hit, z=1

Beta(2,2)

Then we got a miss, z=1

Beta(3,2)

We can plot the Beta(3,2) posterior

Notice how the this posterior has the exact same shape as the one we got via updating, the only different is the numbers on the y-axis.

(Hi, if you made it this far please comment, if there were something that was not well explained, I care more about my statistics communication skills than my ego, so negative feedback is very welcome)

1. Gelman, Hill and Vehtari, “Regression and Other Stories” ↩︎

2. Richard McElreath "Statistical Rethinking" ↩︎

3. John Kruschke "Doing Bayesian Data Analysis: A Tutorial with R, JAGS, and Stan 2nd Edition" ↩︎

Discuss

### Pre-Training + Fine-Tuning Favors Deception

8 мая, 2021 - 21:36
Published on May 8, 2021 6:36 PM GMT

Currently, to obtain models useful for some task X, models are pre-trained on some task Y, then fine-tuned on task X. For example, to obtain a model that can summarize articles, a large language model is first pre-trained on predicting common crawl, then fine-tuned on article summarization. Given the empirical success of this paradigm and the difficulty of obtained labeled data, I loosely expect this trend to continue.

I will argue that compared to the paradigm of training a model on X directly, training on Y then fine-tuning on X increases the chance of deceptive alignment. More specifically, I will argue that fine-tuning a deceptive model will produce a deceptive model and fine-tuning a non-deceptive model is more likely to produce a deceptive model than training from scratch.

From Does SGD Produce Deceptive Alignment?:

More specifically, we can single out three types of models that optimize for the base objective:

• internally aligned models optimize for the base objective because it is equal to the model's terminal objective
• corrigibly aligned models optimize for the base objective because their terminal objective is a robust pointer to a model of the base objective. Note that these are called corrigible because they are amenable to corrections of the model of the base objective (though not necessarily corrections to their pointer).
• deceptively aligned models optimize for the base objective for instrumental reasons

Assume that pre-training has produced a model that optimizes for the pre-training loss and is one of the above types. The arguments are weaker if pre-training is imperfect. If the model is not one of the above types, then these arguments do not apply.

We can analyze the three cases separately:

• If pre-training has produced an internally aligned model, then we are beginning fine-tuning with a mesa-optimizer that has a misaligned proxy objective. The probability of fine-tuning producing deception is controlled by how far the pre-training loss is from the fine-tuning loss and how far the current capabilities are from deception.
• If pre-training has produced a corrigibly aligned model, then we begin fine-tuning with a mesa-optimizer that has a misaligned pointer. The probability of fine-tuning producing deception is controlled by how easy it is to shift the pointer and how far the current capabilities are from deception.
• If pre-training has produced a deceptively aligned model, then fine-tuning will begin with a deceptively aligned model. The probability of deception is controlled by how easy it is for the model to start instrumentally optimizing for the fine-tuning objective.

In the first case, changing an internal objective from the pre-training loss to the fine-tuning loss seems about as difficult as learning the fine-tuning loss from scratch. Since the model starts highly capable, deception has a strict advantage here. For example, suppose a model was pre-trained on prediction and fine-tuned on approval. Since the pre-training data is vast, the model will already understand human approval. Performing well on the fine-tuning loss is thus likely to involve routing capabilities through the understanding the model already has, instead of reincorporating human approval into the model’s internal objective.

In the second case, in order for fine-tuning to produce a corrigibly aligned model, the pointer must be changed from the pre-training loss to the fine-tuning loss. In order for fine-tuning to produce a deceptively aligned model, the model must understand the objective it’s being trained on, which also requires a pointer.

During normal training, the amount of optimization power the model can apply to a task slowly increases, while the amount of optimization power SGD can apply remains roughly constant. Since we are starting pre-training with an already capable model, this gap starts out wider than if we were training from scratch. Since deceptive models can build pointers at runtime, they will thus be able to apply more optimization power to pointer construction than SGD. As a result, the model will become deceptive faster than it can become corrigible. However, the model also begins with a detailed world model, enabling SGD to start building a pointer to the fine-tuning loss earlier than if we were training from scratch. Since deception cannot happen until the model has a detailed world model, this consideration is not more compelling when fine-tuning versus training from scratch.

In the third case, in order for fine-tuning to produce an internally or corrigibly aligned model, fine-tuning must align the model faster than the model can figure out the fine-tuning objective. Since the model was deceptive during pre-training, it already understands most of the training setup. In particular, it probably understood that it was being pre-trained and predicted that it would subsequently get fine-tuned, thus making fine-tuning overwhelmingly likely to produce a deceptive model. There are considerations about the type of deceptive alignment one gets during pre-training that I have ignored. See Mesa-Search vs Mesa-Control for further discussion.

The above arguments assume that pre-training + fine-tuning and training on the fine-tuning task directly produce models that are equally capable. This assumption is likely false. In particular, one probably will not have enough data to achieve high capabilities at the desired task. If the desired task is something like imitative amplification, suboptimal capabilities might produce an imperfect approximation of HCH, which might be catastrophic even if HCH is benign. There are other reasons why pre-training is beneficial for alignment which I will not discuss.

Overall, holding constant the capabilities of the resulting model, pre-training + fine-tuning increases the probability of deceptive alignment. It is still possible that pre-training is net-beneficial for alignment. Exploring ways of doing pre-training that dodge the arguments for deceptive alignment is a potentially fruitful avenue of research.

Discuss

### Migraine hallucinations, phenomenology, and cognition

8 мая, 2021 - 18:56
Published on May 8, 2021 3:56 PM GMT

I have several times in my life experienced migraine hallucinations. I call them that because they look exactly like what other people report under that name.

I'll come back to those.

If I look at someone, and hold up my hand so as to block my view of their head, I do not experience looking at a headless person. I experience looking at a normal person, whose head I cannot see, because there is something else in the way.

Why is this? One can instantly talk about Bayesian estimation, prior experience, training of neural nets, constant conjunction, and so on. However, a real explanation must also account for situations in which this filling-in does not occur. One ordinary example is the pictures here. I see these as headless men, not ordinary men whose heads I cannot see.

Migraine hallucinations provide a more interesting example. If you've ever had one, you might already know what I'm going to say, but I do not know if this experience is the same for everyone.

If I superimpose the hallucination on someone's head, they seem to have no head. I don't mean that I cannot see their head, but that I seem to be looking at a headless person. If I superimpose it on a part of their head, it is as if that part does not exist. Whatever the blind spot covers, my brain does not fill it in. Whatever my hand covers, my brain does fill in, not at the level of the image (I don't confabulate an image of their face), but at some higher level. I know in both cases that they have a head. But at some level below knowing, the experience in one case is that they have no head, and in the other, that they do. My knowledge that they have a head does nothing to alter the sensation that they do not.

It is quite disconcerting to look at myself in a mirror and see half my head missing.

Those who have never had such hallucinations might try experimenting with their ordinary blind spots. I am not sure it will be the same. The brain has had more practice filling those in, and does not have to contend with the jaggies.

From this I cannot draw out much in the way of conclusions about vision and the brain, but it provides an interesting experience of the separation between two levels of abstraction. When we look at the world and see comprehensible objects in it, our brain did that before it ever came into our subjective experience. When the mechanism develops a fault, it presents conclusions that we know to be false, yet still experience.

This presumably applies to all our senses, including that of introspection.

Discuss

### Interview with Christine M. Korsgaard: Animal Ethics, Kantianism, Utilitarianism

8 мая, 2021 - 14:44
Published on May 8, 2021 11:44 AM GMT

Christine M. Korsgaard was kind enough to answer a few questions of mine. Here's an excerpt:

ERICH: Many animal welfare advocates seem to be utilitarians, possibly due to the influence of Peter Singer. But you, of course, are not a utilitarian. Why is it not a convincing moral philosophy, in your view?

CHRISTINE: Because I believe that everything that is good must be good-for someone, some creature, I don’t believe it makes sense to aggregate goods across the boundaries between creatures. Of course, if you say “I can do something that’s good for Jack, or I can do something that’s good for Jack and also good for Jill,” everyone thinks that the second option is better, and that makes it look as if aggregation makes sense – the more good, the better. The problem only shows up when you have to do some subtracting in order to maximize the total. If Jack would get more pleasure from owning Jill’s convertible than Jill does, the utilitarian thinks you should take the car away from Jill and give it to Jack. I don’t think that makes things better for everyone. I think it makes it better for Jack and worse for Jill, and that’s all. It doesn’t make it better on the whole.

Of course, behind this there is a deeper problem. Utilitarians think that the value of people and animals derives from the value of the states they are capable of – pleasure and pain, satisfaction and frustration. In fact, in a way it is worse: In utilitarianism, people and animals don’t really matter at all; they are just the place where the valuable things happen. That’s why the boundaries between them do not matter. Kantians think that the value of the states derives from the value of the people and animals. In a Kantian theory, your pleasures and pains matter because you matter, you are an “end in yourself” and your pains and pleasures matter to you.

Discuss

### Crowdfunding Vaccine Development

8 мая, 2021 - 07:10
Published on May 8, 2021 4:10 AM GMT

Imagine that at the beginning of the pandemic, March 2020, you were given the opportunity to buy a place in the queue to get the vaccine. Imagine knowing nothing more than you actually did in March 2020. You don’t know how long it will take to develop a vaccine, or which companies will do it, or exactly how effective it will be, or what if any side effects it will have. You don’t know what kind of pandemic rules will be in place where or for how long. How much would you pay for that place in the vaccine queue? My personal answer, given that I had not much income at the time, is $100, but figure out for yourself what your answer is. Got a number in mind? Yes? Good. Now lets compare it to what is needed. Operation Warp Speed spent$12.4 billion. A dose of Pfizer/BioNTech costs $19.50, and lets assume conservatively that 300 million doses will be given in the US. That’s$18.25 billion getting spent on vaccines for 150 million people, or $121.67 per person. Was your number above that? Do you think the average number of those 150 million people was above that? I think so. My number specific to March 2020 was only a little below that, and at other times in my life would have been much higher. I’m sure a few particularly well off people would have paid$10k or $100k to be at the very front of the queue. If I’m right that the average amount those 150 million people would pay for a place in a vaccine queue is above$121.67, then crowdfunding vaccine development, with the promise of distributing vaccines in descending order of crowdfunding contribution amount, would have provided at least as good an incentive as the world we actually have.

Why does this matter? If you think the current system will continue to work fine in the future, it doesn’t. But if, like Zvi, you think the Biden administration has just actively destroyed the ability of the current system to develop vaccines in response to future pandemics, then you should care about alternatives. If, like me, you think it was never a good idea for the government to tell N - 1 companies to wait 20 years before they can manufacture life-saving vaccines (this is what a patent is), then you should care about alternatives.

Discuss

### [Writing Exercise] A Guide

8 мая, 2021 - 03:22
Published on May 8, 2021 12:22 AM GMT

There are three components to blogging:

1. Coming up with ideas.
2. Organizing your ideas into something coherent.
3. Writing it in English.

I know lots of people with good ideas and who are proficient at English. Sometimes I encourage them to blog. They write an article and it is awful. The ideas are fine. The English is fine. The structure is an incoherent ramble.

Shaping your ideas into something coherent is a learnable, trainable skill. Mostly it is about selecting the right format. If you're writing an essay then write an essay. Don't try to format it into Buzzfeed clickbait. If your ideas want to be a political diatribe then let them be a political diatribe. Don't hide them in an academic paper. The more writing formats you understand the more prepared you are to pick the right one.

Today's format: A Guide

post x y THE BEST CHOCOLATE CHIP COOKIE RECIPE EVER basic cooking skills: how to use an oven, how to crack an egg, US units of measurement, etc. how to bake optimal chocolate chip cookies. My introductory guide to Vim basic computer literacy Vim Hy's tutorial basic background in programming Hy The Promise

It is best if the reader can determine x and y from context. The chocolate cookie recipe doesn't tell the reader what x is. It is obvious from a glance at the graphic design. Establishing reader expectations from context is ideal but not always possible. It is acceptable to just tell the reader directly "this guide is for people who know x but do not know y".

This chapter provides a quick introduction to Hy. It assumes a basic background in programming, but no specific prior knowledge of Python or Lisp.

―opening line of the Hy tutorial

The Beginning

Writing should always get straight to the point. The point of a guide is to teach the reader how to do something. Spend as little time as you can get away with explaining what the thing is and why the reader should learn it. THE BEST CHOCOLATE CHIP COOKIE RECIPE EVER doesn't tell the reader what a chocolate chip cookie is or why you should make one. The reader already knows. The title and introduction tell the reader why to follow these particular instructions.

This is the best chocolate chip cookie recipe ever! No funny ingredients, no chilling time, etc. Just a simple, straightforward, amazingly delicious, doughy yet still fully cooked, chocolate chip cookie that turns out perfectly every single time!

Everyone needs a classic chocolate chip cookie recipe in their repertoire, and this is mine. It is seriously the Best Chocolate Chip Cookie Recipe Ever! I have been making these for many, many years and everyone who tries them agrees they’re out-of-this-world delicious!

Plus, there’s no funny ingredients, no chilling, etc. Just a simple, straightforward, amazingly delicious, doughy yet still fully cooked, chocolate chip cookie that turns out perfectly every single time!

These are everything a chocolate chip cookie should be. Crispy and chewy. Doughy yet fully baked. Perfectly buttery and sweet.

The more your objective is to persuade rather than explain, the longer your introduction will be. Persuade no more than you must. Get to the explaining as fast as you can.

The Middle

A guide should start with the most basic, most important, most frequently-used information. In the case of Vim this is the hjkl keys. The Hy tutorial starts with prefix notation. The cookie recipe starts off with four bullet points.

1. Soften butter. If you are planning on making these, take the butter out of the fridge first thing in the morning so it’s ready to go when you need it.

2. Measure the flour correctly. Be sure to use a measuring cup made for dry ingredients (NOT a pyrex liquid measuring cup). There has been some controversy on how to measure flour. I personally use the scoop and shake method and always have (gasp)! It’s easier and I have never had that method fail me. Many of you say that the only way to measure flour is to scoop it into the measuring cup and level with a knife. I say, measure it the way you always do. Just make sure that the dough matches the consistency of the dough in the photos in this post.

3. Use LOTS of chocolate chips. Do I really need to explain this?!

4. DO NOT over-bake these chocolate chip cookies! I explain this more below, but these chocolate chip cookies will not look done when you pull them out of the oven, and that is GOOD.

In all three cases a beginner could stop there and leave having learned something useful. The Vim beginner can navigate in Vim. The Hy beginner can write a line of code. The cook can put more chocolate chips in her cookies. A reader should be always be learning something useful. If at any point you are not teaching the reader something useful to a beginner who knows x but does not know y the reader will stop reading because you are not doing your job.

Don't get bogged down in details. Legalistic pedantry is for documentation. Too much detail makes things hard for beginners. You are not allowed overgeneralize—misinformation sets your reader up for failure down the road. Just skip over the details.

What constitutes "the basics" tends to be consistent within fields and subspecialties. If you are not sure what the basics are then pick an objective and tell the reader the minimum necessary to competently achieve it.

Use big headings and bold text to emphasize important ideas. A guide should be skimmable. Readers need the option to skip to the next section if they already understand the current section or if they have a learning objective slightly different from y.

The End

If you can, the best way to end a guide is to point the reader in the direction of where to continue their education. If you don't have anywhere to point them then you can just end it abruptly.

Exercise

Write an guide to something you care about. Niche topics are fine. The most important thing is that you pick a topic you already know very well. Guides distill information. The more you know about a topic the more selective you can be about what to include.

Discuss

### D&D.Sci May 2021: Monster Carcass Auction

7 мая, 2021 - 22:33
Published on May 7, 2021 7:33 PM GMT

You are an apprentice to Carver, the most successful butcher in your tiny, snow-swept village. Today, for the first time since you joined her, she is sending you to buy carcasses at the daily Auction.

(The (first-price, sealed-bid) Monster Carcass Auction began as a collective effort by local shopkeepers to divert Adventurers from trying to sell them random corpses, but has since become an integral part of the village economy, as well as the population’s main protein source.)

Carver thinks you should trust your instincts and bid however feels right. It’s an approach that’s served her well thus far – the record you’ve been compiling of her bids and subsequent sales attests to that, among other things – but you suspect a more data-driven approach would work better. And if you do well enough on this expedition, that might suffice to prove it to her.

You make sure to arrive at the very end of the event, like your boss always does; this means you’ll lose any tie-breakers – matching bids are resolved in favour of whoever bid first – but also means your rivals will have already put in their bids, so none of them will be able to change their bidding strategy to account for Carver’s absence.

The lots available are as follows:

LotSpeciesDays Since Death#1Yeti0#2Snow Serpent2#3Snow Serpent1#4Winter Wolf1#5Yeti5#6Winter Wolf1#7Snow Serpent1#8Snow Serpent5#9Winter Wolf3#10Winter Wolf7#11Winter Wolf8#12Snow Serpent8#13Winter Wolf2

(As usual, this is all the information given to bidders; the original organizers took the term ‘blind auction’ a little too literally, and by the time anyone realized, the practice of hiding almost everything about the lots had become a tradition.)

You and your employer are risk-neutral, and don’t care how much or little time and effort you spend butchering. You brought 400 silver pieces. How much will you bid for each lot?

Notes:

• Payments are collected in lot order; if you’re unable to pay your bid by the time a given lot comes up, you lose your claim to that lot but incur no penalty.
• Your records are in no particular order, but the glacial pace of life in your village suggests there are no time trends to account for.

I’ll be posting an interactive letting you test your decision, along with an explanation of how I generated the dataset, sometime next Friday. I’m giving you a week, but the task shouldn’t take more than a few hours; use Excel, R, Python, Ouija boards, or whatever other tools you think are appropriate. Let me know in the comments if you have any questions about the scenario.

If you want to investigate collaboratively and/or call your decisions in advance, feel free to do so in the comments; however, please use spoiler tags or rot13 when sharing inferences/strategies/decisions, so people intending to fly solo can look for clarifications without being spoiled.

Discuss

### Why quantitative finance is so hard

7 мая, 2021 - 22:29
Published on May 7, 2021 7:29 PM GMT

Quantitative finance (QF) is the art of using mathematics to extract money from a securities market. A security is a fungible financial asset. Securities include stocks, bonds, futures, currencies, cryptocurrencies and so on. People often use the techniques of QF to extract money from prediction markets too, particularly sports betting pools.

Expected return is future outcomes weighted by probability. A trade has edge if its expected return is positive. You should never make a trade with negative expected return. It is not enough just to use expected return. Most peoples' value functions curve downward. The marginal value of money decreases the more you have. Most people have approximately logarithmic value functions.

A logarithmic curve is approximately linear when you zoom in. Losing 1% of your net worth hurts you slightly more than earning 1% of your net worth helps you. But the difference is usually small enough to ignore. The difference between earning 99% of your net worth and losing 99% of your net worth is not ignorable.

When you gain or lose 1% of your net worth, the expected change to the logarithm of your wealth is a tiny -0.01%. When you gain or lose 99% of your net worth the expected change to the logarithm of your wealth is -400%.

This is called a risk premium. For every positive edge you can use the Kelly criterion to calculate a bet small enough such that the you edge exceeds your risk premium. In practice traders tend to use fractional Kelly.

Minimum transaction costs are often constant. It is not sufficient for your edge to merely exceed your risk premium. It must exceed your risk premium plus the transaction cost. Risk premium is defined as a fraction of your net worth but transaction costs are often constant. If you have lots of money then you can place larger bets while keeping your risk premium constant. This is one of the reasons hedge funds like having large war chests. Larger funds can harvest risk-adjusted returns from smaller edges.

Getting an Edge

The only free lunch in finance is diversification. If you invest in two uncorrelated assets with equal edge then your risk goes down. This is the principle behind index funds. If you know you're going to pick stocks with the skill of a monkey then you might as well maximize diversification by picking all the stocks. As world markets become more interconnected they become more correlated too. The more people invest in index funds, the less risk-adjusted return diversification buys you. Nevertheless, standard investment advice for most[1] people is to invest in bonds and index funds. FEMA recommends you add food and water.

All of the above is baseline. Baseline rents you can extract by mindlessly owning the means of production is called beta β. Earning money in excess of beta by beating the market is called alpha α.

There are three ways to make a living in this business: be first, be smarter or cheat.

―John Tuld in Margin Call

You can be first by being fast or using alternative data. Spread Networks laid a \$300 million fiber optic cable in close to a straight line from New York City to Chicago. Being fast is expensive. If you use your own satellites to predict crop prices then you can beat the market. Alternative data is expensive too.

If you want to cheat, go listen to Darknet Diaries. Prison is expensive.

Being smart is cheap.

Science will not save you

Science [ideal] applies Occam's Razor to distinguish good theories from bad. Science [experimental] is the process of shooting a firehose of facts at hypotheses until only the most robust survive. Science [human institution] works when you have lots of new data coming in. If the data dries up then science [human institution] stops working. Lee Smolin asserts this has happened to theoretical physics.

If you have two competing hypotheses with equal prior probability then you need one bit of entropy to determine which one is true. If you have four competing hypotheses with equal prior probability then you need two bits of entropy to determine which one is true. I call your prior probability weighted set of competing hypotheses a hypothesis space. To determine which hypothesis in the hypothesis space is true you need training data. The entropy of your training data must exceed the negentropy of your hypothesis space.

The negentropy of n competing hypotheses with equal prior probability is logn. Suppose your training dataset has entropy T. The number of competing hypotheses you can handle grows exponentially as a function of T.

logn=Tn=eT

The above equation only works if all the variables in each hypothesis are hard-coded. A hypothesis y=2.2x+3.1 counts as a separate hypothesis from y=2.1x+3.1.

A hypothesis can instead use tunable paramters. Tunable parameters eat up the entropy of our training data fast. You can measure the negentropy of a hypothesis by counting how many tunable parameters it has. A one-dimensional linear model y=ax+b has two tunable parameters. A one-dimensional quadratic y=ax2+bx+c model has three tunable parameters. A one-dimensional cubic model y=ax3+bx2+cx+d has four tunable parameters. Suppose each tunable parameter has e bits of entropy. The total entropy needed to collapse a hypothesis space with m tunable parameters equals m. The negentropy of a hypothesis space with m tunable parameters equals m.

We can combine these equations. Suppose your hypothesis space has n separate hypotheses each with m tunable parameters. The total negentropy J equals the entropy necessary to distinguish hypotheses from each other plus the entropy necessary to tune a hypothesis's parameters.

J=m+logn

Logarithmic functions grow slower than linear functions. The number of hypotheses n is inside the logarithm. The number of tunable parameters m is outside of it. The negentropy of our hypothesis space is dominated by m. The number of competing hypotheses we can distinguish grows exponentially slower than the entropy of our training data. You can distinguish competing hypotheses from each other by throwing training data at a problem if they have few tunable parameters. If you have tunable parameters then the entropy required to collapse your hypothesis space goes up fast.

If you have lots of entropy in your training data then you can train a high-parameter model. Silicon Valley gets away with using high-parameter models to run its self-driving cars and image classifiers because it is easy to create new data. There is so much data available that Silicon Valley data scientists focus their attention on compute efficiency.

Wall Street is the opposite. Quants are bottlenecked by training data entropy.

Past performance is not indicative of future results

If you are testing a drug, training a self driving car or classifying images then past performance is usually indicative of future results. If you are examining financial data then past performance is not indicative of future results. Consider a financial bubble. The price of tulips goes up. It goes up some more. It keeps going up. Past performance indicates the price ought to keep going up. Yet buying into a bubble has negative expected return.

Wikipedia lists 25 economic crises in the 20th century plus 20 in the 21st century to date for a total of 45. Financial crises are very important. Hedge funds tend to be highly leveraged. A single crisis can wipe out a firm. If a strategy cannot ride out financial crises then it is unviable. Learning from your mistakes does not work if you do not survive your mistakes.

When Tesla needs more training data to train its self-training cars they can drive more cars around. If a hedge fund needs 45 more financial crisis to train its model then they have to wait a century. World conditions change. Competing actors respond to the historical data. New variables appear faster than new training data. You cannot predict financial crises just by waiting for more training data because the negentropy of your hypothesis space outraces the entropy of your training data.

You cannot predict a once-in-history event by applying a high-parameter model to historical data alone.

1. If your government subsidizes mortgages or another kind of investment then you may be able to beat the market. ↩︎

Discuss

### Life and expanding steerable consequences

7 мая, 2021 - 21:33
Published on May 7, 2021 6:33 PM GMT

Financial status: This is independent research. I welcome financial support to make further posts like this possible.

Epistemic status: I believe this is a helpful lens through which to view the significance of AI in a way that is not fundamentally about intelligence.

In this world, there are two types of objects: objects whose steerable consequences diminish over time, and objects whose steerable consequences expand over time.

Consider a small rock on a table. Suppose I move that rock a little to the left, and consider the ways that this action might affect the future. The rock might have been holding down some papers, and those papers might now be blown about by a gust of wind. Or someone might walk into the room and, seeing the rock being out of place, walk over and move it back. In fact the rock exerts a gravitational effect on every other object in the universe, and the tiny movement of the rock will have consequences that ripple out for the life of the universe.

But although these consequences are real, the rock cannot be used by us to produce a predictable large-scale effect on the world very far into the future — say, on the timescale of decades. The consequences of moving the rock become too unpredictable for us to reason about. Even if we are allowed to move the rock to any point in the universe, we cannot really use this power to effect any useful control over the future, at least not without involvement from humans. As we consider the causal fallout of moving the rock we quickly hit a wall of foggy uncertainty, and so in this sense rock cannot be used on its own to steer the future.

But consider now the action of introducing a living organism to the surface of Mars. Suppose that some scientists have chosen or engineered a particular kind of mold that will thrive in the environmental conditions present on Mars. Suppose that we move an object of the same size as the rock, only now the object is a mold specimen together with an initial food source, and we move it from some laboratory on Earth to the surface of Mars. Although the physical size of this initial specimen might be quite small, this action could have consequences that eventually affect the entire surface of Mars.

Furthermore, some of these consequences are quite predictable. We can predict that the mold will reproduce. We can predict that the specimen will spread outwards from its initial location. We can predict that after a few decades we might find copies of the mold all over the surface of Mars. Other consequences are fundamentally unpredictable, yet it is clear that there are some predictable large-scale consequences.

Suppose now that we genetically engineer the specimen to grow under some conditions and not others. By picking these conditions precisely, we might cause the mold to spread to only the northern hemisphere of Mars, or to grow only at low altitudes, or only at high altitudes. In each case, the only thing we are transporting to Mars is a single specimen the size of a small rock. We are not ourselves spreading the mold over a mountain range or over the low-altitude parts of the planet, but by tweaking the configuration of atoms within this initial specimen we can choose how and where the mold will spread. In this sense the mold has expanding steerable consequences because a physically small specimen can be altered in a way that predictably steers large-scale effects over a long time horizon.

Another object that has this expanding steerability property is the human being. Transport a small colony of humans together with appropriate resources and an initial life support system to the other side of the universe, and over a few thousands or tens of thousands of years an entire space-faring civilization might spring up, perhaps rearranging the matter and energy in that part of the cosmos at a macroscopic scale.

Which kind of objects have this property of expanding steerability? As of May 2021, there are no non-biological objects on Earth that have this property, without ongoing input from humans. For example, suppose I transported a robot to the surface of Mars. This has been done several times, and it has not had the kind of expanding steerable consequences that transporting a mold specimen to the surface of Mars might have[1]. Furthermore we have not yet built robots that could, without any external help from humans, be used to steer the future, even to the limited extent that a mold specimen might be used to steer the future.

If all biological life on Earth disappeared tomorrow, but all machines built by humans continued operating, the entire ecosystem of machines would quite quickly wind down. Much of the software that runs services on the internet relies on near-constant human oversight, and would cease operating in the absence of humans. But even the most robust pieces of software would cease operating when the power grid decayed to the point of inoperability. And even the most robust machines that humans have ever built, such as some satellites and perhaps some computers located underground with nuclear power sources, will not have the kind of expanding consequences in this neighborhood of the universe that biological life could have.

So in this regard, all the machines that humans have ever built are more like the rock on the table than they are like the mold specimen. Whereas life is winding up, the machines we have built thus far are winding down.

But this may be about to change. Humans appear poised to create machines that could have expanding steerable consequences, independent of biological life. If we succeed at building truly intelligent machines, we might create machines that can collect resources, maintain and upgrade themselves, expand or reproduce themselves, grow their own impact from small to large, and reshape significant patches of the universe. The precise initial configuration of such machines may determine much of what changes they make to their patches of the universe.

All biological life appears to have originated from a single seed organism approximately four billion years ago. This seed organism was almost certainly very small, but its unfolding consequences thus far have been as vast as the Earth, and may yet continue to unfold beyond the Earth. Now, four billion years later, we are about to set in motion a second seed.

1. Yes, the Mars rovers have had large consequences via the information they have beamed back to Earth, but these consequences have flowed via humans, which are a form of biological life that very much does have the expanding steerability property ↩︎

Discuss