Новости LessWrong.com

A community blog devoted to refining the art of rationality
Обновлено: 10 минут 20 секунд назад

The Colonization of Cults, Nonprofit Organizations, and Society

18 октября, 2021 - 01:02
Published on October 17, 2021 10:02 PM GMT

Over the past 8+ years of nonprofit experience and during a brief stint of training with a high demand group focused on meditation and leadership development (The Monastic Academy) I have observed how patterns and ideologies related to the complex socio-emotional and historical contexts of American culture and colonization show up, again and again, both within the broader systematic issues nonprofits exist to address as well as within organizations themselves.

Recently while reviewing a list of characteristics and patterns common in “cult” dynamics I recognized that I was also looking at a list that describes colonization. According to a simple google search, colonization is the action or process of settling among and establishing control and domination over the indigenous people of an area. Historically, global colonization has often targeted and disproportionality affected many communities of color including the genocide of indigenous peoples, forced assimilation into cultural and religious practices, loss of language and culture, taking of indigenous lands, the enslavement of Africans and other peoples, forced separation and abuses of indigenous children in boarding schools, etc. Before then many groups within Europe had their own history of invasion, conquest, and colonization (i.e. spread of the Roman empire, English clearing of the Scottish Highlands.) These practices and the history of colonization has left a deep psychological, physical, emotional, and spiritual imprint on people from all walks of life. Unfortunately, these unconscious and conscious patterns/attitudes inherent in “colonization” still show up throughout all levels of our society, perpetuating harm and inequity, and are often deeply embedded within our frameworks for and understanding of community, leadership, institutional and organizational management. This makes it critically important to be intentional about recognizing and addressing unhealthy and dysfunctional patterns of behavior, structures, and practices that perpetuate harm directly and indirectly within our communities and organizations.

On the far end of this spectrum, we see high-demand groups, commonly known as “cults”.  Many of these groups operate under a 501c3 nonprofit status and under the guise of having a mission to bring transformative change or to be of service to the world. However, their outward-facing mission and values often prove to be incongruent with the internal narrative and actual impacts of the organization. Perhaps these groups are not as separate from the dominant culture as we might like to think but are in fact intense microcosms in which particular underlying ideologies, structures, and behaviors are taken to an extreme. Common characteristics, ideologies, and patterns within cults include but are not limited to recruiting of “elite or special ones”; we are the chosen ones; we are going to save the world; we have the right to have and exercise power over others because we are better in x,y, or z ways; unlimited expansion (Manifest Destiny, anyone?); use of religion/spirituality and power to control people and governments; dominated group submits to the will of the dominator; hierarchical and authoritarian (often patriarchal) styles of leadership, breaking down of ones personal and cultural identity and replacing it with a new cult dogma and identity, abuses of power and lack of accountability for those abuses; distorted and disempowered relationships between feminine/masculine energies and persons; disconnect/distrust of your own body and emotions; unhealthy relationship to resources (i.e. money, land) and resource extraction (i.e. unethical fundraising practices, illegal activities), ect]. Involvement in and hierarchies within these groups are often but not always reflected along lines of class, gender, and race reflected in the broader society as "cults" aka high demand groups often target people with money and greater social influence.

At the same time, "cults" themselves are likely a long-term cultural byproduct of colonization that has left many people rootless, with intergenerational trauma, experiencing "the loss of the village", with inadequate socio-emotional support networks, and a lack of cultural identity and connection to the cultures their ancestors came from. These impacts also include many people of European descent. It is important to acknowledge that capitalism has played a significant role in the breakdown of and the “loss of the village”.  Many people today, especially young people are hungry for a sense of cultural identity, belonging, initiation, community, guidance, and mentorship, searching for solutions to societal and environmental breakdowns, and a need for shared purpose and meaning that is lacking in the broader culture. Throw in a major loss or life transition or past childhood trauma without the support of a "village" and people are incredibly vulnerable to charismatic leaders who more or less promise to give them everything they have been looking for at a "price.” This price often being their agency, their power, their silence, access to their resources (money, sex, social influence) and their complicity in perpetuating harmful power structures and dynamics.

Many nonprofits and companies that do not fit the defining characteristics of a "cult" have also been guilty of perpetuating these systems, patterns, and practices that further disempower marginalized groups, individuals, and local communities (especially poor and BIPOC communities.) Some of the ways these patterns of "colonization" show up in both nonprofit organizations and companies that may or may not meet the criteria for “cults”; but nonetheless are problematic and cause harm within our communities  are:

• Mission-driven vs. Community-centered.  The “mission” and/or the needs/desires of the institution/leaders are put above the needs of the communities they serve (i.e. clients, participants, customers) and employees even in cases where the actions of the organization cause harm to those who interact with it. Healthy organizations intentionally use metrics for evaluation and feedback processes to gather data about their impact and the needs of communities through surveys, focus groups, listening to feedback and grievances within the community, and centering the experiences and needs of the community as being fundamental to their work and mission. They do not put "the mission" above the needs of the community or use it to justify unethical and/or harmful behaviors and impacts.
• Hierarchical and inequitable power structures that reflect along lines of class, race, ability, and gender (often unconsciously) creating and perpetuating longstanding patterns of harm, inequitable access and opportunity, power imbalances, and abuses of power, etc. For this reason, many organizations have started to shift towards collaborative and decentralized models of leadership, and focused education and training in anti-oppression models are essential.
• Lack of effective accountability and grievance processes. Healthy organizations create intentional structures and processes that ensure that leaders, employees, and community members understand standards of conduct and are accountable for their actions and impact. Some examples of this include a well developed and diverse board that has at least 7 members without conflicts of interest (re: nonprofit best practices), clearly outlined grievance and feedback processes for employees and participants, checks and balances within the system, distribution of power, committees and/or employees devoted to handling grievances/complaints and accountability, acknowledging and making amends for harms done,  active engagement in restorative practices and meditation processes, etc.
• Erasure of personal and cultural identities and differences through policies that limit the expression of identities (i.e. sexuality, religious, political, etc.), fear-based compliance and silencing of voices of dissent, lack of inclusion, and lack of power given to those with different backgrounds and perspectives, etc. While a healthy company or organizational culture is important to cultivate and can create strong group cohesion; when branding, policies, uniforms, and other practices seek to exclude or replace existing identities this can lead to unhealthy group dynamics. Healthy organizations value diverse backgrounds, thinking, and approaches; and see the essential and valuable contributions these bring to any organization.
• “Save the world” narratives and marketing of self-aggrandizing narratives that many organizations and companies engage in by over-exagerating the importance, role, or uniqueness of their mission and work. Healthy organizations demonstrate awareness of other organizations who engage in similar types of transformative and social change work in their focus area and/or others who offer similar products or services in the for-profit world - as well as what is unique about their approach, methodology, service, or product. They understand that social change and transformation is a collaborative process as well as the importance of accurately representing and demonstrating the claims they are making; especially when positioning themselves as “best” or better than alternative options.
• Engaging in narrative control through the use of nondisclosure agreements, threats, or other forms of manipulation and coercion. By silencing accounts of harm and unethical conduct within the organization they effectively control the narrative and sharing of information. Withholding or distorting information that would reflect negatively on the organization and/or impairs people’s ability to make fully informed and consent-based decisions through PR, reports, and solicitations that do not accurately describe the activities of an organization or events being reported on; especially to stakeholders, funders, and major donors.
• People are treated as a means to an end, and actions that are unethical or that either intentionally or unintentionally result in harm are dismissed or rationalized through the argument that “the ends justify the means.” Healthy organizations and companies center the experience and needs of community members, participants, employees, clients, and customers; and do not sacrifice the wellbeing and safety of any of these in pursuit of “the mission.”
• Appropriation and consumption of cultural identities (especially BIPOC identities) practices, knowledge, attire, lands, and more without the permission of those who are a part of that cultural identity and without the proper cultural and historical context for the things we are partaking in. Appropriation can also extend to using and/or taking credit for other people's ideas and intellectual property without their permission.

Because of the prevalence of these patterns within organizations and community spaces, I believe it is critical that all organizations (regardless of whether they use a nonprofit or for-profit model) engage in accountability processes, staff training, and community dialogue focused on decolonization, anti-racism, anti-sexism, and other anti-oppression frameworks as well as engage in preventive conversations and measures designed to address systemic barriers, common pitfalls and challenges, and intentionally nourishing and maintaining health organizational practices and community dynamics. It takes a lot of work to examine and de-program the many toxic ideas of leadership and community we've received and build healthy organizations because many of the models we have inherited are harmful.

Please feel free to comment below on other ways you’ve seen patterns and behaviors of “colonization” show up in organizational cultures. By no means is this list of examples comprehensive. How have you seen these patterns show up within the communities and organizations you’ve been a part of? What steps can we take to radically “decolonize” our organizations and communities? What practices and structures do you think best support the development of healthy organizations and community dynamics?

Discuss

Applied Mathematical Logic For The Practicing Researcher

17 октября, 2021 - 23:28
Published on October 17, 2021 8:28 PM GMT

Asking for a friend[1]: what happened to Richard Hamming's social status after he started asking those pointed questions about the importance of research questions and individual career decisions? Was he, like, actually banished from the lunch table?

Technically, I am modeling for a living

A couple of months ago I've started asking my colleagues during lunch what their definition of a "model" is. This question is important: our job consists of building, evaluating, and comparing models. I am not hoping for an Aristotelean list of necessary & sufficient conditions, but it still appears like a good idea to "survey the land". Also, admittedly, lunch can get a bit boring without challenging questions.

An abstract drawing of a computational model. CGD generated.

I got a range of responses:

"a description of a phenomenon from which you can reason (= a description you can manipulate to tell more about the phenomenon than you would have been able to tell without it)"

"It should be something like a representation of the modelled system without representing it completely. Perhaps most importantly that it preserves the causal relationships between the system elements without completely mapping these elements?"

"an abstraction of reality"

I also ran into this adage again and again (attributed to a different person every time):

"All models are false, but some are useful."

Along similar lines, there is a quote from the influential computational neuroscientist Larry Abbott:

"the term 'realistic' model is a sociological rather than a scientific term."

Alright, survey done, lunch is over. Back to...

In search of tighter concepts

No! I'm not satisfied. What do you mean it's a sociological term? What do you mean they are false? Can a model have a truth value? If a model is a "representation" / "abstraction" / "description" then what exactly is a "representation" / "abstraction" / "description"? This is not some idle philosophical nitpicking, this question is immediately important. As a reviewer, I have to judge whether a model is good (enough). As a researcher, I want to build a good model. I'm not going to devote my career to building models if I don't have a really good idea of what a model is.

I hope you can tell from my extensive use of italicized words that this is a topic I am rather passionate about. If the question of a good model is a sociological question then it's subject to trends and fads[2]. And if the term "model" is broad enough to fit "detailed biophysical models", "abstract phenomenological models", "linear regression" and "a cartoon in Figure 8" under its umbrella, then it's inevitable that our intuitive understanding of what constitutes a good model deviates. Heck, the term is so broad, technically even this should qualify:

An abstract painting of a very attractive albatross that could totally be a fashion model. CGD generated.

So in the spirit of conceptual engineering and dissolving questions, here goes my attempt of laying out what I think of when I think of models. This is obviously not authoritative and it's far from rigorous. This is just my "working definition" which I wrote down to force myself to tighten my terminology.

Mathematical logic to the rescue

Since we mean so many different things by the term "model" it makes sense to start very general, i.e. mathematical. There is indeed a subfield of mathematics called "model theory" that makes some very useful distinctions! I'll trample over all subtleties to get to the core quickly, but consider checking out this or this for accessible introductory reading.

Here goes the central definition:

A model is a (mathematical) object that satisfies all the sentences of a theory.

To make this useful, we have to further define the used terms.

What is a theory? It's a set of sentences. What is a sentence? Well, it's pretty much what you would expect - it's a string of symbols constructed from an alphabet according to some fixed rules. A famous example of a theory is Peano arithmetic, but really the definition is much more general:

1. A dynamical system, given as a set of differential equations[3], is a theory.
2. A cellular automaton, given as a set of transition rules, is a theory.
3. Any recursively enumerable set of sentences of a formal language, given as a set of production rules, is a theory.
An abstract drawing of a cellular automaton. CGD generated.

1. A particular trajectory through state space, f.e. specified through initial conditions.
2. A particular evolution of the cellular automaton, again specified through the initial conditions.
3. A particular Turing machine that implements the production rules, specified through... (you get the idea).

If we are allowed to be even more hand-wavy, then we can also incorporate models à la Tyler Cowen: To "model this [headline]" we have to come up with a theory (a set of sentences) from which the headline follows.

One important thing to note here is that every model "inherits" every property that follows from the theory. But the inverse does not hold[4]: just because a model has a certain property, this property does not necessarily follow from the theory. In general, there will always be multiple models that satisfy a theory, each with different "additional properties" that go beyond what is prescribed by the theory[5].

Defining a model as an object satisfying a theory is broad enough to cover all the ways in which the term is used:

• the entire spectrum of mathematical models, from detailed biophysical to normative Bayesian, is specified by a set of equations (a theory) and instantiated with parameter choices.
• the "cartoon in Figure 8" is one particular (rather toothless) object that satisfies an implicit theory (consisting of a set of conjectured sentences).
• the albatross fashion model... doesn't fit. But you can't have everything, I'm told.

It also includes an interesting pathological case: to model a particular set of observations, we could just come up with a theory that contains all the observations as axioms, but no production rules. Then the observations themselves trivially satisfy the theory. This is clearly useless in some sense[6] (a dataset shouldn't be a model?) - but looking deeper into why it's useless reveals something about what constitutes a good model - or, by extension, a good theory.

Here is my definition:

A good model of a phenomenon is one that allows us to understand something about the phenomenon. If all the models of a theory are good models, the theory is a good theory.

Again, we need to define our terms for this to make sense. What is a phenomenon? A phenomenon is some (conjunction of) physical process(es). It's something out there in the territory. What does understand mean? Understanding a phenomenon means predicting (better than chance level) the state of the phenomenon at time t+1 given the state at time t.

Why does it make sense to set up things like this?

Models with benefits

First, it establishes a neat hierarchy. Understanding is gradual: It goes from non-existing (chance level) to poor (consistently above chance[7]) to great (almost perfect prediction) to complete (100% prediction accuracy).

With this definition, a "black box" deep learning model that is able to predict a percentage of brain activity does provide some understanding about a part of the brain. Similarly, a mean-field model that has "lost" some conversion factor in its units can also still be a good model, as long as it is able to get the direction of the evolution of the state correct.

Second, making predictions the central criterion for model quality helps us avoid unproductive disputes resulting from mismatched terms. The usual example here is "If a tree falls in the forest, does it make a sound?", which can lead to a highly unproductive discussion if asked at the lunch table. But when explanations are evaluated according to their predictive power, misunderstandings are resolved quickly: Either a tape recorder will or won't record airwaves. Either there is or there isn't activation in some auditory cortex.

Third, to have a good theory, you need to demonstrate that all its models are good (according to the definition above). This gets naturally easier if there are fewer models that satisfy the theory, thus incentivizing you to remove as many free parameters from the theory as possible[8]. Ideally, you'll want a unique characterization of a good model from your theory.

Finally, this definition formalizes the "all models are wrong, but some are useful" adage. To get 100% prediction accuracy for a physical process you have to go down to the level of particles. F.e. having a fluid dynamics model of water motion will get you very far in terms of predictive power. In that sense, it's a very good model. But to get even close to 100%, you'll want an atomic model of water. And eventually, if you are pushing for ever more predictive power, you'll have to decompose your problem further and further, and eventually, you will get into very weird territory[9].

Thus, to determine whether a model is good or bad, you have to figure out which phenomenon it is trying to explain and then determine if the model allows you to predict the time-evolution of the phenomenon better than chance level. This is a relatively low bar, but in my experience, it's still not easy to clear. Actually demonstrating that your performance is different from chance requires explicit performance metrics, which are not usually adapted. But that's a different story.

Cliffhanger!

This is almost all I wanted to say on the topic. But I glossed over an important point in that exposition: If a model is a mathematical object, why might we expect that it can predict physical processes out there in the territory? In fact, why should there be any similarity between the solar system and the atom[10]? Why does analogical reasoning work?

I'm glad you ask. Stay tuned - I'll dig into that next time.

[1] Okay, okay, I can't lie to you. That friend is me. I'm worried about getting banished from the lunch table. ↩︎

[2] And it's usually up to an influential "ingroup" to decide what fits in and what doesn't. ↩︎

[3] Plus ZFC, I guess. ↩︎

[4] This inverse only holds when the model uniquely and completely specifies the model, which is pretty hard to achieve in principle. See Logical Pinpointing. ↩︎

[5] One might be tempted to argue that if many different models that all satisfy the same theory, this is evidence that the property actually does follow from the theory. This isn't guaranteed, but it might work in some cases. In Computational Neuroscience, this is the practice of demonstrating that the desired result holds even when the parameter is slightly perturbed. ↩︎

[6] This has some overlap with Chomsky's levels of adequacy: a theory that includes only the observations as axioms has observational adequacy, but neither descriptive nor explanatory adequacy. ↩︎

[7] or below! If you're consistently worse than chance that is very useful information. ↩︎

[8] Thus we arrive at an interesting version of Occam's razor. ↩︎

[9] Let's not talk about quantum stuff on this Substack, okay? ↩︎

[10] Yes, I know that the Bohr model is not the end of the story. But it is still able to explain basically all of chemistry. And also "we don't talk about quantum physics on this Substack". ↩︎

Discuss

How much should you update on a COVID test result?

17 октября, 2021 - 22:49
Published on October 17, 2021 7:49 PM GMT

This is a writeup of COVID test accuracies that I put together for my own interest, and shared with friends and housemates to help us reason about COVID risk. Some of these friends suggested that I post this to LessWrong. I am not a statistician or an expert in medical research.

Background

We often hear that some kinds of COVID tests are more accurate than others — PCR tests are more accurate than rapid antigen tests, and rapid antigen tests are more accurate if you have symptoms than if you don't. A test's accuracy is often presented as two separate terms: sensitivity (what proportion of diseased patients the test accurately identifies as diseased) and specificity (what proportion of healthy people the test accurately identifies as healthy). But it's not obvious how to practically interpret those numbers: if you test negative, what does that mean about your odds of having COVID?

This writeup attempts to answer to the question, "how much more (or less) likely am I to have COVID given a positive (or negative) test result?" In particular, this is an attempt to calculate the Bayes factor for different types of COVID test results.

The Bayes factor is a number that tells you how much to update your prior odds of an event (in this case, your initial guess at how likely someone is to have COVID) given some piece of new evidence (in this case, a test result). It's calculated based on the test's sensitivity and specificity. If a test has a Bayes factor of 10x for a positive test result, and you test positive, then you should multiply your initial estimated odds of having COVID by 10x. If the same test has a Bayes factor of 0.3x for a negative test result, and you test negative, then you should update your prior odds of having COVID by 0.3x.

Using Bayes factors

(For an excellent explanation of Bayes factors and the motivation behind using them to interpret medical tests, I highly recommend this 3Blue1Brown video, which inspired this post.)

There's a well-known anecdote where doctors in a statistics seminar were asked how they would interpret a positive cancer test result. They were given the information that the test has a sensitivity of 90% (10% false negative rate), a specificity of 91% (9% false positive rate), and that the base rate of cancer for the patient's age and sex is 1%. Famously, nearly half of doctors incorrectly answered that the patient had a 90% probability of having cancer. [1] The actual probability is only 9%, since the base rate of cancer is low in the patient's population. One important lesson from this anecdote is that test results are an update on your priors of having the disease; the same positive test result implies different probabilities of disease depending on the disease's base rate.

Bayes factors help make it easy to make this update. A test's Bayes factor is a single number that, when multiplied by your prior odds, gives you your posterior odds. For a COVID test, you can start with your initial estimate of how likely you are to have COVID (based on the prevalence in your area, or your current number of microCOVIDs) and update from there.

To calculate the Bayes factor for a negative COVID test, you take the probability that you'd test negative in the world where the you do have COVID and divide it by the probability that you'd test negative in the world where the you do not have COVID. Expressed mathematically:

Similarly, the Bayes factor for a positive COVID test is the probability of a positive result in the world where the you do have COVID, divided by the probability of a positive result in the world where the you do not have COVID.

Bayes factor(+)=p(+|COVID)p(+|no COVID)=true positive ratefalse positive rate=sensitivity1 - specificity

To interpret the test result, express your prior probability of having COVID as an odds, and then multiply those odds by the Bayes factor. If you initially believed you had a 10% chance of having COVID, and you got a negative test result with a Bayes factor of 0.1x, you could multiply your prior odds (1:9) by 0.1 to get a posterior odds of 0.1:9, or about 1%.

List of COVID tests with Bayes factors

Below are my calculations for the Bayes factors of rapid nucleic acid amplification tests (which includes rapid PCR tests) as well as rapid antigen tests (the type available for home use in the US). I used sensitivity and specificity estimates from a Cochrane metastudy on rapid tests [2] initially published in August 2020 and last updated in March 2021.

Rapid Antigen Test

This is a test for fragments of SARS-Cov-2 protein [3]. It's typically administered via nasal swab, is available to purchase in the US as at-home test kits, and can be very quick (15 minutes for some brands). It has lower sensitivity (aka more false negatives) than most nucleic acid tests.

Are you symptomatic?

The Cochrane metastudy reviewed 3 brands of rapid antigen test (Coris Bioconcept COVID-19 Ag, Abbot Panbio COVID-19 Ag, and SD Biosensor Standard Q COVID-19 Ag) and found that the sensitivity of all these tests were notably higher for symptomatic patients compared to patients with no symptoms. They also found that these tests were most sensitive within the first week of developing symptoms.

The review's estimates for sensitivity were:

• No symptoms: 58.1%  (95% CI 40.2% to 74.1%)
• Symptomatic, symptoms first developed <1 week ago: 78.3% (95% CI 71.1% to 84.1%)
• Symptomatic, symptoms first developed >1 week ago: 51.0% (95% CI 40.8% to 61.0%)

The review found that specificity was similar across all patients regardless of symptom status — about 99.6% (95% CI 99.0% to 99.8%).

Rapid antigen tests: if you don't have symptoms
• Estimated Bayes factor for a negative result: about 0.4x (1−0.5810.996≈0.42)
• Estimated Bayes factor for a positive result: about 145x ( 0.5811−0.996≈145 )

So, if you got a negative result, you can lower your estimated odds that you have COVID to 0.4x what they were before. If you got a positive result, you should increase your estimated odds that you have COVID to 145x what they were before.

Rapid antigen tests: if you have symptoms that developed <1 week ago
• Estimated Bayes factor for a negative result: about 0.2x (1−0.7830.996≈0.22)
• Estimated Bayes factor for a positive result: about 196x ( 0.7831−0.996≈196 )

So, if you got a negative result, you can lower your estimated odds that you have COVID to 0.2x what they were before. If you got a positive result, you should increase your estimated odds that you have COVID to 196x what they were before.

Rapid antigen tests: if you have symptoms that developed >1 week ago
• Estimated Bayes factor for a negative result: about 0.5x ( 1−0.5100.996≈0.49 )
• Estimated Bayes factor for a positive result: about 128x ( 0.5101−0.996≈128 )

So that if you got a negative result, you can lower your estimated odds that you have COVID to 0.5x what they were before. If you got a positive result, you should increase your estimated odds that you have COVID to 128x what they were before.

The Abbot BinaxNow At-Home Test

Unfortunately the Cochrane metastudy didn't include data for the Abbot BinaxNOW at-home test, which I was particularly interested in because it's the most common at-home test in the US, and is the test my household uses most frequently. I've seen a few sources (e.g. [4]) that claim that the Abbott BinaxNOW test is slightly more sensitive and about as specific than the Abbott Panbio Ag test which was reviewed by the Cochrane metastudy, so it's possible that this test has a slightly higher predictive power than the ones reviewed above.

Nucleic Acid Amplification Test (NAAT)

This test looks for viral RNA from the SARS-Cov-2 virus [3]. It is typically administered via nasal swab. It's also called a "nucleic acid test" or "molecular test". PCR tests are a type of NAAT. The Cochrane metastudy indicated that sensitivity and specificity differed by brand of test.

All Rapid NAATs

If you got a rapid NAAT but don't know what brand of test it was, you could use these numbers, which are from the initial August 2020 revision of the Cochrane metastudy. This version analyzed data from 11 studies on rapid NAATs, and didn't break up the data into subgroups by brand. They calculated the average sensitivity and specificity of these tests to be:

• Sensitivity: 95.2% (95% CI 86.7% to 98.3%)
• Specificity: 98.9% (95% CI 97.3% to 99.5%)
• Estimated Bayes factor for a negative result: about 0.05x (1−0.9520.989≈0.05)
• Estimated Bayes factor for a positive result: about 87x (0.9521−0.989≈87)

So if you get a negative test result, you can lower your estimated odds of having COVID to 0.05 times what they were before. If you got a positive result, you should increase your estimated odds that you have COVID to 87x what they were before.

Cepheid Xpert Xpress Molecular Test

This is an RT-PCR test [5]. The March 2021 revision of the Cochrane metastudy included a separate analysis for this brand of test.

• Sensitivity: 100% (95% CI 88.1% to 100%)
• Specificity: 97.2% (95% CI 89.4% to 99.3%)
• Estimated Bayes factor for a negative result: very very low?
If we use the Cochrane study's figures for sensitivity and specificity, we getfalse negative ratetrue negative rate (specificity)=1−1.000.972=0

If the sensitivity is actually 100%, then we get a Bayes factor of 0, which is weird and unhelpful — your odds of having COVID shouldn't go to literally 0. I would interpret this as extremely strong evidence that you don't have COVID, though. I'd love to hear from people with a stronger statistics background than me if there's a better way to interpret this.
• Estimated Bayes factor for a positive result: about 36x (1.001−0.972≈36)

So if you get a positive test result, your estimated odds of having COVID is increased by a factor of 36.

Abbot ID Now Molecular Test

This is an isothermal amplification test [5]. The March 2021 revision of the Cochrane metastudy included a separate analysis for this brand of test.

• Sensitivity: 73.0% (95% CI 66.8% to 78.4%)
• Specificity: 99.7% (95% CI 98.7% to 99.9%)
• Estimated Bayes factor for a negative result: about 0.3x (1−0.7320.997≈0.27)
• Estimated Bayes factor for a positive result: about 244x (0.7321−0.997≈243)

So if you get a negative test result, you can lower your estimated odds of having COVID to 0.3 times what they were before. If you got a positive result, you should increase your estimated odds that you have COVID to 244x what they were before.

I was surprised to see how different the accuracies of Abbott ID Now and Cepheid Xpert Xpress tests were; I'd previously been thinking of all nucleic acid tests as similarly accurate, but the Cochrane metastudy suggests that the Abbott ID Now test is not meaningfully more predictive than a rapid antigen test. This is surprising enough that I should probably look into the source data more, but I haven't gotten a chance to do that yet. For now, I'm going to start asking what brand of test I'm getting whenever I get a nucleic acid test.

Summary of all testsTestBayes factor for negative resultBayes factor for positive resultRapid antigen test, no symptoms0.4x145xRapid antigen test, symptoms developed <1 week ago0.2x196xRapid antigen test, symptoms developed >1 week ago0.5x128xRapid NAAT, all brands0.05x87xRapid NAAT: Cepheid Xpert Xpressprobably very low, see calculation36xRapid NAAT: Abbot ID Now0.4x243xCaveats about infectiousness

From what I've read, while NAATs are highly specific to COVID viral RNA, they don't differentiate as well between infectious and non-infectious people. (Non-infectious people might have the virus, but in low levels, or in inactive fragments that have already been neutralized by the immune system) [6] [7]. I haven't yet found sensitivity and specificity numbers for NAATs in detecting infectiousness as opposed to illness, but you should assume that the Bayes factor for infectiousness given a positive NAAT result is lower than the ones for illness listed above.

Relatedly, the sensitivity of rapid antigen tests is typically measured against RT-PCR as the "source of truth". If RT-PCR isn't very specific to infectious illness, then this would result in underreporting the sensitivity of rapid antigen tests in detecting infectiousness. So I'd guess that if your rapid antigen test returns negative, you can be somewhat more confident that you aren't infectious than the Bayes factors listed above would imply.

What if I take multiple tests?

A neat thing about Bayes factors is that you can multiply them together! In theory, if you tested negative twice, with a Bayes factor of 0.1 each time, you can multiply your initial odds of having the disease by (0.1)2=0.01.

I say "in theory" because this is only true if the test results are independent and uncorrelated, and I'm not sure that assumption holds for COVID tests (or medical tests in general). If you get a false negative because you have a low viral load, or because you have an unusual genetic variant of COVID that's less likely to be amplified by PCR*, presumably that will cause correlated failures across multiple tests. My guess is that each additional test gives you a less-significant update than the first one.

*This scenario is just speculation, I'm not actually sure what the main causes of false negatives are for PCR tests.

Use with microCOVID

If you use microCOVID.org to track your risk, then you can use your test results to adjust your number of microCOVIDs. For not-too-high numbers of microCOVIDs, the computation is easy: just multiply your initial microCOVIDs by the Bayes factor for your test. For example, if you started with 1,000 microCOVIDs, and you tested negative on a rapid NAAT with a Bayes factor of 0.05, then after the test you have 1000⋅0.05=50 microCOVIDs.

The above is an approximation. The precise calculation involves converting your microCOVIDs to odds first:

1. Express your microCOVIDs as odds:
1,000 microCOVIDs → probability of 1,000 / 1,000,000 → odds of 1,000 : 999,000
2. Multiply the odds by the Bayes factor of the test you took. For example, if you tested negative on a rapid nucleic acid test (Bayes factor of 0.05):
1,000 / 999,000 * 0.05 = 50 / 999,000
3. Convert the resulting odds back into microCOVIDs:
odds of 50 : 999,000 → probability of 50 / 999,050 ≈ 0.00005 ≈ 50 microCOVIDs

But for lower numbers of microCOVIDs (less than about 100,000) the approximation yields almost the same result (as shown in the example above, where we got "about 50 microCOVIDs" either way).

Acknowledgements

Thank you to swimmer963, gwillen, flowerfeatherfocus, and landfish for reviewing this post and providing feedback.

References

Discuss

Feature Suggestion: one way anonymity

17 октября, 2021 - 20:54
Published on October 17, 2021 5:54 PM GMT

Sometimes there's something I want to write, where I would be perfectly happy for whoever happens to read it on Less Wrong and is interested in who the author is to check and see it's me, but I don't want the post to appear in search results for my name.

I also don't want anyone who opens my profile on Less Wrong to see the post.

There could be various reasons for this - perhaps I'm about to claim that tabs are better than spaces and I don't want potential employers to see such heresy if they search for me. Perhaps I'm asking for suggestions for my wife's birthday present and don't want her to find out what it is.

My suggestion is to allow posting semi-anonymously. The name that would appear on such posts is "anonymous" but the name would be a link to the author's user profile. The post wouldn't appear on the users profile, but any votes would change the users karma as usual.

Perhaps it should only be possible for a user with sufficient karma to click on the link to the user profile, so that some bad actor can't scrape all semi-anonymous posts and publish the author's names in a searchable format, or at least if they do it should be straightforward to detect and block that user.

Discuss

"Redundant" AI Alignment

17 октября, 2021 - 14:31
Published on October 16, 2021 9:32 PM GMT

This is a post I wrote on my personal blog after a discussion with a deep learning professor at the University of Chicago. I don't know if this particular topic has been studied in much depth elsewhere, so I figured I would share it here. If you know of any related work (or have any other comments on this, of course), let me know.

Discuss

Explaining Capitalism Harder

17 октября, 2021 - 05:40
Published on October 17, 2021 2:40 AM GMT

A friend recently shared a sharing of a screenshot of a reblogging of a reblogging of this tumblr post:

Pro-Capitalist's defense of capitalism is just explaining how it works, and then when you say "yes I know, I just think it shouldn't be like that" they explain it to you again but angrier this time
strawberry-crocodile

I really like this perspective, even as someone relatively pro-capitalism, because I think it captures something that often goes wrong in these discussions.

The strongest argument in favor of capitalism is that in practice it works for most things, better than the other systems we've tried. Not because it was designed to work, but because that's just how it falls together. When someone points at a piece of the system that seems unfair or wasteful and says "I just think it shouldn't be like that," it's going to have effects elsewhere in the system, often negative ones. And so pro-capitalism folks often respond by trying to explain capitalism harder: what role is the thing you want to change filling? When people propose removing something without engaging with how it ties in to the rest of the system, it is natural to assume they don't know about its function and try to explain.

As in the opening quote, however, people don't want more explanation of the workings of the status quo. Instead, I think a better response is to think about what you expect would go wrong, and ask if they would expect that. Perhaps they don't, and you can try and figure out where specifically your expectations diverge. Perhaps they do, and they think it's worth it. Perhaps they have additional proposals which work together. Whichever way the conversation goes, I think it probably is more productive?

(Overall my perspective is that while things are much worse than they could be, they're also much better than they have ever been. I really don't want us to break the system that keeps improving our ability to turn time and stuff into what people need. At the same time, to the extent that we can do it without breaking this cycle of improvement, I'd like to see far more redistribution of wealth. In my own life this looks like giving.)

Discuss

My experience at and around MIRI and CFAR (inspired by Zoe Curzi's writeup of experiences at Leverage)

17 октября, 2021 - 00:28
Published on October 16, 2021 9:28 PM GMT

I appreciate Zoe Curzi's revelations of her experience with Leverage.  I know how hard it is to speak up when no or few others do, and when people are trying to keep things under wraps.

I haven't posted much publicly about my experiences working as a researcher at MIRI (2015-2017) or around CFAR events, to a large degree because I've been afraid.  Now that Zoe has posted about her experience, I find it easier to do so, especially after the post was generally well-received by LessWrong.

I felt moved to write this, not just because of Zoe's post, but also because of Aella's commentary:

I've found established rationalist communities to have excellent norms that prevent stuff like what happened at Leverage. The times where it gets weird is typically when you mix in a strong leader + splintered, isolated subgroup + new norms. (this is not the first time)

This seemed to me to be definitely false, upon reading it.  Most of what was considered bad about the events at Leverage Research also happened around MIRI/CFAR, around the same time period (2017-2019).

I don't want to concentrate on the question of which is "worse"; it is hard to even start thinking about that without discussing facts on the ground and general social models that would apply to both cases.  I also caution against blame in general, in situations like these, where many people (including me!) contributed to the problem, and have kept quiet for various reasons.  With good reason, it is standard for truth and reconciliation events to focus on restorative rather than retributive justice, and include the possibility of forgiveness for past crimes.

As a roadmap for the rest of the post, I'll start by describing some background, describe some trauma symptoms and mental health issues I and others have experienced, and describe the actual situations that these mental events were influenced by and "about" to a significant extent.

Background: choosing a career

After I finished my CS/AI Master's degree at Stanford, I faced a choice of what to do next.  I had a job offer at Google for machine learning research and a job offer at MIRI for AI alignment research.  I had also previously considered pursuing a PhD at Stanford or Berkeley; I'd already done undergrad research at CoCoLab, so this could have easily been a natural transition.

I'd decided against a PhD on the basis that research in industry was a better opportunity to work on important problems that impact the world; since then I've gotten more information from insiders that academia is a "trash fire" (not my quote!), so I don't regret this decision.

I was faced with a decision between Google and MIRI.  I knew that at MIRI I'd be taking a pay cut.  On the other hand, I'd be working on AI alignment, an important problem for the future of the world, probably significantly more important than whatever I'd be working on at Google.  And I'd get an opportunity to work with smart, ambitious people, who were structuring their communication protocols and life decisions around the content of the LessWrong Sequences.

These Sequences contained many ideas that I had developed or discovered independently, such as functionalist theory of mind, the idea that Solomonoff Induction was a formalization of inductive epistemology, and the idea that one-boxing in Newcomb's problem is more rational than two-boxing.  The scene attracted thoughtful people who cared about getting the right answer on abstract problems like this, making for very interesting conversations.

Research at MIRI was an extension of such interesting conversations to rigorous mathematical formalism, making it very fun (at least for a time).  Some of the best research I've done was at MIRI (reflective oracles, logical induction, others).  I met many of my current friends through LessWrong, MIRI, and the broader LessWrong Berkeley community.

When I began at MIRI (in 2015), there were ambient concerns that it was a "cult"; this was a set of people with a non-mainstream ideology that claimed that the future of the world depended on a small set of people that included many of them.  These concerns didn't seem especially important to me at the time.  So what if the ideology is non-mainstream as long as it's reasonable?  And if the most reasonable set of ideas implies high impact from a rare form of research, so be it; that's been the case at times in history.

(Most of the rest of this post will be negative-valenced, like Zoe's post; I wanted to put some things I liked about MIRI and the Berkeley community up-front.  I will be noting parts of Zoe's post and comparing them to my own experience, which I hope helps to illuminate common patterns; it really helps to have an existing different account to prompt my memory of what happened.)

Trauma symptoms and other mental health problems

Back to Zoe's post.  I want to disagree with a frame that says that the main thing that's bad was that Leverage (or MIRI/CFAR) was a "cult".  This makes it seem like what happened at Leverage is much worse than what could happen at a normal company.  But, having read Moral Mazes and talked to people with normal corporate experience (especially in management), I find that "normal" corporations are often quite harmful to the psychological health of their employees, e.g. causing them to have complex PTSD symptoms, to see the world in zero-sum terms more often, and to have more preferences for things to be incoherent.  Normal startups are commonly called "cults", with good reason.  Overall, there are both benefits and harms of high-demand ideological communities ("cults") compared to more normal occupations and social groups, and the specifics matter more than the general class of something being "normal" or a "cult", although the general class affects the structure of the specifics.

Zoe begins by listing a number of trauma symptoms she experienced.  I have, personally, experienced most of those on the list of cult after-effects in 2017, even before I had a psychotic break.

The psychotic break was in October 2017; although people around me to some degree tried to help me, this "treatment" mostly made the problem worse, so I was placed in 1-2 weeks of intensive psychiatric hospitalization, followed by 2 weeks in a halfway house.  This was followed by severe depression lasting months, and less severe depression from then on, which I still haven't fully recovered from.  I had PTSD symptoms after the event and am still recovering.

During this time, I was intensely scrupulous; I believed that I was intrinsically evil, had destroyed significant parts of the world with my demonic powers, and was in a hell of my own creation.  I was catatonic for multiple days, afraid that by moving I would cause harm to those around me.  This is in line with scrupulosity-related post-cult symptoms.

Talking about this is to some degree difficult because it's normal to think of this as "really bad".  Although it was exceptionally emotionally painful and confusing, the experience taught me a lot, very rapidly; I gained and partially stabilized a new perspective on society and my relation to it, and to my own mind.  I have much more ability to relate to normal people now, who are also for the most part also traumatized.

(Yes, I realize how strange it is that I was more able to relate to normal people by occupying an extremely weird mental state where I thought I was destroying the world and was ashamed and suicidal regarding this; such is the state of normal Americans, apparently, in a time when suicidal music is extremely popular among youth.)

Like Zoe, I have experienced enormous post-traumatic growth.  To quote a song, "I am Woman": "Yes, I'm wise, but it's wisdom born of pain.  I guess I've paid the price, but look how much I've gained."

While most people around MIRI and CFAR didn't have psychotic breaks, there were at least 3 other cases of psychiatric institutionalizations by people in the social circle immediate to MIRI/CFAR; at least one other than me had worked at MIRI for a significant time, and at least one had done work with MIRI on a shorter-term basis.  There was, in addition, a case of someone becoming very paranoid, attacking a mental health worker, and hijacking her car, leading to jail time; this person was not an employee of either organization, but had attended multiple CFAR events including a relatively exclusive AI-focused one.

I heard that the paranoid person in question was concerned about a demon inside him, implanted by another person, trying to escape.  (I knew the other person in question, and their own account was consistent with attempting to implant mental subprocesses in others, although I don't believe they intended anything like this particular effect).  My own actions while psychotic later that year were, though physically nonviolent, highly morally confused; I felt that I was acting very badly and "steering in the wrong direction", e.g. in controlling the minds of people around me or subtly threatening them, and was seeing signs that I was harming people around me, although none of this was legible enough to seem objectively likely after the fact.  I was also extremely paranoid about the social environment, being unable to sleep normally due to fear.

There are even cases of suicide in the Berkeley rationality community associated with scrupulosity and mental self-improvement (specifically, Maia Pasek/SquirrelInHell, and Jay Winterford/Fluttershy, both of whom were long-time LessWrong posters; Jay wrote an essay about suicidality, evil, domination, and Roko's basilisk months before the suicide itself).  Both these cases are associated with a subgroup splitting off of the CFAR-centric rationality community due to its perceived corruption, centered around Ziz.  (I also thought CFAR was pretty corrupt at the time, and I also attempted to split off another group when attempts at communication with CFAR failed; I don't think this judgment was in error, though many of the following actions were; the splinter group seems to have selected for high scrupulosity and not attenuated its mental impact.)

The cases discussed are not always of MIRI/CFAR employees, so they're hard to attribute to the organizations themselves, even if they were clearly in the same or a nearby social circle.  Leverage was an especially legible organization, with a relatively clear interior/exterior distinction, while CFAR was less legible, having a set of events that different people were invited to, and many conversations including people not part of the organization.  Hence, it is easier to attribute organizational responsibility at Leverage than around MIRI/CFAR.  (This diffusion of responsibility, of course, doesn't help when there are actual crises, mental health or otherwise.)

Obviously, for every case of poor mental health that "blows up" and is noted, there are many cases that aren't.  Many people around MIRI/CFAR and Leverage, like Zoe, have trauma symptoms (including "cult after-effect symptoms") that aren't known about publicly until the person speaks up.

Why do so few speak publicly, and after so long?

Zoe discusses why she hadn't gone public until now.  She first cites fear of response:

Leverage was very good at convincing me that I was wrong, my feelings didn't matter, and that the world was something other than what I thought it was. After leaving, it took me years to reclaim that self-trust.

Clearly, not all cases of people trying to convince each other that they're wrong are abusive; there's an extra dimension of institutional gaslighting, people telling you something you have no reason to expect they actually believe, people being defensive and blocking information, giving implausible counter-arguments, trying to make you doubt your account and agree with their bottom line.

Jennifer Freyd writes about "betrayal blindness", a common problem where people hide from themselves evidence that their institutions have betrayed them.  I experienced this around MIRI/CFAR.

Some background on AI timelines: At the Asilomar Beneficial AI conference, in early 2017 (after AlphaGo was demonstrated in late 2016), I remember another attendee commenting on a "short timelines bug" going around.  Apparently a prominent researcher was going around convincing people that human-level AGI was coming in 5-15 years.

This trend in belief included MIRI/CFAR leadership; one person commented that he noticed his timelines trending only towards getting shorter, and decided to update all at once.  I've written about AI timelines in relation to political motivations before (long after I actually left MIRI).

Perhaps more important to my subsequent decisions, the AI timelines shortening triggered an acceleration of social dynamics.  MIRI became very secretive about research.  Many researchers were working on secret projects, and I learned almost nothing about these.  I and other researchers were told not to even ask each other about what others of us were working on, on the basis that if someone were working on a secret project, they may have to reveal this fact.  Instead, we were supposed to discuss our projects with an executive, who could connect people working on similar projects.

I had disagreements with the party line, such as on when human-level AGI was likely to be developed and about security policies around AI, and there was quite a lot of effort to convince me of their position, that AGI was likely coming soon and that I was endangering the world by talking openly about AI in the abstract (not even about specific new AI algorithms). Someone in the community told me that for me to think AGI probably won't be developed soon, I must think I'm better at meta-rationality than Eliezer Yudkowsky, a massive claim of my own specialness.  I experienced a high degree of scrupulosity about writing anything even somewhat critical of the community and institutions (e.g. this post).  I saw evidence of bad faith around me, but it was hard to reject the frame for many months; I continued to worry about whether I was destroying everything by going down certain mental paths and not giving the party line the benefit of the doubt, despite its increasing absurdity.

Like Zoe, I was definitely worried about fear of response.  I had paranoid fantasies about a MIRI executive assassinating me.  The decision theory research I had done came to life, as I thought about the game theory of submitting to a threat of a gun, in relation to how different decision theories respond to extortion.

This imagination, though extreme (and definitely reflective of a cognitive error), was to some degree re-enforced by the social environment.  I mentioned the possibility of whistle-blowing on MIRI to someone I knew, who responded that I should consider talking with Chelsea Manning, a whistleblower who is under high threat.  There was quite a lot of paranoia at the time, both among the "establishment" (who feared being excluded or blamed) and "dissidents" (who feared retaliation by institutional actors).  (I would, if asked to take bets, have bet strongly against actual assassination, but I did fear other responses.)

Zoe further talks about how the experience was incredibly confusing and people usually only talk about the past events secretively.  This matches my experience.

Like Zoe, I care about the people I interacted with during the time of the events (who are, for the most part, colleagues who I learned from), and I don't intend to cause harm to them through writing about these events.

Zoe discusses an unofficial NDA people signed as they left, agreeing not to talk badly of the organization.  While I wasn't pressured to sign an NDA, there were significant security policies discussed at the time (including the one about researchers not asking each other about research).  I was discouraged from writing a blog post estimating when AI would be developed, on the basis that a real conversation about this topic among rationalists would cause AI to come sooner, which would be more dangerous (the blog post in question would have been similar to the AI forecasting work I did later, here and here; judge for yourself how dangerous this is).  This made it hard to talk about the silencing dynamic; if you don't have the freedom to speak about the institution and limits of freedom of speech, then you don't have freedom of speech.

(Is it a surprise that, after over a year in an environment where I was encouraged to think seriously about the possibility that simple actions such as writing blog posts about AI forecasting could destroy the world, I would develop the belief that I could destroy everything through subtle mental movements that manipulate people?)

Years before, MIRI had a non-disclosure agreement that members were pressured to sign, as part of a legal dispute with Louie Helm.

I was certainly socially discouraged from revealing things that would harm the "brand" of MIRI and CFAR, by executive people.  There was some discussion at the time of the possibility of corruption in EA/rationality institutions (e.g. Ben Hoffman's posts criticizing effective altruism, GiveWell, and the Open Philanthropy Project); a lot of this didn't end up on the Internet due to PR concerns.

This seemed culty to me and some friends; it's especially evocative in relation to Julian Jaynes' writing about bronze age cults, which detail a psychological model in which idols/gods give people voices in their head telling them what to do.

(As I describe these events in retrospect they seem rather ridiculous, but at the time I was seriously confused about whether I was especially crazy or in-the-wrong, and the leadership was behaving sensibly.  If I were the type of person to trust my own judgment in the face of organizational mind control, I probably wouldn't have been hired in the first place; everything I knew about how to be hired would point towards having little mental resistance to organizational narratives.)

Strange psycho-social-metaphysical hypotheses in a group setting

Zoe gives a list of points showing how "out of control" the situation at Leverage got.  This is consistent with what I've heard from other ex-Leverage people.

The weirdest part of the events recounted is the concern about possibly-demonic mental subprocesses being implanted by other people. As a brief model of something similar to this (not necessarily the same model as the Leverage people were using): people often pick up behaviors ("know-how") and mental models from other people, through acculturation and imitation. Some of this influence could be (a) largely unconscious on the part of the receiver, (b) partially intentional or the part of the person having mental effects on others (where these intentions may include behaviorist conditioning, similar to hypnosis, causing behaviors to be triggered under certain circumstances), and (c) overall harmful to the receiver's conscious goals. According to IFS-like psychological models, it's common for a single brain to contain multiple sub-processes with different intentions. While the mental subprocess implantation hypothesis is somewhat strange, it's hard to rule out based on physics or psychology.

As weird as the situation got, with people being afraid of demonic subprocesses being implanted by other people, there were also psychotic breaks involving demonic subprocess narratives around MIRI and CFAR. These strange experiences are, as far as I can tell, part of a more general social phenomenon around that time period; I recall a tweet commenting that the election of Donald Trump convinced everyone that magic was real.

Unless there were psychiatric institutionalizations or jail time resulting from the Leverage psychosis, I infer that Leverage overall handled their metaphysical weirdness better than the MIRI/CFAR adjacent community.  While in Leverage the possibility of subtle psychological influence between people was discussed relatively openly, around MIRI/CFAR it was discussed covertly, with people being told they were crazy for believing it might be possible.  (I noted at the time that there might be a sense in which different people have "auras" in a way that is not less inherently rigorous than the way in which different people have "charisma", and I feared this type of comment would cause people to say I was crazy.)

As a consequence, the people most mentally concerned with strange social metaphysics were marginalized, and had more severe psychoses with less community support, hence requiring normal psychiatric hospitalization.

The case Zoe recounts of someone "having a psychotic break" sounds tame relative to what I'm familiar with.  Someone can mentally explore strange metaphysics, e.g. a different relation to time or God, in a supportive social environment where people can offer them informational and material assistance, and help reality-check their ideas.

Alternatively, like me, they can explore these metaphysics while:

• losing days of sleep
• becoming increasingly paranoid and anxious
• feeling delegitimized and gaslit by those around them, unable to communicate their actual thoughts with those around them
• fearing involuntary psychiatric institutionalization
• experiencing involuntary psychiatric institutionalization
• having almost no real mind-to-mind communication during "treatment"
• learning primarily to comply and to play along with the incoherent, shifting social scene (there were mandatory improv classes)
• being afraid of others in the institution, including being afraid of sexual assault, which is common in psychiatric hospitals
• believing the social context to be a "cover up" of things including criminal activity and learning to comply with it, on the basis that one would be unlikely to exit the institution within a reasonable time without doing so

Being able to discuss somewhat wacky experiential hypotheses, like the possibility of people spreading mental subprocesses to each other, in a group setting, and have the concern actually taken seriously as something that could seem true from some perspective (and which is hard to definitively rule out), seems much more conducive to people's mental well-being than refusing to have that discussion, so they struggle with (what they think is) mental subprocess implantation on their own.  Leverage definitely had large problems with these discussions, and perhaps tried to reach more intersubjective agreement about them than was plausible (leading to over-reification, as Zoe points out), but they seem less severe than the problems resulting from refusing to have them, such as psychiatric hospitalization and jail time.

"Psychosis" doesn't have to be a bad thing, even if it usually is in our society; it can be an exploration of perceptions and possibilities not before imagined, in a supportive environment that helps the subject to navigate reality in a new way; some of R.D. Liang's work is relevant here, describing psychotic mental states as a result of ontological insecurity following from an internal division of the self at a previous time. Despite the witch hunts and so on, the Leverage environment seems more supportive than what I had access to. The people at Leverage I talk to, who have had some of these unusual experiences, often have a highly exploratory attitude to the subtle mental realm, having gained access to a new cognitive domain through the experience, even if it was traumatizing.

World-saving plans and rarity narratives

Zoe cites the fact that Leverage has a "world-saving plan" (which included taking over the world) and considered Geoff Anders and Leverage to be extremely special, e.g. Geoff being possibly a better philosopher than Kant.

Like Leverage, MIRI had a "world-saving plan".  This is no secret; it's discussed in an Arbital article written by Eliezer Yudkowsky.  Nate Soares frequently talked about how it was necessary to have a "plan" to make the entire future ok, to avert AI risk; this plan would need to "backchain" from a state of no AI risk and may, for example, say that we must create a human emulation using nanotechnology that is designed by a "genie" AI, which does a narrow task rather than taking responsibility for the entire future; this would allow the entire world to be taken over by a small group including the emulated human.

I remember taking on more and more mental "responsibility" over time, noting the ways in which people other than me weren't sufficient to solve the AI alignment problem, and I had special skills, so it was uniquely my job to solve the problem.  This ultimately broke down, and I found Ben Hoffman's post on responsibility to resonate (which discusses the issue of control-seeking).

The decision theory of backchaining and taking over the world somewhat beyond the scope of this post.  There are circumstances where back-chaining is appropriate, and "taking over the world" might be necessary, e.g. if there are existing actors already trying to take over the world and none of them would implement a satisfactory regime.  However, there are obvious problems with multiple actors each attempting to control everything, which are discussed in Ben Hoffman's post.

This connects with what Zoe calls "rarity narratives".  There were definitely rarity narratives around MIRI/CFAR.  Our task was to create an integrated, formal theory of values, decisions, epistemology, self-improvement, etc ("Friendliness theory"), which would help us develop Friendly AI faster than the rest of the world combined was developing AGI (which was, according to leaders, probably in less than 20 years).  It was said that a large part of our advantage in doing this research so fast was that we were "actually trying" and others weren't.  It was stated by multiple people that we wouldn't really have had a chance to save the world without Eliezer Yudkowsky.

Though I don't remember people saying explicitly that Eliezer Yudkowsky was a better philosopher than Kant, I would guess many would have said so.  No one there, as far as I know, considered Kant worth learning from enough to actually read the Critique of Pure Reason in the course of their research; I only did so years later, and I'm relatively philosophically inclined.  I would guess that MIRI people would consider a different set of philosophers relevant, e.g. would include Turing and Einstein as relevant "philosophers", and I don't have reason to believe they would consider Eliezer more relevant than these, though I'm not certain either way.  (I think Eliezer is a world-historically-significant philosopher, though not as significant as Kant or Turing or Einstein.)

I don't think it's helpful to oppose "rarity narratives" in general.  People need to try to do hard things sometimes, and actually accomplishing those things would make the people in question special, and that isn't a good argument against trying the thing at all.  Intellectual groups with high information integrity, e.g. early quantum mechanics people, can have a large effect on history.  I currently think the intellectual work I do is pretty rare and important, so I have a "rarity narrative" about myself, even though I don't usually promote it.  Of course, a project claiming specialness while displaying low information integrity is, effectively, asking for more control and resources that it can beneficially use.

Rarity narratives can have the effects of making a group of people more insular, more concentrating relevance around itself and not learning from other sources (in the past or the present), making local social dynamics be more centered on a small number of special people, and increasing pressure on people to try to do (or pretend to try to do) things beyond their actual abilities; Zoe and I both experienced these effects.

(As a hint to evaluating rarity narratives yourself: compare Great Thinker's public output to what you've learned from other public sources; follow citations and see where Great Thinker might be getting their ideas from; read canonical great philosophy and literature; get a quantitative sense of how much insight is coming from which places throughout spacetime.)

The object-level specifics of each case of world-saving plan matter, of course; I think most readers of this post will be more familiar with MIRI's world-saving plan, especially since Zoe's post provides few object-level details about the content of Leverage's plan.

Debugging

Rarity ties into debugging; if what makes us different is that we're Actually Trying and the other AI research organizations aren't, then we're making a special psychological claim about ourselves, that we can detect the difference between actually and not-actually trying, and cause our minds to actually try more of the time.

Zoe asks whether debugging was "required"; she notes:

The explicit strategy for world-saving depended upon a team of highly moldable young people self-transforming into Elon Musks.

I, in fact, asked a CFAR instructor in 2016-17 whether the idea was to psychologically improve yourself until you became Elon Musk, and he said "yes".  This part of the plan was the same.

Self-improvement was a major focus around MIRI and CFAR, and at other EA orgs.  It often used standard CFAR techniques, which were taught at workshops.  It was considered important to psychologically self-improve to the point of being able to solve extremely hard, future-lightcone-determining problems.

I don't think these are bad techniques, for the most part.  I think I learned a lot by observing and experimenting on my own mental processes.  (Zoe isn't saying Leverage's techniques are bad either, just that you could get most of them from elsewhere.)

Zoe notes a hierarchical structure where people debugged people they had power over:

Trainers were often doing vulnerable, deep psychological work with people with whom they also lived, made funding decisions about, or relied on for friendship. Sometimes people debugged each other symmetrically, but mostly there was a hierarchical, asymmetric structure of vulnerability; underlings debugged those lower than them on the totem pole, never their superiors, and superiors did debugging with other superiors.

This was also the case around MIRI and CFAR.  A lot of debugging was done by Anna Salamon, head of CFAR at the time; Ben Hoffman noted that "every conversation with Anna turns into an Anna-debugging-you conversation", which resonated with me and others.

There was certainly a power dynamic of "who can debug who"; to be a more advanced psychologist is to be offering therapy to others, being able to point out when they're being "defensive", when one wouldn't accept the same from them.  This power dynamic is also present in normal therapy, although the profession has norms such as only getting therapy from strangers, which change the situation.

How beneficial or harmful this was depends on the details.  I heard that "political" discussions at CFAR (e.g. determining how to resolve conflicts between people at the organization, which could result in people leaving the organization) were mixed with "debugging" conversations, in a way that would make it hard for people to focus primarily on the debugged person's mental progress without imposing pre-determined conclusions.  Unfortunately, when there are few people with high psychological aptitude around, it's hard to avoid "debugging" conversations having political power dynamics, although it's likely that the problem could have been mitigated.

It was really common for people in the social space, including me, to have a theory about how other people are broken, and how to fix them, by getting them to understand a deep principle you do and they don't.  I still think most people are broken and don't understand deep principles that I or some others do, so I don't think this was wrong, although I would now approach these conversations differently.

A lot of the language from Zoe's post, e.g. "help them become a master", resonates.  There was an atmosphere of psycho-spiritual development, often involving Kegan stages.  There is a significant degree of overlap between people who worked with or at CFAR and people at the Monastic Academy.

Although I wasn't directly financially encouraged to debug people, I infer that CFAR employees were, since instructing people was part of their job description.

Other issues

MIRI did have less time pressure imposed by the organization itself than Leverage did, despite the deadline implied by the AGI timeline; I had no issues with absurdly over-booked calendars.  I vaguely recall that CFAR employees were overworked especially around workshop times, though I'm pretty uncertain of the details.

Many people's social lives, including mine, were spent mostly "in the community"; much of this time was spent on "debugging" and other psychological work.  Some of my most important friendships at the time, including one with a housemate, were formed largely around a shared interest in psychological self-improvement.  There was, therefore, relatively little work-life separation (which has upsides as well as downsides).

Zoe recounts an experience with having unclear, shifting standards applied, with the fear of ostracism.  Though the details of my experience are quite different, I was definitely afraid of being considered "crazy" and marginalized for having philosophy ideas that were too weird, even though weird philosophy would be necessary to solve the AI alignment problem.  I noticed more people saying I and others were crazy as we were exploring sociological hypotheses that implied large problems with the social landscape we were in (e.g. people thought Ben Hoffman was crazy because of his criticisms of effective altruism). I recall talking to a former CFAR employee who was scapegoated and ousted after failing to appeal to the winning internal coalition; he was obviously quite paranoid and distrustful, and another friend and I agreed that he showed PTSD symptoms.

Like Zoe, I experienced myself and others being distanced from old family and friends, who didn't understand how high-impact the work we were doing was.  Since leaving the scene, I am more able to talk with normal people (including random strangers), although it's still hard to talk about why I expect the work I do to be high-impact.

An ex-Leverage person I know comments that "one of the things I give Geoff the most credit for is actually ending the group when he realized he had gotten in over his head. That still left people hurt and shocked, but did actually stop a lot of the compounding harm."  This is to some degree happening with MIRI and CFAR, with a change in the narrative about the organizations and their plans, although the details are currently less legible than with Leverage.

Conclusion

Perhaps one lesson to take from Zoe's account of Leverage is that spending relatively more time discussing sociology (including anthropology and history), and less time discussing psychology, is more likely to realize benefits while avoiding problems.  Sociology is less inherently subjective and meta than psychology, having intersubjectively measurable properties such as events in human lifetimes and social network graph structures.  My own thinking has certainly gone in this direction since my time at MIRI, to great benefit.  I hope this account I have written helps others to understand the sociology of the rationality community around 2017, and that this understanding helps people to understand other parts of the society they live in.

There are, obviously from what I have written, many correspondences, showing a common pattern for high-ambition ideological groups in the San Francisco Bay Area.  I know there are serious problems at other EA organizations, which produce largely fake research (and probably took in people who wanted to do real research, who become convinced by their experience to do fake research instead), although I don't know the specifics as well.  EAs generally think that the vast majority of charities are doing low-value and/or fake work.  I also know that San Francisco startup culture produces cult-like structures (and associated mental health symptoms) with regularity.  It seems more productive to, rather than singling out specific parties, think about the social and ecological forces that create and select for the social structures we actually see, which include relatively more and less cult-like structures.  (Of course, to the extent that harm is ongoing due to actions taken by people and organizations, it's important to be able to talk about that.)

It's possible that after reading this, you think this wasn't that bad.  Though I can only speak for myself here, I'm not sad that I went to work at MIRI instead of Google or academia after college.  I don't have reason to believe that either of these environments would have been better for my overall intellectual well-being or my career, despite the mental and social problems that resulted from the path I chose.  Scott Aaronson, for example, blogs about "blank faced" non-self-explaining authoritarian bureaucrats being a constant problem in academia.  Venkatesh Rao writes about the corporate world, and the picture presented is one of a simulation constantly maintained thorough improv.

I did grow from the experience in the end.  But I did so in large part by being very painfully aware of the ways in which it was bad.

I hope that those that think this is "not that bad" (perhaps due to knowing object-level specifics around MIRI/CFAR justifying these decisions) consider how they would find out whether the situation with Leverage was "not that bad", in comparison, given the similarity of the phenomena observed in both cases; such an investigation may involve learning object-level specifics about what happened at Leverage.  I hope that people don't scapegoat; in an environment where certain actions are knowingly being taken by multiple parties, singling out certain parties has negative effects on people's willingness to speak without actually producing any justice.

Aside from whether things were "bad" or "not that bad" overall, understanding the specifics of what happened, including harms to specific people, is important for actually accomplishing the ambitious goals these projects are aiming at; there is no reason to expect extreme accomplishments to result without very high levels of epistemic honesty.

Discuss

Optimization Concepts in the Game of Life

16 октября, 2021 - 23:51
Published on October 16, 2021 8:51 PM GMT

Abstract: We define robustness and retargetability (two of Flint’s measures of optimization) in Conway’s Game of Life and apply the definitions to a few examples. The same approach likely works in most embedded settings, and provides a frame for conceptualizing and quantifying these aspects of agency. We speculate on the relationship between robustness and retargetability, and identify various directions for future work.

Motivation

We would like to better understand the fundamental principles of agency (and related phenomena including optimization and goal-directedness). We focus on agency because we believe agency is a core source of risk from AI systems, especially in worlds with one (or few) most-capable systems. The goals of the most competent consequence-driven systems are more likely to be achieved, because trying outperforms not trying or less competent trying. We do not want to create a world where such systems are working against us. By better understanding agency, we hope to improve our ability to avoid mistakenly building systems working capably against us, and to correct course if we do.

A rich source of confusions about agency comes from attending to the fact that goal-directed systems are part of – embedded in – the environment that their goals are about. Most practical work on AI avoids the confusions of embedded agency by constructing and enforcing a Cartesian boundary between agent and environment, using frameworks such as reinforcement learning (RL) that define an interaction protocol. We focus on embedded agency because we expect not to be able to enforce a Cartesian boundary for highly capable agents in general domains, and, as a particularly strong instance of this, because agents may emerge unexpectedly in systems where we did not design how they interface with the rest of the world.

Our approach to deconfusion in this post is to identify concepts that seem relevant to embedded agency but do not have technical definitions, to propose some definitions, and see how they fare on some examples. More generally, we are interested in analyzing small examples of agency-related phenomena in the hope that some examples will be simple enough to yield insight while retaining essential features of the phenomenon.

Optimization in the Game of Life Concepts

We draw two concepts from Alex Flint’s essay The Ground of Optimization. Flint defines an optimizing system as a system that evolves towards a small set of target configurations from a broad basin of attraction, despite perturbations. The essay introduces measures for quantifying optimization systems. One is robustness: how robust to perturbations is the process of reaching the target set, e.g. the number of dimensions on which perturbations can be made or the magnitude of the perturbations. Another measure is retargetability: whether the system can be transformed into another optimizing system with a different target configuration set via a small change.

Here, we develop more precise definitions of these concepts by concentrating on a particular concrete domain: Conway’s Game of Life. This is a natural setting for studying embedded agency because it is a deterministic environment with no pre-specified Cartesian boundaries, which is rich enough to support emergent goal-directed behavior, yet simple enough to define the concepts above explicitly.

Examples

Before getting to the definitions, let’s look at how we might draw analogies between some of the examples of systems (including optimizing systems) from the Ground of Optimization post and structures in the Game of Life.

The Ground of OptimizationGame of LifeOptimizing system?Bottle cap

4-square still life

NoSatellite in orbit

Glider

NoBall in a valley

Eater

YesBall in a valley with robot Mobile eater (hypothetical)YesEntropy / death Empty boardYes

A 4-square still life is like a bottle cap in that it has been designed (or selected) to stay in place and not spontaneously disintegrate, but it does not robustly produce more specific outcomes than simply existing, and can easily be perturbed away from this state.

A glider is like a satellite in orbit: it can be redirected but does not recover its original trajectory on perturbation.

An eater is like a ball in a valley in the sense that it ends up in the same state from a variety of starting configurations. This is the state with the eater alone on the board, analogous to the state with the ball at the bottom of the valley.

We can imagine a hypothetical "mobile eater" that walks around looking for other patterns to consume. This would be more robust than the regular eater, similarly to a ball in a valley with a robot, which is more robust than just a ball in a valley.

An empty board is also an example of an optimizing system that is robust to adding non-viable collections of live cells (e.g., fewer than 3 live cells next to each other).

Preliminary Definitions

Like any embedded setting, Life does not come with a privileged Cartesian boundary. Instead we will define an operation, instantiation, that combines an agent with an environment, and thereby substantiates counterfactual questions such as “What would this agent do in a different context?” that are otherwise meaningless in a deterministic non-Cartesian world.

What kinds of things are agents and environments? We start with a very general mathematical object, a pattern, which we define as simply a state of the Game of Life world. That is, a pattern is an infinite two-dimensional Boolean grid, or equivalently a function of type ℤxℤ→{true, false}, indicating which cells are alive and which are dead. A pattern is finite if it has only finitely many cells alive.

We represent an agent as a finite pattern and an environment as a context (formally defined as a pattern). Thus, agents and environments have the same type signature, since they are made of the same "stuff" in an embedded setting.

To put the two together, we make use of a third concept, also formally represented by a pattern, which we call a mask, which specifies which parts of the context are the “holes” the agent is supposed to fit into (and replace whatever else was there). As mentioned above, the operation that combines agent and environment is instantiation:

where c is a context, m is a mask, and p is a pattern (the "agent").

Instantiating p in c results in the pattern that is the same as p wherever the mask is true, and the same as c everywhere else. By default we take the mask m to be the padding mask of one cell around all the agent’s live cells: pad(p)(i,j)=∃x,y∈{−1,0,1}.p(i+x,j+y).

In any deterministic discrete dynamical system, if we have an operation like instantiation that can combine two states of the system to produce another, then we can similarly represent potential agents and their surroundings by system states. This might allow these definitions to be generalized to other settings besides the Game of Life.

We’ll use the following notation for computations and properties in discrete dynamical systems:

• Given a state p (we use p because our Life states are patterns), step(p) is one step of evolution according to the system’s dynamics.
• The sequence p,step(p),step(step(p)),…, i.e., n↦stepn(p), is the computation seeded at p (or a “trajectory” in dynamical systems terminology).
• A property is a set of states (patterns).
• A property P is achieved by a computation s if there exists some number of steps n such that s(n)∈P. A property is fixed by a computation if s(n)∈P for all n above some bound.
RobustnessDefining robustness

We have defined patterns very generally. Which patterns are optimizing systems? As Flint noted, an optimizing system has a measure of robustness to perturbations. We can characterize this formally by considering the optimization target as a set of states P (target configurations), and the set C of possible contexts in which a pattern p might be placed.

Definition. A pattern p is robust for P within C iff for all c∈C, the computation seeded at c(p) achieves P.

In this way, the variation within C represents perturbations the system faces, and can recover from, when optimizing for the target configuration represented by P.

Examples:

• Eater. An eater p is robust for P={p} within any context c that contains n≥0 spaceships traveling in the direction of the eater (and nothing else on the board). In these contexts, the eater eventually achieves a board empty apart from itself.
• Periodic patterns. An oscillator or spaceship p with period N is robust for Pn={q∣q∼stepn(p)} (for any n) within the empty board C={⊥} (where ∼ is equivalence up to translation). This includes still lifes (N=0), blinkers (N=2), gliders (N=4), etc.

Our use of contexts to represent perturbations is a little different from the intuitive notion. In particular, we do not directly consider perturbations that happen during the computation, that is, interventions on the state of the board at some step after the initial state c(p). One could consider this kind of external perturbation in an alternative definition, which may also be illuminating. An advantage of our approach is that it recognises that many perturbations can be achieved within the Game of Life computation itself  – one might call these embedded perturbations. Specifically, one can include in C a context c that contains a pattern that is “going to perturb p after k timesteps” (e.g., a glider that is going to collide with p after k timesteps).

The more robust a system is, and the more restrictive its target is, the more it seems like an optimizing system. These two axes correspond to the “size” of the two components of our formal robustness definition: the contexts C and the target P. If C is “larger”, the system is robust to more variation, and if P is “smaller”, the target is more restrictive. We will leave quantification of size unspecified for now, since there are various candidate definitions but we haven’t found a clearly correct one yet.

Definitions building on robustness

Definition. The basin of attraction for a pattern p and a property P is the largest context set B such that p is robust for P within B.

Examples:

• Eater. Let p be an eater and P={p}. B is the context set containing n≥0 spaceships moving in the direction of the eater and nothing else (in any other context, the contents of the board don't get consumed by the eater).
• Any pattern. Let p be an arbitrary pattern and P={q∣∃c.q=c(p)}. Then B is the set of all contexts: P is achieved immediately by c(p).

Definition. Now if we keep C fixed and vary P instead, we can define the minimal property of a pattern p within a context set C as the "smallest" property P such that p is robust for P within C.

We will discuss some options for quantifying the size of a property in the next section. For now, we consider some examples of minimal properties using set cardinality as the measure of size.

Examples:

• Still life. Let p be a still life and C={c∣pad(p)&c=⊥∧c is a still life} (still lifes not overlapping with p). Then P = {still life q | q=c(p) for some c} (since c(p) is a still life that is different for every context).
• Eater. Let p be an eater and C be the context set containing spaceships moving in the direction of the eater. Then P={p}.

The concept of a minimal property is related to the idea of a behavioral objective: a goal that a system appears to be optimizing for given its behavior in a set of situations. Given a pattern p and a context set C, the set of properties that p is robust for within C corresponds to the set of possible behavioral objectives for p within the set of situations C. We may be interested in the simplest behavioral objectives, corresponding to the set of minimal properties that p is robust for within C.

Options for robustness definitions

How might we quantify size in our definitions above? Primarily, we seek a notion of size for a property, which is a set of patterns. Set cardinality is one option, which ignores the contents of the patterns, counting each of them equally. Another option would be to combine (e.g., take an average of) sizes of the constituent patterns. Natural possibilities in Life for the size of a pattern include number of live cells or size of the smallest rectangle bounding the live cells. A different option, that may capture the sense needed better, is a complexity-based definition such as Kolmogorov complexity (of the whole property) or Levin complexity. It remains to be worked out whether any of these notions of size give our definitions above a natural semantics, or whether we need a different notion of size.

We defined robustness in terms of achieving a property. We could have defined it instead in terms of fixing a property, which is a stronger condition (any computation that fixes a property also achieves it, but not vice versa). However, the two definitions are equivalent if we restrict attention to stable properties that satisfy stepn(p)∈P whenever p∈P. We can stabilize a property P by unioning it with all the states any elements produce after any number of steps.

Retargetability

The orthogonality thesis states that “more or less any level of intelligence could in principle be combined with more or less any final goal”, suggesting the idea that the capability to achieve goals in general (intelligence) is separate from the particular goal being pursued. Not all optimizing systems satisfy this separation, as Flint’s examples show, but those that do should score more highly on his measures of duality and retargetability. We think duality and retargetability are hard to distinguish concepts, and will focus on the latter.

To get more precise about retargetability, let’s use the definition of robustness above for the aspect of retargetability that requires a notion of goal pursuit.

Definition. A pattern p is retargetable for a set G of properties (the “possible goals”) if there exists a context set C, such that for any property Pi in G there is a pattern pi that is a “small” change from p, such that pi is robust for Pi within C.

The degree of retargetability depends on the size of the set G (more, or more interesting, possible goals are better), the size of the changes (smaller, or less complex changes required for retargeting are better), and the size of the context set (larger is better).

This definition is again dependent on a way to measure sizes, for example, the size of the change between p and pi. Some candidates include: Kolmogorov complexity of the change, the number of cells changed, and the size of the area in which changes are made.

Examples:

• Glider gun. Let p be a glider gun positioned at (0,0) on the board. Let the target set of goals be G={P1,P2,P3,P4}, where Pi is the property that there is a glider in the ith quadrant of the board. Namely, Pi={q∣∃gi.gi∼g∧gi≤Qi∧gi≤q}, where g is a glider and Qi is a pattern covering the ith quadrant of the board.
Then for any Pi, we obtain pi by rotating the glider gun to fire into the quadrant Qi, which is a small change by the complexity definition. Let C be the set of still life contexts that don't overlap with the glider gun or its firing path for any of the four rotations of the glider gun. Then pi is robust for Pi within C, so p is retargetable for the target set G.
• Turing machine. Let p be a pattern implementing a Turing machine computing some function f(x), e.g., "given input x, compute x+1". For any input x, let Px be the set of board states where the output tape of the Turing machine contains f(x). Let the target set of goals be G={Px∣x is a possible input}.
Then for any Px we can obtain px by placing x on the input tape of the Turing machine, which is a small change by all the definitions of size we considered (number of cells, size of area, and complexity). Let C be the set of still life contexts that don't overlap with the Turing machine. Then px is robust for Px within C, so p is retargetable for the target set G.

Our setup suggests a possible relationship between robustness and retargetability: it may be difficult for a pattern to be both robust and retargetable. A retargetable pattern needs to be “close to” robustly achieving many targets, but this may be in tension with robustly achieving a single target property. The reason is that a context may cause retargeting via an embedded perturbation, and the new target property may not overlap with the original target property. For example, since the Turing machine is retargetable by changing the input, it's not robust to contexts that change its input.

Conclusions and open questions

We have proposed some definitions for robustness and retargetability in Conway’s Game of Life, and shown examples of how they work. Our definitions are not fully specified - they lack a good specification of how to quantify sizes of patterns and sets of patterns. We hope they nevertheless illustrate an interesting way of looking at optimizing systems in a concrete deterministic setting.

Here are some open questions that we would be excited to get your input on:

• To what extent is there a tradeoff between robustness and retargetability?
• Is robustness or retargetability of a system a greater concern from the alignment perspective?
• It seems feasible to extend our definitions from the Game of Life to other environments where instantiation can be defined. We'd be interested in your suggestions of interesting environments to consider.
• More examples of interesting robust patterns. What could they tell us about the properties that C should have in the definition of robustness?
• Possible theorems restricting the size or content of robust patterns. E.g., for some class of contexts do you need to be “agent-like” in some way, such as doing something like perception, in order to be robust?
• No-free-lunch type theorems on what kinds of combinations of context set C and property P are impossible for any robust pattern.

Discuss

What if we should use more energy, not less?

16 октября, 2021 - 22:51
Published on October 16, 2021 7:51 PM GMT

Declining growth rates and technological stagnation since the 70s correlate with flatlining energy use per capita.

I've been reading "Where Is My Flying Car?: A Memoir of Future Past" by J Storrs Hall. Ostensibly a book about a promised future that never arrived, it's a broader commentary on a technologically stagnating culture and society. A few others have reviewed it here on Less Wrong already, and Jason Crawford has a great summary and commentary of the book at Roots of Progress

I'm sharing this post here to get feedback on the coherence of the idea that declining growth rates and the technological stagnation since the 70s strongly correlate with flatlining energy use per capita:

A key takeaway from the book has been the counter-intuitive realization that perhaps one of the main reasons for the so-called "Great Stagnation" is the western world's flatlining energy usage (per capita). But given our overall increase in energy consumption (mostly due to the growth of the developing world), one can be forgiven for having missed this - especially as we're trying to deal with a warming climate by using less energy. Yet this drive towards efficiency and the resultant decline in energy usage among developed nations, at least according to Hall, may be one of our biggest mistakes of the past half-century.

The Great Stagnation - as readers of Less Wrong are likely familiar with - is the term coined/popularized by economist Tyler Cowen which is now used to describe the current period since the ~70s of declining economic and technological growth experienced by most developed nations. This is perfectly exemplified by the fact that almost every important piece of technology we use in our day-to-day lives and in industries where invented before the 60s. This includes things like refrigerators, freezers, vacuum cleaners, gas and electric stoves, and washing machines; indoor plumbing, detergent, and deodorants; electric lights; cars, trucks, and buses; tractors and combines; fertilizer; air travel, containerized freight, the vacuum tube, and the transistor; the telegraph, telephone, phonograph, movies, radio, and television.

Although this stagnation is surely a result of a wide range of social, economic, and technological factors; like the fact that we've already picked a lot of the "long hanging fruit" of technological innovation since the start of the industrial revolution, or that most women have moved into the workforce since the second world war. According to (my interpretation of) Hall, the underlying cause of this stagnation is our stagnating energy usage. Or put differently, the decline in the growth rate of energy usage in advanced economies. Which is a result of a shift from a focus on progress in tech to a focus on the efficiency of tech.

In other words, we have had a very long-term trend in history going back at least to the Newcomen and Savery engines of 300 years ago, a steady trend of about 7% per year growth in usable energy available to our civilization. Let us call it the “Henry Adams Curve.” The optimism and constant improvement of life in the 19th and first half of the 20th centuries can quite readily be seen as predicated on it. To a first approximation, it can be factored into a 3% population growth rate, a 2% energy efficiency growth rate, and a 2% growth in actual energy consumed per capita. Here is the Henry Adams Curve, the centuries-long historical trend, as the smooth red line. Since the scale is power per capita, this is only the 2% component. The blue curve is actual energy use in the US,   which up to the 70s matched the trend quite well. But then energy consumption flatlined.

The story of human progress is largely a story of how much energy we have been able to harness and put to productive use. Starting with our early ancestor’s ability to harness fire to the discovery that we could split the atom, and beyond. Declining energy usage is therefore a problem because technological innovation and growth are tightly correlated with increased energy consumption, and technological innovation is one of the main drivers of progress. Or perhaps it would be more accurate to say that in order to drive innovation broadly, we have to use more energy because advanced technologies are generally more energy-intensive. All things equal, increased energy efficiency is great, but all things aren't equal and we've traded growth for efficiency.

The extent to which a technology didn’t live up to its Jetson’s-era expectations is strongly correlated with its energy intensity. The one area where progress continued most robustly—Moore’s Law in computing and communications—was the one where energy was not a major concern.

The one notable exception to this is the computing revolution which birthed the Information Technology industry - arguably the only technological revolution we've had since the ~60s. Computing, driven by improvements in semi-conductor performance and computing power according to Moore's Law, is perhaps the only area where increased energy consumption beyond what was available in the 70s hasn't been needed to keep up [with Moore's Law]. As such, it has been able to grow despite the focus on energy efficiency. Not surprisingly, nearly all of today's most valuable companies by market cap are tech/IT companies.

No one captured this as eloquently as Peter Thiel when he said:

"We wanted flying cars, instead we got 140 characters.

So if we want to see continued improvements to our quality of life and progress as a civilization, we need new technological revolutions in the world of atoms and not just bits. Nanotech, flying cars, space travel, biomedicine. All those things are possible but they will demand a lot more energy; in addition to less regulation, better academic institutions, and a culture that wants growth, according to Hall.

Needless to say, the flatlining of energy usage has had a profound effect on our current predicament. Given how our modern society and political systems work, steady economic growth seems crucial for the continued existence of peaceful and prosperous civilizations (stagnation or "degrowth" leads to zero-sum competition for resources). The effects of stagnating growth may even be more detrimental than the effects of climate change in the short to medium term. And faster technological progress may be just what we need in order to deal with climate change, rather than less progress.

What is then the cause of this decline in the growth of energy usage? Probably the drive for efficiency over effectiveness (for ideological reasons), stricter regulation of science and technology since the middle of the 20th century, including many energy-producing industries, and a culture that's often opposed to technological progress. The decline in energy use per capita started in the 70s and largely coincides with the spread of the counter-cultures of the 60s, including the green movement. Of course, not all regulations are bad or unnecessary, but it's not very clear that strict regulation actually makes us much safer and healthier overall, and in fact, the opposite may be true:

"Economists John Dawson and John Seater recently published a study in the Journal of Economic Growth, “Federal Regulation and Aggregate Economic Growth”, [82]  that put some hard numbers to these observations. The result is startling: America’s median household income is now $53,000. If we had simply maintained the amount of regulation we had in 1949 since then, our income would now be$185,000 per household."

In retrospect, the utopian science fiction from the first half of the 20th century looks ridiculous. Flying cars and robot maids perhaps best illustrated by the Jetson's cartoon is something we laugh at today. But when you plot the growth trends from that era it was perfectly reasonable to predict its continuation for many more decades or centuries. In fact, the economist Alex Tabarrok (a colleague with Tyler Cowen) made just this point in a recent blog post on Marginal Revolution, The Future is Getting Farther Away:

If total factor productivity had continued to grow at its 1957 to 1973 rate then we today would be living in the world of 2076 rather than in the world of 2014.

While we probably can't expect growth to continue forever, it's not at all a given that it must decline now. And the better tech we have, the more smart people (and machines) there are, and the better our political systems, the better our chances of dealing with that eventuality.

It's quite clear to me that the current myopic and dystopian narrative that's captured a large part of the western zeitgeist is largely explained by our declining growth, just like the perhaps overoptimistic extrapolations of the early 20th century is explained by that eras ever-increasing growth trends.

Nevertheless, if we look at our (economic) growth trajectory on the time frame of many thousands of years, we're still in the early stages of hypergrowth and judging by the trend, we can expect it to continue for some time - which is a cause for optimism.

The need for growth in a finite world is often criticized as a greedy and selfish impulse, but I believe we need continued growth exactly to solve some of our most pressing problems, including climate change, poverty, and a stagnating quality of life. And to do that, we must dare to use more energy, not less.

Obviously, we don't want to burn more fossil fuels than we are doing yet we shouldn’t minimize the importance that fossil fuels have played and still play in our society. Luckily, we already have a viable technology that provides clean and reliable energy with zero emissions: it's called nuclear energy. Although we should and have to continue improving other alternative energy sources (including solar, wind and even nuclear fusion!), and stop using fossil fuels asap, we can only do that if create the right incentives and provide the right mechanisms for technological innovation. The conclusion to me, then, is that our drive towards efficiency before we've reached some sort of technological maturity may be undermining the long-term potential of our civilization and the near-term viability of our modern societies.

P.S. Tyler Cowen has recently said that he now thinks we might be coming out of the great stagnation in the near future, something mainly driven by innovations across various fields of technology and science like computing, AI, Space tech, biotech, etc; that all converge into general-purpose innovations.

Discuss

Is moral duty/blame irrational because a person does only what they must?

16 октября, 2021 - 22:30
Published on October 16, 2021 5:00 PM GMT

I should note a few things from the start. I understand that there is much prewritten work available here, namely the sequences, the codex, and my favorite fanfic ever, HPMOR. I have tried to find and understand where any of these or any other prewritten works associated with LessWrong.com might have already addressed these questions. I am writing this however because either I did not find the answers I was looking for or I have not recognized them; either way I ask for assistance.

Also, full disclosure, while I have spent the majority of the past three and a half decades (of my 53 total years) on my own exploring applied rationality and discussing it face to face with others in my life’s orbit, as of the last three years or so I have become drawn to the idea of making a book out of my conclusions so far, on the off chance that doing so could be helpful to even a single other person who comes across it. My time on this planet is limited; I’d like to leave some collected thoughts, the fruits of my time to date, to survive me – again, on the off chance that they can be helpful.

One of the realizations I’ve had more recently (largely because it was not an area of thought that interested me until recently) is the question of whether we have the option of making any other choices than the ones we actually do. This by some is called the question of free will, but I am choosing to largely avoid that phrase here in order to attempt to be much more specific, to wit:

When a sentient being (such as I or you) makes decision, is it ever possible they could have made a decision different from the one that they actually made. (Or if speaking of a future decision, will make.)

I am imagining that my chain of thinking on this topic may well seem simplistic, even naïve to the minds found here, but here it is nonetheless, the best effort I have been able to make in applying reason to this question so far, in abbreviated form:

• In order to seek explanations for the patterns we see in reality, we must first embrace the idea that even when we don’t know the reasons why something has happened, we still hold that there are reasons why it happened. We must first begin with the idea that explanations are possible to begin seeking them, even if some wind up to be statistical in nature. Even when we cannot determine how something occurred, we still understand that it had a reason for occurring.
• Thus, we embrace that whatever happens in reality happens because of reality – because of the specific details and configuration of reality in the moment.
• Human decisions and choices are also things that happen in reality.
• Thus, human decisions and choice also happen due to the specific details and configuration of reality: both the reality of the environment that the person is confronted with when making the choice, and the reality of the full nature of the person making the choice at the time of doing so.
• Therefore, a person who makes choice A had to make that choice, for to have made any other choice the person would have either had to be different than they actually were, or the situation/environment would have had to be different than it actually was. Given the specific reality of both the person and the situation they are in when they make a choice, it stands to reason that the choice they do make is the choice they must make, else they would be making a different choice instead.

The above seems inescapably tautologically true, to my best effort to find otherwise. When a sentient being makes any choice of any kind, the “rules of reality”, whatever they may be, dictate what that choice will be. This is not to say we know the rules well enough to be able to predict the choice, nor to say we will necessarily ever be able to learn the rules well enough to be able to make such an accurate prediction. But we don’t need to be able to predict a future choice to know that the choice that will be made will be the one that reality dictates must be made.

It therefore becomes irrational to insist that people “ought” to have made any choice other than the one that they do, if it is indeed true that the choice a person actually makes in any circumstance is the only choice they are capable of making, given whatever the rules of reality actually are. How does it make sense to blame a person for not doing what they are incapable of doing?

If human choice works this way, as I think the above demonstrates, we cannot be reasonably any more upset with people for their “wrong” choices than we can be upset at a car that doesn’t function: both have no choice in their response, and simply work how they must.

There was an illustration I read somewhere with this exact comparison: that holding people responsible for their choices is no different from Basil Fawlty warning his car that it better stop misbehaving and start working, and then giving his car a “well-deserved” thrashing with an umbrella when it failed to shape up.

Note: I am not saying that we should not visit consequences on those who do things that we dislike, merely that those consequences make more sense from a view to changing the future than punishing the past. Imprisoning someone who enjoys murdering people makes sense if doing so reduces future murders, assuming that is one of our goals – regardless of whether or not we “blame” the murderer.

So this is where I am at. It seems to me that no one can have the capacity to do anything other than the rules of reality demand, and as such, whenever we make a decision or choice, it was the one we had to make given all the circumstances. Since we had no option of making any other decision, we cannot reasonably have any moral duty to do what we cannot do. Thus no person, no matter what they choose to do, can be reasonably told that they “ought” to have done any different, since they simply did not have that option.

And so I have concluded that there is no basis on which to judge others as “to blame” or as “morally wrong”. Thus punishment makes no sense in terms of being “deserved”, and neither does vengeance (except possibly in terms of the vengeance seeker seeking their own emotional pleasure/relief/closure).

The silver lining if I am right I think is threefold:

• This would more clearly delineate the difference between punishment and justice: whereas punishment makes no rational sense from a moral stance, justice still makes sense because justice is about influencing the future using the past only as data, but not living in the past.
• We would have to stop blaming both others and ourselves, and instead start asking better questions, like “if I don’t like what the other person did, what do I want to do about it?” Hopefully less raging and more pragmatics.
• We would have to finally embrace a practical morality. A preacher I discourse with asked me, given that fact I cannot be held to blame for any of my actions, what’s to stop me from pushing my own mother down the stairs. My answer was simple. I don’t do such things because they are not who I am, because doing things like that would distress me. And that is also the reason I didn’t do any of that before I reached this conclusion too.
(For people who don’t have such natural instincts, that is one reason we do need consequences/justice: to both disincentivize unwanted behaviors and to remove the capabilities of those to commit such acts in the future.)

One of the main reasons I am putting this all here is it seemed to me that the sequence on Free Will, as best as I could understand it, may have been disputing some of the elements above – although much of it was either not about these items directly or over my head (or both, perhaps). For all I know, the Free Will sequence is about something entirely different than what is above.

For instance, it seems to me that the Free Will sequence was focused a lot more on the question of “why does it seem to us that we have freedom of choice” rather than examining whether we actually do have it or not. And I didn’t see anything in that sequence that addressed any of my thoughts above – either because it doesn’t or because it went over my head and I may need help making it not do that.

So, where do we go from here? As far as I can see at this moment, it seems unavoidably and even factually correct to state that no being in reality can possibly have freedom to act in any way other than the rules of reality require, and those rules must thus determine what choice they must make in any given moment. Without a capability to have made any choices differently given the circumstances, we  cannot rationally assign any moral duty or blame.

If this thinking is not correct, then please demonstrate to me why? I am, as always, trying to become less wrong.

Discuss

The AGI needs to be honest

16 октября, 2021 - 22:24
Published on October 16, 2021 7:24 PM GMT

Imagine that you are a trained mathematician and you have been assigned the job of testing an arbitrarily intelligent chatbot for its intelligence.

You being knowledgeable about a fair amount of computer-science theory won’t test it with the likes of Turing-test or similar, since such a bot might have any useful priors about the world.

You have asked it find a proof for the Riemann-hypothesis. the bot started its search program and after several months it gave you gigantic proof written in coq.

You have tried to run the proof through a proof-checking assistant but quickly realized that checking that itself would years or decades, also no other computer except the one running the bot is sophisticated enough to run such a gigantic proof.

You have asked the bot to provide you a zero-knowledge-proof, but being a trained mathematician you know that a zero-knowledge-proof of sufficient credibility requires as much compute as the original one. also, the correctness is directly linked to the length of the proof it generates.

You know that the bot may have formed increasingly complex abstractions while solving the problem, and it would be very hard to describe those in exact language to you.

You have asked the bot to summarize the proof for you in natural-language, but you know that the bot can easily trick you into accepting the proof.

You have now started to think about a bigger question, the bot essentially is a powerful optimizer. In this case, the bot is trained to find proofs, its reward is based on finding what a group of mathematicians agree on how a correct proof looks like.

But the bot being bot doesn’t care about being true to you or itself, it is not rewarded for being “honest” it is only being rewarded for finding proof-like strings that humans may select or reject.

So it is far easier to find a large coq-program, large enough that you cannot check by any other means than to find a proof for Riemann-hypothesis.

Now you have concluded that before you certify that the bot is intelligent, you have to prove that the bot is being honest.

Going by the current trend, it is okay for us to assume that such an arbitrarily intelligent bot would have a significant part of it based on the principles of the current deep-learning stack. assume it be a large neural-network-based agent, also assume that the language-understanding component is somewhat based on the current language-model design.

So how do you know that the large language model is being honest?

A quick look at the plots of results on truthful-qa dataset shows that truthfulness reduces with the model-size, going by this momentum any large-models trained on large datasets are more likely to give fluke answers to significantly complex questions.

Any significantly complex decision-question if cast into an optimization problem has one hard-to-find global-minima called “truth” but extremely large number of easy-to-find local-minima of falsehoods, how do you then make a powerful optimizer optimize for honesty?

Discuss

Memetic hazards of AGI architecture posts

16 октября, 2021 - 19:10
Published on October 16, 2021 4:10 PM GMT

I've been thinking recently and writing a post about potential AGI architecture that seems possible to make with current technology in 3 to 5 years, and even faster if significant effort will be put to that goal.

It is a bold claim, and that architecture very well might not be feasible, but it got me thinking about the memetic hazard of similar posts.

It might very well be true that there is an architecture combining current AI tech in a manner as to create AGI out there; in that case, should we treat it as a memetic hazard? If so, what is the course of action here?

I'm thinking that the best thing to do is to covertly discuss it with the AI Safety crowd, both to understand it's feasibility, and to start working on how to keep this particular architecture aligned (which is a much easier task than aligning something that you don't even know how it will look.)

What are your thoughts on this matter?

Discuss

16 октября, 2021 - 10:30
Published on October 16, 2021 7:30 AM GMT

Good writing illuminates surprising things about reality. It must therefore be grounded in reality. Losing touch with reality is boring.

The best way to keep your writing grounded in reality is to write concretely. Don't write "the United States committed war crimes". Write "the United States firebombed women and children". Personal experience is always concrete.

• Don't argue. Arguing shifts your focus from things to ideas. It distances you from empirical reality. Don't anticipate counterarguments. Preemptive counterargument is form of arguing.
• Don't write positively about other peoples' opinions. I love George Orwell but putting him on a pedestal is no less shallow as arguing against him.

Writing about facts, feelings and faith is fine. Fiction has its place too. What you shouldn't write about is other peoples' beliefs. Doing so opens the Box of Infinite Recursion and ultimately leads to the Black Hole of Drama.

Reversed conformity is orthogonal to independent thought. Independent thought equals ignoring others' opinions.

Most people are right most of the time about most things. Deviating from consensus makes you less correct on average. How correct you are on average is unimportant when you are inventing radical ideas. You must weigh according to impact.

• Personal attacks cause collateral damage. Before you make a personal attack you should be extremely confident that your claim is true and that the good will outweigh the harm.
• It's worth publishing weird ideas even when most of them are wrong because if you publish a weird idea and your idea is good then it will be adopted by many people whereas if the idea is bad then it will be quickly forgotten.

Make your claims easy to falsify. Claims that aren't falsifiable aren't grounded in reality.

Surprise!

Unsurprising facts are boring. Good writing focuses on the surprising ones. If you're ignoring others' opinions then "surprising" means "surprising to you". Explore.

Surprise is temporary. If you discover something surprising then you should write about it immediately, before you acclimatize.

Discuss

Long Covid Informal Study Results

16 октября, 2021 - 10:10
Published on October 16, 2021 7:10 AM GMT

Introduction

Yesterday* I talked about a potential treatment for Long Covid, and referenced an informal study I’d analyzed that tried to test it, which had seemed promising but was ultimately a let down. That analysis was too long for its own post, so it’s going here instead.

Gez Medinger ran an excellent-for-its-type study of interventions for long covid, with a focus on niacin, the center of the stack I took. I want to emphasize both how very good for its type this study was, and how limited the type is. Surveys of people in support groups who chose their own interventions is not a great way to determine anything. But really rigorous information will take a long time and some of us have to make decisions now, so I thought this was worth looking into.

Medinger does a great analysis in this youtube video. He very proactively owns all the limitations of the study (all of which should be predictable to regular readers of mine) and does what he can to make up for them in the analysis, while owning where that’s not possible. But he delivers the analysis in a video rather than a text post ugh why would you do that (answer: he was a professional filmmaker before he got long covid). I found this deeply hard to follow, so I wanted to play with the data directly. Medinger generously shared the data, at which point this snowballed into a full-blown analysis.

I think Medinger attributes his statistics to a medical doctor, but I couldn’t find it on relisten and I’m not watching that damn video again. My statistical analysis was done by my dad/Ph.D. statistician R. Craig Van Nostrand. His primary work is in industrial statistics but the math all transfers, and the biology-related judgment calls were made by me (for those of you just tuning in, I have a BA in biology and no other relevant credentials or accreditations).

The Study

As best I can determine, Medinger sent a survey to a variety of long covid support groups, asking what interventions people had tried in the last month, when they’d tried them, and how they felt relative to a month ago. Obviously this has a lot of limitations – it will exclude people who got better or worse enough they didn’t engage with support groups, it was in no way blinded, people chose their own interventions, it relied entirely on self-assement, etc.

Differences in Analysis

You can see Medinger’s analysis here. He compared the rate of improvement and decline among groups based on treatments. I instead transformed the improvement bucket to a number and did a multivariate analysis.

Much better (near or at pre-covid)1Significantly better0.5A little better0.1No change0A little worse-0.2Significantly worseCuriously unusedMuch worse-1.2

You may notice that the numerical values of the statements are not symmetric- being “a little worse” is twice as bad as “a little better” is good. This was deliberate, based on my belief that people with chronic illness on average overestimate their improvement over short periods of time. We initially planned on doing a sensitivity analysis to see how this changed the results; in practice the treatment groups had very few people who got worse so this would only affect the no-treatment control, and it was obvious that fiddling with the numbers would not change the overall conclusion.

Also, no one checked “significantly worse”, and when asked Medinger couldn’t remember if it was an option at all. This suggests to me that “Much worse” should have a less bad value and “a little worse” a more bad value. However, we judged this wouldn’t affect the outcome enough to be worth the effort, and ignored it.

We tossed all the data where people had made less than two weeks ago (this was slightly more than half of it), except for the no-change control group (140 people). Most things take time to have an effect and even more things take time to have an effect you can be sure isn’t random fluctuation. The original analysis attempted to fix this by looking at who had a sudden improvement or worsening, but I don’t necessarily expect a sudden improvement with these treatments.

We combined prescription and non-prescription antihistamines because the study was focused on the UK which classifies several antihistamines differently than the US.

On row 410, a user used slightly nonstandard answers, which we corrected to being equivalent to “much improved’, since they said they were basically back to normal.

Medinger uses both “no change” and “new supplements but not niacin” as control groups, in order to compensate for selection and placebo effects from trying new things. I think that was extremely reasonable but felt I’d covered it by limiting myself to subjects with >2 weeks on a treatment and devaluing mild improvement.

Results

I put my poor statistician through many rounds on this before settling on exactly which interventions we should focus on. In the end we picked five: niacin, anti-histamines, and low-histamine diet, which the original analysis evaluated, and vitamin D (because it’s generally popular), and selenium (because it had the strongest evidence of the substances prescribed the larger protocol, which we’ll discuss soon).

Unfortunately, people chose their vitamins themselves, and there was a lot of correlation between the treatments. Below is the average result for people with no focal treatments, everyone with a given focal treatment, and everyone who did that and none of the other focal treatments for two weeks (but may have done other interventions). I also threw in a few other analyses we did along the way. These sample sizes get really pitifully small, and so should be taken as preliminary at best.

TreatmentNiacin, > 2 weeksSelenium, > 2 weekVitamin D, > 2 weekAntihistamines, > 2 weeksLow-histamine diet, > 2 weeksChange (1 = complete recovery)95% Confidence Interval nNo change000000.04± 0.07140Niacin, > 2 weeks1––––0.23± 0.0791Selenium, > 2 weeks–1–––0.24±0.0788Vitamin D, > 2 week––1––0.15±0.05261Antihistamines, >2 weeks–––1–0.18± 0.06164Low histamine diet––––10.18±0.06195Niacin, > 2 weeks, no other focal treatments100000.15±0.211Selenium, > 2 weeks, no other focal treatments010000.05±0.064Vitamin D, > 2 week, no other focal treatments001000.07±0.08106Antihistamines, >2 weeks, no other focal treatments000100.08±0.1326Low histamine diet, > 2 weeks, no other focal treatments000010.13±0.1444All focal treatments111110Niacin + Antihistamines, >2 weeks1––100.33± 0.0738Niacin + Low Histamine Diet, > 2 weeks100010.29±0.1036Selenium + Niacin, no histamine interventions11–000.05±0.1917Niacin, > 2 weeks, no other focal treatments, ignore D10–000.13±0.1219Selenium, > 2 weeks, no other focal treatments, ignore D01–000.16±0.1218

1 = treatment used

0 = treatment definitely not used

– = treatment not excluded

Confidence interval calculation assumes a normal distribution, which is a stretch for data this lump and sparse but there’s nothing better available.

[I wanted to share the raw data with you but Medinger asked me not to. He was very fast to share with me though, so maybe if you ask nicely he’ll share with you too]

You may also be wondering how the improvements were distributed. The raw count isn’t high enough for really clean curves, but the results were clumped rather than bifurcated, suggesting it helps many people some rather than a few people lots. Here’s a sample graph from Niacin (>2 weeks, no exclusions)

Reasons this analysis could be wrong
• All the normal reasons this kind of study or analysis can be wrong.
• Any of the choices I made that I outlined in “Differences…”
• There were a lot of potential treatments with moderate correlations with each other, which makes it impossible to truly track the cause of improvements.
• Niacin comes in several forms, and the protocol I analyze later requires a specific form of niacin (I still don’t understand why). The study didn’t ask people what form of niacin they took. I had to actively work to get the correct form in the US (where 15% of respondents live); it’s more popular but not overwhelmingly so in the UK (75% of respondents), and who knows what other people took. If the theory is correct and if a significant number of people took the wrong form of niacin, it could severely underestimate the improvement.
• This study only looked at people who’d changed things in the last month. People could get better or worse after that.
• There was no attempt to look at dosage.
Conclusion

For a small sample of self-chosen interventions and opt-in participation, this study shows modest improvements from niacin and low histamine diets, which include overlap with the confidence interval of the no-treatment group if you exclude people using other focal interventions. The overall results suggest that either something in the stack is helping, or that trying lots of things is downstream of feeling better, which I would easily believe.

Thank you to Gez Medinger for running the study and sharing his data with me, R. Craig Van Nostrand for statistical analysis, and Miranda Dixon-Luinenburg⁩ for copyediting.

* I swear I scheduled this to publish the day after the big post but here we are three days later without it unpublished, so…

Discuss

NLP Position Paper: When Combatting Hype, Proceed with Caution

16 октября, 2021 - 03:09
Published on October 15, 2021 8:57 PM GMT

Linkpost for https://cims.nyu.edu/~sbowman/bowman2021hype.pdf. To appear on arXiv shortly.

I'm sharing a position paper I put together as an attempt to introduce general NLP researchers to AI risk concerns. From a few discussions at *ACL conferences, it seems like a pretty large majority of active researchers aren't aware of the arguments at all, or at least aren't aware that they have any connection to NLP and large language model work.

The paper makes a slightly odd multi-step argument to try to connect to active debates in the field:

• It's become extremely common in NLP papers/talks to claim or imply that NNs are too brittle to use, that they aren't doing anything that could plausibly resemble language understanding, and that this is a pretty deep feature of NNs that we don't know how to fix. These claims sometimes come with evidence, but it's often bad evidence, like citations to failures in old systems that we've since improved upon significantly. Weirdly, this even happens in papers that themselves to show positive results involving NNs.
• This seems to be coming from concerns about real-world harms: Current systems are pretty biased, and we don't have great methods for dealing with that, so there's a pretty widely-shared feeling that we shouldn't be deploying big NNs nearly as often as we are. The reasoning seems to go: If we downplay the effectiveness of this technology, that'll discourage its deployment.
• But is that actually the right way to minimize the risk of harms? We should expect the impacts of these technologies to grow dramatically as they get better—the basic AI risk arguments go here—and we'll need to be prepared for those impacts. Downplaying the progress that we're making, both to each other and to outside stakeholders, limits our ability to foresee potentially-impactful progress or prepare for it.

I'll be submitting this to ACL in a month. Comments/criticism welcome, here or privately (bowman@nyu.edu).

Discuss

Do you think you are a Boltzmann brain? If not, why not?

15 октября, 2021 - 09:02
Published on October 15, 2021 6:02 AM GMT

For more on Boltzmann brains, see here.

Discuss

Intelligence, epistemics, and sanity, in three short parts

15 октября, 2021 - 07:01
Published on October 15, 2021 4:01 AM GMT

Epistemic status: Boggling. This is early, messy work.

Part 1: A Brief Tale

You’re exploring a vast land filled with forests and brush. You thrash your sword to carve out a path and make sense of things.

There are monsters. Lovecraftian winged monstrosities. Their attacks damage your sanity. When injured, you don’t notice, but your mind will begin to wander. You begin seeing things that aren’t real.

You have a shield that protects you from the beasts. You have armor as well. They attack; you deflect. You hear a thud, a squeal, then light flapping sounds when they retreat. Occasionally you can even catch them with your sword but they move quickly.

You return to the village. Your companions have returned too. Some of them aren’t right. They’re obsessed with bizarre imaginings of angels and gods. Some speak of finding ancient labyrinths in places you know are full of swamps.

Your companions size each other up. Factions emerge. One cluster suggests that the monsters bring forward wisdom. The monsters should be brought directly into the village and released.

The pro-monster faction is visibly scratched up; they must have been wounded. They deny taking damage, that the marks on their bodies were extraneous. Some hallucinate wearing a great deal of armor, even though they are visibly unprotected. They do however possess large weapons (likely a trade-off from having light armor), so others are nervous about disagreeing.

The bulk of the crowd turns against this faction. Some attack back; a few people are harmed, but in the end, the faction loses. Some are thrown into jails, others delegated to the very safest of tasks. But this cycle will repeat. You might not be so lucky next time.

The village learns to cope. There are vast regions of bush and thorn but no monsters. Soldiers with little armor are sent there. Some dual wield axes or use two-handed broadswords.

Other areas have many monsters but plain land. This is where you send your fully plated knights. They’ll move slowly but have the best chances of keeping their sanity.

Part 2: An Explanation

The topic is intelligence, epistemics, and sanity.

Weapons are raw intellect, cleverness, narrow technical abilities. These are powerful but dangerous.

Shields and armor represent epistemics, wisdom, humility, calibration. Not believing false things. Not going functionally insane. Winning the intellectual war, though perhaps losing the battle.

The monsters are epistemic hazards that lead good people to believe bad things. See: Politics is the Mind-Killer. Pure math is (relatively) safe.[1] Few people learn new math proofs and proceed to imagine ridiculous things about epistemology.  But politics, social sciences, news, and religion, are quite treacherous. They’re full of gradients that cause people to believe false things. These fields have incentives that encourage dramatic overconfidence and motivated reasoning.

Mein Kampf is a token example. For people with solid epistemics and who are otherwise well-read, reading Mein Kampf can be enlightening. It’s necessary reading to deeply understand what exactly happened in Hitler’s rise to power. But there are clearly other readers who would come away with exactly the wrong lessons, and wind up becoming sympathetic to Adolf Hitler. Mein Kampf represents dangerous territory to these people.

Televised news and social media news are both highly hazardous. People who learn from these sources and often become extremists of one form or another. We haven’t found easy ways to give them the right defenses; to allow them to come from these resources as constantly more correct about the world, instead of less.

Individuals with powerful intellectual weapons but weak epistemics can be extremely dangerous. Consider all of the highly well-coordinated destruction caused by certain religious groups, or, on an individual level, vocal advocates of these groups.

Individuals with great epistemics but weak intellect are safe, but often not very useful. Think of soft-spoken people who are poor at arguing. It takes great effort to discern these folks and listen to them accordingly.

Increases in intellect can be dangerous. Increases in epistemics are occasionally so. If your enemies have improved epistemics, but still don’t see your point of view, that could well be a bad thing. The scariest threat is a force with brutal intelligence and epistemics that are excellent except for one crucial detail.

Sanity in the tale is just that; sanity in real life. Perhaps we might not call crazed conspiracy theorists *insane*, but I hope we can admit they often have low *sanity*. At some point, fundamental incorrect beliefs infect a great deal of one’s worldview and lifestyle. When you take epistemic damage, from epistemic hazards, you lose sanity. Losing sanity lessens your epistemics, sometimes leading to self-reinforcing cycles.

Part 3: A Dungeons and Dragons Variant

Amelia has high intelligence, medium sanity, and medium epistemics.

She reads an epistemically dangerous article about Holocaust denialism. Her epistemic level is taken into account, and she must roll a 6-sided die.

5-6: She understands false news better. +1 to epistemics.

4: She glances over it. No effect.

2-3: She’s a tiny bit convinced. -1 to sanity.

1: She’s deeply convinced. -3 to sanity, -1 to epistemics.

Amelia rolls a 2.

Much later, Amelia is interested in writing a public blog post of her choosing. Her new intelligence, sanity, and epistemics are taken into account. A die is rolled.

Enlightened Utility Points are used as the victory points in this game. It’s a concept similar to CEV. It’s meant to represent what Amelia would value if she were completely sane and intelligent.

4-6: She writes it about a generally reasonable topic. +1 Enlightened Utility Point

2-3: She writes a post presenting some evidence for Holocaust denialism. -2 Enlightened Utility Points

1: She comes up with novel and effective arguments that will greatly help the Holocaust denialism side. -20 Enlightened Utility Points.

Amelia rolls a 4.

----

[1] To be specific, pure math is mostly safe from epistemic hazards. However, I would classify it more as an intelligence field than an epistemics field. Organizations with poor sanity might use mathematical knowledge in harmful ways.

Discuss

Book Review Review (end of the bounty program)

15 октября, 2021 - 06:23
Published on October 15, 2021 3:23 AM GMT

This post calls an official end to the LessWrong Sep/Oct 2021 Book Review Bounty Program!! A huge thank you to everyone who submitted! You've produced some superb content.

I launched the program with the goals of getting more great content on the site, encouraging people to practice useful skills, getting people to learn about many aspects of the world, and just surfacing some talented writers/thinkers. I think the program succeeded.

36 book reviews were posted in the last month. I have paid out $500 to nine of them so far, and expect to pay out several more once: (a) I've finished reading them all, (b) the authors have messaged me to request payment. If you want to collect the bounty for your book review, please message me via Intercom or email (ruby@lesswrong.com) In terms of encouraging people to learn about many aspects of the world, the submitted book reviews spanned parenting, history, philosophy, immigration policy, magic, the microbiome, operational excellence, mathematical logic, moral psychology, and yet more topics. These are some of my personal favorites that I've read so far, in no particular order: You can see all of the reviews in the Book Reviews tag. Make sure to click "load more" As far as surfacing new talent goes, quite a few contributors were making their first post on LessWrong. Kudos to Sam Marks and TheRealSlimHippo, authors of two of my favorites listed above, who are new to posting. Great first contributions. One review author told me that he was initially too shy to write anything on LessWrong, but that the$500 incentive was actually enough to get him to do it. He sent me this image:

Reflections

The bounty program demonstrated to me that we can incentivize the creation of good content with bounties, perhaps (in this case) making it easier for people to spend the 10-30 hours required to produce a good review. I plan for LessWrong to experiment more with such programs.

If I have any reservations about this program, it's that I feel some of the entries were lacking in "core LessWrong virtue". Something like they were missing the epistemic focus that most LessWrong essays have, even when they were otherwise engagingly and enjoyably written. I don't think this is insurmountable–clearer and more actionable requirements, as well as better onboarding for new contributors, can be provided–but it is something to be mindful of in the design of these programs.

What comes next?

I expect to run more writing/research bounty programs in the near future! Probably we will cycle through a variety of writing/research tasks beyond book reviews, e.g. distillation/summarization tasks, writing wiki articles, answering open questions and similar.

If you have an idea for the kind of writing we should incentivize with bounties, please comment below.

Thanks again to everyone who wrote a book review! (Or three!)

Discuss

Interpreting Diagnostic Tests

15 октября, 2021 - 06:20
Published on October 15, 2021 2:58 AM GMT

Most people—including most physicians—don’t understand how to correctly interpret diagnostic medical tests.

Here’s a concrete example: the BinaxNOW home COVID test has 68% sensitivity (chance of giving a positive result if you have COVID) and 99% specificity (chance of giving a negative result if you don’t have COVID). So does that mean positive test results are 68% accurate and negative results are 99% accurate? Unfortunately, it isn’t that simple.

My COVID risk level means that for me, a positive test only has a 6% chance of being accurate but a negative test has a 99.97% chance of being accurate. Your odds might be completely different, depending on your risk level.

In this post, I’ll explain how it all works and show you how to understand test results. (Spoiler: aside from some medical terminology, this is just an application of Bayes’ Theorem).

Sensitivity and specificity

Let’s start with the easy part. Sensitivity and selectivity measure the intrinsic accuracy of a test regardless of who’s taking it.

Sensitivity is how often a test gives a positive result when testing someone who is actually positive. BinaxNOW has a sensitivity of 68%, so if 100 people with COVID take a BinaxNOW test, 68 of them will test positive.

Specificity is how often a test gives a negative result when testing someone who is actually negative. BinaxNOW has a specificity of 99%, so if 100 people who do not have COVID take a BinaxNOW test, 99 of them will test negative.

Why does your risk level matter?

Let’s do a thought experiment:

If 100 people who have COVID take BinaxNOW tests, they will get 68 positive results and 32 negative results. All the positives are correct (0% false positive rate) and all the negatives are incorrect (100% false negative rate).

If 100 people who don't have COVID take BinaxNOW tests, they will get 1 incorrect positive result (100% false positive rate) and 99 correct negative results (0% false negative rate).

The same test has completely different false positive and false negative rates, depending on how likely it is that the person taking it has COVID. So how do I calculate the test’s accuracy based on my risk level?

First, some terminology

The numbers I want are:

Positive predictive value (PPV): how accurate is a positive test result? If I get a positive test result, the PPV is the chance that I truly have COVID.

Negative predictive value (NPV): how accurate is a negative test result? If I get a negative test result, the NPV is the chance that I truly don’t have COVID.

In order to calculate those, I need to know:

Prior probability (sometimes called pre-test probability or prevalence): what is the probability that I have COVID based on my risk level, symptoms, known exposure, etc.?

I’m on a 500 microCOVID per week risk budget, which means my chance of having COVID at any given time is approximately 0.1%. So my prior probability is 0.1%. (Assuming I don’t have any symptoms: if I suddenly get a fever and lose my sense of smell, my prior probability might be close to 100%).

This calculator by Epitools lets you calculate PPV and NPV. In my case, I put 0.68 in the Sensitivity box, 0.99 in the Specificity box, .001 in the Prior probability box, and press Submit. After waiting a surprisingly long time, the calculator tells me my PPV is 6.4% and my NPV is 99.97%.

A graphical explanation

The BMJ has an excellent Covid-19 test calculator that does a nice job of graphically representing how this works. I recommend you take two minutes to play with it if you want to develop an intuitive understanding of how this works.

Unfortunately, the BMJ calculator doesn’t have the precision to calculate PPV and NPV for someone with a very low prior probability.

A numerical explanation

If you’re familiar with Bayes’ Theorem, you already know how to do the math. If not, here’s a quick summary of how to calculate PPV and NPV yourself.

Probabilities can be hard to think about: for most people, it’s easiest to imagine a large number of people taking the same test. So let’s imagine 1,000,000 of me taking the test. (Is the world ready for a million Tornuses? I say yes!)

Because my prior probability is 0.1%, 1,000 of the hypothetical Tornuses have COVID. The sensitivity of the test is 68%, so 680 of them get correct positive results and 320 get incorrect negative results.

What about the remaining 999,000 Tornuses who don’t have COVID? The specificity is 99%, so 989,010 get correct negative results and 9,990 get incorrect positive results.

So how do we calculate PPV? There are 10,670 positive tests (680 + 9,990), of which 680 are accurate. So the odds of a positive test being accurate are 6.4% (680 / 10,670).

If I get a positive test result, it has a 6.4% chance of being accurate.

How about the NPV? There are 989,330 negative tests, of which 989,010 are accurate. NPV = 989,010 / 989,330 = 99.97%.

If I get a negative test result, it has a 99.97% chance of being accurate.

Step by step

1a. Imagine 1,000,000 people taking the test

2a. Truly positive people = 1,000,000 x (prior probability)
2b. Correct positives = (truly positive people) x (sensitivity)
2c. Incorrect negatives = (truly positive people) x (1 - sensitivity)

3a. Truly negative people = 1,000,000 x (1 - prior probability)
3b. Correct negatives = (truly negative people) x (specificity)
3c. Incorrect positives = (truly negative people) x (1 - specificity)

4a. PPV = (correct positives) / (correct positives + incorrect positives)
4b. NPV = (correct negatives) / (correct negatives + incorrect negatives)

In closing

Don’t feel too bad if this doesn’t make intuitive sense to you. If you understand the question and you know where to calculate the answer, you’re ahead of most physicians.

Discuss

Book Review: A Pattern Language by Christopher Alexander

15 октября, 2021 - 04:11
Published on October 15, 2021 1:11 AM GMT

This is an ambitious, opinionated book about how to live.

Ambitious, because its scope is enormous -- how far apart cities should be placed ("2. the distribution of towns"), zoning ("3. city country fingers"), street maps ("23. parallel roads"), recreation ("31. promenade"), beauty ("62. high places), interior architecture ("133. staircase as a stage"), interior design ("200. open shelves"), building ("214. root foundations"), and decoration ("253. things from your life").

Opinionated, because it has specific prescriptions for all these 253 things. Some are fairly wacky: "3. city country fingers" recommends interlacing urban and countryside in "fingers", so that nobody in the city is ever more than a ten-minute walk from the countryside. Some are incredibly specific: "22. nine percent parking" sets an upper limit on land area used by parking, and has dire warnings about "the fabric of society is threatened" if exceeding that even in a small area.

A Pattern Language is meticulously organized and numbered from big to small -- high to low abstraction level. Each prescription has an epistemic status:

In the patterns marked with two asterisks, we believe that we have succeeded in stating a true invariant: in short, that the solution we have stated summarizes a property common to all possible ways of solving the problem. [...]

In the patterns marked with one asterisk, we believe that we have made some progress towards identifying such an invariant: but that with careful work it will certainly be possible to improve on the solution. In these cases, we believe it would be wise for you to treat the pattern with a certain amount of disrespect. [...]

Finally, in the patterns without an asterisk, we are certain that we have not succeeded in defining a true invariant... In these cases we have still stated a solution, in order to be concrete.

We even see a form of wiki-style hyperlinking -- each pattern references certain other patterns as being particularly related. These hyperlinks are grouped into going upwards or downwards on the abstraction scale:

Notice that the other patterns mentioned by name at the beginning and at the end, of the pattern you are reading, are also possible candidates for your language. The ones at the beginning will tend to be "larger" than your project. Don't include them, unless you have the power to help create these patterns, at least in a small way, in the world around your project. The ones at the end are "smaller." Almost all of them will be important. Tick all of them, on your list, unless you have some special reason for not wanting to include them.

Now your list has some more ticks on it. Turn to the next highest pattern on the list which is ticked, and open the book to that pattern....

A Pattern Language was published in 1977 and is full of Hanson or Thiel-style contrarian gems: "the nuclear family is not by itself a viable social form" (75. the family); "people cannot be genuinely comfortable and healthy in a house which is not theirs" (79. your own home); "individuals have no effective voice in any community of more than 5000-10000 persons" (12. community of 7000); "high buildings make people crazy" (21. four story limit).

My favorite (and possibly the most radical) pattern is "39. Housing Hill", recommending that in a place you want dense housing (30-50 houses per acre), build stepped/terraced apartment buildings. Each home is a single story with a garden on the below home's roof. "The terraces must be south-facing, large, and intimately connected to the houses, and solid enough for earth, and bushes, and small trees... served by a great central open stair which also faces south and leads toward a common garden."

The book is overtly about architecture and design, but its secret identity is a manual for how to live a better life. I think this touches on what some people may dislike about it. The authors' lifestyle & ethos pervades the book: they have figured out what makes them happy and are trying to spread those ideas. But then they write it in this academic style -- here is the best way to build, here are the studies we ran. It seems overconfident.

Still, I find it inspirational. I already wanted, but now even more, to design places.

Thanks to Ben Kuhn for recommending I read it.

Discuss